infer0 for developers

Let your users bring their own AI provider. infer0 handles the OAuth flow so each user connects their own API key. You call a single API in the SDK format you prefer.

01Register app
02Auth redirect
03Exchange code
04Inference

1. Register your app

Sign in to infer0 and register OAuth Apps. You'll get a client_id and client_secret. The redirect URI must match your app's callback exactly, including protocol, hostname, and path.

2. Authorization redirect

GET/oauth/authorize

Redirect the user to infer0's authorization endpoint to start the OAuth flow.

https://infer0.com/oauth/authorize?client_id=<client_id>&redirect_uri=<callback_url>&response_type=code

The user signs in (if needed), selects a provider, and approves the request. infer0 redirects back to your callback with a code parameter.

3. Exchange code for tokens

POST/v1/oauth/token

Trade the authorization code for an access token and refresh token.

POST https://infer0.com/v1/oauth/token
Content-Type: application/x-www-form-urlencoded

grant_type=authorization_code&code=<code>&client_id=<client_id>&client_secret=<client_secret>&redirect_uri=<redirect_uri>
HTTP 200

{
  "access_token": "infer0_at_xxx",
  "token_type": "Bearer",
  "expires_in": 3600,
  "refresh_token": "infer0_rt_xxx",
  "scope": "inference userinfo"
}

The access_token expires in 1 hour. The refresh_token expires in 30 days and is single-use. The access token works for all three inference endpoints and for fetching user info.

4. Inference

Use the access token to call infer0's API. infer0 routes the request to the user's configured provider and model automatically. Three API endpoints are available. Pick the one that matches your preferred SDK format. The response always matches the request protocol, regardless of which upstream provider handles it.

POST/v1/chat/completions

OpenAI Chat Completions format.

POST https://infer0.com/v1/chat/completions
Authorization: Bearer <access_token>
Content-Type: application/json

{
  "messages": [
    { "role": "user", "content": "Hello" }
  ]
}
HTTP 200

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1718000000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 9,
    "total_tokens": 19
  }
}

or with the OpenAI SDK:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://infer0.com/v1",
  apiKey: "<access_token>",
});

const chat = await client.chat.completions.create({
  messages: [{ role: "user", content: "Hello" }],
});

POST/v1/messages

Anthropic Messages format.

POST https://infer0.com/v1/messages
Authorization: Bearer <access_token>
Content-Type: application/json
anthropic-version: 2023-06-01

{
  "messages": [
    { "role": "user", "content": "Hello" }
  ]
}
HTTP 200

{
  "id": "msg_xxx",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! How can I help you today?"
    }
  ],
  "model": "claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 9
  }
}

or with the Anthropic SDK:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "https://infer0.com/v1",
  apiKey: "<access_token>",
});

const message = await client.messages.create({
  messages: [{ role: "user", content: "Hello" }],
});

POST/v1/responses

OpenAI Responses API format.

POST https://infer0.com/v1/responses
Authorization: Bearer <access_token>
Content-Type: application/json

{
  "input": "Hello"
}
HTTP 200

{
  "id": "resp_xxx",
  "object": "response",
  "created_at": 1718000000,
  "model": "gpt-4o-mini",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Hello! How can I help you today?"
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 10,
    "output_tokens": 9,
    "total_tokens": 19
  }
}

or with the OpenAI SDK:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://infer0.com/v1",
  apiKey: "<access_token>",
});

const response = await client.responses.create({
  input: "Hello",
});

5. Parameters

For cross-provider compatibility, only these request parameters are accepted. All other parameters (temperature, top_p, max_tokens, tools, response_format, etc.) are silently ignored.

Parameter Endpoints Description
messages /v1/chat/completions, /v1/messages Array of message objects with role and content.
input /v1/responses String or array of input items. The prompt content.
model All Accepted but ignored. The user's configured model is always used.
stream All Set to true to receive a server-sent event stream. Defaults to false.

6. How model selection works

infer0 supports multiple API formats (OpenAI Chat, Anthropic Messages, OpenAI Responses). Regardless of which format you use, the model field value you send is ignored. The actual model is determined by each user's provider configuration, not by your app.

Requested vs actual model

// Your app always sends the same request format:
POST /v1/chat/completions  (OpenAI SDK)
POST /v1/messages          (Anthropic SDK)
POST /v1/responses         (OpenAI Responses SDK)
{ "model": "ignored", "input/messages": [...] }

// User A has OpenAI / gpt-4o-mini
// infer0 routes to: OpenAI gpt-4o-mini

// User B has Anthropic / claude-sonnet-4-20250514
// infer0 routes to: Anthropic claude-sonnet-4-20250514

// User C has Google / gemini-2.5-pro
// infer0 routes to: Google gemini-2.5-pro

The model field in your request acts as a placeholder. infer0 replaces it with the user's configured model before forwarding to the upstream provider.

Tradeoffs

User control (by design)

Each user chooses their own provider and model. Your app doesn't need to know or care. This means different users may get different results from the same request, which is the intended behavior for a BYO-provider app.

Reproducibility

The same request can produce different results across users because each may be using a different provider or model. If your app needs deterministic model behavior, consider whether a BYO-provider approach fits your use case.

Debugging

When a user reports an issue, you'll need to know which provider and model they're using. Ask them to check their AI Providers page. The resolved model is not currently returned in the API response.

Developer constraints

You cannot pin a specific model version across all users. If your app requires a particular model or provider to function correctly, infer0's routing model may not be the right fit.

7. Errors and edge cases

infer0 returns JSON error responses with a consistent structure. Your app should handle these cases gracefully.

No provider configured

The user signed in and authorized your app but has not connected an AI provider. Your app cannot make inference requests until the user adds a provider.

HTTP 400
{
  "error": {
    "message": "No provider configured",
    "code": "no_provider"
  }
}

Recommended: Ask the user to add a provider on infer0's AI Providers page. You can redirect them to https://infer0.com/providers or let them retry after configuring one.

Authorization revoked

The user revoked your app's access from their Authorizations page. The access token is no longer valid.

HTTP 403
{
  "error": {
    "message": "Authorization revoked or not found",
    "code": "auth_revoked"
  }
}

Recommended: Prompt the user to re-authorize your app by redirecting them through the OAuth flow again.

Access token expired

Access tokens expire after 1 hour. The token is no longer accepted.

HTTP 401
{
  "error": {
    "message": "Invalid or expired token",
    "code": "auth_error"
  }
}

Recommended: Use the refresh token to get a new access token. If the refresh token is also expired or revoked, redirect the user through the full OAuth flow again.

Provider token expired

The user's provider API key is invalid or has been revoked. This is surfaced as a provider error from the upstream API.

HTTP 401
{
  "error": {
    "message": "Provider error: 401 Unauthorized",
    "code": "provider_error"
  }
}

Recommended: Ask the user to check their provider key on the AI Providers page and re-enter it if needed.

Provider quota or rate exceeded

The user's provider account has hit a rate limit or quota. The error message is forwarded from the upstream provider.

HTTP 429
{
  "error": {
    "message": "Provider error: 429 Too Many Requests",
    "code": "provider_error"
  }
}

Recommended: Implement exponential backoff and retry. If the error persists, notify the user that their provider account may need a plan upgrade.

Unsupported model or feature

The user's provider does not support a feature your app requested (e.g. a parameter the provider doesn't accept). The upstream provider returns the error.

HTTP 400
{
  "error": {
    "message": "Provider error: 400 {'error': {'message': 'Unsupported parameter: ...'}}",
    "code": "provider_error"
  }
}

Recommended: Check which provider the user has configured and adjust request parameters accordingly. Some features (like response_format or tools) are provider-specific.

infer0 service unavailable

infer0 is down or unreachable. The user's provider keys remain unaffected.

HTTP 502 / 503 / 504
{
  "error": {
    "message": "Provider error: upstream failure",
    "code": "provider_error"
  }
}

Recommended: Implement a retry with backoff. If requests continue to fail, degrade gracefully, inform the user, and avoid blocking the rest of your app.

8. Login / SSO (optional)

The same access token doubles as an identity token. Use the userinfo endpoint to get the user's profile:

GET/v1/userinfo

curl https://infer0.com/v1/userinfo   -H "Authorization: Bearer <access_token>"

{
  "sub": "user-uuid",
  "email": "user@example.com",
  "name": "User Name",
  "picture": "https://..."
}

Use this to look up or create users in your own database when they sign in with infer0. No extra setup is needed. The same OAuth flow gives you the token and the user profile.

9. Refreshing the access token

POST/v1/oauth/refresh

Access tokens expire after 1 hour. Use the refresh token to get a new one without asking the user to re-authorize. Refresh tokens are single-use. Each response includes a new refresh_token.

POST https://infer0.com/v1/oauth/refresh
Content-Type: application/x-www-form-urlencoded

grant_type=refresh_token&refresh_token=<refresh_token>&client_id=<client_id>&client_secret=<client_secret>
HTTP 200

{
  "access_token": "infer0_at_xxx",
  "token_type": "Bearer",
  "expires_in": 3600,
  "refresh_token": "infer0_rt_xxx",
  "scope": "inference userinfo"
}

The previous refresh_token is invalidated. Each refresh returns a new refresh_token.

10. Managing provider configs

Your end users manage their own provider keys through AI Providers. They can add, remove, or switch providers at any time. Your app doesn't need to change anything. You never see or handle their API keys.

Security & privacy

Prompts and completions stay private.

infer0 does not log or store prompt or completion content. Metadata such as timestamps and token counts may be retained for rate limiting, but message bodies are never stored.

Credentials are encrypted.

Provider API keys are encrypted with AES-256-GCM before storage. Access tokens and refresh tokens are hashed. Encryption keys are managed by Cloudflare's secure infrastructure and never exposed to the application.

Staff cannot read your keys.

Encrypted data cannot be read by infer0 staff. The encryption keys are stored in Cloudflare's secure infrastructure, separate from the database.

Revoke access anytime.

Users can revoke any app's access from their Authorizations page. Revoking immediately invalidates the associated tokens and blocks further requests. Providers can be deleted from AI Providers.

Data retention.

infer0 retains account profiles (email, name, avatar), encrypted provider configurations, and OAuth authorization records. Users can delete their providers and revoke authorizations at any time.

If infer0 is unavailable.

A user's API keys with their AI provider (OpenAI, Anthropic, etc.) remain valid and are unaffected. App requests to infer0 will fail until service resumes. We recommend developers handle this gracefully.