infer0 for developers
Let your users bring their own AI provider. infer0 handles the OAuth flow so each user connects their own API key. You call a single API in the SDK format you prefer.
1. Register your app
Sign in to infer0 and register OAuth Apps. You'll get a client_id and client_secret. The redirect URI must match your app's callback exactly, including protocol, hostname, and path.
2. Authorization redirect
GET/oauth/authorize
Redirect the user to infer0's authorization endpoint to start the OAuth flow.
https://infer0.com/oauth/authorize?client_id=<client_id>&redirect_uri=<callback_url>&response_type=code
The user signs in (if needed), selects a provider, and approves the request.
infer0 redirects back to your callback with a code parameter.
3. Exchange code for tokens
POST/v1/oauth/token
Trade the authorization code for an access token and refresh token.
POST https://infer0.com/v1/oauth/token
Content-Type: application/x-www-form-urlencoded
grant_type=authorization_code&code=<code>&client_id=<client_id>&client_secret=<client_secret>&redirect_uri=<redirect_uri>
HTTP 200
{
"access_token": "infer0_at_xxx",
"token_type": "Bearer",
"expires_in": 3600,
"refresh_token": "infer0_rt_xxx",
"scope": "inference userinfo"
}
The access_token expires in 1 hour. The refresh_token
expires in 30 days and is single-use. The access token works for all three inference endpoints and for fetching user info.
4. Inference
Use the access token to call infer0's API. infer0 routes the request to the user's configured provider and model automatically. Three API endpoints are available. Pick the one that matches your preferred SDK format. The response always matches the request protocol, regardless of which upstream provider handles it.
POST/v1/chat/completions
OpenAI Chat Completions format.
POST https://infer0.com/v1/chat/completions
Authorization: Bearer <access_token>
Content-Type: application/json
{
"messages": [
{ "role": "user", "content": "Hello" }
]
}
HTTP 200
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1718000000,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 9,
"total_tokens": 19
}
}
or with the OpenAI SDK:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://infer0.com/v1",
apiKey: "<access_token>",
});
const chat = await client.chat.completions.create({
messages: [{ role: "user", content: "Hello" }],
});
POST/v1/messages
Anthropic Messages format.
POST https://infer0.com/v1/messages
Authorization: Bearer <access_token>
Content-Type: application/json
anthropic-version: 2023-06-01
{
"messages": [
{ "role": "user", "content": "Hello" }
]
}
HTTP 200
{
"id": "msg_xxx",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello! How can I help you today?"
}
],
"model": "claude-sonnet-4-20250514",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 10,
"output_tokens": 9
}
}
or with the Anthropic SDK:
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
baseURL: "https://infer0.com/v1",
apiKey: "<access_token>",
});
const message = await client.messages.create({
messages: [{ role: "user", content: "Hello" }],
});
POST/v1/responses
OpenAI Responses API format.
POST https://infer0.com/v1/responses
Authorization: Bearer <access_token>
Content-Type: application/json
{
"input": "Hello"
}
HTTP 200
{
"id": "resp_xxx",
"object": "response",
"created_at": 1718000000,
"model": "gpt-4o-mini",
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "Hello! How can I help you today?"
}
]
}
],
"usage": {
"input_tokens": 10,
"output_tokens": 9,
"total_tokens": 19
}
}
or with the OpenAI SDK:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://infer0.com/v1",
apiKey: "<access_token>",
});
const response = await client.responses.create({
input: "Hello",
});
5. Parameters
For cross-provider compatibility, only these request parameters are accepted. All other parameters (temperature, top_p, max_tokens, tools, response_format, etc.) are silently ignored.
6. How model selection works
infer0 supports multiple API formats (OpenAI Chat, Anthropic Messages, OpenAI Responses). Regardless of which format you use, the model field value you send is ignored. The actual model is determined by each user's provider configuration, not by your app.
Requested vs actual model
// Your app always sends the same request format:
POST /v1/chat/completions (OpenAI SDK)
POST /v1/messages (Anthropic SDK)
POST /v1/responses (OpenAI Responses SDK)
{ "model": "ignored", "input/messages": [...] }
// User A has OpenAI / gpt-4o-mini
// infer0 routes to: OpenAI gpt-4o-mini
// User B has Anthropic / claude-sonnet-4-20250514
// infer0 routes to: Anthropic claude-sonnet-4-20250514
// User C has Google / gemini-2.5-pro
// infer0 routes to: Google gemini-2.5-pro
The model field in your request acts as a placeholder. infer0 replaces it with the user's configured model before forwarding to the upstream provider.
Tradeoffs
User control (by design)
Each user chooses their own provider and model. Your app doesn't need to know or care. This means different users may get different results from the same request, which is the intended behavior for a BYO-provider app.
Reproducibility
The same request can produce different results across users because each may be using a different provider or model. If your app needs deterministic model behavior, consider whether a BYO-provider approach fits your use case.
Debugging
When a user reports an issue, you'll need to know which provider and model they're using. Ask them to check their AI Providers page. The resolved model is not currently returned in the API response.
Developer constraints
You cannot pin a specific model version across all users. If your app requires a particular model or provider to function correctly, infer0's routing model may not be the right fit.
7. Errors and edge cases
infer0 returns JSON error responses with a consistent structure. Your app should handle these cases gracefully.
No provider configured
The user signed in and authorized your app but has not connected an AI provider. Your app cannot make inference requests until the user adds a provider.
HTTP 400
{
"error": {
"message": "No provider configured",
"code": "no_provider"
}
}
Recommended: Ask the user to add a provider on infer0's AI Providers page. You can redirect them to https://infer0.com/providers or let them retry after configuring one.
Authorization revoked
The user revoked your app's access from their Authorizations page. The access token is no longer valid.
HTTP 403
{
"error": {
"message": "Authorization revoked or not found",
"code": "auth_revoked"
}
}
Recommended: Prompt the user to re-authorize your app by redirecting them through the OAuth flow again.
Access token expired
Access tokens expire after 1 hour. The token is no longer accepted.
HTTP 401
{
"error": {
"message": "Invalid or expired token",
"code": "auth_error"
}
}
Recommended: Use the refresh token to get a new access token. If the refresh token is also expired or revoked, redirect the user through the full OAuth flow again.
Provider token expired
The user's provider API key is invalid or has been revoked. This is surfaced as a provider error from the upstream API.
HTTP 401
{
"error": {
"message": "Provider error: 401 Unauthorized",
"code": "provider_error"
}
}
Recommended: Ask the user to check their provider key on the AI Providers page and re-enter it if needed.
Provider quota or rate exceeded
The user's provider account has hit a rate limit or quota. The error message is forwarded from the upstream provider.
HTTP 429
{
"error": {
"message": "Provider error: 429 Too Many Requests",
"code": "provider_error"
}
}
Recommended: Implement exponential backoff and retry. If the error persists, notify the user that their provider account may need a plan upgrade.
Unsupported model or feature
The user's provider does not support a feature your app requested (e.g. a parameter the provider doesn't accept). The upstream provider returns the error.
HTTP 400
{
"error": {
"message": "Provider error: 400 {'error': {'message': 'Unsupported parameter: ...'}}",
"code": "provider_error"
}
}
Recommended: Check which provider the user has configured and adjust request parameters accordingly. Some features (like response_format or tools) are provider-specific.
infer0 service unavailable
infer0 is down or unreachable. The user's provider keys remain unaffected.
HTTP 502 / 503 / 504
{
"error": {
"message": "Provider error: upstream failure",
"code": "provider_error"
}
}
Recommended: Implement a retry with backoff. If requests continue to fail, degrade gracefully, inform the user, and avoid blocking the rest of your app.
8. Login / SSO (optional)
The same access token doubles as an identity token. Use the userinfo endpoint to get the user's profile:
GET/v1/userinfo
curl https://infer0.com/v1/userinfo -H "Authorization: Bearer <access_token>"
{
"sub": "user-uuid",
"email": "user@example.com",
"name": "User Name",
"picture": "https://..."
}
Use this to look up or create users in your own database when they sign in with infer0. No extra setup is needed. The same OAuth flow gives you the token and the user profile.
9. Refreshing the access token
POST/v1/oauth/refresh
Access tokens expire after 1 hour. Use the refresh token to get a new one without asking the user to re-authorize. Refresh tokens are single-use. Each response includes a new refresh_token.
POST https://infer0.com/v1/oauth/refresh
Content-Type: application/x-www-form-urlencoded
grant_type=refresh_token&refresh_token=<refresh_token>&client_id=<client_id>&client_secret=<client_secret>
HTTP 200
{
"access_token": "infer0_at_xxx",
"token_type": "Bearer",
"expires_in": 3600,
"refresh_token": "infer0_rt_xxx",
"scope": "inference userinfo"
}
The previous refresh_token is invalidated. Each refresh returns a new refresh_token.
10. Managing provider configs
Your end users manage their own provider keys through AI Providers. They can add, remove, or switch providers at any time. Your app doesn't need to change anything. You never see or handle their API keys.
Security & privacy
Prompts and completions stay private.
infer0 does not log or store prompt or completion content. Metadata such as timestamps and token counts may be retained for rate limiting, but message bodies are never stored.
Credentials are encrypted.
Provider API keys are encrypted with AES-256-GCM before storage. Access tokens and refresh tokens are hashed. Encryption keys are managed by Cloudflare's secure infrastructure and never exposed to the application.
Staff cannot read your keys.
Encrypted data cannot be read by infer0 staff. The encryption keys are stored in Cloudflare's secure infrastructure, separate from the database.
Revoke access anytime.
Users can revoke any app's access from their Authorizations page. Revoking immediately invalidates the associated tokens and blocks further requests. Providers can be deleted from AI Providers.
Data retention.
infer0 retains account profiles (email, name, avatar), encrypted provider configurations, and OAuth authorization records. Users can delete their providers and revoke authorizations at any time.
If infer0 is unavailable.
A user's API keys with their AI provider (OpenAI, Anthropic, etc.) remain valid and are unaffected. App requests to infer0 will fail until service resumes. We recommend developers handle this gracefully.