Error codes
Canonical error responses across endpoints with likely causes and fixes.
Summary
- 400 — Validation error (missing/invalid fields or provider-parameter mismatch)
- 401 — Authentication error (missing/invalid API key)
- 402 — Payment required (quota/credits exhausted)
- 413 — Payload too large (exceeds request size limits)
- 429 — Rate limit exceeded (per-key or per-IP)
- 500 — Internal server error (unexpected failure)
400 — Validation error
Common cases:
- OpenAI model used with Anthropic-only params (e.g.,
top_k,stop_sequences,thinking). - Anthropic model used with OpenAI-only params (e.g.,
logit_bias,max_completion_tokens). streamis not a boolean.- Missing required fields (e.g.,
messages,max_tokensfor Anthropic).
Example response:
{
"error": "Parameters not supported for Anthropic models: response_format, frequency_penalty"
}How to fix:
- Check
Schemasand the selected model's provider. Remove incompatible fields. - Ensure required fields are present and correctly typed.
401 — Authentication error
Common cases:
- Missing or incorrect header.
- Inactive API key or zero credits.
Headers matrix:
- OpenAI-compatible routes:
Authorization: Bearer $API_KEY - Anthropic-compatible routes:
x-api-key: $API_KEY(orAuthorization: Bearer)
How to fix:
- Verify header name/value and that the key is active with credits.
402 — Payment required
- Credits or quota exhausted.
- Top up credits in dashboard; retry after balance is available.
413 — Payload too large
- Request exceeded size limits (e.g., large prompts, embedded files).
- Reduce input size; consider
Batchesfor long-running workloads.
429 — Rate limit exceeded
- Per-key or per-IP limit reached.
- Apply exponential backoff, spread bursts, or upgrade plan.
- Honor
Retry-Afterheader when present. If numeric seconds, sleep that duration; if HTTP date, retry after that timestamp.
Example response:
{
"error": "Rate limit exceeded (key)"
}Client backoff (JS):
async function retryWithBackoff(fn: () => Promise<any>, attempts = 5) {
let delay = 500;
for (let i = 0; i < attempts; i++) {
try { return await fn(); } catch (e: any) {
if (e?.status !== 429) throw e;
await new Promise(r => setTimeout(r, delay));
delay = Math.min(delay * 2, 8000);
}
}
throw new Error('Retries exhausted');
}500 — Internal server error
- Unexpected provider/network errors, transient failures.
- Capture
x-request-idand contact support if persistent.
Provider-specific notes
- OpenAI-compatible: streaming emits
chat.completion.chunk; final frame setsfinish_reasonand, whenstream_options.include_usage=true, a final usage-only chunk precedes[DONE]. - Anthropic-compatible: streaming emits
message_*andcontent_block_*events; usage fields appear inmessage_delta/message_stop.
Retryability matrix
| Status | Retryable | Guidance |
|---|---|---|
| 400 | No | Fix request (schema/params) and resend |
| 401 | No | Fix auth header/key state |
| 402 | No | Add credits; then resend |
| 413 | No | Reduce payload size; consider Batches |
| 429 | Yes | Backoff with jitter; honor Retry-After |
| 5xx | Yes | Exponential backoff; cap retries |
Mid-stream errors (SSE)
When a failure occurs after headers are sent, errors are delivered inside the stream instead of a JSON body with a status code.
- OpenAI Chat Completions: the stream may terminate early; clients should handle abrupt close as an error. If
stream_options.include_usagewas enabled, you may still receive a final usage-only chunk. - OpenAI Responses: emits
response.errorwith details and then a terminalresponse.completedcarryingstatus: "failed". - Anthropic Messages: emits a
{ type: "error", error: { ... } }event and then closes; no non-spec stop frames follow.
After the stream ends, fetch final usage via the generation ID with GET /api/v1/generations?id=<ID>.