Error codes

Canonical error responses across endpoints with likely causes and fixes.

Summary

400 — Validation error (missing/invalid fields or provider-parameter mismatch)
401 — Authentication error (missing/invalid API key)
402 — Payment required (quota/credits exhausted)
413 — Payload too large (exceeds request size limits)
429 — Rate limit exceeded (per-key or per-IP)
500 — Internal server error (unexpected failure)

400 — Validation error

Common cases:

OpenAI model used with Anthropic-only params (e.g., top_k, stop_sequences, thinking).
Anthropic model used with OpenAI-only params (e.g., logit_bias, max_completion_tokens).
stream is not a boolean.
Missing required fields (e.g., messages, max_tokens for Anthropic).

Example response:

{
  "error": "Parameters not supported for Anthropic models: response_format, frequency_penalty"
}

How to fix:

Check Schemas and the selected model's provider. Remove incompatible fields.
Ensure required fields are present and correctly typed.

401 — Authentication error

Common cases:

Missing or incorrect header.
Inactive API key or zero credits.

Headers matrix:

OpenAI-compatible routes: Authorization: Bearer $API_KEY
Anthropic-compatible routes: x-api-key: $API_KEY (or Authorization: Bearer)

How to fix:

Verify header name/value and that the key is active with credits.

402 — Payment required

Credits or quota exhausted.
Top up credits in dashboard; retry after balance is available.

413 — Payload too large

Request exceeded size limits (e.g., large prompts, embedded files).
Reduce input size; consider Batches for long-running workloads.

429 — Rate limit exceeded

Per-key or per-IP limit reached.
Apply exponential backoff, spread bursts, or upgrade plan.
Honor Retry-After header when present. If numeric seconds, sleep that duration; if HTTP date, retry after that timestamp.

Example response:

{
  "error": "Rate limit exceeded (key)"
}

Client backoff (JS):

async function retryWithBackoff(fn: () => Promise<any>, attempts = 5) {
  let delay = 500;
  for (let i = 0; i < attempts; i++) {
    try { return await fn(); } catch (e: any) {
      if (e?.status !== 429) throw e;
      await new Promise(r => setTimeout(r, delay));
      delay = Math.min(delay * 2, 8000);
    }
  }
  throw new Error('Retries exhausted');
}

500 — Internal server error

Unexpected provider/network errors, transient failures.
Capture x-request-id and contact support if persistent.

Provider-specific notes

OpenAI-compatible: streaming emits chat.completion.chunk; final frame sets finish_reason and, when stream_options.include_usage=true, a final usage-only chunk precedes [DONE].
Anthropic-compatible: streaming emits message_* and content_block_* events; usage fields appear in message_delta / message_stop.

Retryability matrix

Status	Retryable	Guidance
400	No	Fix request (schema/params) and resend
401	No	Fix auth header/key state
402	No	Add credits; then resend
413	No	Reduce payload size; consider Batches
429	Yes	Backoff with jitter; honor `Retry-After`
5xx	Yes	Exponential backoff; cap retries

Mid-stream errors (SSE)

When a failure occurs after headers are sent, errors are delivered inside the stream instead of a JSON body with a status code.

OpenAI Chat Completions: the stream may terminate early; clients should handle abrupt close as an error. If stream_options.include_usage was enabled, you may still receive a final usage-only chunk.
OpenAI Responses: emits response.error with details and then a terminal response.completed carrying status: "failed".
Anthropic Messages: emits a { type: "error", error: { ... } } event and then closes; no non-spec stop frames follow.

After the stream ends, fetch final usage via the generation ID with GET /api/v1/generations?id=<ID>.