Error codes

Error codes

Canonical error responses across endpoints with likely causes and fixes.

Summary

  • 400 — Validation error (missing/invalid fields or provider-parameter mismatch)
  • 401 — Authentication error (missing/invalid API key)
  • 402 — Payment required (quota/credits exhausted)
  • 413 — Payload too large (exceeds request size limits)
  • 429 — Rate limit exceeded (per-key or per-IP)
  • 500 — Internal server error (unexpected failure)

400 — Validation error

Common cases:

  • OpenAI model used with Anthropic-only params (e.g., top_k, stop_sequences, thinking).
  • Anthropic model used with OpenAI-only params (e.g., logit_bias, max_completion_tokens).
  • stream is not a boolean.
  • Missing required fields (e.g., messages, max_tokens for Anthropic).

Example response:

{
  "error": "Parameters not supported for Anthropic models: response_format, frequency_penalty"
}

How to fix:

  • Check Schemas and the selected model's provider. Remove incompatible fields.
  • Ensure required fields are present and correctly typed.

401 — Authentication error

Common cases:

  • Missing or incorrect header.
  • Inactive API key or zero credits.

Headers matrix:

  • OpenAI-compatible routes: Authorization: Bearer $API_KEY
  • Anthropic-compatible routes: x-api-key: $API_KEY (or Authorization: Bearer)

How to fix:

  • Verify header name/value and that the key is active with credits.

402 — Payment required

  • Credits or quota exhausted.
  • Top up credits in dashboard; retry after balance is available.

413 — Payload too large

  • Request exceeded size limits (e.g., large prompts, embedded files).
  • Reduce input size; consider Batches for long-running workloads.

429 — Rate limit exceeded

  • Per-key or per-IP limit reached.
  • Apply exponential backoff, spread bursts, or upgrade plan.
  • Honor Retry-After header when present. If numeric seconds, sleep that duration; if HTTP date, retry after that timestamp.

Example response:

{
  "error": "Rate limit exceeded (key)"
}

Client backoff (JS):

async function retryWithBackoff(fn: () => Promise<any>, attempts = 5) {
  let delay = 500;
  for (let i = 0; i < attempts; i++) {
    try { return await fn(); } catch (e: any) {
      if (e?.status !== 429) throw e;
      await new Promise(r => setTimeout(r, delay));
      delay = Math.min(delay * 2, 8000);
    }
  }
  throw new Error('Retries exhausted');
}

500 — Internal server error

  • Unexpected provider/network errors, transient failures.
  • Capture x-request-id and contact support if persistent.

Provider-specific notes

  • OpenAI-compatible: streaming emits chat.completion.chunk; final frame sets finish_reason and, when stream_options.include_usage=true, a final usage-only chunk precedes [DONE].
  • Anthropic-compatible: streaming emits message_* and content_block_* events; usage fields appear in message_delta / message_stop.

Retryability matrix

StatusRetryableGuidance
400NoFix request (schema/params) and resend
401NoFix auth header/key state
402NoAdd credits; then resend
413NoReduce payload size; consider Batches
429YesBackoff with jitter; honor Retry-After
5xxYesExponential backoff; cap retries

Mid-stream errors (SSE)

When a failure occurs after headers are sent, errors are delivered inside the stream instead of a JSON body with a status code.

  • OpenAI Chat Completions: the stream may terminate early; clients should handle abrupt close as an error. If stream_options.include_usage was enabled, you may still receive a final usage-only chunk.
  • OpenAI Responses: emits response.error with details and then a terminal response.completed carrying status: "failed".
  • Anthropic Messages: emits a { type: "error", error: { ... } } event and then closes; no non-spec stop frames follow.

After the stream ends, fetch final usage via the generation ID with GET /api/v1/generations?id=<ID>.

See also