Unified endpoint (OpenAI + Anthropic)
Endpoint: POST /api/v1/messages
Use this endpoint for a provider-agnostic request shape that supports both OpenAI and Anthropic models. It accepts a familiar messages
array and streams OpenAI-compatible chunks when stream: true
.
Request schema
Required fields:
model
(string) – one of the allowed model IDs configured by the service.messages
(array) – conversation messages. Use OpenAI-style messages for OpenAI models or Anthropic-style content blocks for Anthropic models.
Optional fields (OpenAI-style):
temperature
(0..2),max_tokens
(int),top_p
(0..1],stop
(string|string[])frequency_penalty
(-2..2),presence_penalty
(-2..2)response_format
– supports{ type: 'text' | 'json_object' | 'json_schema', json_schema?: { name: string; schema?: object; strict?: boolean } }
tools
,tool_choice
, legacyfunction_call
/functions
n
,seed
,user
,logit_bias
,parallel_tool_calls
,store
Optional fields (Anthropic-style):
top_k
(int > 0),stop_sequences
(string[])metadata
(object),thinking
(object),cache_control
(object)
Common:
stream
(boolean) – must be a boolean. When true, response is SSE with OpenAI chat.completion.chunk frames.mcp_servers
(array) – optional list of MCP server definitions to enhance tool calling.prompt_cache
(object) – optional hinting for prompt caching.timeoutMs
(number) – optional per-request timeout.input_file_id
(string) – optional file ID to load a JSONL request from your uploaded files (first valid request will be used; body fields take precedence).
Note: The unified endpoint enforces provider-parameter compatibility. If you choose an OpenAI model, Anthropic-only params (like top_k
, stop_sequences
, etc.) will be rejected with 400; and vice versa.
Messages format
- For OpenAI models, send OpenAI-style messages (
{ role, content }
) wherecontent
may be a string or an array with parts liketext
,image_url
, etc. - For Anthropic models, send Anthropic-style content blocks (e.g.,
text
,tool_use
,tool_result
).
File input
If you provide input_file_id
, the server reads the referenced file (uploaded via Files API) which should contain newline-delimited JSON (JSONL) requests. The first valid request is merged with the request body, with body properties taking precedence.
cURL example (streaming)
curl -N -X POST "https://api.kushrouter.com/api/v1/messages" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{ "role": "user", "content": "Write a haiku about routers" }
],
"stream": true
}'
Response is an SSE stream of OpenAI-style chat.completion.chunk
objects ending with [DONE]
.
Error responses
Validation failures return HTTP 400 with a descriptive message, for example:
{"error": "Parameters not supported for Anthropic models: response_format, frequency_penalty"}
Other errors:
- 401 – missing/invalid API key
- 413 – payload too large
- 429 – rate limit exceeded (key or IP)
Tool calls
When tools are used, incremental function call deltas are streamed via choices[0].delta.tool_calls[].function.arguments
(OpenAI format). If the provider only supplies final arguments at the end, a final tool-call delta with complete arguments
is emitted. The final chunk sets finish_reason
to tool_calls
when applicable.
Response shapes
Non‑streaming JSON response follows OpenAI Chat Completion structure. Streaming uses OpenAI chat.completion.chunk
frames with:
{
"id": "chatcmpl-...",
"object": "chat.completion.chunk",
"created": 1739123456,
"model": "...",
"choices": [
{ "index": 0, "delta": { "content": "..." }, "finish_reason": null }
]
}
A final frame with empty delta
sets finish_reason
to stop
or tool_calls
.