OpenAI Responses (Streaming and Non‑Streaming)
Endpoint: POST /api/openai/v1/responses
Use this endpoint for the OpenAI "Responses" API style. It supports both non‑streaming JSON responses and SSE streaming with official Responses events. Authentication is via Authorization: Bearer $API_KEY
.
Request schema
Required:
model
(string)
One of the supported OpenAI-compatible model ids.
Input options (choose one of the following ways to provide input):
input
(string | array | object withcontent
) – direct content. If string, it's treated as a user message. If array, it's an array of content parts.messages
(array) – OpenAI Chat-style messages. Content arrays are normalized (e.g.,input_text
coerced totext
).instructions
(string) – optional system/instructional text. When combined withinput
, it is sent as a system message.system
(string) – optional system prompt (alternative toinstructions
).
Tools:
tools
(array) – OpenAI function tools list.tool_choice
(string | object) – e.g.,"auto"
, or{ type: "function", function: { name: "..." } }
.
Generation controls:
temperature
(number) – 0..2top_p
(number) – 0..1max_output_tokens
(integer) – maximum tokens for outputresponse_format
(object) –{ type: "text" | "json_object" | "json_schema", json_schema? }
reasoning_effort
(string) –low | medium | high
(where applicable)mcp_servers
(array) – MCP server configurations, if usedstream
(boolean) – whentrue
, returns an SSE stream of official Responses events
Notes:
- Unknown top-level parameters are rejected with HTTP 400.
- Payloads larger than allowed size return HTTP 413.
Non‑streaming response shape
When stream: false
(default), the response is a JSON object. If tools are invoked, you may receive a JSON with status: "requires_action"
and required_action.submit_tool_outputs.tool_calls
to execute.
Example (completed without tools):
{
"id": "resp_...",
"object": "response",
"model": "openai:gpt-4o-2024-11-20",
"created_at": 1739123456,
"status": "completed",
"output": [{ "type": "output_text", "text": "Hello!" }],
"output_text": "Hello!",
"usage": {
"input_tokens": 12,
"output_tokens": 10,
"total_tokens": 22
}
}
Example (tools required):
{
"id": "resp_...",
"object": "response",
"model": "openai:gpt-4o-2024-11-20",
"created_at": 1739123456,
"status": "requires_action",
"required_action": {
"type": "submit_tool_outputs",
"submit_tool_outputs": {
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\n \"city\": \"Boston\"\n}"
}
}
]
}
},
"usage": {
"input_tokens": 20,
"output_tokens": 5,
"total_tokens": 25
}
}
Streaming (SSE) events
When stream: true
, the connection emits official Responses events only. Expect the following sequence as applicable:
response.created
response.in_progress
response.output_item.added
(for a message item)response.content_part.added
(initial output_text part)response.output_text.delta
(emitted repeatedly with text deltas)response.function_call_arguments.delta
(during tool arg streaming)response.function_call_arguments.done
(final tool args)response.output_text.done
(final text)response.content_part.done
response.output_item.done
response.requires_action
(if tools were called) ORresponse.completed
(when finished)
The final event carries the response
with status completed
or requires_action
and may include a usage
object containing token counts and, when available, provider usage details.
Example curl:
curl -N https://api.kushrouter.com/api/openai/v1/responses \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai:gpt-4o-2024-11-20",
"input": "Say hello",
"stream": true
}'
Errors
- 400 – invalid JSON or unknown/invalid parameters
- 401 – missing/invalid API key
- 413 – payload too large
- 429 – rate limit exceeded (may be returned before a stream starts)
- 500 – internal error
Tool calling notes
- Tool calls stream incrementally as
response.function_call_arguments.delta
and finalize withresponse.function_call_arguments.done
. - When a provider only supplies final arguments at the end, you still receive a final
...arguments.done
event with complete arguments.