Create Chat Completion
Authentication
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID (e.g., gpt-4o, claude-sonnet-4-20250514) |
messages | array | Yes | Array of message objects |
stream | boolean | No | Enable SSE streaming (default: false) |
temperature | number | No | Sampling temperature (0-2, default varies by model) |
top_p | number | No | Nucleus sampling parameter (0-1) |
max_tokens | integer | No | Maximum tokens to generate |
stop | string or array | No | Up to 4 stop sequences |
tools | array | No | Function/tool definitions |
tool_choice | string or object | No | auto, none, required, or specific function |
response_format | object | No | Response format (e.g., {"type": "json_object"}) |
Message Object
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | system, user, assistant, or tool |
content | string or array | Yes | Message content (text or multimodal parts) |
name | string | No | Optional name for the message author |
tool_calls | array | No | Tool calls (for assistant messages) |
tool_call_id | string | No | Tool call ID (for tool result messages) |
Content Parts (Multimodal)
Whencontent is an array, each element is a content part:
Tool Definition
Examples
Basic Request
Response
Streaming Request
Streaming Response Format
Each chunk follows the SSE format:Function Calling
Raven-Specific Response Headers
Every response from the chat completions endpoint includes these additional headers:| Header | Example | Description |
|---|---|---|
X-Raven-Provider | openai | Provider that handled the request |
X-Raven-Model | gpt-4o | Model that was used |
X-Raven-Latency-Ms | 450 | Total request latency in milliseconds |
X-Guardrail-Warnings | PII detected in input | Guardrail warnings (only if triggered) |
Error Responses
Model Not Found
400 Bad Request
Rate Limited
429 Too Many Requests
Budget Exceeded
402 Payment Required