Skip to main content

Create Chat Completion

POST /v1/chat/completions
Creates a chat completion using the specified model. This endpoint is fully compatible with the OpenAI Chat Completions API. Raven automatically resolves the model to the correct provider, translates the request, and returns the response in OpenAI format.

Authentication

Authorization: Bearer rk_live_...
Optionally pass your own provider key:
X-Provider-Key: sk-your-provider-key

Request Body

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., gpt-4o, claude-sonnet-4-20250514)
messagesarrayYesArray of message objects
streambooleanNoEnable SSE streaming (default: false)
temperaturenumberNoSampling temperature (0-2, default varies by model)
top_pnumberNoNucleus sampling parameter (0-1)
max_tokensintegerNoMaximum tokens to generate
stopstring or arrayNoUp to 4 stop sequences
toolsarrayNoFunction/tool definitions
tool_choicestring or objectNoauto, none, required, or specific function
response_formatobjectNoResponse format (e.g., {"type": "json_object"})

Message Object

FieldTypeRequiredDescription
rolestringYessystem, user, assistant, or tool
contentstring or arrayYesMessage content (text or multimodal parts)
namestringNoOptional name for the message author
tool_callsarrayNoTool calls (for assistant messages)
tool_call_idstringNoTool call ID (for tool result messages)

Content Parts (Multimodal)

When content is an array, each element is a content part:
{
  "role": "user",
  "content": [
    { "type": "text", "text": "What is in this image?" },
    {
      "type": "image_url",
      "image_url": { "url": "https://example.com/photo.jpg" }
    }
  ]
}

Tool Definition

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get current weather for a city",
    "parameters": {
      "type": "object",
      "properties": {
        "city": { "type": "string", "description": "City name" }
      },
      "required": ["city"]
    }
  }
}

Examples

Basic Request

curl -X POST http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer rk_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1705312800,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33
  }
}

Streaming Request

curl -X POST http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer rk_live_abc123..." \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Streaming Response Format

Each chunk follows the SSE format:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1705312800,"model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1705312800,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1705312800,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1705312800,"model":"gpt-4o","choices":[{"index":0,"delta":{},"finish_reason":"stop"},"usage":{"prompt_tokens":9,"completion_tokens":2,"total_tokens":11}}]}

data: [DONE]

Function Calling

curl -X POST http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer rk_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "What is the weather in Paris?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {"type": "string"}
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'
Response with tool call:
{
  "id": "chatcmpl-def456",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\": \"Paris\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Raven-Specific Response Headers

Every response from the chat completions endpoint includes these additional headers:
HeaderExampleDescription
X-Raven-ProvideropenaiProvider that handled the request
X-Raven-Modelgpt-4oModel that was used
X-Raven-Latency-Ms450Total request latency in milliseconds
X-Guardrail-WarningsPII detected in inputGuardrail warnings (only if triggered)

Error Responses

Model Not Found

{
  "error": {
    "code": "MODEL_NOT_SUPPORTED",
    "message": "Model \"gpt-5-turbo\" is not supported in Raven. Please contact us at [email protected]."
  }
}
Status: 400 Bad Request

Rate Limited

{
  "error": {
    "message": "Rate limit exceeded (requests per minute)",
    "code": "RATE_LIMITED"
  }
}
Status: 429 Too Many Requests

Budget Exceeded

{
  "error": {
    "message": "Budget exceeded for this organization",
    "code": "BUDGET_EXCEEDED"
  }
}
Status: 402 Payment Required