Skip to main content
Raven provides a fully OpenAI-compatible API endpoint at /v1/chat/completions. This means you can switch to Raven by changing just two configuration values — your base URL and API key. No other code changes are required.

Drop-in Replacement

Before (Direct OpenAI)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-..."
});

After (Through Raven)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "rk_live_...",                // Your Raven virtual key
  baseURL: "http://localhost:4000/v1"   // Raven endpoint
});
Everything else stays exactly the same. client.chat.completions.create(), streaming, function calling — all work identically.

The /v1/chat/completions Endpoint

Raven’s primary proxy endpoint is fully compatible with OpenAI’s Chat Completions API:
POST http://localhost:4000/v1/chat/completions
When you send a request:
  1. Raven authenticates your virtual key
  2. Resolves the model field to the correct provider
  3. Translates the request format if needed (e.g., OpenAI format to Anthropic’s native format)
  4. Forwards the request to the upstream provider
  5. Normalizes the response back to OpenAI format
  6. Returns it to your application
This means you can use model: "claude-sonnet-4-20250514" with the OpenAI SDK and Raven handles the translation automatically.

Auto Model-to-Provider Resolution

Raven automatically resolves model names to the correct provider. You do not need to specify which provider to use — just set the model:
// Routes to OpenAI
await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello" }]
});

// Routes to Anthropic
await client.chat.completions.create({
  model: "claude-sonnet-4-20250514",
  messages: [{ role: "user", content: "Hello" }]
});

Supported Parameters

All standard OpenAI Chat Completions parameters are supported:
ParameterTypeDescription
modelstringModel ID (required)
messagesarrayArray of message objects (required)
streambooleanEnable SSE streaming
temperaturenumberSampling temperature (0-2)
top_pnumberNucleus sampling parameter
max_tokensintegerMaximum tokens to generate
stopstring/arrayStop sequences
toolsarrayFunction/tool definitions
tool_choicestring/objectTool selection strategy
response_formatobjectResponse format (e.g., JSON mode)

Streaming

Streaming works identically to OpenAI. Set stream: true and consume SSE chunks:
const stream = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Write a haiku" }],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Raven normalizes streaming output from all providers to the OpenAI SSE format, so even Anthropic streams arrive in the same format. See Streaming for details.

Function Calling

Function/tool calling works exactly as it does with OpenAI:
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "What's the weather in Paris?" }],
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get current weather for a city",
        parameters: {
          type: "object",
          properties: {
            city: { type: "string", description: "City name" }
          },
          required: ["city"]
        }
      }
    }
  ]
});

Compatible SDKs

Any SDK that supports custom base URLs works with Raven:
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "rk_live_abc123def456",
  baseURL: "http://localhost:4000/v1"
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain quantum computing in one paragraph." }
  ],
  temperature: 0.7,
  max_tokens: 500
});

console.log(response.choices[0].message.content);

Using Non-OpenAI Models

The power of Raven is that you use the same OpenAI-compatible API to access any provider:
// Access Anthropic models through the OpenAI SDK
const response = await client.chat.completions.create({
  model: "claude-sonnet-4-20250514",
  messages: [{ role: "user", content: "Hello" }]
});
Raven automatically resolves the model to the correct provider and translates the request format.

Feature Support

FeatureSupport
Chat completionsFull
StreamingFull
Function/tool callingFull
Vision (image inputs)Full
System messagesFull
Temperature, top_p, etc.Full
Stop sequencesFull
Max tokensFull
Response format (JSON)Full

Raven-Specific Response Headers

Raven adds informational headers to every response:
HeaderDescription
X-Raven-ProviderThe provider that handled the request
X-Raven-ModelThe model that was used
X-Raven-Latency-MsTotal request latency in milliseconds
X-Guardrail-WarningsGuardrail warnings, if any were triggered
These headers are additive and do not break OpenAI SDK compatibility.

Vercel AI SDK

You can also use Raven with the Vercel AI SDK for generateText, streamText, and useChat:
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
import { generateText } from "ai";

const raven = createOpenAICompatible({
  name: "raven",
  apiKey: "rk_live_...",
  baseURL: "http://localhost:4000/v1",
});

const { text } = await generateText({
  model: raven("gpt-4o"),
  prompt: "Hello!",
});