Core Concepts

Architecture Overview

Raven is an open-source AI gateway you self-host on your own infrastructure. It acts as an intelligent proxy between your application and LLM providers. Every request flows through a pipeline of modules that handle authentication, routing, guardrails, caching, and logging.

Your App → Raven Gateway → Provider (OpenAI, Anthropic, etc.)
              ↓
    Auth → Rate Limit → Guardrails → Route → Cache → Forward → Log

Organizations

An organization is your workspace in Raven. All resources — providers, keys, budgets, team members — belong to an organization.

Each user can belong to multiple organizations
Resources are isolated between organizations

Virtual Keys

Virtual keys are API keys that your applications use to authenticate with Raven. They are not your provider API keys — they’re Raven-managed keys that map to your configured providers.

Feature	Description
Prefix	Keys start with `rk_live_` or `rk_test_`
Rate Limits	Per-key RPM and RPD limits
Environment	Separate live and test environments
Expiration	Optional expiration date
Budgets	Attach cost budgets to individual keys

Providers

A provider is a configured connection to an LLM service. You store your credentials once, and Raven handles routing requests to the right provider based on the model requested.

Supported Providers

OpenAI

Anthropic

More providers are being added. See the Providers page for the latest list.

Models

Models are the LLM models available through your configured providers. Raven maintains a catalog of models with metadata including:

Pricing — Input and output token costs
Context window — Maximum token capacity
Capabilities — Chat, function calling, vision, etc.

Request Pipeline

Every request that hits the Raven gateway goes through this pipeline:

Authentication

Validates the virtual key and resolves the organization.

Rate Limiting

Checks RPM/RPD limits for the virtual key.

Budget Check

Ensures the request won’t exceed any configured budgets.

Guardrails

Runs content through configured guardrail rules (PII, topic blocking, regex).

Cache Check

Checks cache for matching previous responses.

Routing

Resolves the target provider and model based on routing rules.

Upstream Request

Forwards the request to the resolved provider with retry and fallback logic.

Response Analysis

Analyzes the response for token usage and cost calculation.

Logging

Records the full request lifecycle for analytics and audit.

Environments

Raven supports two environments:

Live — Production traffic. Keys prefixed with rk_live_.
Test — Development and testing. Keys prefixed with rk_test_.

This lets you safely test configuration changes without affecting production traffic.

Getting Started

Self-Hosting

Core Features

Governance & Safety

Cost Management

Advanced Features

Security

Guides

Architecture Overview

Organizations

Virtual Keys

Providers

Supported Providers

OpenAI

Anthropic

Models

Request Pipeline

Environments

Getting Started

Self-Hosting

Core Features

Governance & Safety

Cost Management

Advanced Features

Security

Guides

​Architecture Overview

​Organizations

​Virtual Keys

​Providers

​Supported Providers

OpenAI

Anthropic

​Models

​Request Pipeline

​Environments

Architecture Overview

Organizations

Virtual Keys

Providers

Supported Providers

Models

Request Pipeline

Environments