Skip to main content

Architecture Overview

Raven is an open-source AI gateway you self-host on your own infrastructure. It acts as an intelligent proxy between your application and LLM providers. Every request flows through a pipeline of modules that handle authentication, routing, guardrails, caching, and logging.
Your App → Raven Gateway → Provider (OpenAI, Anthropic, etc.)

    Auth → Rate Limit → Guardrails → Route → Cache → Forward → Log

Organizations

An organization is your workspace in Raven. All resources — providers, keys, budgets, team members — belong to an organization.
  • Each user can belong to multiple organizations
  • Resources are isolated between organizations

Virtual Keys

Virtual keys are API keys that your applications use to authenticate with Raven. They are not your provider API keys — they’re Raven-managed keys that map to your configured providers.
FeatureDescription
PrefixKeys start with rk_live_ or rk_test_
Rate LimitsPer-key RPM and RPD limits
EnvironmentSeparate live and test environments
ExpirationOptional expiration date
BudgetsAttach cost budgets to individual keys

Providers

A provider is a configured connection to an LLM service. You store your credentials once, and Raven handles routing requests to the right provider based on the model requested.

Supported Providers

OpenAI

Anthropic

More providers are being added. See the Providers page for the latest list.

Models

Models are the LLM models available through your configured providers. Raven maintains a catalog of models with metadata including:
  • Pricing — Input and output token costs
  • Context window — Maximum token capacity
  • Capabilities — Chat, function calling, vision, etc.

Request Pipeline

Every request that hits the Raven gateway goes through this pipeline:
1

Authentication

Validates the virtual key and resolves the organization.
2

Rate Limiting

Checks RPM/RPD limits for the virtual key.
3

Budget Check

Ensures the request won’t exceed any configured budgets.
4

Guardrails

Runs content through configured guardrail rules (PII, topic blocking, regex).
5

Cache Check

Checks cache for matching previous responses.
6

Routing

Resolves the target provider and model based on routing rules.
7

Upstream Request

Forwards the request to the resolved provider with retry and fallback logic.
8

Response Analysis

Analyzes the response for token usage and cost calculation.
9

Logging

Records the full request lifecycle for analytics and audit.

Environments

Raven supports two environments:
  • Live — Production traffic. Keys prefixed with rk_live_.
  • Test — Development and testing. Keys prefixed with rk_test_.
This lets you safely test configuration changes without affecting production traffic.