How It Works
Each virtual key can have two independent rate limits:| Limit | Window | Purpose |
|---|---|---|
| RPM (Requests Per Minute) | 60 seconds | Controls burst traffic |
| RPD (Requests Per Day) | 86,400 seconds | Controls total daily usage |
If neither RPM nor RPD is set on a virtual key, rate limiting is skipped entirely for that key.
Configuring Rate Limits
Set rate limits when creating or updating a virtual key in the dashboard or via the API.API Configuration
Example Configurations
| Use Case | RPM | RPD | Rationale |
|---|---|---|---|
| Development key | 30 | 1,000 | Light usage for testing |
| Production key | 600 | 100,000 | Typical production workload |
| Batch processing | 120 | 500,000 | High daily volume, moderate burst |
| Internal tool | 10 | 500 | Low-frequency, cost-controlled |
Token Bucket via Redis
Rate limits are enforced using the token bucket algorithm backed by Redis via therate-limiter-flexible library. This provides:
- Distributed enforcement — Works across multiple Raven instances sharing the same Redis
- Sub-millisecond latency — Redis operations add minimal overhead to each request
- Atomic operations — No race conditions under high concurrency
- Automatic expiry — Counters reset naturally when the time window elapses
How Counters Work
429 Responses
When a rate limit is exceeded, Raven returns a429 Too Many Requests response:
Rate Limits vs. Budgets
Rate limits and budgets serve different purposes:| Feature | Rate Limits | Budgets |
|---|---|---|
| Unit | Request count | Dollar amount |
| Scope | Per virtual key | Org, team, or key |
| Window | Minute / day | Day / week / month |
| Purpose | Throughput control | Cost control |
Monitoring Rate Limits
Rate limit events are tracked in Raven’s observability layer:- Prometheus —
raven_rate_limit_exceeded_totalcounter withkey_idlabel - Events —
key.rate_limitedevents are emitted and available via the event stream - Analytics — Rate-limited requests appear in the dashboard analytics with a
429status code
Best Practices
Set both RPM and RPD
Set both RPM and RPD
RPM alone does not prevent a key from being used all day at a moderate rate. Combine RPM for burst protection with RPD for daily caps.
Use separate keys for separate workloads
Use separate keys for separate workloads
Give each application or team its own virtual key with appropriate limits. This prevents one workload from starving another.
Start conservative, increase as needed
Start conservative, increase as needed
Begin with lower limits and increase them based on observed usage patterns in the analytics dashboard.
Handle 429s gracefully in your application
Handle 429s gracefully in your application
Implement exponential backoff in your client code. The Raven SDK handles this automatically.