Dashboard Overview
Navigate to Analytics in the dashboard to see a real-time view of your gateway activity.Requests
Total request volume, success rate, and error breakdowns over time.
Costs
Spending trends by provider, model, team, and key.
Performance
Latency percentiles, cache hit rates, and throughput metrics.
Request Metrics
Every request through the gateway is tracked with detailed metrics.| Metric | Description |
|---|---|
| Input tokens | Tokens in the prompt |
| Output tokens | Tokens in the response |
| Reasoning tokens | Extended thinking tokens (e.g., Claude) |
| Cached tokens | Provider-level cached tokens |
| Cost | Calculated cost based on model pricing |
| Latency | End-to-end response time in milliseconds |
| Status | HTTP status code |
| Cache hit | Whether the response was served from Raven’s cache |
| Tool calls | Number of function/tool calls in the request |
| Images | Number of images in the request |
Token Tracking
Raven breaks down token usage into granular categories:- Input tokens — What you send to the model
- Output tokens — What the model generates
- Reasoning tokens — Internal thinking tokens (for models like Claude with extended thinking)
- Cached tokens — Tokens served from provider-level caching (e.g., OpenAI prompt caching)
Cost Analytics
Analyze costs across multiple dimensions:| Dimension | What It Shows |
|---|---|
| By provider | Compare spend across OpenAI, Anthropic, etc. |
| By model | Identify which models cost the most |
| By virtual key | Track spend per API key |
| By team | Attribute costs to specific teams |
| By time | View daily, weekly, or monthly spending trends |
Cache Hit Rates
Monitor how effectively the cache is reducing costs and latency:- Hit rate — Percentage of requests served from cache
- Cost savings — Estimated spend avoided by cache hits
- Latency improvement — Average response time reduction
Latency Monitoring
Track response times across your entire gateway:- Average latency — Mean response time
- P50 / P95 / P99 — Percentile latency breakdowns
- Time to first token — Streaming latency measurement
- By provider — Compare latency across providers
- By model — Identify slow models
Tool Use Tracking
For requests that include function calling or tool use:- Tool call count — How many tools were invoked per request
- Tool types — Which tools are being called most frequently
- Tool latency — Time spent in tool execution
Session Grouping
Requests can be grouped by session to track multi-turn conversations:- View all requests in a conversation as a single unit
- Track total cost and token usage per session
- Identify long-running or expensive sessions
Request Logs
Navigate to Requests in the dashboard for a detailed log of every request:- Full request and response payloads
- Token counts and cost breakdown
- Latency timing
- Provider and model used
- Cache hit/miss status
- Guardrail triggers and policy evaluations