Skip to main content
The analytics dashboard gives you real-time visibility into your AI usage, costs, and performance across all providers and models.

Dashboard Overview

Navigate to Analytics in the dashboard to see a real-time view of your gateway activity.

Requests

Total request volume, success rate, and error breakdowns over time.

Costs

Spending trends by provider, model, team, and key.

Performance

Latency percentiles, cache hit rates, and throughput metrics.

Request Metrics

Every request through the gateway is tracked with detailed metrics.
MetricDescription
Input tokensTokens in the prompt
Output tokensTokens in the response
Reasoning tokensExtended thinking tokens (e.g., Claude)
Cached tokensProvider-level cached tokens
CostCalculated cost based on model pricing
LatencyEnd-to-end response time in milliseconds
StatusHTTP status code
Cache hitWhether the response was served from Raven’s cache
Tool callsNumber of function/tool calls in the request
ImagesNumber of images in the request

Token Tracking

Raven breaks down token usage into granular categories:
  • Input tokens — What you send to the model
  • Output tokens — What the model generates
  • Reasoning tokens — Internal thinking tokens (for models like Claude with extended thinking)
  • Cached tokens — Tokens served from provider-level caching (e.g., OpenAI prompt caching)

Cost Analytics

Analyze costs across multiple dimensions:
DimensionWhat It Shows
By providerCompare spend across OpenAI, Anthropic, etc.
By modelIdentify which models cost the most
By virtual keyTrack spend per API key
By teamAttribute costs to specific teams
By timeView daily, weekly, or monthly spending trends

Cache Hit Rates

Monitor how effectively the cache is reducing costs and latency:
  • Hit rate — Percentage of requests served from cache
  • Cost savings — Estimated spend avoided by cache hits
  • Latency improvement — Average response time reduction

Latency Monitoring

Track response times across your entire gateway:
  • Average latency — Mean response time
  • P50 / P95 / P99 — Percentile latency breakdowns
  • Time to first token — Streaming latency measurement
  • By provider — Compare latency across providers
  • By model — Identify slow models

Tool Use Tracking

For requests that include function calling or tool use:
  • Tool call count — How many tools were invoked per request
  • Tool types — Which tools are being called most frequently
  • Tool latency — Time spent in tool execution

Session Grouping

Requests can be grouped by session to track multi-turn conversations:
  • View all requests in a conversation as a single unit
  • Track total cost and token usage per session
  • Identify long-running or expensive sessions

Request Logs

Navigate to Requests in the dashboard for a detailed log of every request:
  • Full request and response payloads
  • Token counts and cost breakdown
  • Latency timing
  • Provider and model used
  • Cache hit/miss status
  • Guardrail triggers and policy evaluations
Filter and search through request logs to debug specific issues or investigate anomalies.

Live Request Streaming

The analytics dashboard includes a live view that shows requests flowing through the gateway in real-time. Use this to monitor production traffic and spot issues as they happen.

Data Export

Export analytics data for external analysis via the API. Supports JSON format with pagination for large datasets.