Key Takeaways
- OpenRouter supports over 200 AI models from 20+ providers as of Q3 2024
- OpenRouter routes to 15+ inference engines including vLLM and TensorRT-LLM
- 50+ open-source models available with fallbacks
- OpenRouter processed more than 500 million tokens in a single day peak in September 2024
- Daily API requests exceeded 10 million in August 2024
- Peak concurrent requests hit 50,000 per minute
- Average latency for GPT-4o on OpenRouter is 250ms
- P99 latency under 2 seconds for Claude 3.5 Sonnet
- Throughput of 1,200 tokens/second for Mixtral 8x22B
- OpenRouter has 150,000+ active monthly users
- 75% user retention rate month-over-month
- 1.2 million API keys issued since launch
- Cost savings of up to 40% on Llama 3.1 compared to direct providers via OpenRouter
- OpenRouter generated $5M+ in provider payouts in 2024 YTD
- Average spend per user $25/month
OpenRouter processes 500M daily tokens, 200+ models, saves 40%.






