OpenRouter Performance Analytics

Comprehensive Speed & Efficiency Analysis

Generated on October 20, 2025 at 11:41:39

2,121 API calls analyzed

16 providers

30 models

Fastest Provider

0.234s

Crusoe

Median response time

Most Reliable

90.3%

Crusoe

Consistency score

Highest Throughput

58955

Mistral

Tokens per second

Average Cost

$0.0002

Per 1K tokens

Total spent: $6.63

Key Insights

🚀 Speed Champion

Crusoe leads with 0.234s median response time, processing requests 4.3x faster than average.

⚡ Throughput Leader

Mistral achieves the highest token generation rate at 58955 tokens/second.

🎯 Reliability Star

Crusoe shows the most consistent performance with 90.3% reliability score.

💰 Cost Efficiency

Average cost per 1K tokens is $0.0002, with total spending of $6.63 across all providers.

Top 20 Fastest Providers

Rank	Provider	Requests	Median Response	Avg Tokens/sec	Reliability	Performance
1	Crusoe	49.0	0.234s	5164	90.3%	Excellent
2	Anthropic	53.0	0.389s	1767	52.1%	Average
3	Google AI Studio	286.0	0.566s	4301	45.3%	Average
4	Novita	10.0	0.726s	979	30.9%	Average
5	Mistral	15.0	1.346s	58955	51.4%	Average
6	Amazon Bedrock	2.0	2.642s	1275	45.7%	Average
7	DeepInfra	40.0	2.696s	4014	14.7%	Average
8	Venice	10.0	3.469s	761	66.7%	Good
9	Parasail	3.0	3.737s	671	82.7%	Excellent
10	xAI	1,467.0	4.614s	2653	4.7%	Average
11	SiliconFlow	8.0	5.424s	16843	11.3%	Average
12	Google	14.0	7.318s	5008	7.4%	Average
13	Z.AI	7.0	7.963s	412	36.8%	Average
14	OpenAI	85.0	7.965s	601	13.6%	Average
15	Alibaba	52.0	11.305s	358	13.4%	Average
16	Liquid	20.0	13.300s	453	29.0%	Average

Top 20 Fastest Models

Rank	Model	Requests	Median Response	Avg Tokens/sec	Primary Provider
1	google/gemma-3-12b-it	71	0.212s	3776	Crusoe
2	anthropic/claude-3-5-haiku	50	0.381s	969	Anthropic
3	google/gemini-2.5-flash-lite-p...	175	0.520s	4025	Google AI Studio
4	google/gemini-2.5-flash	88	0.601s	5493	Google AI Studio
5	mistralai/codestral-2508	15	1.346s	58955	Mistral
6	qwen/qwen3-coder-30b-a3b-instr...	5	2.261s	32710	DeepInfra
7	openai/gpt-5-codex	33	2.724s	255	OpenAI
8	google/gemini-2.5-flash-previe...	23	3.026s	1837	Google AI Studio
9	venice/uncensored	10	3.469s	761	Venice
10	deepseek/deepseek-v3.1-terminu...	6	4.040s	16753	SiliconFlow
11	x-ai/grok-code-fast-1	1,374	4.279s	2572	xAI
12	microsoft/phi-4-reasoning-plus...	10	4.614s	452	DeepInfra
13	google/gemini-2.5-flash-image	13	7.559s	5277	Google
14	z-ai/glm-4-32b-0414	7	7.963s	412	Z.AI
15	deepseek/deepseek-chat-v3	10	10.918s	327	DeepInfra
16	qwen/qwen3-max	47	11.302s	370	Alibaba
17	x-ai/grok-4-fast	59	11.505s	5590	xAI
18	qwen/qwen-plus-2025-07-28	5	11.970s	243	Alibaba
19	liquid/lfm-7b	20	13.300s	453	Liquid
20	openai/codex-mini	6	13.457s	3658	OpenAI

Performance Statistics

📊 Response Time Range

Fastest: {provider_stats['fastest_response_sec'].min():.3f}s | Slowest: {provider_stats['slowest_response_sec'].max():.3f}s | Average: {df['response_time_sec'].mean():.3f}s

🔄 Token Processing

Total tokens processed: {total_tokens:,} | Average per request: {(total_tokens/len(df)):.0f} | Peak throughput: {provider_stats['max_tokens_per_sec'].max():.0f} tokens/s

📈 Usage Distribution

Most active provider: {df['provider_name'].value_counts().index[0]} ({df['provider_name'].value_counts().iloc[0]:,} requests) | Average requests per provider: {len(df)/df['provider_name'].nunique():.0f}

⏱️ Time to First Token

Average TTFT: {df['time_to_first_sec'].mean():.3f}s | Fastest TTFT: {df['time_to_first_sec'].min():.3f}s | Provider with best TTFT: {provider_stats.nsmallest(1, 'avg_time_to_first_sec').index[0] if not provider_stats['avg_time_to_first_sec'].isna().all() else 'N/A'}