Comprehensive Speed & Efficiency Analysis
Crusoe leads with 0.234s median response time, processing requests 4.3x faster than average.
Mistral achieves the highest token generation rate at 58955 tokens/second.
Crusoe shows the most consistent performance with 90.3% reliability score.
Average cost per 1K tokens is $0.0002, with total spending of $6.63 across all providers.
| Rank | Provider | Requests | Median Response | Avg Tokens/sec | Reliability | Performance |
|---|---|---|---|---|---|---|
| 1 | Crusoe | 49.0 | 0.234s | 5164 | 90.3% | Excellent |
| 2 | Anthropic | 53.0 | 0.389s | 1767 | 52.1% | Average |
| 3 | Google AI Studio | 286.0 | 0.566s | 4301 | 45.3% | Average |
| 4 | Novita | 10.0 | 0.726s | 979 | 30.9% | Average |
| 5 | Mistral | 15.0 | 1.346s | 58955 | 51.4% | Average |
| 6 | Amazon Bedrock | 2.0 | 2.642s | 1275 | 45.7% | Average |
| 7 | DeepInfra | 40.0 | 2.696s | 4014 | 14.7% | Average |
| 8 | Venice | 10.0 | 3.469s | 761 | 66.7% | Good |
| 9 | Parasail | 3.0 | 3.737s | 671 | 82.7% | Excellent |
| 10 | xAI | 1,467.0 | 4.614s | 2653 | 4.7% | Average |
| 11 | SiliconFlow | 8.0 | 5.424s | 16843 | 11.3% | Average |
| 12 | 14.0 | 7.318s | 5008 | 7.4% | Average | |
| 13 | Z.AI | 7.0 | 7.963s | 412 | 36.8% | Average |
| 14 | OpenAI | 85.0 | 7.965s | 601 | 13.6% | Average |
| 15 | Alibaba | 52.0 | 11.305s | 358 | 13.4% | Average |
| 16 | Liquid | 20.0 | 13.300s | 453 | 29.0% | Average |
| Rank | Model | Requests | Median Response | Avg Tokens/sec | Primary Provider |
|---|---|---|---|---|---|
| 1 | google/gemma-3-12b-it | 71 | 0.212s | 3776 | Crusoe |
| 2 | anthropic/claude-3-5-haiku | 50 | 0.381s | 969 | Anthropic |
| 3 | google/gemini-2.5-flash-lite-p... | 175 | 0.520s | 4025 | Google AI Studio |
| 4 | google/gemini-2.5-flash | 88 | 0.601s | 5493 | Google AI Studio |
| 5 | mistralai/codestral-2508 | 15 | 1.346s | 58955 | Mistral |
| 6 | qwen/qwen3-coder-30b-a3b-instr... | 5 | 2.261s | 32710 | DeepInfra |
| 7 | openai/gpt-5-codex | 33 | 2.724s | 255 | OpenAI |
| 8 | google/gemini-2.5-flash-previe... | 23 | 3.026s | 1837 | Google AI Studio |
| 9 | venice/uncensored | 10 | 3.469s | 761 | Venice |
| 10 | deepseek/deepseek-v3.1-terminu... | 6 | 4.040s | 16753 | SiliconFlow |
| 11 | x-ai/grok-code-fast-1 | 1,374 | 4.279s | 2572 | xAI |
| 12 | microsoft/phi-4-reasoning-plus... | 10 | 4.614s | 452 | DeepInfra |
| 13 | google/gemini-2.5-flash-image | 13 | 7.559s | 5277 | |
| 14 | z-ai/glm-4-32b-0414 | 7 | 7.963s | 412 | Z.AI |
| 15 | deepseek/deepseek-chat-v3 | 10 | 10.918s | 327 | DeepInfra |
| 16 | qwen/qwen3-max | 47 | 11.302s | 370 | Alibaba |
| 17 | x-ai/grok-4-fast | 59 | 11.505s | 5590 | xAI |
| 18 | qwen/qwen-plus-2025-07-28 | 5 | 11.970s | 243 | Alibaba |
| 19 | liquid/lfm-7b | 20 | 13.300s | 453 | Liquid |
| 20 | openai/codex-mini | 6 | 13.457s | 3658 | OpenAI |
Fastest: {provider_stats['fastest_response_sec'].min():.3f}s | Slowest: {provider_stats['slowest_response_sec'].max():.3f}s | Average: {df['response_time_sec'].mean():.3f}s
Total tokens processed: {total_tokens:,} | Average per request: {(total_tokens/len(df)):.0f} | Peak throughput: {provider_stats['max_tokens_per_sec'].max():.0f} tokens/s
Most active provider: {df['provider_name'].value_counts().index[0]} ({df['provider_name'].value_counts().iloc[0]:,} requests) | Average requests per provider: {len(df)/df['provider_name'].nunique():.0f}
Average TTFT: {df['time_to_first_sec'].mean():.3f}s | Fastest TTFT: {df['time_to_first_sec'].min():.3f}s | Provider with best TTFT: {provider_stats.nsmallest(1, 'avg_time_to_first_sec').index[0] if not provider_stats['avg_time_to_first_sec'].isna().all() else 'N/A'}