AI model performance rankings
Click any two models to compare them side by side
| Index Scores | Benchmarks | Speed | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| # | ||||||||||||
| 1 | Claude Opus 4.7 (Adaptive Reasoning, Max Effort) | 57.3 | 52.5 | N/A | N/A | 91.4 | 39.6 | N/A | 54.5 | N/A | N/A | 52 |
| 2 | GPT-5.4 (xhigh) | 57.2 | 57.3 | N/A | N/A | 92.0 | 41.6 | N/A | 56.6 | N/A | N/A | 32 |
| 3 | Gemini 3.1 Pro PreviewGooglevia Google AI Studio | 57.2 | 55.5 | N/A | N/A | 94.1 | 44.7 | N/A | 58.9 | N/A | N/A | 71 |
| 4 | Claude Opus 4.6 (Adaptive Reasoning, Max Effort) | 53.0 | 48.1 | N/A | N/A | 89.6 | 36.7 | N/A | 51.9 | N/A | N/A | 36 |
| 5 | Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) | 51.7 | 50.9 | N/A | N/A | 87.5 | 30.0 | N/A | 46.8 | N/A | N/A | 42 |
| 6 | GLM-5.1 (Reasoning)Z AIvia Z.AI | 51.4 | 43.4 | N/A | N/A | 86.8 | 28.0 | N/A | 43.8 | N/A | N/A | 21 |
| 7 | Qwen3.6 PlusAlibaba | 50.0 | 42.9 | N/A | N/A | 88.2 | 25.7 | N/A | 40.7 | N/A | N/A | 44 |
| 8 | GLM-5 (Reasoning)Z AIvia Z.AI | 49.8 | 44.2 | N/A | N/A | 82.0 | 27.2 | N/A | 46.2 | N/A | N/A | 32 |
| 9 | GPT-5.4 mini (xhigh) | 48.1 | 51.5 | N/A | N/A | 87.5 | 26.6 | N/A | 49.9 | N/A | N/A | 83 |
| 10 | Gemini 3 Flash Preview (Reasoning) | 46.4 | 42.6 | 97.0 | 89.0 | 89.8 | 34.7 | 90.8 | 50.6 | N/A | N/A | 68 |
| 11 | Qwen3.5 397B A17B (Reasoning)Alibaba | 45.0 | 41.3 | N/A | N/A | 89.3 | 27.3 | N/A | 42.0 | N/A | N/A | 53 |
| 12 | GPT-5.4 nano (xhigh) | 44.4 | 43.9 | N/A | N/A | 81.7 | 26.5 | N/A | 46.9 | N/A | N/A | 65 |
| 13 | MiMo-V2-Flash (Feb 2026)Xiaomi | 41.5 | 33.5 | N/A | N/A | 83.5 | 20.0 | N/A | 38.3 | N/A | N/A | 35 |
| 14 | Grok 4 | 41.5 | 40.5 | 92.7 | 86.6 | 87.7 | 23.9 | 81.9 | 45.7 | 99.0 | 94.3 | 45 |
| 15 | Gemma 4 31B (Reasoning)Googlevia DeepInfra | 39.2 | 38.7 | N/A | N/A | 85.7 | 22.7 | N/A | 43.4 | N/A | N/A | 32 |
| 16 | Grok 4.1 Fast (Reasoning) | 38.6 | 30.9 | 89.3 | 85.4 | 85.3 | 17.6 | 82.2 | 44.2 | N/A | N/A | 116 |
| 17 | Claude 4.5 Haiku (Reasoning) | 37.1 | 32.6 | 83.7 | 76.0 | 67.2 | 9.7 | 61.5 | 43.3 | N/A | N/A | 66 |
| 18 | NVIDIA Nemotron 3 Super 120B A12B (Reasoning)NVIDIAvia Nebius | 36.0 | 31.2 | N/A | N/A | 80.0 | 19.2 | N/A | 36.0 | N/A | N/A | 89 |
| 19 | Grok 4 Fast (Reasoning) | 35.1 | 27.4 | 89.7 | 85.0 | 84.7 | 17.0 | 83.2 | 44.2 | N/A | N/A | 123 |
| 20 | Gemini 3.1 Flash-Lite Preview | 33.5 | 30.1 | N/A | N/A | 82.2 | 16.2 | N/A | 41.9 | N/A | N/A | 44 |
| 21 | gpt-oss-120B (high) | 33.3 | 28.6 | 93.4 | 80.8 | 78.2 | 18.5 | 87.8 | 38.9 | N/A | N/A | 349 |
| 22 | gpt-oss-120B (high) | 33.3 | 28.6 | 93.4 | 80.8 | 78.2 | 18.5 | 87.8 | 38.9 | N/A | N/A | 218 |
| 23 | gpt-oss-120B (high) | 33.3 | 28.6 | 93.4 | 80.8 | 78.2 | 18.5 | 87.8 | 38.9 | N/A | N/A | 768 |
| 24 | GPT-4.1 | 26.3 | 21.8 | 34.7 | 80.6 | 66.6 | 4.6 | 45.7 | 38.1 | 91.3 | 43.7 | 44 |
| 25 | GPT-4.1 mini | 22.9 | 18.5 | 46.3 | 78.1 | 66.4 | 4.6 | 48.3 | 40.4 | 92.5 | 43.0 | 51 |
| 26 | GPT-4o (Aug '24) | 18.6 | 16.6 | N/A | N/A | 52.1 | 2.9 | 31.7 | 33.1 | 79.5 | 11.7 | 14 |
| 27 | DeepSeek R1 Distill Qwen 32BDeepSeekvia NextBit | 17.2 | N/A | 63.0 | 73.9 | 61.5 | 5.5 | 27.0 | 37.6 | 94.1 | 68.7 | 24 |
| 28 | DeepSeek R1 Distill Llama 70BDeepSeekvia DeepInfra | 16.0 | 11.4 | 53.7 | 79.5 | 40.2 | 6.1 | 26.6 | 31.2 | 93.5 | 67.0 | 42 |
| 29 | Gemini 2.0 Flash-Lite (Feb '25) | 14.7 | N/A | N/A | 72.4 | 53.5 | 3.6 | 18.5 | 25.0 | 87.3 | 27.7 | 22 |
| 30 | Llama 3.3 Instruct 70B | 14.5 | 10.7 | 7.7 | 71.3 | 49.8 | 4.0 | 28.8 | 26.0 | 77.3 | 30.0 | 144 |
| 31 | Llama 4 Scout | 13.5 | 6.7 | 14.0 | 75.2 | 58.7 | 4.3 | 29.9 | 17.0 | 84.4 | 28.3 | 100 |
| 32 | GPT-4.1 nano | 13.0 | 11.2 | 24.0 | 65.7 | 51.2 | 3.9 | 32.6 | 25.9 | 84.8 | 23.7 | 35 |
| 33 | GPT-4o mini | 12.6 | N/A | 14.7 | 64.8 | 42.6 | 4.0 | 23.4 | 22.9 | 78.9 | 11.7 | 30 |
crafted by bart stefanski