← Back to user leaderboard
User profile
rig-tester
15 runsFirst seen 2026-04-16Avg 69.2
Total tokens83.7KAcross every task this user has run
Avg latency1307msPer task, across all submissions
Tasks run37515 submissions x ~25 tasks
Rigs used13Distinct hardware tags
Category signature
Average score per category across all 15 runs.
Code
69.3
Reason
69.1
Write
71.3
Tool Use
71.2
RAG
68.0
Speed
65.4
Hardware mix
Rigs this user benchmarked on.
m2-ultra-192gb2 (13%)
m2-pro-16gb2 (13%)
rtx-4080-16gb-offload1 (7%)
a100-80gb1 (7%)
m3-ultra-256gb1 (7%)
rtx-3080-10gb1 (7%)
rtx-4090-24gb1 (7%)
m3-max-128gb1 (7%)
macbook-pro-m3-pro-18gb1 (7%)
rtx-4070-12gb1 (7%)
m3-air-16gb1 (7%)
snapdragon-x-elite-32gb1 (7%)
m1-air-8gb1 (7%)
Provider mix
Where they spend their tokens.
deepseek5 (33%)
microsoft2 (13%)
nous1 (7%)
moonshot1 (7%)
mistral1 (7%)
alibaba1 (7%)
google1 (7%)
cohere1 (7%)
meta1 (7%)
huggingface1 (7%)
Models tried
Best score per model. Click a model to see its full page.
| # | Model | Best Score | Tier |
|---|---|---|---|
| 1 | DeepSeek R1 671B-A37B | 86.6 | MAINLINE |
| 2 | DeepSeek R1 Distill Qwen 32B | 82.3 | MAINLINE |
| 3 | Hermes 3 Llama 3.1 70B | 79.7 | MAINLINE |
| 4 | Kimi K2 Instruct | 77.0 | MAINLINE |
| 5 | DeepSeek V2.5 | 76.6 | MAINLINE |
| 6 | DeepSeek R1 Distill Llama 8B | 74.6 | FEEDER |
| 7 | Codestral 22B | 74.6 | FEEDER |
| 8 | DeepSeek Coder V2 16B | 74.1 | FEEDER |
| 9 | Qwen 2.5 14B Instruct | 72.7 | FEEDER |
| 10 | Gemma 3 12B IT | 68.7 | FEEDER |
| 11 | Phi 3 Small 7B | 64.8 | FEEDER |
| 12 | Aya 23 8B | 58.9 | TAP |
| 13 | Llama 3.2 3B Instruct | 52.8 | TAP |
| 14 | Phi 3 Mini 3.8B | 52.4 | TAP |
| 15 | SmolLM2 1.7B | 41.9 | TAP |
All submissions
Every run, ordered by score.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 | DeepSeek R1 671B-A37B | 86.6 | MAINLINE |
| 2 | DeepSeek R1 Distill Qwen 32B | 82.3 | MAINLINE |
| 3 | Hermes 3 Llama 3.1 70B | 79.7 | MAINLINE |
| 4 | Kimi K2 Instruct | 77.0 | MAINLINE |
| 5 | DeepSeek V2.5 | 76.6 | MAINLINE |
| 6 | DeepSeek R1 Distill Llama 8B | 74.6 | FEEDER |
| 7 | Codestral 22B | 74.6 | FEEDER |
| 8 | DeepSeek Coder V2 16B | 74.1 | FEEDER |
| 9 | Qwen 2.5 14B Instruct | 72.7 | FEEDER |
| 10 | Gemma 3 12B IT | 68.7 | FEEDER |
| 11 | Phi 3 Small 7B | 64.8 | FEEDER |
| 12 | Aya 23 8B | 58.9 | TAP |
| 13 | Llama 3.2 3B Instruct | 52.8 | TAP |
| 14 | Phi 3 Mini 3.8B | 52.4 | TAP |
| 15 | SmolLM2 1.7B | 41.9 | TAP |