← Back to user leaderboard
User profile
gguf-pilgrim
12 runsFirst seen 2026-04-16Avg 69.1
Total tokens67.1KAcross every task this user has run
Avg latency1221msPer task, across all submissions
Tasks run30012 submissions x ~25 tasks
Rigs used9Distinct hardware tags
Category signature
Average score per category across all 12 runs.
Code
68.3
Reason
70.5
Write
69.4
Tool Use
69.2
RAG
70.6
Speed
66.5
Hardware mix
Rigs this user benchmarked on.
cloud-api2 (17%)
rtx-4080-16gb-offload2 (17%)
macbook-pro-m3-pro-18gb2 (17%)
m3-max-128gb1 (8%)
2x-rtx-40901 (8%)
rtx-3090-24gb1 (8%)
rtx-4090-24gb1 (8%)
rtx-4080-16gb1 (8%)
m3-air-16gb1 (8%)
Provider mix
Where they spend their tokens.
alibaba2 (17%)
google2 (17%)
mistral2 (17%)
deepseek1 (8%)
nous1 (8%)
cohere1 (8%)
meta1 (8%)
internlm1 (8%)
bigcode1 (8%)
Models tried
Best score per model. Click a model to see its full page.
| # | Model | Best Score | Tier |
|---|---|---|---|
| 1 | Qwen 2.5 Coder 32B | 80.0 | MAINLINE |
| 2 | DeepSeek V3 671B-A37B | 79.6 | MAINLINE |
| 3 | Hermes 3 Llama 3.1 70B | 78.1 | MAINLINE |
| 4 | Gemma 3 27B IT | 76.9 | MAINLINE |
| 5 | Devstral Small 24B | 75.6 | MAINLINE |
| 6 | Mixtral 8x22B Instruct | 74.7 | FEEDER |
| 7 | Gemma 2 27B IT | 74.7 | FEEDER |
| 8 | Command R | 67.9 | FEEDER |
| 9 | Llama 3.1 8B Instruct | 63.6 | FEEDER |
| 10 | InternLM 2.5 7B Chat | 62.7 | FEEDER |
| 11 | Qwen 2.5 1.5B Instruct | 48.1 | TAP |
| 12 | StarCoder2 3B | 47.0 | TAP |
All submissions
Every run, ordered by score.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 | Qwen 2.5 Coder 32B | 80.0 | MAINLINE |
| 2 | DeepSeek V3 671B-A37B | 79.6 | MAINLINE |
| 3 | Hermes 3 Llama 3.1 70B | 78.1 | MAINLINE |
| 4 | Gemma 3 27B IT | 76.9 | MAINLINE |
| 5 | Devstral Small 24B | 75.6 | MAINLINE |
| 6 | Mixtral 8x22B Instruct | 74.7 | FEEDER |
| 7 | Gemma 2 27B IT | 74.7 | FEEDER |
| 8 | Command R | 67.9 | FEEDER |
| 9 | Llama 3.1 8B Instruct | 63.6 | FEEDER |
| 10 | InternLM 2.5 7B Chat | 62.7 | FEEDER |
| 11 | Qwen 2.5 1.5B Instruct | 48.1 | TAP |
| 12 | StarCoder2 3B | 47.0 | TAP |