โ Back to user leaderboard
User profile
mac-stack
21 runsFirst seen 2026-04-11Avg 71.5
Total tokens112.0KAcross every task this user has run
Avg latency1491msPer task, across all submissions
Tasks run52521 submissions x ~25 tasks
Rigs used14Distinct hardware tags
Category signature
Average score per category across all 21 runs.
Code
71.0
Reason
71.6
Write
72.2
Tool Use
72.7
RAG
73.7
Speed
68.1
Hardware mix
Rigs this user benchmarked on.
rtx-4080-16gb-offload3 (14%)
rtx-3090-24gb2 (10%)
4x-rtx-30902 (10%)
rtx-3080-10gb2 (10%)
rtx-4090-24gb2 (10%)
m3-pro-36gb2 (10%)
dgx-h1001 (5%)
m3-ultra-256gb1 (5%)
a6000-48gb1 (5%)
cloud-api1 (5%)
m2-ultra-192gb1 (5%)
rtx-3060-12gb1 (5%)
ryzen-7950x-cpu-only1 (5%)
m3-air-16gb1 (5%)
Provider mix
Where they spend their tokens.
alibaba3 (14%)
deepseek2 (10%)
meta2 (10%)
community2 (10%)
internlm2 (10%)
microsoft2 (10%)
bigcode2 (10%)
ibm2 (10%)
zhipu1 (5%)
mistral1 (5%)
google1 (5%)
xai1 (5%)
Models tried
Best score per model. Click a model to see its full page.
| # | Model | Best Score | Tier |
|---|---|---|---|
| 1 | DeepSeek R1 671B-A37B | 86.0 | MAINLINE |
| 2 | Qwen 2.5 Coder 32B | 81.8 | MAINLINE |
| 3 | Qwen 2.5 32B Instruct | 81.0 | MAINLINE |
| 4 | Llama 3.3 70B Instruct | 76.9 | MAINLINE |
| 5 | Qwen 3 32B Instruct | 76.3 | MAINLINE |
| 6 | Magnum V4 72B | 75.3 | MAINLINE |
| 7 | GLM 4 Plus | 74.8 | FEEDER |
| 8 | Devstral Small 24B | 74.5 | FEEDER |
| 9 | InternLM 2.5 20B Chat | 73.0 | FEEDER |
| 10 | L3 70B Euryale | 72.4 | FEEDER |
| 11 | Phi 4 14B | 71.3 | FEEDER |
| 12 | Phi 3.5 MoE 42B | 69.7 | FEEDER |
| 13 | DeepSeek R1 Distill Llama 8B | 69.5 | FEEDER |
| 14 | StarCoder2 15B | 69.2 | FEEDER |
| 15 | Gemma 2 9B IT | 66.8 | FEEDER |
| 16 | Grok-1 314B | 66.6 | FEEDER |
| 17 | Granite 3.1 8B Instruct | 65.6 | FEEDER |
| 18 | Code Llama 34B Instruct | 63.6 | FEEDER |
| 19 | Granite 3.1 2B Instruct | 48.8 | TAP |
All submissions
Every run, ordered by score.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 | DeepSeek R1 671B-A37B | 86.0 | MAINLINE |
| 2 | Qwen 2.5 Coder 32B | 81.8 | MAINLINE |
| 3 | Qwen 2.5 32B Instruct | 81.0 | MAINLINE |
| 4 | Llama 3.3 70B Instruct | 76.9 | MAINLINE |
| 5 | Qwen 3 32B Instruct | 76.3 | MAINLINE |
| 6 | Magnum V4 72B | 75.3 | MAINLINE |
| 7 | GLM 4 Plus | 74.8 | FEEDER |
| 8 | Devstral Small 24B | 74.5 | FEEDER |
| 9 | InternLM 2.5 20B Chat | 73.0 | FEEDER |
| 10 | L3 70B Euryale | 72.4 | FEEDER |
| 11 | Phi 4 14B | 71.3 | FEEDER |
| 12 | InternLM 2.5 20B Chat | 70.2 | FEEDER |
| 13 | Phi 3.5 MoE 42B | 69.7 | FEEDER |
| 14 | DeepSeek R1 Distill Llama 8B | 69.5 | FEEDER |
| 15 | StarCoder2 15B | 69.2 | FEEDER |
| 16 | StarCoder2 15B | 67.8 | FEEDER |
| 17 | Gemma 2 9B IT | 66.8 | FEEDER |
| 18 | Grok-1 314B | 66.6 | FEEDER |
| 19 | Granite 3.1 8B Instruct | 65.6 | FEEDER |
| 20 | Code Llama 34B Instruct | 63.6 | FEEDER |
| 21 | Granite 3.1 2B Instruct | 48.8 | TAP |