PipelineScore
← Back to user leaderboard
User profile

midnight-bencher

6 runsFirst seen 2026-04-12Avg 65.8
Best PipelineScore
81.0MAINLINE
Total tokens32.1KAcross every task this user has run
Avg latency1011msPer task, across all submissions
Tasks run1506 submissions x ~25 tasks
Rigs used6Distinct hardware tags

Category signature

Average score per category across all 6 runs.

Code
67.5
Reason
63.6
Write
66.3
Tool Use
66.6
RAG
67.1
Speed
63.3

Hardware mix

Rigs this user benchmarked on.

m3-ultra-256gb1 (17%)
dgx-h1001 (17%)
m3-pro-36gb1 (17%)
rtx-3080-10gb1 (17%)
rtx-3060-12gb1 (17%)
m1-air-8gb1 (17%)

Provider mix

Where they spend their tokens.

meta3 (50%)
alibaba1 (17%)
community1 (17%)
google1 (17%)

Models tried

Best score per model. Click a model to see its full page.

#ModelBest ScoreTier
1Llama 4 70B Instruct81.0MAINLINE
2Qwen 2.5 VL 72B80.0MAINLINE
3LLaVA OneVision 7B65.2FEEDER
4Gemma 2 9B IT65.0FEEDER
5Llama 3.2 3B Instruct54.4TAP

All submissions

Every run, ordered by score.

#ModelScoreTier
1Llama 4 70B Instruct81.0MAINLINE
2Qwen 2.5 VL 72B80.0MAINLINE
3LLaVA OneVision 7B65.2FEEDER
4Gemma 2 9B IT65.0FEEDER
5Llama 3.2 3B Instruct54.4TAP
6Llama 3.2 3B Instruct49.4TAP
midnight-bencher · PipelineScore