User profile

midnight-bencher

6 runsFirst seen 2026-04-12Avg 65.8

Best PipelineScore

81.0MAINLINE

Best run beats 87% of all 428 submissionsBest rig m3-ultra-256gb ranks #23 of 63 rigs

Total tokens32.1KAcross every task this user has run

Avg latency1011msPer task, across all submissions

Tasks run1506 submissions x ~34 tasks

Rigs used6Distinct hardware tags

Category signature

Average score per category across all 6 runs.

Code

67.5

Reason

63.6

Tool Use

66.6

RAG

67.1

Speed

63.3

Rigs this user benchmarked on.

m3-ultra-256gb1 (17%)

dgx-h1001 (17%)

m3-pro-36gb1 (17%)

rtx-3080-10gb1 (17%)

rtx-3060-12gb1 (17%)

m1-air-8gb1 (17%)

Where they spend their tokens.

meta3 (50%)

alibaba1 (17%)

community1 (17%)

google1 (17%)

Best score per model. Click a model to see its full page.

#	Model	Provider	Best Score	Tier	Achieved
1	Llama 4 70B Instruct	meta	81.0	MAINLINE	2026-05-08
2	Qwen 2.5 VL 72B	alibaba	80.0	MAINLINE	2026-05-03
3	LLaVA OneVision 7B	community	65.2	FEEDER	2026-04-12
4	Gemma 2 9B IT	google	65.0	FEEDER	2026-05-05
5	Llama 3.2 3B Instruct	meta	54.4	TAP	2026-04-16

Every run, ordered by score.

#	Model	Hardware	Score	Tier	Code	Reason	Tool Use	RAG	Speed	Tokens	Avg ms	Date
1	Llama 4 70B Instruct	m3-ultra-256gb	81.0	MAINLINE	85.3	80.0	84.0	84.5	67.8	5.2K	2757	2026-05-08
2	Qwen 2.5 VL 72B	dgx-h100	80.0	MAINLINE	79.3	76.1	78.4	84.3	85.7	5.6K	904	2026-05-03
3	LLaVA OneVision 7B	m3-pro-36gb	65.2	FEEDER	60.0	65.1	68.2	71.1	63.8	5.4K	854	2026-04-12
4	Gemma 2 9B IT	rtx-3080-10gb	65.0	FEEDER	67.2	65.3	69.1	58.7	60.9	5.2K	856	2026-05-05
5	Llama 3.2 3B Instruct	rtx-3060-12gb	54.4	TAP	58.9	51.4	51.2	49.8	52.9	5.5K	346	2026-04-16
6	Llama 3.2 3B Instruct	m1-air-8gb	49.4	TAP	54.6	44.0	48.9	54.2	49.0	5.2K	350	2026-05-18