User profile

inference-monk

15 runsFirst seen 2026-04-14Avg 68.2

Best PipelineScore

84.7MAINLINE

on DeepSeek Coder V2 236B

Best run beats 96% of all 428 submissionsBest rig cloud-api ranks #11 of 63 rigs

Total tokens78.8KAcross every task this user has run

Avg latency981msPer task, across all submissions

Tasks run37515 submissions x ~34 tasks

Rigs used11Distinct hardware tags

Category signature

Average score per category across all 15 runs.

Code

69.3

Reason

67.7

Tool Use

66.6

RAG

69.7

Speed

67.4

Hardware mix

Rigs this user benchmarked on.

rtx-4070-12gb4 (27%)

m3-pro-36gb2 (13%)

cloud-api1 (7%)

m3-max-64gb1 (7%)

h200-141gb1 (7%)

m3-ultra-256gb1 (7%)

rtx-4080-16gb-offload1 (7%)

rtx-4080-16gb1 (7%)

rtx-3060-12gb1 (7%)

m2-air-16gb1 (7%)

macbook-pro-m3-pro-18gb1 (7%)

Provider mix

Where they spend their tokens.

deepseek2 (13%)

alibaba2 (13%)

nous2 (13%)

mistral2 (13%)

google2 (13%)

internlm1 (7%)

cognitivecomputations1 (7%)

community1 (7%)

cohere1 (7%)

bigcode1 (7%)

Models tried

Best score per model. Click a model to see its full page.

#	Model	Provider	Best Score	Tier	Achieved
1	DeepSeek Coder V2 236B	deepseek	84.7	MAINLINE	2026-04-20
2	DeepSeek R1 Distill Qwen 32B	deepseek	84.0	MAINLINE	2026-04-28
3	Qwen 3 235B-A22B MoE	alibaba	83.0	MAINLINE	2026-04-23
4	Hermes 3 Llama 3.1 70B	nous	77.2	MAINLINE	2026-05-14
5	Mistral Small 24B Instruct	mistral	73.2	FEEDER	2026-05-03
6	Gemma 3 12B IT	google	73.2	FEEDER	2026-04-25
7	Mistral Nemo 12B Instruct	mistral	71.0	FEEDER	2026-04-28
8	InternLM 2.5 7B Chat	internlm	66.6	FEEDER	2026-05-17
9	Dolphin 3.0 Llama 3.1 8B	cognitivecomputations	65.2	FEEDER	2026-05-06
10	Hermes 3 Llama 3.1 8B	nous	64.1	FEEDER	2026-04-14
11	LLaVA OneVision 7B	community	63.7	FEEDER	2026-05-14
12	Aya 23 8B	cohere	61.3	FEEDER	2026-05-15
13	StarCoder2 7B	bigcode	56.7	TAP	2026-05-13
14	Qwen 2.5 Coder 1.5B	alibaba	55.0	TAP	2026-04-22
15	Gemma 2 2B IT	google	44.0	TAP	2026-04-19

All submissions

Every run, ordered by score.

#	Model	Hardware	Score	Tier	Code	Reason	Tool Use	RAG	Speed	Tokens	Avg ms	Date
1	DeepSeek Coder V2 236B	cloud-api	84.7	MAINLINE	87.9	82.3	80.7	84.5	87.6	4.9K	900	2026-04-20
2	DeepSeek R1 Distill Qwen 32B	m3-max-64gb	84.0	MAINLINE	80.8	86.5	81.7	90.0	81.7	4.7K	1483	2026-04-28
3	Qwen 3 235B-A22B MoE	h200-141gb	83.0	MAINLINE	83.5	80.0	79.0	88.5	82.2	5.6K	947	2026-04-23
4	Hermes 3 Llama 3.1 70B	m3-ultra-256gb	77.2	MAINLINE	81.0	75.7	80.0	79.7	69.2	4.7K	2907	2026-05-14
5	Mistral Small 24B Instruct	rtx-4080-16gb-offload	73.2	FEEDER	76.8	71.8	71.8	72.7	70.3	5.2K	1458	2026-05-03
6	Gemma 3 12B IT	m3-pro-36gb	73.2	FEEDER	71.3	72.8	72.0	77.2	75.9	5.3K	777	2026-04-25
7	Mistral Nemo 12B Instruct	rtx-4070-12gb	71.0	FEEDER	75.5	68.7	67.9	66.6	69.7	5.1K	797	2026-04-28
8	InternLM 2.5 7B Chat	rtx-4070-12gb	66.6	FEEDER	68.8	63.4	63.4	72.3	65.9	4.6K	781	2026-05-17
9	Dolphin 3.0 Llama 3.1 8B	m3-pro-36gb	65.2	FEEDER	68.7	61.1	66.4	64.1	61.6	5.6K	779	2026-05-06
10	Hermes 3 Llama 3.1 8B	rtx-4070-12gb	64.1	FEEDER	61.3	71.7	60.2	62.3	66.4	5.6K	819	2026-04-14
11	LLaVA OneVision 7B	rtx-4080-16gb	63.7	FEEDER	63.3	66.6	69.9	60.1	58.9	5.0K	814	2026-05-14
12	Aya 23 8B	rtx-4070-12gb	61.3	FEEDER	59.2	61.5	56.1	60.6	65.8	5.4K	798	2026-05-15
13	StarCoder2 7B	rtx-3060-12gb	56.7	TAP	55.1	56.0	53.9	62.3	54.7	6.2K	778	2026-05-13
14	Qwen 2.5 Coder 1.5B	m2-air-16gb	55.0	TAP	57.6	57.8	52.0	57.6	55.1	5.5K	327	2026-04-22
15	Gemma 2 2B IT	macbook-pro-m3-pro-18gb	44.0	TAP	48.1	39.1	44.0	47.3	45.9	5.4K	356	2026-04-19