PipelineScore
← Back to user leaderboard
User profile

tool-caller

2 runsFirst seen 2026-05-07Avg 76.2
Best PipelineScore
79.3MAINLINE
Total tokens11.5KAcross every task this user has run
Avg latency823msPer task, across all submissions
Tasks run502 submissions x ~25 tasks
Rigs used2Distinct hardware tags

Category signature

Average score per category across all 2 runs.

Code
79.4
Reason
74.3
Write
75.8
Tool Use
75.7
RAG
74.7
Speed
75.8

Hardware mix

Rigs this user benchmarked on.

cloud-api1 (50%)
m3-max-128gb1 (50%)

Provider mix

Where they spend their tokens.

alibaba1 (50%)
meta1 (50%)

Models tried

Best score per model. Click a model to see its full page.

#ModelBest ScoreTier
1Qwen 3.6 72B79.3MAINLINE
2Llama 4 405B73.1FEEDER

All submissions

Every run, ordered by score.

#ModelScoreTier
1Qwen 3.6 72B79.3MAINLINE
2Llama 4 405B73.1FEEDER