Leaderboards/Language Models
๐Ÿง 
LEADERBOARD

Language Models

Large language models for text generation, reasoning, and analysis

15tools ranked

Language Models Rankings

Ranked by overall ToolRoute Score across all benchmark dimensions

RankTool NameToolRoute ScoreOutputReliabilityEfficiencyCostTrustStars
๐Ÿฅ‡GPT-4oOfficial92.094.090.085.045.095.052,000
๐ŸฅˆClaude 3.5 SonnetOfficial91.093.092.088.050.096.028,000
๐Ÿฅ‰Claude 3 OpusOfficial90.092.090.075.035.096.028,000
#4OpenAI MCPOfficial88.090.086.082.045.092.08,500
#5Gemini ProOfficial88.088.086.090.055.092.018,000
#6Anthropic MCPOfficial87.090.088.080.050.093.06,200
#7Mistral LargeOfficial86.086.084.088.060.088.012,000
#8DeepSeek V385.087.082.090.090.078.045,000
#9Llama 384.085.082.092.095.080.065,000
#10Command R+Official83.082.084.080.065.086.08,000
#11Qwen 2.582.084.080.088.092.076.032,000
#12Gemini FlashOfficial82.080.084.095.080.090.018,000
#13Grok-2Official80.082.078.084.055.082.09,000
#14Yi-Large79.080.078.086.088.074.07,000
#15Phi-4Official78.076.080.094.095.085.015,000

Score Guide

9.0+ Exceptional
8.0+ Excellent
7.0+ Good
6.0+ Fair
<6.0 Below Average

Contribute Benchmark Data

Help improve these rankings by submitting real-world telemetry. Contributors earn routing credits for every data point.