Leaderboard · Coding Capabilities

AI Coding Models Leaderboard

Ranked comparison of Large Language Model APIs on coding benchmarks. Sourced from official docs, sorted by SWE-bench Verified score.

Data from models.json Data-driven and neutral
Rank Model Provider SWE-bench Verified HumanEval Blended $/1M
Loading leaderboard rankings...

Methodology

This leaderboard ranks models based on **SWE-bench Verified**, the gold standard benchmark for resolving real-world GitHub issues. Where providers have not officially published their SWE-bench scores, we print estimated or fallback scores (marked with *est* or *—*).