Interactive tools
Six tools for researching, comparing, and pricing AI models. All data sourced from official provider documentation.
Calculator
Cost Calculator
Enter your token volumes and workload mix. Get a monthly cost estimate per model with caching and batch discounts applied.
Recommender
Model Recommender
Answer a few questions about your task, budget, and quality bar. Get a ranked shortlist of models that fit your requirements.
Compare
Side-by-side Compare
Pick any two or more of the 19 indexed models. Pricing, benchmarks, context windows, and capability ratings side by side.
Charts
Benchmark Charts
Visual comparison of SWE-bench, MMLU, GPQA, and pricing across the full model index. Sortable and filterable.
Tracker
Model Tracker
Live status of announced, available, and deprecated models. Useful for keeping your integration plans current.
Timeline
Release Timeline
Chronological history of every major model release across OpenAI, Anthropic, Google, Meta, Mistral, and the open-weight field.
Also useful
- Pricing index → — Input, output, and caching rates for all 19 models.
- Rankings → — Models ranked by the benchr Rating composite score.
- Price per use case → — Cheapest model for chat, coding, RAG, agents, and classification.
- How to reduce token usage → — Caching, batching, and routing tactics.
When each tool is useful
Start with Rankings when you need a broad shortlist. Move to Compare when the decision is between two or three models and the trade-off is not obvious from a single benchmark. Use Calculator last, after you know your expected token mix, because small differences in cache rate and output length can change the bill more than the headline input price.
The chart and tracker pages are for maintenance work: checking whether a model still fits your benchmark target, watching release cadence, and spotting when a cheaper tier has caught up enough to replace a frontier model. The goal is not to crown one permanent winner, but to make model selection repeatable as prices and scores move.
Data policy
Tool data comes from the shared benchr model index and is reviewed against provider documentation. Pricing, context windows, release dates, and model IDs are treated as factual fields. Composite ratings and some benchmark fills are editorial estimates where providers do not publish directly comparable numbers; those estimates are documented in Methodology.