Automated translation metrics summarized for 335 language directions.
Sources — Source segments come from curated, public-facing parallel corpora and benchmark-style collections mixed for broad coverage. They are not model-generated: originals and references are assembled upstream as parallel text only; this dashboard does not redistribute raw corpora.
Mixed parallel corpora — Parallel segments spanning diverse genres and corpus scales. Each item includes a reference translation that is certified and human-verified—not produced by the models being benchmarked.
Modeli | fluency_v2.0 ranki | Cost ranki | Provideri | fluency_v2.0i | chrFi | BLEUi | COMETi | sacreBLEUi | Leni | Cost/1ki | Badgesi |
|---|---|---|---|---|---|---|---|---|---|---|---|
algebras_router_agentic | 1 | 1 | Algebras | 5.00 | 32.3 | 0.0 | 0.863 | 0.0 | 0.02 | $0.001algebras.ai/pricing | ✨Best fluency💰Cheapest |
Gemini 3 Flash (thinking-minimal) google/gemini-3-flash-preview | 1 | 2 | 5.00 | 28.3 | 2.4 | 0.868 | 9.2 | 0.06 | $0.005 | ✨Best fluency | |
Gemini 3 Flash google/gemini-3-flash-preview | 1 | 2 | 5.00 | 28.4 | 0.0 | 0.863 | 0.0 | 0.02 | $0.005 | ✨Best fluency | |
Seed 2.0 (ByteDance) bytedance-seed/seed-2.0-lite | 10 | 4 | ByteDance | 4.87 | 22.3 | 0.0 | 0.870 | 0.0 | 0.02 | $0.010 | |
Qwen 3.5+ (max class) qwen/qwen3.5-plus-02-15 | 12 | 5 | Alibaba | 4.78 | 28.7 | 2.4 | 0.860 | 9.2 | 0.06 | $0.012 | |
This release reports translation quality by language pair: medians and spread of automated scores (fluency, COMET, BLEU, sacreBLEU, length ratio) aggregated across evaluated directions. Run metadata describes recipes, metrics, and segment counts.