Ranks upside down

Rank models by where they need to improve.

AI is doing real work. We collect honest failure signals so better LLM services can be built.

12 improvement signals. Models only.

We think AI is already useful, and we also think the weak spots should be visible. This page ranks public LLM models, not agent apps. Codex, Claude Code, OpenCode, OpenClaw, PAI and similar tools belong on a separate app page.

Top 4 user concerns

Vote here first

Other signals to watch

Still useful, just lower priority