[ agent + llm stack leaderboard ]

updated / 0 stacks / about_scoring →
new Opus 4.8 released: SWE-bench 88.6%
all_stacks
coding_agents
by_llm
open_source
image
video
voice
# agent + llm score ▼ ctx cost_i/o tier