The Paradox of VibeThinker:3B — SOTA Reasoning, Futile Agency
Why does a 3B parameter model that outperforms frontier models on AIME and LiveCodeBench fail entirely at agentic coding?
See all tags
1 post in total
Why does a 3B parameter model that outperforms frontier models on AIME and LiveCodeBench fail entirely at agentic coding?