vibethinker - whysooraj's Portfolio

Why does a 3B parameter model that outperforms frontier models on AIME and LiveCodeBench fail entirely at agentic coding?