Meta? The one that released Llama 3.3? The one that actually publishes its work? What are you talking about?
Why is it so hard to believe that deepseek is just yet another amazing paper in a long line of research done by everyone. Just because it’s Chinese? Everyone will adapt to this amazing innovation and then stagnate and throw compute at it until the next one. That’s how research works.
Not to mention China has billions of more people to establish a research community…
Well I think you actually need to train a "discriminator" model on rationality tests. Probably an encoder only model like BERT just to assign a score to thoughts. Then you do monte carlo tree search.