Recently, DeepSeek announced their latest model, R1, and article after article came out praising its performance relative to cost, and how the release of such open-source models could genuinely change the course of LLMs forever. That is really exciting! And also, too big of a scope to write about… but when a model like DeepSeek […]
The post I Tried Making my Own (Bad) LLM Benchmark to Cheat in Escape Rooms appeared first on Towards Data Science.
