we’ve been tracking the deepseek threads extensively in LS. related reads:
i consider the deepseek v3 paper required preread GitHub - deepseek-ai/DeepSeek-V3
R1 + Sonnet > R1 or O1 or R1+R1 or O1+Sonnet or any other combo R1+Sonnet set SOTA on aider’s polyglot benchmark | aider
independent repros: 1) Notion – The all-in-one workspace for your notes, tasks, wikis, and databases. 2) https://buttondown.com/ainews/archive/ainews-tinyzero-reprod… 3) x.com
R1 distillations are going to hit us every few days - because it’s ridiculously easy (<$400, <48hrs) to improve any base model with these chains of thought eg with Sky-T1 recipe (writeup https://buttondown.com/ainews/archive/ainews-bespoke-stratos… , 23min interview w team https://www.youtube.com/watch?v=jrf76uNs77k)
i probably have more resources but dont want to spam - seek out the latent space discord if you want the full stream i pulled these notes from
This article linked in the htread is great:
and the Latent Space mailings are top.
Fantastic thread.