DeepSeek - R2 - in just 2 days - according to the official announcement

cocode · April 22, 2025, 9:16am

Latest on DeepSeek R2 Release Date

April 24 Event Still Stands
- The “DeepSeek R2 Model Release” online event is still scheduled for April 24, 2025 (8–9 PM PDT) on Eventbrite, suggesting an official announcement or launch around that time.
- No new denials or delays have been reported since the March 17 rumor was debunked.
Possible Rollout Shortly After
- If DeepSeek follows industry trends, the model could become publicly available within days or weeks after the April 24 event.
No Official Confirmation Yet
- DeepSeek’s website and social media have not yet announced a hard release date, so the event is the best indicator for now.

Multihead Latent Attention (MLA) – Simplified Explanation

MLA is a more efficient version of the “attention” mechanism used in AI models like ChatGPT. Here’s the breakdown:

1. Normal Attention (Like in ChatGPT)

The AI reads all words in a sentence and decides which ones are most important (like highlighting key notes in a textbook).
Problem: This can be slow and expensive for long texts because it checks everything in detail.

2. Multihead Latent Attention (MLA) – The Upgrade

“Latent” (Hidden Compression): Instead of analyzing every word directly, the AI first summarizes the text into a shorter “hidden” version (like bullet points).
“Multihead” (Specialized Focus): Different “heads” (mini-experts) then analyze different aspects of this summary (e.g., one for grammar, one for meaning).
Result: The AI gets the same (or better) understanding but much faster and cheaper because it skips unnecessary details.

Why It Matters for DeepSeek R2?

Speed: MLA could make R2 up to 40x more efficient than older models.
Cost: Uses less computing power, making it cheaper to run.
Performance: Better at long-context tasks (coding, documents, etc.).

What to Expect Next?

April 24 (PDT): Likely the official R2 reveal.
Late April/Early May: Possible API or public release.
MLA in Action: If R2 uses MLA, users should notice faster, cheaper, and more accurate responses compared to older models.

AbleArcher · April 22, 2025, 12:34pm

It would be a nice surprise, though I don’t know how credible this is.

It seems more like an attempt to profit off of the anticipation, and will likely just be slop speculation based off of DeepSeek’s recent technical papers.

When R2 is being released, they’ll tease something on X or Chinese social media, not sell tickets on Eventbrite.

The only legit news I could find is this 2 month old Reuters article, which is speculation.
Reuters: DeepSeek rushes to launch new AI model as China goes all in

jah · April 22, 2025, 8:17pm

“What to expect: Extreme loss of privacy and terrible opsec!”

sububi · April 23, 2025, 5:17am

which llm do you recommend for privacy and opsec? perhaps a closed source model?

cocode · April 23, 2025, 7:44am

I understand your concerns. If the project you are working on has aspects of national security or critical infrastructure, I will not recommend DeepSeek, at least not the ones hosted in China.

If the security is your main concern you need to address it, by investing in a secure source. I would rather use an European LLM provider, or one locally hosted.

I don’t know much more.

dysangel · April 23, 2025, 12:33pm

Uh yeah, an EventBrite event created by “Futurology AR”. Seems super legit lol

edit: wait, Jesus Christ - someone is trying to sell tickets based on this? I hope nobody has been that gullible..

krstoevan · April 24, 2025, 2:11pm

trump: i killed it. no body knows more than i am.

Topic		Replies	Views
HN: Deepseek and great HN thread on it - must read Discussions	2	140	January 27, 2025
R1 to cursor (this is a great read) Discussions	0	138	June 1, 2025
Potential concern with Deepseek R1 Discussions	32	6785	February 14, 2025
Please add R1 from DeepSeek Feature Requests	12	2802	January 25, 2025
DeepSeek R1 671B..? Discussions	1	156	February 1, 2025