New llama 4.0 10 million context

decsters01 · April 7, 2025, 8:11am

HelloI see several people complaining about context if 2.5 million is not reached and etc notice how the goal solved this

What do you think of these results and with 10 million context tokens it would be excellent for all our larger projects, what do you think?

Let’s vote in the hope that the cursor adds this fantastic template

yurkomik · April 7, 2025, 8:16pm

those 10m are good only for tasks that can be easily accomplished by vector search and knowledge graph. All models start losing context on complex questions after 8k of context. You should not use more than 8k of context if possible.

danperks · April 7, 2025, 10:38pm

From initial investigations, this doesn’t look to be much of a breakthrough model. The model performs very well on some benchmarks, but poorly on others, and we don’t believe the context window is genuinely functional, in that if you gave it 10m tokens of context, it wouldn’t have real understanding of all of them!

sdmat · April 8, 2025, 2:02am

Substantially not true for Gemini 2.5

oooooono · April 8, 2025, 6:41am

Hey bro, a lot of people on social media have been sharing their test feedback on LLaMA 4, and it seems like the response hasn’t been super positive. There’s also a ton of juicy gossip floating around—pretty entertaining stuff. If you’re interested, you should look it up; it’s quite a ride.

I haven’t actually tried or tested LLaMA 4 myself, because based on the general feedback, it doesn’t seem to hold up against the real-world experience of something like Gemini 2.5 Pro Exp. So I probably won’t rush to try it anytime soon. Let’s wait for the dust to settle—public opinion tends to be way more reliable than their own hype anyway.

DelicateAlchemy · April 8, 2025, 7:54am

There is a ‘sweet spot’ with Gemini 2.5 Pro though. I feel like after around the 200k mark, like into posting entire codebases, returns diminish significantly. Using the Google Studio UI via RepoPrompt here.

sdmat · April 8, 2025, 8:04am

Yeah, I’d agree with 200K as where you start to notice performance falling off.

It’s also half the per token cost <200K, so definitely better to stick to that where possible.

But having 200K of truly usable context is revolutionary - every other model falls off hard well before that.

DelicateAlchemy · April 8, 2025, 11:54am

Yeah, even though Claude 3.7 can supposedly take 250k, the drop off of sensibility in output after 75K is tangible.

Topic		Replies	Views
Meta llama code model Feedback	3	1921	August 26, 2023
Groq and Llama3 Discussions	12	2713	September 19, 2024
Integrate Llama 4 Feature Requests	9	1048	May 7, 2025
Gemini 2.5 Pro's paid version is released! Feature Requests	6	1865	April 8, 2025
Is the "start a new chat for better results" still needed for gemini-2.5-pro? Discussions	2	315	May 5, 2025

New llama 4.0 10 million context

Related topics