Groq and Llama3

What are your opinions on llama3 70b? Also groq team made something incredible i guess. Their response speed is unbelievable right now. If you do not get into queue, you will have full thousands of tokens output in 2 seconds.

What are your thoughts on this?

And also there are rumors about gpt5 coming in Apr 22 :thinking:

2 Likes

we will improve prompting and let you use api keys there. will look into a partnership with them too!

5 Likes

They just released 2024-04-09 as a buffer for the GPT5 anticipation. Not going to happen anytime this month, or the month after for sure…

1 Like

Yes we want Llama3 8b instruct on Groq w 800 tokens/s.

It’s very close to gpt-4 level. Not huge context, but Cursor covers this already.

That would be an INSANE upswing for Cursor. I catch myself sometimes just going there, instead of cursor, or web supported AI like Perplexity. So far I always got my answer.

Can you share where you got that from? Looking at the coding category on https://chat.lmsys.org/?leaderboard it seems pretty meh. Good for it’s size, especially since it’s open, but nowhere near gpt-4 from what I can see. Would love to be convinced otherwise

Read it somewhere on twitter. Did a few test on groq.com, and did very well. Groq has no instruct, unfortunately.

Testing right now in cursor, will know in some days.

Of course, apply does not work, which slowly make me f*cking angry :slight_smile:

Already 6th place for coding, way less votes, among a bunch of GPT-4s.

gpt5 propably will be something to do with agents etc. ofc it will be 5x better than gpt4 because of the data and training time. but i think llm plateau is close…

70b is good but nothing compared to gpt4 still. but its the most advanced open-source model in coding i think. at least according to my observations.

i didnt like the 8b. i tried llama3 8b Q6 on my local but it’s not even close to close-source models. not even haiku i think.

would be good to use groq llama3 70b as default, and then if not good enough answer uses gpt4 or claude

How much tokens does Llama3 use to browse/read a website homepage like https://www.clay.com?

Is there a simple way to evaluate the number of tokens needed for a certain task?

Groq just announced a set of SOTA fine-tuned Llama3 models for function calling. I would love to see these guys integrated into Cursor. x.com

Right now, a huge amount of my dev time is sitting waiting for changes to be applied to files after using the composer UI. If Groq could be used to speed up any part of this cycle, it could save me tons of time.

1 Like