Llama 3.1 405B is published

Orpheus · July 23, 2024, 4:09pm

(generated by gpt4o and i’m not sure if 100% accurate)

Parameters: 405 billion
Training Tokens: 15.6 trillion
Compute Used: Approximately 3.8e25 floating-point operations (FLOP)
Training Duration: Approximately 72 days
GPU Usage: Trained using a cluster of 16,000 H100 GPUs
GPU Performance: Each GPU performed 380 teraFLOP per second
Compute Intensity: Around twice the compute of GPT-4
Status: Most compute-intensive open-weight model to date
Multimodal Capabilities: Supports integration of image, video, and speech inputs using a compositional approach
Release and Licensing: Publicly released under Llama 3 Community License

gpickett00 · July 23, 2024, 5:02pm

Is there a way to use it inside of Cursor yet?

deanrie · July 24, 2024, 3:55am

Kirai · July 25, 2024, 8:25am

ssmits · July 25, 2024, 8:41pm

Openrouter will work however it is still not possible to easily switch between API calls and calls linked to the Pro plan (the 500). @truell20

Orpheus · July 29, 2024, 1:13pm

Curious if anyone that used for complex tasks in Cursor can report about its performance.

Topic		Replies	Views
Groq and Llama3 Discussions	12	2740	September 19, 2024
Meta llama code model Feedback	3	1929	August 26, 2023
Use other APIs instead of OpenAI Feature Requests	2	809	September 8, 2023
Llama 3.3 70B is better than Claude and GPT-4o for tool use Discussions	1	391	January 27, 2025
New model! (gpt-4o) Discussions	9	3595	May 21, 2024