Llama 3.1 405B is published

Key Facts about Llama 3.1 405B

(generated by gpt4o and i’m not sure if 100% accurate)

  • Parameters: 405 billion
  • Training Tokens: 15.6 trillion
  • Compute Used: Approximately 3.8e25 floating-point operations (FLOP)
  • Training Duration: Approximately 72 days
  • GPU Usage: Trained using a cluster of 16,000 H100 GPUs
  • GPU Performance: Each GPU performed 380 teraFLOP per second
  • Compute Intensity: Around twice the compute of GPT-4
  • Status: Most compute-intensive open-weight model to date
  • Multimodal Capabilities: Supports integration of image, video, and speech inputs using a compositional approach
  • Release and Licensing: Publicly released under Llama 3 Community License

Benchmark Performance Comparison

Benchmark GPT-4o Llama 3.1-405B
BoolQ 0.905 0.921
GSM8K 0.942 0.968
HellaSwag 0.891 0.920
HumanEval 0.921 0.854
MMLU_Humanities 0.802 0.818
MMLU_Other 0.872 0.875
MMLU_Social_Sciences 0.913 0.898
MMLU_STEM 0.696 0.831
OpenBookQA 0.882 0.908
PIQA 0.844 0.874
Social IQA 0.790 0.797
TruthfulQA_MC1 0.825 0.800
Winogrande 0.822 0.867
3 Likes

Is there a way to use it inside of Cursor yet?

2 Likes

You can use the OpenRouter Meta: Llama 3.1 405B Instruct by meta-llama – Run with a standardized API | OpenRouter

I think this isn’t in the released model. https://www.perplexity.ai/search/recently-released-llama-3-1-40-mXuB3I9BTuqCSZtlPFeBjg

Openrouter will work however it is still not possible to easily switch between API calls and calls linked to the Pro plan (the 500). @truell20

3 Likes

Curious if anyone that used for complex tasks in Cursor can report about its performance.