Command Palette
Search for a command to run...

Abstract
Modern artificial intelligence (AI) systems are powered by foundation models.This paper presents a new set of foundation models, called Llama 3. It is aherd of language models that natively support multilinguality, coding,reasoning, and tool usage. Our largest model is a dense Transformer with 405Bparameters and a context window of up to 128K tokens. This paper presents anextensive empirical evaluation of Llama 3. We find that Llama 3 deliverscomparable quality to leading language models such as GPT-4 on a plethora oftasks. We publicly release Llama 3, including pre-trained and post-trainedversions of the 405B parameter language model and our Llama Guard 3 model forinput and output safety. The paper also presents the results of experiments inwhich we integrate image, video, and speech capabilities into Llama 3 via acompositional approach. We observe this approach performs competitively withthe state-of-the-art on image, video, and speech recognition tasks. Theresulting models are not yet being broadly released as they are still underdevelopment.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| answerability-prediction-on-peerqa | Llama-3-IT-8B-32k | Macro F1: 0.2881 |
| answerability-prediction-on-peerqa | Llama-3-IT-8B-8k | Macro F1: 0.3112 |
| multi-task-language-understanding-on-mmlu | Llama 3.1 8B (CoT) | Average (%): 73.0 |
| multi-task-language-understanding-on-mmlu | DBRX Instruct 132B (5-shot) | Average (%): 73.7 |
| question-answering-on-peerqa | Llama-3-IT-8B-8k | AlignScore: 0.1098 Prometheus-2 Answer Correctness: 3.1102 Rouge-L: 0.2295 |
| question-answering-on-peerqa | Llama-3-IT-8B-32k | AlignScore: 0.1016 Prometheus-2 Answer Correctness: 3.1673 Rouge-L: 0.2286 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.