HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

The Llama 3 Herd of Models

The Llama 3 Herd of Models

Abstract

Modern artificial intelligence (AI) systems are powered by foundation models.This paper presents a new set of foundation models, called Llama 3. It is aherd of language models that natively support multilinguality, coding,reasoning, and tool usage. Our largest model is a dense Transformer with 405Bparameters and a context window of up to 128K tokens. This paper presents anextensive empirical evaluation of Llama 3. We find that Llama 3 deliverscomparable quality to leading language models such as GPT-4 on a plethora oftasks. We publicly release Llama 3, including pre-trained and post-trainedversions of the 405B parameter language model and our Llama Guard 3 model forinput and output safety. The paper also presents the results of experiments inwhich we integrate image, video, and speech capabilities into Llama 3 via acompositional approach. We observe this approach performs competitively withthe state-of-the-art on image, video, and speech recognition tasks. Theresulting models are not yet being broadly released as they are still underdevelopment.

Code Repositories

zhuzilin/ring-flash-attention
pytorch
Mentioned in GitHub
wenet-e2e/west
pytorch
Mentioned in GitHub
zechenli03/sensorllm
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
answerability-prediction-on-peerqaLlama-3-IT-8B-32k
Macro F1: 0.2881
answerability-prediction-on-peerqaLlama-3-IT-8B-8k
Macro F1: 0.3112
multi-task-language-understanding-on-mmluLlama 3.1 8B (CoT)
Average (%): 73.0
multi-task-language-understanding-on-mmluDBRX Instruct 132B (5-shot)
Average (%): 73.7
question-answering-on-peerqaLlama-3-IT-8B-8k
AlignScore: 0.1098
Prometheus-2 Answer Correctness: 3.1102
Rouge-L: 0.2295
question-answering-on-peerqaLlama-3-IT-8B-32k
AlignScore: 0.1016
Prometheus-2 Answer Correctness: 3.1673
Rouge-L: 0.2286

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
The Llama 3 Herd of Models | Papers | HyperAI