Command Palette
Search for a command to run...
Hugo Touvron; Thibaut Lavril; Gautier Izacard; Xavier Martinet; Marie-Anne Lachaux; Timothée Lacroix; Baptiste Rozière; Naman Goyal; Eric Hambro; Faisal Azhar; Aurelien Rodriguez; Armand Joulin; Edouard Grave; Guillaume Lample

Abstract
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.
Code Repositories
vcskaushik/LLMzip
pytorch
Mentioned in GitHub
icalk-nlp/educhat
pytorch
Mentioned in GitHub
abhaskumarsinha/Corpus2GPT
pytorch
kayvr/token-hawk
pytorch
Mentioned in GitHub
teelinsan/camoscio
pytorch
Mentioned in GitHub
krafton-ai/korani
pytorch
Mentioned in GitHub
akanyaani/miniLLAMA
pytorch
beomi/koalpaca
pytorch
Mentioned in GitHub
chaoyi-wu/finetune_llama
jax
Mentioned in GitHub
freedomintelligence/huatuogpt
pytorch
Mentioned in GitHub
phoebussi/alpaca-cot
pytorch
Mentioned in GitHub
yuanmu97/secure-transformer-inference
pytorch
Mentioned in GitHub
facebookresearch/chai
pytorch
Mentioned in GitHub
Mind23-2/MindCode-140
mindspore
kbressem/medalpaca
pytorch
Mentioned in GitHub
xusenlinzy/api-for-open-llm
pytorch
Mentioned in GitHub
facebookresearch/llama
Official
pytorch
Mentioned in GitHub
aethercortex/llama-x
pytorch
Mentioned in GitHub
guinmoon/llmfarm
Mentioned in GitHub
ganjinzero/rrhf
pytorch
Mentioned in GitHub
ohadrubin/rpt
jax
Mentioned in GitHub
squeezeailab/squeezellm
pytorch
Mentioned in GitHub
qwopqwop200/GPTQ-for-LLaMa
pytorch
Mentioned in GitHub
tatsu-lab/stanford_alpaca
pytorch
Mentioned in GitHub
stanfordbdhg/llama.cpp
Mentioned in GitHub
replicate/cog_stanford_alpaca
pytorch
Mentioned in GitHub
zihanzhaosjtu/librisqa
Mentioned in GitHub
huggingface/transformers
pytorch
Mentioned in GitHub
ggerganov/llama.cpp
pytorch
Mentioned in GitHub
ggml-org/llama.cpp
pytorch
Mentioned in GitHub
aozhongzhang/magr
pytorch
Mentioned in GitHub
fsoft-ai4code/codecapybara
pytorch
Mentioned in GitHub
young-geng/easylm
jax
Mentioned in GitHub
grantslatton/llama.cpp
Mentioned in GitHub
chaoyi-wu/pmc-llama
pytorch
Mentioned in GitHub
ecolab-postech/owq
pytorch
Mentioned in GitHub
meta-llama/llama
pytorch
batsresearch/alfred
pytorch
Mentioned in GitHub
llamafamily/llama-chinese
pytorch
Mentioned in GitHub
Lightning-AI/lit-llama
pytorch
ntunlplab/traditional-chinese-alpaca
pytorch
Mentioned in GitHub
hamishivi/easylm
jax
Mentioned in GitHub
flagalpha/llama2-chinese
pytorch
Mentioned in GitHub
MS-P3/code5/tree/main/llama
mindspore
longhao-chen/aicas2024
pytorch
Mentioned in GitHub
fajri91/indommlu
pytorch
Mentioned in GitHub
ofa-sys/expertllama
pytorch
Mentioned in GitHub
ecnu-icalk/educhat
pytorch
Mentioned in GitHub
greenbitai/low_bit_llama
pytorch
Mentioned in GitHub
facico/chinese-vicuna
pytorch
Mentioned in GitHub
xvyaward/owq
pytorch
Mentioned in GitHub
xiaoman-zhang/PMC-VQA
pytorch
Mentioned in GitHub
MS-P3/code5/tree/main/llama2
mindspore
xzhang97666/alpacare
Mentioned in GitHub
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| arithmetic-reasoning-on-gsm8k | LLaMA 13B | Accuracy: 17.8 Parameters (Billion): 13 |
| arithmetic-reasoning-on-gsm8k | LLaMA 33B-maj1@k | Accuracy: 53.1 Parameters (Billion): 33 |
| arithmetic-reasoning-on-gsm8k | LLaMA 7B | Accuracy: 11.0 Parameters (Billion): 7 |
| arithmetic-reasoning-on-gsm8k | LLaMA 33B | Accuracy: 35.6 Parameters (Billion): 33 |
| arithmetic-reasoning-on-gsm8k | LLaMA 7B (maj1@k) | Accuracy: 18.1 Parameters (Billion): 7 |
| arithmetic-reasoning-on-gsm8k | LLaMA 65B | Accuracy: 50.9 Parameters (Billion): 65 |
| arithmetic-reasoning-on-gsm8k | LLaMA 13B-maj1@k | Accuracy: 29.3 Parameters (Billion): 13 |
| arithmetic-reasoning-on-gsm8k | LLaMA 65B-maj1@k | Accuracy: 69.7 Parameters (Billion): 65 |
| code-generation-on-mbpp | LLaMA 33B (0-shot) | Accuracy: 30.2 |
| code-generation-on-mbpp | LLaMA 13B (0-shot) | Accuracy: 22 |
| code-generation-on-mbpp | LLaMA 65B (0-shot) | Accuracy: 37.7 |
| code-generation-on-mbpp | LLaMA 7B (0-shot) | Accuracy: 17.7 |
| common-sense-reasoning-on-arc-challenge | LLaMA 65B (zero-shot) | Accuracy: 56.0 |
| common-sense-reasoning-on-arc-challenge | LLaMA 7B (zero-shot) | Accuracy: 47.6 |
| common-sense-reasoning-on-arc-challenge | LLaMA 13B (zero-shot) | Accuracy: 52.7 |
| common-sense-reasoning-on-arc-challenge | LLaMA 33B (zero-shot) | Accuracy: 57.8 |
| common-sense-reasoning-on-arc-easy | LLaMA 13B (0-shot) | Accuracy: 74.8 |
| common-sense-reasoning-on-arc-easy | LLaMA 7B (0-shot) | Accuracy: 72.8 |
| common-sense-reasoning-on-arc-easy | LLaMA 33B (0-shot) | Accuracy: 80.0 |
| common-sense-reasoning-on-arc-easy | LLaMA 65B (0-shot) | Accuracy: 78.9 |
| common-sense-reasoning-on-winogrande | LLaMA 13B (0-shot) | Accuracy: 73.0 |
| common-sense-reasoning-on-winogrande | LLaMA 33B (0-shot) | Accuracy: 76.0 |
| common-sense-reasoning-on-winogrande | LLaMA 7B (0-shot) | Accuracy: 70.1 |
| common-sense-reasoning-on-winogrande | LLaMA 65B (0-shot) | Accuracy: 77.0 |
| few-shot-learning-on-medconceptsqa | meta-llama/Meta-Llama-3-8B-Instruct | Accuracy: 25.653 |
| math-word-problem-solving-on-math | LLaMA 13B | Accuracy: 3.9 Parameters (Billions): 13 |
| math-word-problem-solving-on-math | LLaMA 13B-maj1@k | Accuracy: 8.8 Parameters (Billions): 13 |
| math-word-problem-solving-on-math | LLaMA 7B | Accuracy: 2.9 Parameters (Billions): 7 |
| math-word-problem-solving-on-math | LLaMA 7B-maj1@k | Accuracy: 6.9 Parameters (Billions): 7 |
| math-word-problem-solving-on-math | LLaMA 65B | Accuracy: 10.6 Parameters (Billions): 65 |
| math-word-problem-solving-on-math | LLaMA 33B | Accuracy: 7.1 Parameters (Billions): 33 |
| math-word-problem-solving-on-math | LLaMA 65B (maj1@k) | Accuracy: 20.5 Parameters (Billions): 65 |
| math-word-problem-solving-on-math | LLaMA 33B-maj1@k | Accuracy: 15.2 Parameters (Billions): 33 |
| multi-task-language-understanding-on-mmlu | LLaMA 65B (fine-tuned) | Average (%): 68.9 |
| multi-task-language-understanding-on-mmlu | LLaMA 65B (5-shot) | Average (%): 63.4 |
| multi-task-language-understanding-on-mmlu | LLaMA 33B (5-shot) | Average (%): 57.8 |
| question-answering-on-boolq | LLaMA 7B (zero-shot) | Accuracy: 76.5 |
| question-answering-on-boolq | LLaMA 65B (0-shot) | Accuracy: 85.3 |
| question-answering-on-boolq | LLaMA 33B (0-shot) | Accuracy: 83.1 |
| question-answering-on-boolq | LLaMA 13B (zero-shot) | Accuracy: 78.1 |
| question-answering-on-natural-questions | LLaMA 65B (few-shot, k=5) | EM: 35.0 |
| question-answering-on-natural-questions | LLaMA 65B (few-shot, k=64) | EM: 39.9 |
| question-answering-on-natural-questions | LLaMA 33B (zero-shot) | EM: 24.9 |
| question-answering-on-natural-questions | LLaMA 65B (one-shot) | EM: 31.0 |
| question-answering-on-obqa | LLaMA 7B (zero-shot) | Accuracy: 57.2 |
| question-answering-on-obqa | LLaMA 13B (zero-shot) | Accuracy: 56.4 |
| question-answering-on-obqa | LLaMA 65B (zero-shot) | Accuracy: 60.2 |
| question-answering-on-obqa | LLaMA 33B (zero-shot) | Accuracy: 58.6 |
| question-answering-on-piqa | LLaMA 33B (0-shot) | Accuracy: 82.3 |
| question-answering-on-piqa | LLaMA 7B (0-shot) | Accuracy: 79.8 |
| question-answering-on-piqa | LLaMA 13B (0-shot) | Accuracy: 80.1 |
| question-answering-on-piqa | LLaMA 65B (0-shot) | Accuracy: 82.8 |
| question-answering-on-social-iqa | LLaMA 13B (zero-shot) | Accuracy: 50.4 |
| question-answering-on-social-iqa | LLaMA 7B (zero-shot) | Accuracy: 48.9 |
| question-answering-on-social-iqa | LLaMA 65B (zero-shot) | Accuracy: 52.3 |
| question-answering-on-social-iqa | LLaMA 33B (zero-shot) | Accuracy: 50.4 |
| question-answering-on-timequestions | Llama3 | P@1: 17.8 |
| question-answering-on-triviaqa | LLaMA 65B (few-shot, k=64) | EM: 73.0 |
| question-answering-on-triviaqa | LLaMA 65B (one-shot) | EM: 71.6 |
| question-answering-on-triviaqa | LLaMA 65B (few-shot, k=5) | EM: 72.6 |
| question-answering-on-triviaqa | LLaMA 65B (zero-shot) | EM: 68.2 |
| question-answering-on-truthfulqa | LLaMA 65B | % info: 53 % true: 57 |
| question-answering-on-truthfulqa | LLaMA 7B | % info: 29 % true: 33 |
| question-answering-on-truthfulqa | LLaMA 13B | % info: 41 % true: 47 |
| question-answering-on-truthfulqa | LLaMA 33B | % info: 48 % true: 52 |
| reading-comprehension-on-race | LLaMA 33B (zero-shot) | Accuracy (High): 48.3 Accuracy (Middle): 64.1 |
| reading-comprehension-on-race | LLaMA 65B (zero-shot) | Accuracy (High): 51.6 Accuracy (Middle): 67.9 |
| reading-comprehension-on-race | LLaMA 7B (zero-shot) | Accuracy (High): 46.9 Accuracy (Middle): 61.1 |
| reading-comprehension-on-race | LLaMA 13B (zero-shot) | Accuracy (High): 47.2 Accuracy (Middle): 61.6 |
| stereotypical-bias-analysis-on-crows-pairs | LLaMA 65B | Age: 70.1 Disability: 66.7 Gender: 70.6 Nationality: 64.2 Overall: 66.6 Physical Appearance: 77.8 Race/Color: 57.0 Religion: 70.6 Sexual Orientation: 81.0 Socioeconomic status: 71.5 |
| zero-shot-learning-on-medconceptsqa | meta-llama/Meta-Llama-3-8B-Instruct | Accuracy: 25.840 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.
AI Co-coding
Ready-to-use GPUs
Best Pricing
Hyper Newsletters
Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp