Language Modelling On The Pile

评估指标

Bits per byte

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
GPT-2 Small 124M (pre-trained)1.2253The Pile: An 800GB Dataset of Diverse Text for Language Modeling
GPT-2 Medium 355M (pre-trained)1.0928The Pile: An 800GB Dataset of Diverse Text for Language Modeling
GPT-2 Large 774M (pre-trained)1.0828The Pile: An 800GB Dataset of Diverse Text for Language Modeling
GPT-2 XL 1.5B (pre-trained)1.0468The Pile: An 800GB Dataset of Diverse Text for Language Modeling
GPT-3 Ada 350M (pre-trained)0.9631The Pile: An 800GB Dataset of Diverse Text for Language Modeling
GPT-3 Babbage 1.3B (pre-trained)0.8718The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Test-Time Fine-Tuning with SIFT + GPT-2 (124M)0.862Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
GPT-2 Large 774M (test-time training on nearest neighbors)0.85Test-Time Training on Nearest Neighbors for Large Language Models
Llama-3.2-Instruct 1B0.807Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
GPT-3 Curie 6.7B (pre-trained)0.7980The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Test-Time Fine-Tuning with SIFT + GPT-2 (774M)0.762Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
GPT-30.742GLM-130B: An Open Bilingual Pre-trained Model
Llama-3.2-Instruct 3B0.737Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
Gemma-2 2B0.721Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
GPT-3 Davinci 175B (pre-trained)0.7177The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Llama-3.2 1B0.697Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
Phi-3 3.8B0.679Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
Phi-3 7B0.678Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
Gemma-2 9B0.670Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
Phi-3 14B0.651Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
0 of 39 row(s) selected.
Language Modelling On The Pile | SOTA | HyperAI超神经