HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
SOTA
Language Modelling

Language Modelling

Language modeling is the task of predicting the next word or character in a document, and trained language models can be applied to various natural language processing tasks such as text generation, text classification, and question answering. Since the 2010s, neural language models have replaced N-gram models, and after the 2020s, large language models (LLMs) have become the sole path to achieving state-of-the-art performance. The capabilities of these models are evaluated using metrics like cross-entropy and perplexity, with common datasets including WikiText-103, One Billion Word, Text8, C4, and The Pile.

Penn Treebank (Word Level)

GPT-3 (Zero-Shot)

GPT-2 (48 layers, h=1600)

Test-Time Fine-Tuning with SIFT + Llama-3.2 (3B)

SparseGPT (175B, 50% Sparsity)

GPT-3 175B (Few-Shot)

One Billion Word

OmniNetT (Large)

Penn Treebank (Character Level)

Mogrifier LSTM + dynamic eval

Transformer-XL + RMS dynamic eval

Spirit-LM (Expr.)

GLM-130B (3-shot)

FewCLUE (EPRSTMT)

Hybrid 4-gram VietMed-Train + ExtraText

FewCLUE (OCNLI-FC)

FewCLUE (CLUEWSC-FC)

FewCLUE (CHID-FC)

CLUE (CMRC2018)

CLUE (OCNLI_50K)

FewCLUE (BUSTM)

PubMed Cognitive Control Abstracts

PTB Diagnostic ECG Database

USPTO Backgrounds

Transformer-LS (small)

Gutenberg PG-19

PAR Transformer 24B

100 sleep nights of 8 caregivers

2000 HUB5 English

Arxiv HEP-TH citation graph

Curation Corpus

Transformer-LS (small)

Ethereum Phishing Transaction Network

language-modeling-recommendation

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
SOTA
Language Modelling

Language Modelling

Language modeling is the task of predicting the next word or character in a document, and trained language models can be applied to various natural language processing tasks such as text generation, text classification, and question answering. Since the 2010s, neural language models have replaced N-gram models, and after the 2020s, large language models (LLMs) have become the sole path to achieving state-of-the-art performance. The capabilities of these models are evaluated using metrics like cross-entropy and perplexity, with common datasets including WikiText-103, One Billion Word, Text8, C4, and The Pile.

Penn Treebank (Word Level)

GPT-3 (Zero-Shot)

GPT-2 (48 layers, h=1600)

Test-Time Fine-Tuning with SIFT + Llama-3.2 (3B)

SparseGPT (175B, 50% Sparsity)

GPT-3 175B (Few-Shot)

One Billion Word

OmniNetT (Large)

Penn Treebank (Character Level)

Mogrifier LSTM + dynamic eval

Transformer-XL + RMS dynamic eval

Spirit-LM (Expr.)

GLM-130B (3-shot)

FewCLUE (EPRSTMT)

Hybrid 4-gram VietMed-Train + ExtraText

FewCLUE (OCNLI-FC)

FewCLUE (CLUEWSC-FC)

FewCLUE (CHID-FC)

CLUE (CMRC2018)

CLUE (OCNLI_50K)

FewCLUE (BUSTM)

PubMed Cognitive Control Abstracts

PTB Diagnostic ECG Database

USPTO Backgrounds

Transformer-LS (small)

Gutenberg PG-19

PAR Transformer 24B

100 sleep nights of 8 caregivers

2000 HUB5 English

Arxiv HEP-TH citation graph

Curation Corpus

Transformer-LS (small)

Ethereum Phishing Transaction Network

language-modeling-recommendation

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)