WikiText Long Term Dependency Language Modeling Dataset Long Term Dependency Language Modeling Dataset

Date

3 years ago

Size

373.28 MB

Organization

Publish URL

www.salesforce.com

Tags

E-commerce

social contact

Research institutions

Natural Language Processing

Dataset Download

Join the Discord Community

The WikiText long-term reliance language modeling dataset contains 100 million English words, which come from Wikipedia's high-quality articles and benchmark articles.

The dataset is divided into two versions: WikiText-2 and WikiText-103. Compared with the PTB vocabulary, it is larger in scale and each word also retains the relevant original article, which is suitable for scenarios that require long-term reliance on natural language modeling.

This dataset was released by Salesforce Research in 2016, with the main publishers being Stephen Merity, Caiming Xiong, James Bradbury and Richard Socher. The related paper is "Pointer Sentinel Mixture Models".

WikiText Long Term Dependency Language Modeling Dataset.torrent

Seeding 4Downloading 0Completed 1,231Total Downloads 2,242

WikiText Long Term Dependency Language Modeling Dataset/
- README.md
  1.46 KB
- README.txt
  2.92 KB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

WikiText Long Term Dependency Language Modeling Dataset Long Term Dependency Language Modeling Dataset

Build AI with AI

Hyper Newsletters