HyperAIHyperAI

Command Palette

Search for a command to run...

WikiLinks Wikipedia Link Dataset

Date

3 years ago

Size

1.71 GB

Organization

Publish URL

code.google.com

License

CC BY-NC-SA 3.0

Featured Image

WikiLinks is a dataset that searches the full text of Wikipedia by paragraph, phrase, or part of the paragraph itself. The dataset considers each page on Wikipedia as representing an entity (or concept or idea), based on hyperlinks found from web searches, and uses anchor text as mentions, which can eventually provide large-scale labeled data without manual manipulation.

The dataset includes:

  • Nearly 1.9 billion words from more than 4 million articles
  • 40 million references to 3 million entities
  • 10 compressed text files data-0000[0-9]-of-00010.gz.

This dataset was created on September 29, 2012

WikiLinks.torrent
Seeding 2Downloading 0Completed 694Total Downloads 695
  • WikiLinks/
    • README.md
      1.33 KB
    • README.txt
      2.67 KB
      • data/
        • README.txt
          6.86 KB
        • data-00000-of-00010.gz
          175.01 MB
        • data-00001-of-00010.gz
          350.24 MB
        • data-00002-of-00010.gz
          525.45 MB
        • data-00003-of-00010.gz
          700.97 MB
        • data-00004-of-00010.gz
          875.93 MB
        • data-00005-of-00010.gz
          1.03 GB
        • data-00006-of-00010.gz
          1.2 GB
        • data-00007-of-00010.gz
          1.37 GB
        • data-00008-of-00010.gz
          1.54 GB
        • data-00009-of-00010.gz
          1.71 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp