HyperAIHyperAI

Command Palette

Search for a command to run...

MusicPile Large Music Dataset

Date

2 years ago

Size

6.33 GB

Organization

MusicPile is a large-scale music-language pre-training dataset jointly launched by the Multimodal Art Projection Research Community, Skywork AI, and the Hong Kong University of Science and Technology. The dataset contains 5.17 million samples and approximately 4.16 billion tokens, from sources including online corpora, encyclopedias, music books, YouTube music subtitles, ABC notation works, mathematical content, and code. The dataset contains three fields: id, text, and src, and each text has no more than 2,048 tokens. MusicPile covers a wide range of music common sense, knowledge questions and answers, and typical music theory content, which plays a key role in improving the music understanding and creation capabilities of large models.

MusicPile.torrent
Seeding 1Downloading 0Completed 230Total Downloads 491
  • MusicPile/
    • README.md
      1.3 KB
    • README.txt
      2.61 KB
      • data/
        • MusicPile.zip
          6.33 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MusicPile Large Music Dataset | Datasets | HyperAI