HyperAIHyperAI

Command Palette

Search for a command to run...

Updesh Indic Synthetic Text Dataset

Date

3 months ago

Size

16.09 GB

Organization

Microsoft

Updesh is an Indian language synthetic text dataset released by Microsoft in 2025 to facilitate post-training of Large Language Models (LLMs) for Indian languages.

The dataset contains 6,800,000 inference data and 2,100,000 generated data in the following languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Nepali, Odia, Punjabi, Tamil, Telugu, and Urdu.

Updesh_beta.torrent
Seeding 1Downloading 0Completed 62Total Downloads 81
  • Updesh_beta/
    • README.md
      1.2 KB
    • README.txt
      2.4 KB
      • data/
        • Updesh_beta.zip
          16.09 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Updesh Indic Synthetic Text Dataset | Datasets | HyperAI