HyperAIHyperAI

Command Palette

Search for a command to run...

IEPile Large-Scale Information Extraction Corpus 

Date

2 years ago

Size

1.83 MB

Organization

Zhejiang University

Publish URL

github.com

IEPile is a large-scale, high-quality bilingual (Chinese and English) information extraction (IE) instruction fine-tuning dataset developed by Zhejiang University, covering three core subtasks: named entity recognition (NER), relation extraction (RE), and event extraction (EE). The dataset contains about 2 million instruction samples, totaling about 320 million tokens, covering multiple fields such as general, medical, and financial.

The research team carefully integrated 26 English and 7 Chinese IE datasets and adopted the proposed "schema-based polling instruction construction method", including the construction of a hard negative sample dictionary and polling instruction generation, to ensure the high quality of the dataset. The construction of IEPile significantly improved the performance of large models in information extraction tasks, especially zero-shot generalization capabilities, and provided valuable resources for information extraction research.

IEPile.torrent
Seeding 2Downloading 0Completed 319Total Downloads 644
  • IEPile/
    • README.md
      1.47 KB
    • README.txt
      2.94 KB
      • data/
        • IEPile-main.zip
          1.83 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
IEPile Large-Scale Information Extraction Corpus  | Datasets | HyperAI