HyperAIHyperAI

Command Palette

Search for a command to run...

Mol-Instructions Large-scale Biomolecular Instruction Dataset

Date

a year ago

Size

260.89 MB

Organization

Zhejiang University

Publish URL

github.com

Paper URL

arxiv.org

*This dataset supports online use.Click here to jump.

Mol-Instructions is a large-scale biomolecular instruction dataset designed for large language models. It was created by a research team from Zhejiang University in 2024. The related paper results are "Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models", has been accepted by ICLR 2024.

The dataset contains three types of instructions: molecule-oriented instructions, protein-oriented instructions, and biomolecule text instructions. It aims to provide rich instruction data to enhance the understanding and prediction capabilities of large language models in the biomolecule field.

The molecule-oriented instructions contain 148,400 instructions, covering the basic properties and behaviors of small molecules, involving a variety of chemical reactions and molecular design tasks. The protein-oriented instructions contain 505,000 instructions, involving protein structure, function and activity prediction, as well as protein design based on text instructions. The biomolecule text instructions contain 53,000 instructions, mainly used for natural language processing tasks in the fields of bioinformatics and cheminformatics.

Mol-Instructions.torrent
Seeding 1Downloading 0Completed 110Total Downloads 158
  • Mol-Instructions/
    • README.md
      1.69 KB
    • README.txt
      3.39 KB
      • data/
        • Mol-Instructions.zip
          260.89 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Mol-Instructions Large-scale Biomolecular Instruction Dataset | Datasets | HyperAI