Command Palette
Search for a command to run...
APM Protein Generation Dataset
Date
3 months ago
Size
9.06 GB
Publish URL
Paper URL
License
Other
This dataset is a protein generation dataset released in 2025 by Hunan University, University of Chinese Academy of Sciences, and ByteDance Seed Team. The related paper results are "An All-Atom Generative Model for Designing Protein Complexes".
Dataset composition
- Single-chain protein dataset: contains 187,494 samples, covering a variety of protein types and functions, from PDB (18,684), Swiss-Prot (140,769), AFDB (28,041) databases.
- Multi-chain protein dataset: contains 11,620 samples, covering 2-6 chain protein complexes, supporting multi-chain modeling. The data is derived from PDB biological assembly data, excluding 3 types of samples: samples in the SAbDab antibody database, samples containing chains less than 30 in length (considered as peptides), samples with a length greater than 2,048 or lacking cluster IDs. The researchers randomly trimmed the multi-chain samples during training: samples with more than 384 residues were centered on the interchain binding interface residue pairs, retaining the nearest 384 amino acids.
APM.torrent
Seeding 2Downloading 0Completed 38Total Downloads 111
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.
AI Co-coding
Ready-to-use GPUs
Best Pricing
Hyper Newsletters
Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp