Command Palette
Search for a command to run...
GeMS Chemical Mass Spectrometry Dataset
GeMS is a chemical mass spectrometry dataset that focuses on the storage and analysis of mass spectrometry (MS/MS) data, aiming to provide large-scale raw data support for molecular characterization, compound identification and metabolomics research. The dataset integrates a large number of unlabeled spectra from the Global Natural Products Social Mass Spectrum Library (GNPS) and is the core data foundation of the DreaMS project. The related paper results are:Self-supervised learning of molecular representations from millions of tandem mass spectra using DreaMS".
The dataset contains hundreds of millions of mass spectra (such as 2 billion in the GeMS-C1 subset), including structured numerical data (mass-to-charge ratio-intensity pairs of mass spectra) and metadata (such as spectral sources, experimental conditions, etc.). It is one of the largest public mass spectrometry datasets currently available and can support ultra-large-scale model training.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.