HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

LDMol: A Text-to-Molecule Diffusion Model with Structurally Informative Latent Space Surpasses AR Models

Jinho Chang; Jong Chul Ye

LDMol: A Text-to-Molecule Diffusion Model with Structurally Informative Latent Space Surpasses AR Models

Abstract

With the emergence of diffusion models as a frontline generative model, many researchers have proposed molecule generation techniques with conditional diffusion models. However, the unavoidable discreteness of a molecule makes it difficult for a diffusion model to connect raw data with highly complex conditions like natural language. To address this, here we present a novel latent diffusion model dubbed LDMol for text-conditioned molecule generation. By recognizing that the suitable latent space design is the key to the diffusion model performance, we employ a contrastive learning strategy to extract novel feature space from text data that embeds the unique characteristics of the molecule structure. Experiments show that LDMol outperforms the existing autoregressive baselines on the text-to-molecule generation benchmark, being one of the first diffusion models that outperforms autoregressive models in textual data generation with a better choice of the latent domain. Furthermore, we show that LDMol can be applied to downstream tasks such as molecule-to-text retrieval and text-guided molecule editing, demonstrating its versatility as a diffusion model.

Code Repositories

jinhojsk515/ldmol
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-based-de-novo-molecule-generation-onLDMol
BLEU: 92.6
Exact Match: 53.3
Frechet ChemNet Distance (FCD): 0.20
Levenshtein: 6.750
MACCS FTS: 97.3
Morgan FTS: 93.1
RDK FTS: 95.0
Validity: 94.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
LDMol: A Text-to-Molecule Diffusion Model with Structurally Informative Latent Space Surpasses AR Models | Papers | HyperAI