HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text

Pengfei Liu; Yiming Ren; Jun Tao; Zhixiang Ren

GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text

Abstract

Large language models have made significant strides in natural language processing, enabling innovative applications in molecular science by processing textual representations of molecules. However, most existing language models cannot capture the rich information with complex molecular structures or images. In this paper, we introduce GIT-Mol, a multi-modal large language model that integrates the Graph, Image, and Text information. To facilitate the integration of multi-modal molecular data, we propose GIT-Former, a novel architecture that is capable of aligning all modalities into a unified latent space. We achieve a 5%-10% accuracy increase in properties prediction and a 20.2% boost in molecule generation validity compared to the baselines. With the any-to-language molecular translation strategy, our model has the potential to perform more downstream tasks, such as compound name recognition and chemical reaction prediction.

Code Repositories

ai-hpc-research-team/git-mol
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
drug-discovery-on-baceGIT-Mol(G+S)
AUC: 0.8108
drug-discovery-on-bbbpGIT-Mol(G+S)
AUC: 0.739
drug-discovery-on-clintoxGIT-Mol(G+S)
AUC: 0.883
drug-discovery-on-siderGIT-Mol(G+S)
AUC: 0.634
drug-discovery-on-tox21GIT-Mol(G+S)
AUC: 0.759
drug-discovery-on-toxcastGIT-Mol(G+S)
AUC: 0.668
image-captioning-on-chebi-20GIT-Mol
BLEU: 0.924
Exact: 0.461
Levenshtein: 6.575
MACCS FTS: 0.962
Morgan FTS: 0.894
RDK FTS: 0.906
Validity: 0.899
text-based-de-novo-molecule-generation-onGIT-Mol-caption
BLEU: 75.6
Exact Match: 5.1
Levenshtein: 26.315
MACCS FTS: 73.8
Morgan FTS: 51.9
RDK FTS: 58.2
Validity: 92.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text | Papers | HyperAI