Command Palette
Search for a command to run...
GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text
Pengfei Liu; Yiming Ren; Jun Tao; Zhixiang Ren

Abstract
Large language models have made significant strides in natural language processing, enabling innovative applications in molecular science by processing textual representations of molecules. However, most existing language models cannot capture the rich information with complex molecular structures or images. In this paper, we introduce GIT-Mol, a multi-modal large language model that integrates the Graph, Image, and Text information. To facilitate the integration of multi-modal molecular data, we propose GIT-Former, a novel architecture that is capable of aligning all modalities into a unified latent space. We achieve a 5%-10% accuracy increase in properties prediction and a 20.2% boost in molecule generation validity compared to the baselines. With the any-to-language molecular translation strategy, our model has the potential to perform more downstream tasks, such as compound name recognition and chemical reaction prediction.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| drug-discovery-on-bace | GIT-Mol(G+S) | AUC: 0.8108 |
| drug-discovery-on-bbbp | GIT-Mol(G+S) | AUC: 0.739 |
| drug-discovery-on-clintox | GIT-Mol(G+S) | AUC: 0.883 |
| drug-discovery-on-sider | GIT-Mol(G+S) | AUC: 0.634 |
| drug-discovery-on-tox21 | GIT-Mol(G+S) | AUC: 0.759 |
| drug-discovery-on-toxcast | GIT-Mol(G+S) | AUC: 0.668 |
| image-captioning-on-chebi-20 | GIT-Mol | BLEU: 0.924 Exact: 0.461 Levenshtein: 6.575 MACCS FTS: 0.962 Morgan FTS: 0.894 RDK FTS: 0.906 Validity: 0.899 |
| text-based-de-novo-molecule-generation-on | GIT-Mol-caption | BLEU: 75.6 Exact Match: 5.1 Levenshtein: 26.315 MACCS FTS: 73.8 Morgan FTS: 51.9 RDK FTS: 58.2 Validity: 92.8 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.