HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Knowledge-Design: Pushing the Limit of Protein Design via Knowledge Refinement

Zhangyang Gao Cheng Tan Stan Z. Li

Knowledge-Design: Pushing the Limit of Protein Design via Knowledge Refinement

Abstract

Recent studies have shown competitive performance in protein design that aims to find the amino acid sequence folding into the desired structure. However, most of them disregard the importance of predictive confidence, fail to cover the vast protein space, and do not incorporate common protein knowledge. After witnessing the great success of pretrained models on diverse protein-related tasks and the fact that recovery is highly correlated with confidence, we wonder whether this knowledge can push the limits of protein design further. As a solution, we propose a knowledge-aware module that refines low-quality residues. We also introduce a memory-retrieval mechanism to save more than 50\% of the training time. We extensively evaluate our proposed method on the CATH, TS50, and TS500 datasets and our results show that our Knowledge-Design method outperforms the previous PiFold method by approximately 9\% on the CATH dataset. Specifically, Knowledge-Design is the first method that achieves 60+\% recovery on CATH, TS50 and TS500 benchmarks. We also provide additional analysis to demonstrate the effectiveness of our proposed method. The code will be publicly available.

Code Repositories

A4Bio/OpenCPD
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
protein-design-on-cath-4-2GraphTrans
Perplexity: 6.63
Sequence Recovery %(All): 35.82
protein-design-on-cath-4-2StructGNN
Perplexity: 6.4
Sequence Recovery %(All): 35.91
protein-design-on-cath-4-2ProteinMPNN
Perplexity: 4.61
Sequence Recovery %(All): 45.96
protein-design-on-cath-4-2Knowledge-Design
Perplexity: 3.46
Sequence Recovery %(All): 60.77
protein-design-on-cath-4-2AlphaDesign
Perplexity: 6.3
Sequence Recovery %(All): 41.31
protein-design-on-cath-4-2GVP
Perplexity: 5.36
Sequence Recovery %(All): 39.47
protein-design-on-cath-4-2PiFold
Perplexity: 4.55
Sequence Recovery %(All): 51.66
protein-design-on-cath-4-2GCA
Perplexity: 6.05
Sequence Recovery %(All): 37.64
protein-design-on-cath-4-3ESM-IF
Perplexity: 6.44
Sequence Recovery %(All): 38.3
protein-design-on-cath-4-3GVP-large
Perplexity: 6.17
Sequence Recovery %(All): 39.2
word-sense-disambiguation-on-ts50SPIN
Sequence Recovery %(All): 30.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Knowledge-Design: Pushing the Limit of Protein Design via Knowledge Refinement | Papers | HyperAI