DeepMSA (version 2) is a hierarchical approach to create high-quality multiple sequence alignments (MSAs)
for monomer and multimer proteins.
The method is built on iterative sequence database searching followed by fold-based
MSA ranking and selection.
For protein monomers, MSAs are produced with three iterative MSA searching pipelines (dMSA, qMSA and mMSA)
through whole-genome (Uniclust30 and UniRef90) and
metagenome (Metaclust, BFD, Mgnify, TaraDB, MetaSourceDB and JGIclust) sequence databases.
For protein multimers, a number of hybrid MSAs are created by pairing the sequences from
monomer MSAs of the component chains, with the optimal multimer MSAs selected based on a combined score of
MSA depth and folding score of the monomer chains.
Large-scale benchmark data show significant advantage of DeepMSA2 in generating accurate MSAs
with balanced depth and alignment coverage which are most suitable for deep-learning based
protein and protein complex stucture and function predictions. To directly predict the structure model,
please use
DMFold server.
[Example output for monomer]
[Example output for multimer]