view article

Figure 3
Strategies for building atomic models. (a) General scheme for interpreting cryo-EM maps of unknown identity before 2021. The protein fold responsible for a region of globular density could be determined by evaluating the fit of a library of protein folds (the BALBES database) curated from the PDB to the density (Brown et al., 2015BB10; Vagin & Teplyakov, 2010BB84). Map regions with resolved side chains could be built manually, with side-chain predictions used to generate query sequences for BLAST searches against proteomes. Alternatively, secondary structure could be extracted from the built model and compared with precomputed profiles, or the tertiary structure compared with PDB entries using DALI (Holm, 2022BB35) or PDBeFold (Krissinel & Henrick, 2004BB49) to identify likely folds. (b) Current best approaches for interpreting cryo-EM maps. The protein fold responsible for a region of globular density can be determined by evaluating the fit to the density of AlphaFold2-created model libraries of entire proteomes (Jumper et al., 2021BB41). Map regions with resolved side chains can be built automatically using AI-based tools such as ModelAngelo (Jamali et al., 2024BB38) or DeepTracer (Pfab et al., 2021BB70; Chang et al., 2022BB14). Sequence assignments can be extracted and used to create profile hidden Markov models (HMMs) for proteome searches. Alternatively, the three-dimensional models can be compared with millions of predicted structures using Foldseek (van Kempen et al., 2023BB45) to identify the exact protein or protein fold.

Journal logoSTRUCTURAL
BIOLOGY
ISSN: 2059-7983
Follow Acta Cryst. D
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds