research papers
Cryo-EM single-particle structure Servalcat
and map calculation usingaMRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom, and bScientific Computing Department, UKRI Science and Technology Facilities Council, Rutherford Appleton Laboratory, Harwell Campus, Didcot OX11 0FA, United Kingdom
*Correspondence e-mail: kyamashita@mrc-lmb.cam.ac.uk, garib@mrc-lmb.cam.ac.uk
In 2020, cryo-EM single-particle analysis achieved true atomic resolution thanks to technological developments in hardware and software. The number of high-resolution reconstructions continues to grow, increasing the importance of the accurate determination of atomic coordinates. Here, a new Python package and program called Servalcat is presented that is designed to facilitate atomic model Servalcat implements a pipeline using the program REFMAC5 from the CCP4 package. After the Servalcat calculates a weighted Fo − Fc difference map, which is derived from Bayesian statistics. This map helps manual and automatic model building in real space, as is common practice in crystallography. The Fo − Fc map helps in the visualization of weak features including hydrogen densities. Although hydrogen densities are weak, they are stronger than in the electron-density maps produced by X-ray crystallography, and some H atoms are even visible at ∼1.8 Å resolution. Servalcat also facilitates atomic model under symmetry constraints. If point-group symmetry has been applied to the map during reconstruction, the model is refined with the appropriate symmetry constraints.
Keywords: cryo-EM; structure refinement; REFMAC5; Servalcat.
1. Notation
FT: Fourier transform of unknown true map (complex values).
Fn: Fourier transform of noise in the observed map (complex values).
Fo1, Fo2: Fourier transforms of the two unweighted and unsharpened half maps from independent reconstructions (complex values).
Fo: Fourier transform of the observed full map, (Fo1 + Fo2)/2.
Fc: Fourier transform of calculated map from atomic coordinates (complex values).
E: structure factors normalized in resolution bins, F/(〈|F|2〉)1/2.
k: resolution-dependent scale factor between Fo and FT.
D: resolution-dependent scale factor between Fo and Fc.
: variance of signal, var(FT).
: variance of noise, var(Fn).
: variance of unexplained signal, var(DFc − kFT).
f: atomic scattering factor.
s: column vector of position in reciprocal space.
sT: row vector of position in reciprocal space.
x: column vector of position in real space.
(R, t): rotation matrix and translation vector that could be an element of a point group.
B: displacement parameter of an atom, or blurring parameter for a local or global region of a map. A real value (isotropic case) or a 3 × 3 symmetric matrix (anisotropic case). Usually B is isotropic and atomic unless otherwise stated. Also called an atomic displacement parameter (ADP) if associated with an atom.
Unless otherwise stated, all quantities in Fourier space are dependent on s.
2. Introduction
Atomic model ). More accurate maps may be obtained as the model becomes more accurate through the In single-particle analysis (SPA) there is no although the Fourier coefficients can be noisy, especially at high resolution.
is the optimization of the model's parameters against the observed data. Atomic parameters typically include coordinates, atomic displacement parameters (ADPs) and occupancies. In crystallography, is crucial because of the the accuracy of density maps relies on the accuracy of the phases of the structure factors. Accurate phases are not observed and must be calculated from the model (Tronrud, 2004Accurate atomic model determination is becoming more and more important due to the `resolution revolution' in cryo-EM SPA following the introduction of direct electron detectors and new data-processing methods (Bai et al., 2015). As of April 2021, more than 2500 SPA entries with resolutions better than 3.5 Å have been deposited in the Data Bank (EMDB; Tagari et al., 2002). This improvement in resolution has accelerated the development of methods for model building, and validation. Automatic model-building programs that were originally developed for crystallography are now being adapted for cryo-EM SPA maps (Terwilliger, Adams et al., 2018; Hoh et al., 2020; Chojnowski et al., 2021). Density modification and local map sharpening can help to interpret the map (Jakobi et al., 2017; Terwilliger, Sobolev et al., 2018; Ramírez-Aportela et al., 2019; Ramlaul et al., 2019; Terwilliger et al., 2020). In general, care must be exercised when using any techniques based on prior knowledge; bias towards incorrect assumptions might lead to misinterpretation of the maps. Full-atom can be performed either in real space (Afonine et al., 2018) or in (Murshudov, 2016).
After et al., 2015). MolProbity is the most widely used geometry validation tool, and includes analyses of clashes, rotamers and the Ramachandran plot (Chen et al., 2010). Map–model quality is assessed using real-space local correlations (Cragnolini et al., 2021), which have commonly been used in crystallography (Tickle, 2012). In reciprocal-space the R factor can be calculated as in crystallography, but the map–model Fourier shell correlation (FSC) is preferred as it does not depend on resolution-dependent scaling and takes phases into account explicitly. An Fo − Fc map, which highlights unmodelled features and errors in the current model, is almost always used in crystallography, and some similar tools already exist for SPA (Joseph et al., 2020). The σA-weighted (m|Fo| − D|Fc|)exp(iφc) map as used in crystallography is not directly applicable to SPA, because phases are available for both Fo and Fc and we should model the error of Fo in the complex plane, rather than simply using the estimated phase error as in crystallography (see below).
the model should be validated; the model should have a reasonable geometry and should describe the map well. Due to the low data-to-parameter ratio, all models will exhibit a degree of overfitting; however, the model should not deviate substantially from cross-validation data (BrownIn 2020, cryo-EM SPA achieved atomic resolution, according to Sheldrick's criterion (Wlodawer & Dauter, 2017), in structural analyses of apoferritin, which were reported by two groups (Nakane et al., 2020; Yip et al., 2020). Nakane et al. (2020) observed H-atom densities at 1.2 and 1.7 Å resolutions using Fo − Fc maps calculated by REFMAC5. There is a higher chance of observing hydrogen density in than in X-ray crystallography because of the increased contrast for the lighter elements (Clabbers & Abrahams, 2018). Nevertheless, hydrogen density is relatively weak and there is always a much higher peak from the parent atom nearby, so the Fo − Fc difference map is essential to see it. In addition, there is complexity in the interpretation of hydrogen peaks in EM. An electron in an H atom is usually shifted towards the parent atom from the nucleus position. In EM, both the electrons and the nucleus contribute to scattering, and this offset results in a shift of hydrogen density peaks beyond the position of the hydrogen nucleus (Nakane et al., 2020).
SPA structures often have point-group symmetries (rather than space-group symmetry as in crystallography). Approximately half of the SPA entries in the EMDB have non-C1 point-group symmetry according to their associated metadata. Such symmetry is advantageous and helps to reach higher resolution because it increases the effective number of particles. If the map is symmetrized, downstream analyses should be aware of it and the structural model must follow the symmetry. As in crystallography, it is natural to work in a single The MTRIX records in the PDB format or _struct_ncs_oper in the mmCIF format can be used to encode the symmetry information.1 Currently, for structures from SPA there are only a few depositions of such models in the PDB (excepting viruses). We recommend refining and depositing an model, which makes sure the symmetry copies are truly identical. It should be noted that validation tools must be aware of any applied symmetry operators, but results should be reported for the only. These considerations are only valid if the map is symmetrized, and we suggest that the point-group information should be required by the deposition system.
Here, we present Servalcat, a Python package and standalone program for the and map calculation of cryo-EM SPA structures. Servalcat takes unsharpened and unweighted half maps of the independent reconstructions as inputs and implements a pipeline using REFMAC5, which uses a dedicated likelihood function for SPA (Murshudov, 2016). After the Servalcat calculates a sharpened and weighted Fo − Fc map derived from Bayesian statistics as described below. If the map has point-group symmetry, the user can give an model and a point-group symbol, and the program will output a refined model with symmetry annotation as well as a symmetry-expanded model. The (NCS) constraint function in REFMAC5 has been updated to consider symmetry-related nonbonded interactions and ADP similarity restraints (to ensure the similarity of ADPs of atoms brought into close proximity via symmetry operations).
Servalcat is freely available as a standalone package and also as part of CCP-EM (Burnley et al., 2017), where the REFMAC5 interface has been updated to use Servalcat.
3. Map calculation and sharpening using signal variance
Let us assume that Fo is the result of a position-independent blurring k of the true Fourier coefficients FT with an independent zero-mean Gaussian noise with variance . That is,
Note that in this work we treat k as a function of resolution |s|. Multiplication by k in Fourier space is equivalent to isotropic blurring by a convolution in real space. In general, k could take on a different value at each point s in Fourier space, which would produce a position-independent but direction-dependent blurring in real space.
The variance of the noise () can be calculated from the half maps in resolution bins (Murshudov, 2016),
We will later use the relationship of and to the FSC, correlation coefficients in resolution bins (Rosenthal & Henderson, 2003),
Let us also assume that the errors in the model follow a Gaussian distribution (Luzzati, 1952),
We need two functions: the likelihood p(Fo; Fc) for the estimation of parameters (of the atomic model and of the distribution function) and the posterior distribution p(FT; Fo, Fc) of the unknown FT for map calculation.
3.1. Likelihood
As derived in Murshudov (2016),
is the likelihood function that is optimized during atomic model D and are obtained in each resolution bin i by maximizing the joint likelihood (7):
where Ni is the number of Fourier coefficients in bin i.
3.2. Posterior distribution and map calculation
The posterior distribution, as derived in Murshudov (2016),
is a 2D Gaussian distribution with the mean and variance
where
Coefficients for an Fo − Fc-type difference map can be derived as
The remaining unknown variable is k, which cannot be determined from the data alone. For position-independent isotropic Gaussian blurring, k has the form exp(−Boverall|s|2/4) and Boverall may be estimated from line fitting of a Wilson plot (Wilson, 1942). However such an estimate is unstable, especially when only low-resolution data are available. Here, we introduce a simple approximation using the variance of the signal. Let us assume that the true map consists of atoms with the same isotropic ADP of 〈B〉, and then
We ignored the interference terms . Further ignoring resolution-dependent terms in , we can use kσT as a proxy for k, which gives the best sharpening for the region, with a local blurring parameter of 〈B〉. kσT can be transformed as follows:
The Fo − Fc coefficient then finally has the form
Servalcat calculates an Fo − Fc map using (17). Note that the Fo − Fc map is only sensible when the ADPs are properly refined; otherwise we will see spurious peaks due to incorrect ADPs. For this reason, unsharpened Fo should be used as the input for atomic model (see Section 4.1); the sharpening is then consistent as the same sharpening factor is applied to Fo and Fc. Note also that the sharpening is based on the average B value, so regions having very different B values may show fewer structural features.
The map from the estimated true Fourier coefficients (11) may be useful, but there is a risk of model bias because of the contribution from Fc. In the future, techniques may be available to resolve the issue of model bias. At the moment, Servalcat provides the following as a default map for manual inspection. This is a special case of (11) in the absence of a model, that is with D = 0,
This is equivalent to EMDA's normalized expected map (Warshamanage et al., 2021).
The approach here should work at any resolution where atomic model
is applicable.3.3. Variance of a masked map
The significance of difference map peaks is usually defined by the r.m.s.d. (sigma) level in crystallography. However, in SPA the box size is arbitrary and the voxels outside the molecular envelope lead to underestimation of the r.m.s.d. value. Here, we demonstrate how a mask inflates sigma-scaled density and show that it is useful to normalize the map using the standard deviation within the mask.
We consider a masked map containing n points in total, where m points are within the mask and thus the values for n − m points are zero. If we calculate the mean value of the whole data,
Thus, to calculate the mean within the mask we can calculate the total mean and then use the formula for correction:
For the variance,
From here we can calculate varmask if we know vartotal and μtotal. If we denote f = m/n then we can write
If the mean inside the mask is zero then there is a simple relationship between the total variance and the variance within the mask. This explains the dependence between the box size and the r.m.s.d. of a cryo-EM SPA map. Servalcat normalizes the Fo − Fc map by (varmask)1/2 when a mask file is given. (Otherwise only the Fo − Fc structure factors are written in MTZ format.)
If we assume that the map consists of signal and noise, and there is no correlation between them, then we can claim that varmask = varsignal + varnoise. Now, in addition, if we assume that we have modelled the map fully with an atomic model (or that two maps have an almost perfect overlap of signals) then the difference maps should consist almost entirely of noise. Therefore, vardiffmap,mask = varnoise. This variance should be calculated within the mask to make sure that we do not have variance reduction because of systematically low values outside the region occupied by the macromolecule. If we want to increase the reliability of these variances for a region of interest then we may also mask out other regions where there might be signal that is not fully accounted for by the current model. This can also be practiced in crystallography.
4. procedure
In this section the REFMAC5 itself is implemented in Servalcat using the GEMMI library (https://github.com/project-gemmi/gemmi). Fig. 1 summarizes the procedure.
and map-calculation procedures are described. Everything other than4.1. Map choice
The optimal map depends on the purpose. For manual inspection, optimally sharpened and weighted maps should be used so that the best visual interpretability is achieved. In general, this does not mean the best signal-to-noise ratio, but it does mean that the details of structural features are visible in the map. On the other hand, unsharpened and unweighted maps are preferred in B values (or nonpositive definite if anisotropic), but they are constrained to be positive in the resulting in suboptimal atomic models. On the other hand, blurred maps will just give a shifted distribution of refined B values. An unweighted map is preferred because it enables the calculation of many properties including noise variance and optimally weighted maps after (see Section 3). Users should therefore be aware that the ADPs in the model are not refined against the same map that is used for visual inspection. Cross-validation (Brown et al., 2015) can also be carried out throughout and model building if both half maps are readily available. Therefore, unsharpened and unweighted half maps from two independent reconstructions are considered to be optimal inputs for the Servalcat pipeline, which performs atomic model followed by map calculation.
If a sharpened map is used, some atoms may need to be refined to have negative4.2. Masking and trimming
The box size in SPA is often substantially larger than the molecule, which is unnecessary for atomic model et al. (2018).
Therefore the map is masked and trimmed into a smaller box to speed up calculations, as discussed in NichollsHalf maps are first sharpened, masked at a radius of 3 Å (default) from the atom positions and then blurred by the same factor. Sharpening before masking is important to avoid masking away any of the signal (the tails of the atomic density distributions), because the raw half maps are blurred and the signal is spread out. The optimal sharpening will differ depending on the region, but here we use an overall isotropic B value estimated by comparing |Fo| with |Fc| calculated from a copy of the initial model with all ADPs set to zero. Alternatively, a user-supplied B value can be used. The sharpened–masked–unsharpened half maps are then averaged to make a full map that is used as the target in REFMAC5. After the map–model FSC is calculated using a newly created mask based on the refined model.
4.3. Point-group symmetry
If the maps are symmetrized, the user can specify a point-group symbol and give the coordinates for just a single Cn, Dn, O, T and I) following the axis convention in RELION (Scheres, 2012), which follows the common orientation convention (Heymann et al., 2005) except for T. It is also assumed that the centre of the box is the origin of symmetry. This requires translation for each rotation Rj, which can be calculated as c − Rjc = (I − Rj)c, where c is the origin of symmetry. Reconstruction programs such as RELION (Scheres, 2012) usually follow this assumption. However, the rotation of the axes and the position of the origin are arbitrary in general, and in future will be determined automatically using ProSHADE (Nicholls et al., 2018; Tykac, 2018) and EMDA. The model in the is expanded when creating a mask and performing map trimming. The rotation matrices are invariant to changing the box sizes and shifts of the molecule. The translation vectors in the symmetry operators are recalculated for the shifted model.
Symmetry operators are calculated from the symbols (REFMAC5 internally generates symmetry copies when calculating Fc and restraint terms. For anisotropic ADPs, the Baniso matrix in the Cartesian basis is transformed by . This anisotropic ADP transformation is also implemented in GEMMI.
During the
nonbonded interaction and ADP similarity restraints are evaluated using the symmetry-expanded model, and the gradients are calculated for the model in the asymmetric unit.If atoms are on special positions (for example on a rotation axis), they are restrained2 to sit on the special position and have anisotropic ADPs consistent with symmetry. Firstly, atoms are identified as being on a special position if the following condition is obeyed for any of the symmetry operators j,
where ɛ is a tolerance that can be modified by users. The default value is 0.25 Å. If an atom is on a special position then the program makes sure that the symmetry operators for this position form a group that is a
of the of the map. Once the elements of the for this atom have been identified, the atom is forced to be on that position by simply replacing its coordinates withIn every cycle, the positions of these atoms are restrained to be on their special positions by adding a term to the target function,
where the summation is performed over all x is a user-controllable weight parameter for special positions. The occupancy of the atom is adjusted based on the multiplicity of the position.
elements of the special position and σIf anisotropic ADPs are used, they are also forced to obey symmetry conditions for atoms on special positions by replacing the anisotropic tensor with
After this, similarly to the positional parameters, in every cycle restraints are applied to the anisotropic tensor of the atoms on special positions to avoid violation of the symmetry condition for the ADP,
where σB is a user-controllable weight parameter for Baniso values on special positions. Here, the distance between anisotropic tensors is a Frobenius distance |B1 − B2|2 = .
4.4. H atoms
Hydrogen electrons are usually shifted towards the parent atoms by 0.1–0.2 Å (Williams et al., 2018). This must be accounted for when calculating structure factors from the atomic model (Fc). REFMAC5 and Servalcat (GEMMI) use the Mott–Bethe formula (Mott & Bragg, 1930; Bethe, 1930; Murshudov, 2016), which can conveniently take this fact into account.
The
for an atom with a shifted nucleus iswhere Δx is the positional shift of the nucleus with respect to the centre of the electron density. The hydrogen density peak in real space is shifted beyond the position of the hydrogen nucleus and varies depending on the ADP and resolution cutoff (Nakane et al., 2020). The expected peak position may be calculated by the Fourier transform of (28). The new CCP4 monomer library includes nucleus bond distances (_chem_comp_bond.value_dist_nucleus; Nicholls et al., 2021).
4.5. Refinement
REFMAC5 performs a against the Fourier transform of a sharpened–masked–unsharpened map (see Section 4.2) using a dedicated likelihood function for SPA (7). The estimated noise is not used at the moment. No solvent model is used. The average of map–model FSC weighted by the number of Fourier coefficients in each shell (FSC average) is reported to monitor the At low resolution the use of jelly-body restraints or external restraints is encouraged to ensure a large radius of convergence and stabilize the (Murshudov et al., 2011; Nicholls et al., 2012). Note that jelly-body restraints are only useful when the initial model geometry is of good quality because they try to keep the model in its current conformation. After the Servalcat shifts the model back to the original box and adjusts the translation vectors of the symmetry operators if needed. It also generates an MTZ file of map coefficients including the sharpened and weighted Fo − Fc and Fo maps (as calculated by equations 17 and 18).
4.6. User interface
Servalcat has a command-line interface. A graphical interface will be available in CCP-EM, where the REFMAC5 interface has been updated and is now based on Servalcat.
From the user's point of view, the main difference in setting up a Fo − Fc difference map from Servalcat is made available along with the other output files in the CCP-EM launcher.
job is that the default input is now a pair of half maps. (Refinement from a single input map is still possible but is no longer the default option.) The user is also offered more control over the options for weight, symmetry and handling of H atoms. At the end of the5. Methods and results
5.1. Fo − Fc map for ligand visualization
Fo − Fc omit maps are widely used to convincingly demonstrate the existence of ligands in crystallography. They are also useful for this purpose in SPA. Fig. 2 shows an example of an Fo − Fc omit map for the ligand density from EMDB entries EMD-22898 (Kern et al., 2021) and EMD-8123 (Murray et al., 2016), clearly showing support for the presence of the ligand. To generate the map from EMD-22898, chain A of the atomic model from PDB entry 7kjr was refined using the half maps under C2 symmetry constraints. For EMD-8123, PDB entry 5it7 was refined using the half maps without symmetry constraints. After the the ligand and water atoms were omitted and the Fo − Fc maps were calculated. Map values were normalized within a mask. Since a suitable mask for EMD-22898 was not available in the EMDB, one was calculated from half-map correlation using EMDA.
The weighting and sharpening scheme in Servalcat was compared with alternatives using no weights or (FSCfull)1/2 weights (Rosenthal & Henderson, 2003), both with sharpening by the overall B value as determined from Wilson plot fitting by RELION (Supplementary Figs. S1 and S2). Especially in the case of EMDB entry EMD-8123 (Supplementary Fig. S2), sharpening by the overall B value obtained by line fitting gave oversharpened maps.
5.2. Fo − Fc map for detecting model errors
In crystallography, Fo − Fc maps are almost always used for manual and automatic model rebuilding. Strong negative density usually indicates that parts of the model should be moved away or removed, while strong positive density implies that there are unmodelled atoms. The Fo − Fc map is typically updated after every session, and may be stopped when there are no significant strong peaks.
The same illustrates the use of the Fo − Fc map for detecting model errors using EMDB entry EMD-0919 and PDB entry 6lmt (Demura et al., 2020). Chain A of the model was refined using the half maps under C8 symmetry constraints. After the Fo − Fc map was calculated and normalized using the standard deviation of the region within the EMDB-deposited mask. In this example, it is clear from the positive and negative difference peaks that the tryptophan and methionine side chains should be repositioned. The weighting and sharpening scheme are compared in Supplementary Fig. S3, demonstrating that appropriate weighting can increase the interpretability of maps.
practice is possible in SPA. Fig. 35.3. Hydrogen density analysis
Nakane et al. (2020) reported convincing densities for H atoms in apoferritin and GABAAR maps by cryo-EM SPA at 1.2 and 1.7 Å resolution, respectively. It is natural to ask what is the lowest resolution at which H atoms can be seen in cryo-EM SPA using currently available computational tools.
Here, we analyzed apoferritin maps from the EMDB to see if and when hydrogen densities could be observed. There are 25 mouse or human apoferritin entries at resolutions better than 2.1 Å, of which 19 had half maps and were used in the analysis (Table 1). Chain A of each model was refined using the half maps under O symmetry constraints. If there was no corresponding PDB entry, PDB entry 7a4m or 6z6u was placed in the map using MOLREP (Vagin & Teplyakov, 2010) followed by jiggle fit in Coot (Brown et al., 2015) before full atomic After ten cycles of with REFMAC5, an Fo − Fc map was calculated and normalized within the mask. Riding H atoms were used in the (so they are not refined, but generated at fixed positions; this is the default in REFMAC5) and they were omitted for Fo − Fc map calculation. Peaks of ≥2σ and ≥3σ were detected using PEAKMAX from the CCP4 package (Winn et al., 2011), and were associated with hydrogen positions if the distance from the peak was less than 0.3 Å. H atoms having multiple potential minima (such as those in hydroxyl, sulfhydryl or carboxyl groups) were ignored in the analysis. The ratios of the number of hydrogen peaks to the number of H atoms in the model are plotted in Fig. 4(a). The result shows that the 1.25 Å resolution data gave the highest ratio of ∼70% hydrogens detected (Fig. 5a). Even at 1.84 Å resolution approximately 17% of the H atoms may be found (Fig. 5b), while at 2.0 or 2.1 Å resolution only a few H atoms are visible in the map (Fig. 5c). The weighting and sharpening schemes are compared in Supplementary Figs. S4–S6. Note that there may be false positives due to, for example, alternative conformations or inaccuracies in the model.
|
In addition, Fo − Fc maps were generated from the 1.2 Å resolution data (PDB entry 7a4m; EMDB entry EMD-11638) using several different resolution cutoffs. These were analysed in the same way (Fig. 4c), along with Fc maps calculated from the PDB entry 7a4m model at the same resolutions (Fig. 4d). Figs. 4(c) and 4(d) show that if the cryo-EM experiment and atomic model are carried out carefully, with due attention to ADPs, then some H atoms can be seen even at 2.0 Å resolution.
For comparison, we performed the same analysis using X-ray crystallographic data for (apo)ferritins deposited in the PDB. 51 re-refined atomic models available in the PDB-REDO database (Joosten et al., 2012) were downloaded, crystallographic mFo − DFc maps were calculated using REFMAC5 and density peaks for H atoms were analysed as just described. The result (Fig. 4b) confirms that, as expected, H atoms are more visible in EM than using X-rays.
6. Conclusions
A new program, Servalcat, for the and validation of atomic models using cryo-EM SPA maps has been developed. The program controls the flow and performs difference-map calculations. A weighted and sharpened Fo − Fc map was derived as a validation tool, obtained from the posterior distribution of FT and an approximation of an overall blurring factor calculated from the variance of the signal. We showed that such maps are useful to visualize H atoms and model errors, as in crystallography.
In this work, we assumed the blurring factor k was position-independent (see Section 3). However, in reality, blurring of maps is position- and direction-dependent, for example due to the varying mobility of different domains and/or uncertainty in the particle alignments. For such regions k should ideally be replaced with klocal, derived from a local map blurring parameter Blocal according to klocal(s) = exp(−Blocal|s|2/4) (if isotropic) or exp(−sTBlocals/4) (if anisotropic). If we could estimate Blocal values, then we would be able to use them for the visual improvement of maps. This is especially important for identifying weak densities. We are working on this subject.
We showed that many H atoms may be observed in the difference maps, even up to a resolution of 2 Å. We would expect that they should also be visible in electron diffraction (MicroED) experiments. However, high accuracy would be needed in the experiment, data analysis and model et al., 2018); H atoms are known to suffer from radiation damage (Leapman & Sun, 1995) and this would hinder their detection. Lower dose experiments might be needed for more reliable identification of hydrogen, even at the expense of resolution.
in both MicroED and cryo-EM SPA to achieve this experimentally. For example, the electron dose in cryo-EM experiments is often high enough to cause radiation damage (HattneSymmetry is widely used in cryo-EM SPA. When symmetry is imposed in the reconstruction, it should be used throughout the downstream analyses, and all software tools should be aware of it and take it into account. The Cn or Dn), twist and rise (He & Scheres, 2017). Servalcat will support helical symmetry in the future.
model should be refined under symmetry constraints, and it should be deposited in the PDB with the correct annotation of the symmetry. The PDB and EMDB deposition system will need to validate the symmetry of both the model and the map. We hope that this will become common practice in the future. The same practice should be established for helical reconstructions, in which symmetry is described by the axial symmetry type (Servalcat is freely available under an open source (MPL-2.0) licence at https://github.com/keitaroyam/servalcat. The features described in this paper have been implemented in REFMAC 5.8.0291 and Servalcat 0.2.0 (which requires GEMMI 0.4.9). Servalcat is also available in the latest nightly builds of the CCP-EM suite and will be included in the upcoming version 1.6 release.
Supporting information
Supplementary Figures. DOI: https://doi.org/10.1107/S2059798321009475/qt5003sup1.pdf
Footnotes
1There is a similar record, BIOMT, which encodes the biological assembly. In SPA, the symmetry of the map usually corresponds to the biological assembly, but this is not always the case. Both MTRIX and BIOMT records are generally required during deposition.
2Technically, fixed position constraints would be more appropriate here. We used restraints instead of constraints for simplicity of implementation. In the future, we will implement the use of constraints instead.
Acknowledgements
The authors are grateful to Marcin Wojdyr for the implementation of Fc calculation for EM in the GEMMI library, Takanori Nakane for critical reading of the manuscript, computational structural biology group members for discussion, and Jake Grimmett and Toby Darling from the MRC–LMB Scientific Computing Department for computing support and resources.
Funding information
This work was supported by the Medical Research Council as part of UK Research and Innovation (MC_UP_A025_1012 to KY and GNM; MR/V000403/1 to CMP and TB).
References
Afonine, P. V., Poon, B. K., Read, R. J., Sobolev, O. V., Terwilliger, T. C., Urzhumtsev, A. & Adams, P. D. (2018). Acta Cryst. D74, 531–544. Web of Science CrossRef IUCr Journals Google Scholar
Bai, X.-C., McMullan, G. & Scheres, S. H. W. (2015). Trends Biochem. Sci. 40, 49–57. Web of Science CrossRef CAS PubMed Google Scholar
Bethe, H. (1930). Ann. Phys. 397, 325–400. CrossRef Google Scholar
Brown, A., Long, F., Nicholls, R. A., Toots, J., Emsley, P. & Murshudov, G. (2015). Acta Cryst. D71, 136–153. Web of Science CrossRef IUCr Journals Google Scholar
Burnley, T., Palmer, C. M. & Winn, M. (2017). Acta Cryst. D73, 469–477. Web of Science CrossRef IUCr Journals Google Scholar
Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21. Web of Science CrossRef CAS IUCr Journals Google Scholar
Chojnowski, G., Sobolev, E., Heuser, P. & Lamzin, V. S. (2021). Acta Cryst. D77, 142–150. CrossRef IUCr Journals Google Scholar
Clabbers, M. T. B. & Abrahams, J. P. (2018). Crystallogr. Rev. 24, 176–204. Web of Science CrossRef CAS Google Scholar
Cragnolini, T., Sahota, H., Joseph, A. P., Sweeney, A., Malhotra, S., Vasishtan, D. & Topf, M. (2021). Acta Cryst. D77, 41–47. CrossRef IUCr Journals Google Scholar
Danev, R., Yanagisawa, H. & Kikkawa, M. (2019). Trends Biochem. Sci. 44, 837–848. Web of Science CrossRef CAS PubMed Google Scholar
Danev, R., Yanagisawa, H. & Kikkawa, M. (2021). Microscopy, dfab016. CrossRef Google Scholar
Demura, K., Kusakizako, T., Shihoya, W., Hiraizumi, M., Nomura, K., Shimada, H., Yamashita, K., Nishizawa, T., Taruno, A. & Nureki, O. (2020). Sci. Adv. 6, eaba8105. CrossRef PubMed Google Scholar
Fislage, M., Shkumatov, A. V., Stroobants, A. & Efremov, R. G. (2020). IUCrJ, 7, 707–718. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Guo, H., Franken, E., Deng, Y., Benlekbir, S., Singla Lezcano, G., Janssen, B., Yu, L., Ripstein, Z. A., Tan, Y. Z. & Rubinstein, J. L. (2020). IUCrJ, 7, 860–869. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Hattne, J., Shi, D., Glynn, C., Zee, C.-T., Gallagher-Jones, M., Martynowycz, M. W., Rodriguez, J. A. & Gonen, T. (2018). Structure, 26, 759–766. Web of Science CrossRef CAS PubMed Google Scholar
He, S. & Scheres, S. H. W. (2017). J. Struct. Biol. 198, 163–176. Web of Science CrossRef CAS PubMed Google Scholar
Heymann, J. B., Chagoyen, M. & Belnap, D. M. (2005). J. Struct. Biol. 151, 196–207. Web of Science CrossRef PubMed Google Scholar
Hoh, S. W., Burnley, T. & Cowtan, K. (2020). Acta Cryst. D76, 531–541. CrossRef IUCr Journals Google Scholar
Jakobi, A. J., Wilmanns, M. & Sachse, C. (2017). eLife, 6, e27131. Web of Science CrossRef PubMed Google Scholar
Joosten, R. P., Joosten, K., Murshudov, G. N. & Perrakis, A. (2012). Acta Cryst. D68, 484–496. Web of Science CrossRef CAS IUCr Journals Google Scholar
Joseph, A. P., Lagerstedt, I., Jakobi, A., Burnley, T., Patwardhan, A., Topf, M. & Winn, M. (2020). J. Chem. Inf. Model. 60, 2552–2560. Web of Science CrossRef CAS PubMed Google Scholar
Kato, T., Makino, F., Nakane, T., Terahara, N., Kaneko, T., Shimizu, Y., Motoki, S., Ishikawa, I., Yonekura, K. & Namba, K. (2019). Microsc. Microanal. 25, 998–999. CrossRef PubMed Google Scholar
Kern, D. M., Sorum, B., Mali, S. S., Hoel, C. M., Sridharan, S., Remis, J. P., Toso, D. B., Kotecha, A., Bautista, D. M. & Brohawn, S. G. (2021). Nat. Struct. Mol. Biol. 28, 573–582. CrossRef CAS PubMed Google Scholar
Leapman, R. D. & Sun, S. (1995). Ultramicroscopy, 59, 71–79. CrossRef CAS PubMed Web of Science Google Scholar
Luzzati, V. (1952). Acta Cryst. 5, 802–810. CrossRef IUCr Journals Web of Science Google Scholar
Mott, N. F. & Bragg, W. L. (1930). Proc. R. Soc. London A, 127, 658–665. CAS Google Scholar
Murray, J., Savva, C. G., Shin, B.-S., Dever, T. E., Ramakrishnan, V. & Fernández, I. S. (2016). eLife, 5, e13567. CrossRef PubMed Google Scholar
Murshudov, G. N. (2016). Methods Enzymol. 579, 277–305. Web of Science CrossRef CAS PubMed Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Nakane, T., Kotecha, A., Sente, A., McMullan, G., Masiulis, S., Brown, P. M. G. E., Grigoras, I. T., Malinauskaite, L., Malinauskas, T., Miehling, J., Uchański, T., Yu, L., Karia, D., Pechnikova, E. V., de Jong, E., Keizer, J., Bischoff, M., McCormack, J., Tiemeijer, P., Hardwick, S. W., Chirgadze, D. Y., Murshudov, G., Aricescu, A. R. & Scheres, S. H. W. (2020). Nature, 587, 152–156. Web of Science CrossRef CAS PubMed Google Scholar
Naydenova, K., Peet, M. J. & Russo, C. J. (2019). Proc. Natl Acad. Sci. USA, 116, 11718–11724. Web of Science CAS PubMed Google Scholar
Nicholls, R. A., Long, F. & Murshudov, G. N. (2012). Acta Cryst. D68, 404–417. Web of Science CrossRef CAS IUCr Journals Google Scholar
Nicholls, R. A., Tykac, M., Kovalevskiy, O. & Murshudov, G. N. (2018). Acta Cryst. D74, 492–505. Web of Science CrossRef IUCr Journals Google Scholar
Nicholls, R. A., Wojdyr, M., Joosten, R. P., Catapano, L., Long, F., Fischer, M., Emsley, P. & Murshudov, G. N. (2021). Acta Cryst. D77, 727–745. Web of Science CrossRef IUCr Journals Google Scholar
Pintilie, G., Zhang, K., Su, Z., Li, S., Schmid, M. F. & Chiu, W. (2020). Nat. Methods, 17, 328–334. Web of Science CrossRef CAS PubMed Google Scholar
Ramírez-Aportela, E., Vilas, J. L., Glukhova, A., Melero, R., Conesa, P., Martínez, M., Maluenda, D., Mota, J., Jiménez, A., Vargas, J., Marabini, R., Sexton, P. M., Carazo, J. M. & Sorzano, C. O. S. (2019). Bioinformatics, 36, 765–772. Google Scholar
Ramlaul, K., Palmer, C. M. & Aylett, C. H. (2019). J. Struct. Biol. 205, 30–40. Web of Science CrossRef PubMed Google Scholar
R Core Team (2020). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Google Scholar
Rosenthal, P. B. & Henderson, R. (2003). J. Mol. Biol. 333, 721–745. Web of Science CrossRef PubMed CAS Google Scholar
Scheres, S. H. W. (2012). J. Struct. Biol. 180, 519–530. Web of Science CrossRef CAS PubMed Google Scholar
Schrodinger, LLC (2020). The PyMOL Molecular Graphics System, Version 2.4. Google Scholar
Tagari, M., Newman, R., Chagoyen, M., Carazo, J.-M. & Henrick, K. (2002). Trends Biochem. Sci. 27, 589. CrossRef PubMed Google Scholar
Tan, Y. Z. & Rubinstein, J. L. (2020). Acta Cryst. D76, 1092–1103. Web of Science CrossRef IUCr Journals Google Scholar
Terwilliger, T. C., Adams, P. D., Afonine, P. V. & Sobolev, O. V. (2018a). Nat. Methods, 15, 905–908. CrossRef CAS PubMed Google Scholar
Terwilliger, T. C., Sobolev, O. V., Afonine, P. V. & Adams, P. D. (2018b). Acta Cryst. D74, 545–559. CrossRef IUCr Journals Google Scholar
Terwilliger, T. C., Sobolev, O. V., Afonine, P. V., Adams, P. D. & Read, R. J. (2020). Acta Cryst. D76, 912–925. Web of Science CrossRef IUCr Journals Google Scholar
Tickle, I. J. (2012). Acta Cryst. D68, 454–467. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tronrud, D. E. (2004). Acta Cryst. D60, 2156–2168. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tykac, M. (2018). PhD thesis. University of Cambridge. https://doi.org/10.17863/CAM.31783. Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
Warshamanage, R., Yamashita, K. & Murshudov, G. N. (2021). bioRxiv, 2021.07.26.453750. Google Scholar
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. New York: Springer. Google Scholar
Williams, C. J., Headd, J. J., Moriarty, N. W., Prisant, M. G., Videau, L. L., Deis, L. N., Verma, V., Keedy, D. A., Hintze, B. J., Chen, V. B., Jain, S., Lewis, S. M., Arendall, W. B. III, Snoeyink, J., Adams, P. D., Lovell, S. C., Richardson, J. S. & Richardson, D. C. (2018). Protein Sci. 27, 293–315. Web of Science CrossRef CAS PubMed Google Scholar
Wilson, A. J. C. (1942). Nature, 150, 152. CrossRef Google Scholar
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Wlodawer, A. & Dauter, Z. (2017). Acta Cryst. D73, 379–380. Web of Science CrossRef IUCr Journals Google Scholar
Wu, M., Lander, G. C. & Herzik, M. A. (2020). J. Struct. Biol. X, 4, 100020. Web of Science PubMed Google Scholar
Yip, K. M., Fischer, N., Paknia, E., Chari, A. & Stark, H. (2020). Nature, 587, 157–161. Web of Science CrossRef CAS PubMed Google Scholar
Zivanov, J., Nakane, T., Forsberg, B. O., Kimanius, D., Hagen, W. J., Lindahl, E. & Scheres, S. H. W. (2018). eLife, 7, e42166. Web of Science CrossRef PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.