Estimation of high-order aberrations and anisotropic magnification from cryo-EM data sets in RELION-3.1

Zivanov, J.; Nakane, T.; Scheres, S. H. W.

doi:10.1107/S2052252520000081

research papers

IUCrJ

Volume 7| Part 2| March 2020| Pages 253-267

ISSN: 2052-2525

https://doi.org/10.1107/S2052252520000081

CRYO | EM

Open

access

Estimation of high-order aberrations and anisotropic magnification from cryo-EM data sets in RELION-3.1

Jasenko Zivanov,^a,^b Takanori Nakane ^a and Sjors H. W. Scheres ^a ^*

^aMedical Research Council Laboratory of Molecular Biology, Cambridge CB2 0QH, England, and ^bBiozentrum, University of Basel, Switzerland
^*Correspondence e-mail: scheres@mrc-lmb.cam.ac.uk

(Received 30 October 2019; accepted 6 January 2020; online 11 February 2020)

Methods are presented that detect three types of aberrations in single-particle cryo-EM data sets: symmetrical and antisymmetrical optical aberrations and magnification anisotropy. Because these methods only depend on the availability of a preliminary 3D reconstruction from the data, they can be used to correct for these aberrations for any given cryo-EM data set, a posteriori. Using five publicly available data sets, it is shown that considering these aberrations improves the resolution of the 3D reconstruction when these effects are present. The methods are implemented in version 3.1 of the open-source software package RELION.

Keywords: cryo-EM; RELION; aberrations; anisotropic magnification.

1. Introduction

Structure determination of biological macromolecules using electron cryo-microscopy (cryo-EM) is primarily limited by the radiation dose to which samples can be exposed before they are destroyed. As a consequence of the low electron dose, cryo-EM has to rely on very noisy images. In recent years, advances in electron-detector technology and processing algorithms have enabled the reconstruction of molecular structures at resolutions sufficient for de novo atomic modelling (Fernandez-Leiro & Scheres, 2016 ). With increasing resolutions, limitations imposed by the optical system of the microscope are becoming more important. In this paper, we propose methods to estimate three optical effects – symmetrical and antisymmetrical aberrations, and magnification anisotropy – which, when considered during reconstruction, increase the attainable resolution.

In order to increase contrast, cryo-EM images are typically collected out of focus, which introduces a phase shift between the scattered and unscattered components of the electron beam. This phase shift varies with spatial frequency and gives rise to the contrast-transfer function (CTF). Since the electron-scattering potential of the sample corresponds to a real-valued function, its Fourier-space representation exhibits Friedel symmetry: the amplitude of the complex structure factor at spatial frequency k is the complex conjugate of the structure factor at frequency −k. Traditionally, the phase shift of these two frequencies has been assumed to be identical, which corresponds to a real-valued CTF. Imperfections of the optical system can, however, produce asymmetrical phase shifts that break the Friedel symmetry of the scattered wave. The effect of this is that the CTF has to be expressed as a complex-valued function, which affects not only the amplitudes of the structure factors but also their phases.

The phase shifts of a pair of corresponding spatial frequencies can be separated into a symmetrical component (i.e. their average shift) and an antisymmetrical component (i.e. their deviation from that average). In this paper, we will refer to the antisymmetrical component as antisymmetrical aberrations. The symmetrical component of the phase shift sometimes also deviates from that predicted by the aberration-free CTF model (Hawkes & Kasper, 1996 ). The effect of this is that the CTF is not always adequately represented by a set of elliptical rings of alternating sign, but these so-called Thon rings can take on slightly different shapes. We will refer to this deviation from the traditional CTF model as symmetrical aberrations.

In addition to the antisymmetrical and symmetrical aberrations, the recorded image itself can be distorted by a different magnification in two perpendicular directions. This is called anisotropic magnification. Anisotropic magnification can be detected by measuring the ellipticity of the power spectra of multi-crystalline test samples (Grant & Grigorieff, 2015 ). This has the advantage of providing a calibration of the absolute magnification, but does require additional experiments, and microscope alignments may drift in between such experiments. For several icosahedral virus data sets, it has been shown that anisotropic magnification may be detected and corrected by an exhaustive search over the amount and the direction of the anisotropy while comparing projections of an undistorted three-dimensional reference map with individual particle images (Yu et al., 2016 ).

Because the antisymmetrical and symmetrical aberrations and the anisotropic magnification produce different effects, we propose three different and independent methods to estimate them. We recently proposed a method to estimate a specific type of antisymmetrical aberration that arises from a tilted electron beam (Zivanov et al., 2018 ). In this paper, we propose an extension of this method that allows us to estimate arbitrary antisymmetrical aberrations expressed as linear combinations of Zernike polynomials (Zernike, 1934 ). The methods to estimate symmetrical aberrations and anisotropic magnification are novel. Similar to the method for antisymmetrical aberration correction, the method for symmetrical aberration correction also uses Zernike polynomials to model the estimated aberrations. The choice of Zernike polynomials as a basis is to some degree arbitrary, and the methods described here could be trivially altered to use any other function as a basis. In particular, we make no use of the orthogonality of Zernike polynomials, since they are only defined to be orthogonal on the entire, evenly weighted unit disc. In our case, the evidence is distributed non-uniformly across Fourier space, and accounting for this fact breaks the orthogonality of the polynomials.

Optical aberrations in the electron microscope have been studied extensively in the materials science community (Batson et al., 2002 ; Krivanek et al., 2008 ; Saxton, 1995 , 2000 ; Meyer et al., 2002 ). However, until now, their estimation has required specific test samples of known structure and of greater radiation resistance than biological samples. The methods presented in this paper work directly on cryo-EM single-particle data sets of biological samples, making it possible to estimate the effects after the data have been collected, and without performing additional experiments on specific test samples. Using data sets that are publicly available from the EMPIAR database (Iudin et al., 2016 ), we illustrate that when these optical effects are present their correction leads to reconstruction with increased resolution.

2. Materials and methods

2.1. Observation model

We are working on a single-particle cryo-EM data set consisting of a large number of particle images. We assume that we already have a preliminary 3D reference map of the particle up to a certain resolution, and that we know the approximate viewing parameters of all observed particles. This allows us to predict each particle image, which in turn allows us to estimate the parameters of the optical effects by comparing these predicted images with the observed images.

Let X_p,k ∈ $[{\bb C}]$ be the complex amplitude of the observed image of particle p ∈ $[{\bb N}]$ for 2D spatial frequency k ∈ $[{\bb Z}^2]$ . Without loss of generality, we can assume that the observed image is shifted so that the centre of the particle appears at the origin of the image. We can obtain the corresponding predicted image by integrating over the 3D reference along the appropriate viewing direction. According to the central-slice theorem, the corresponding complex amplitude V_p,k ∈ $[{\bb C}]$ of the predicted particle image is given by

$[V_{p,{\bf k}} = W(A_p {\bf k}), \eqno (1)]$

where $[W: {\bb R}^3 \mapsto {\bb C}]$ is the 3D reference map in Fourier space and A_p is a 3 × 2 projection matrix arising from the viewing angles. Since the back-projected positions of the 2D pixels k mostly fall between the 3D voxels of the reference map, we determine the values of W(A_pk) using linear interpolation.

Further, we assume that we have an estimate of the defocus and astigmatism of each particle, as well as the spherical aberration of the microscope, allowing us to also predict the CTFs. We can therefore write

$[X_{p,{\bf k}} = \exp(i \varphi_{\bf k}) {\rm CTF}_{p,{\bf k}} V_{p,{\bf k}} + n_{p,{\bf k}}, \eqno (2)]$

where φ_k is the phase-shift angle induced by the antisymmetrical aberration, CTF_p,k is the real part of the CTF and n_p,k represents the noise.

The three methods presented in the following all aim to estimate the optical effects by minimizing the squared difference between X_p,k and exp(iφ_k)CTF_p,kV_p,k. This is equivalent to a maximum-likelihood estimate under the assumption that all n_p,k are drawn from the same normal distribution.

2.2. Antisymmetrical aberrations

Antisymmetrical aberrations shift the phases in the observed images and they are expressed by the angle φ_k in (2). We assume that φ_k is constant for a sufficiently large number of particles. This assumption is necessary since, in the presence of typically strong noise, we require the information from a large number of particle images to obtain a reliable estimate.

We model φ_k using antisymmetrical Zernike polynomials as a basis,

$[\varphi_{\bf k}({\bf c}) = \textstyle\sum \limits_b c_b Z_b({\bf k}), \eqno (3)]$

where c_b ∈ $[{\bb R}]$ are the unknown coefficients describing the aberration and Z_b(k) are a subset of the antisymmetrical Zernike polynomials. The usual two-index ordering of these polynomials is omitted for the sake of clarity. This set of polynomials always includes the first-order terms Z₁⁻¹(k) and Z₁¹(k) that correspond to rigid motion in 2D. It is essential to consider these terms during estimation, since they capture any systematic errors in particle positions that arise when the positions are estimated under antisymmetrical aberrations, in particular under axial coma arising from beam tilt. In that situation, the particles are erroneously shifted in order to neutralise the coma in the mid-frequency range, which overcompensates for the phase shift in the low-frequency range. The measured phase shift is therefore a superposition of an axial coma and a translation and has to be modelled as such.

The coefficients c_b are determined by minimizing the following sum of squared differences over all particles,

$[E_{\rm asymm} = \textstyle \sum \limits_{p,{\bf k}} f_{{\bf k}}\big|X_{p,{\bf k}} - \exp[i \varphi_{\bf k}({\bf c})] {\rm CTF}_{p,{\bf k}} V_{p,{\bf k}} \big|^2, \eqno (4)]$

where f_k is a heuristical weighting term given by the FSC of the reconstruction; its purpose is to suppress the contributions of frequencies |k| for which the reference is less reliable.

Since typical data sets contain between 10⁴ and 10⁶ particles, and each particle image typically consists of more than 10⁴ Fourier pixels, optimizing the nonlinear expression in (4) directly would be highly impractical, especially since the images would likely have to be reloaded from disc in each iteration. Instead, we apply a two-step approach. Firstly, we reduce the above sum over sums of quadratic functions to a single sum over quadratic functions, one for each Fourier-space pixel k,

$[E_{\rm asymm} = \textstyle \sum \limits_{\bf k} w_{\bf k} |\exp[i \varphi_{\bf k}({\bf c})] - q_{\bf k}|^2 + K, \eqno (5)]$

where K is a constant that does not influence the optimum of c_b. The per-pixel optimal phase shifts q_k ∈ $[{\bb C}]$ and weights w_k ∈ $[{\bb R}]$ are given by

$[\eqalignno {q_{\bf k} & = \textstyle \sum \limits_p (X_{p,{\bf k}} {\rm CTF}_{p,{\bf k}} V_{p,{\bf k}}^*) / \textstyle \sum \limits_p {\rm CTF}_{p,{\bf k}}^2 |V_{p,{\bf k}}|^2, & (6) \cr w_{\bf k} &= f_{\bf k} \textstyle \sum \limits_p {\rm CTF}_{p,{\bf k}}^2 |V_{p,{\bf k}}|^2.& (7)}]$

This is the same transformation that we have applied for the beam-tilt estimation in RELION-3.0 (Zivanov et al., 2018); beam tilt is in fact only one of the possible sources of antisymmetrical aberrations. The computation of q_k and w_k requires only one iteration over all the images in the data set, and for the data sets presented here it took of the order of one hour of time on a 24-core 2.9 GHz Intel Xeon workstation.

Once the q_k and w_k are known, the optimal c_b are determined by minimizing the following sum of squared differences using the Nelder–Mead downhill simplex (Nelder & Mead, 1965 ) method,

$[{\bf c} = {\rm argmin}_{{\bf c}^\prime}\textstyle\sum \limits_{\bf k} w_{\bf k} \big|\exp[i \varphi_{\bf k}({\bf c})] - q_{\bf k}\big|^2. \eqno (8)]$

This step requires only seconds of computation time. In addition to making the problem tractable, this separation into two steps also allows us to inspect the phase angles of the per-pixel optima q_k visually and to determine the type of antisymmetrical aberration present in the data set.

After the optimal antisymmetrical aberration coefficients c have been determined, they are used to invert the phase shift of all observed images X when a 3D map is being reconstructed from them.

2.3. Symmetrical aberrations

Unlike the antisymmetrical aberrations, the symmetrical aberrations act on the absolute value of the CTF. In the presence of such aberrations, the CTF no longer consists of strictly elliptical rings of alternating sign, but can take on a more unusual form. In our experiments, we have specifically observed the ellipses deforming into slightly square-like shapes. In order to estimate the symmetrical aberration, we need to determine the most likely deformations of the CTFs hidden underneath the measured noisy pixels. Since the micrographs in a cryo-EM data set are usually collected at different defoci, it is not sufficient to measure the collective power spectrum of the entire data set; instead, we need to determine one deformation applied to different CTFs.

In RELION-3.1, the CTF is defined as

$[\eqalignno {{\rm CTF}_{p,{\bf k}} &= -\sin(\gamma_{p,{\bf k}}), & (9)\cr \gamma_{p,{\bf k}} & = {\bf k}^{\top} D_p {\bf k} + {{\pi} \over {2}} C_{\rm s} \lambda^3 |{\bf k}|^4 - \chi_p,& (10)}]$

where D_p is the real symmetrical 2 × 2 astigmatic-defocus matrix for particle p, C_s is the spherical aberration of the microscope, λ is the electron wavelength and χ_p is a constant offset given by the amplitude contrast and the phase shift owing to a phase plate (if one is used). We chose this formulation of astigmatism because it is both more concise and also more practical when dealing with anisotropic magnification, as shown in Section 2.4. In Appendix A, we define D_p and we show that this is equivalent to the more common formulation (Mindell & Grigorieff, 2003 ).

We model the deformation of the CTF under symmetrical aberrations by offsetting γ,

$[{\rm CTF}_{p,{\bf k}} = -\sin[\gamma_p,{\bf k}} + \psi_{\bf k}({\bf d})], \eqno (11)]$

where ψ_k(d) is modelled using symmetrical Zernike polynomials combined with a set of coefficients d ∈ $[{\bb R}^B]$ that describe the aberration,

$[\psi_{{\bf k}}({\bf d}) = \textstyle \sum \limits_b d_b Z_b({\bf k}). \eqno (12)]$

The optimal values of d_b are determined by minimizing another sum of squared differences,

$[\eqalignno {E_{\rm symm} & = \textstyle \sum \limits_{p,{\bf k}} f_{{\bf k}} \big|X_{p,{\bf k}} - {\rm CTF}_{p,{\bf k}} \widetilde{V}_{p,{\bf k}} \big|^2 &(13)\cr & = \textstyle\sum \limits_{p,{\bf k}} f_{{\bf k}} \big|X_{p,{\bf k}} + \sin[\gamma_{p,{\bf k}} + \psi_{\bf k}({\bf d})] \widetilde{V}_{p,{\bf k}} \big|^2, &(14)}]$

where the predicted complex amplitude $[\widetilde{V}_{p,{\bf k}}]$ contains the phase shift induced by the antisymmetrical aberration (if it is known),

$[\widetilde{V}_{p,{\bf k}} = \exp[i \varphi({\bf k})] V_{p,{\bf k}}. \eqno (15)]$

This is again a nonlinear equation with a large number of terms. In order to make its minimization tractable, we perform the following substitution,

$[\sin[\gamma_{p,{\bf k}} + \psi_{\bf k}({\bf d})] = {\bf r}_{p,{\bf k}}^\top {\bf t}_{{\bf k}}({\bf d}), \eqno(16)]$

with the known column vector r_p,k ∈ $[{\bb R}^2]$ given by

$[{\bf r}_{p,{\bf k}} = \left [ \matrix {\cos(\gamma_{p,{\bf k}}) \cr \sin(\gamma_{p,{\bf k}})}\right] \eqno (17)]$

and the unknown t_k(d) ∈ $[{\bb R}^2]$ by

$[{\bf t}_{{\bf k}}({\bf d}) = \left \{ \matrix { \sin[\psi_{{\bf k}}({\bf d})] \cr \cos[\psi_{{\bf k}}({\bf d})]}\right\}. \eqno (18)]$

This allows us to transform the one-dimensional nonlinear term for each pixel k into a two-dimensional linear term,

$[E_{\rm symm} = \textstyle \sum \limits_{p,{\bf k}} f_{{\bf k}} \Big| X_{p,{\bf k}} + \widetilde{V}_{p,{\bf k}} {\bf r}_{p,{\bf k}}^\top {\bf t}_{{\bf k}}({\bf d}) \Big|^2. \eqno (19)]$

In this form, we can decompose E_symm into a sum of quadratic functions over all pixels k. This is equivalent to the transformation in (5), only in two real dimensions instead of one complex dimension,

$[E_{\rm symm} = \textstyle \sum \limits_{\bf k} f_{\bf k} [{\bf t}_{\bf k}({\bf d}) - \hat{\bf t}_{\bf k}]^\top R_{\bf k} [{\bf t}_{\bf k}({\bf d}) - \hat{\bf t}_{\bf k}] + K, \eqno (20)]$

where the real symmetrical 2 × 2 matrix R_k is given by

$[R_{\bf k} = \textstyle \sum \limits_p |\widetilde{V}_{p,{\bf k}}|^2 {\bf r}_{p,{\bf k}} {\bf r}_{p,{\bf k}}^\top \eqno (21)]$

and the corresponding per-pixel optima $[\hat{{\bf t}}_{{\bf k}} \in \mathbb{R}^2]$ by

$[\eqalignno {\hat{\bf t}_{\bf k} & = -R_{\bf k}^{-1} \tau_{\bf k}, & (22) \cr \tau_{\bf k} & = \textstyle \sum \limits_p {\rm Re} (X^*_{p,{\bf k}} \widetilde{V}_{p,{\bf k}}) {\bf r}_{p,{\bf k}}.& (23)}]$

Again, computing R_k and $[\hat{{\bf t}}_{{\bf k}}]$ only requires one iteration over the data set, where for each pixel k five numbers need to be updated for each particle p: the three distinct elements of R_k (the matrix is symmetrical) and the two of τ_k. Once R_k and $[\hat{\bf t}_{\bf k}]$ are known, the optimal Zernike coefficients d are determined by minimizing E_symm in (20) using the Nelder–Mead downhill simplex algorithm. Analogously to the case of the antisymmetrical aberrations, a visual inspection of the optimal ψ_k(d) for each pixel allows us to examine the type of aberration without projecting it into the Zernike basis. The CTF phase-shift estimate for pixel k is given by $[\tan^{-1}[\hat{t}_{\bf k}^{(1)}/\hat{t}_{\bf k}^{(2)}]]$ , where $[\hat{t}_{\bf k}^{(1)}]$ and $[\hat{t}_{\bf k}^{(2)}]$ refer to the two components of t_k.

Once the coefficients d of the symmetrical aberration are known, they are used to correct any CTF that is computed in RELION-3.1.

2.4. Anisotropic magnification

To determine the anisotropy of the magnification, we again compare predicted images with the observed images. We assume that the 3D reference map W has been obtained by averaging views of the particle at in-plane rotation angles drawn from a uniform distribution. This is a realistic assumption, since unlike the angle between the particle and the ice surface, where the particle often shows a preferred orientation, the particle is oblivious to the orientation of the camera pixel grid. Thus, for a data set of a sufficient size, the anisotropy in the individual images averages out and the resulting reference map depicts an isotropically scaled 3D image of the particle (although the high-frequency information on the periphery of the particle is blurred out by the averaging). We can therefore estimate the anisotropy by determining the optimal deformation that has to be applied to the predicted images in order to best fit the observed images.

We are only looking for linear distortions of the image. Such a distortion can be equivalently represented in real space or in Fourier space: if the real-space image is distorted by a 2 × 2 matrix M, then the corresponding Fourier-space image is distorted by M^-1T. We choose to operate in Fourier space since this allows us to determine the deformation of the predicted image without also distorting the CTF. We assume that the CTF parameters known at this point already fit the Thon rings observed in the image, so we only deform the particle itself.

Formally, we define the complex amplitude V_p,k(M) of the predicted image deformed by a 2 × 2 matrix M by

$[V_{p,{\bf k}}(M) = W(A_p M {\bf k}), \eqno (24)]$

and we aim to determine such a matrix M that minimizes

$[E_{\rm mag} = \textstyle\sum \limits_{p,{\bf k}} \Big| X_{p,{\bf k}} - {\rm CTF}_{p,{\bf k}} \widetilde{V}_{p,{\bf k}}(M) \Big|^2, \eqno (25)]$

where $[\widetilde{V}]$ again refers to the phase-shifted complex amplitudes as defined in (15). We are not assuming that M is necessarily symmetrical, which allows it to express a skew component in addition to the anisotropic magnification. Such skewing effects are considered by the models commonly used in computer vision applications (Hartley, 1994 ; Hartley & Zisserman, 2003 ), but not in cryo-EM. We have decided to model the skew component as well, in case it should manifest in a data set.

The expression given in (25) is yet another sum over a large number of nonlinear terms. In order to obtain a sum over squares of linear terms, we first express the deformation by M as a set of per-pixel displacements δ_k ∈ $[{\bb R}^2]$ ,

$[M {\bf k} = {\bf k} + \delta_{\bf k}. \eqno (26)]$

Next, we perform a first-order Taylor expansion of W around A_pk. We know that this linear approximation of W is reasonable for all frequencies k at which the reference map contains any information, because the displacements δ_k are likely to be smaller than one voxel there. If they were significantly larger then they would prevent a successful computation of the complex amplitudes of the reference map at these frequencies, except if a very large number of particles were to be considered. The linear approximation is given as

$[\widetilde{V}_p({\bf k} + \delta_{\bf k}) \simeq \widetilde{V}_{p,{\bf k}} + {\bf g}_{p,{\bf k}}^\top \delta_{{\bf k}}, \eqno (27)]$

where the gradient g_p,k ∈ $[{\bb C}^2]$ is a column vector that is computed by forward-projecting the 3D gradient of W (which is given by the linear interpolation),

$[{\bf g}_{p,{\bf k}} = \exp[i \varphi({\bf k})] A_p^\top \nabla W(A_p {\bf k}). \eqno (28)]$

It is essential to compute g_p,k in this way, since computing it numerically from the already projected image V_p,k would lead to a systematic underestimation of the gradient (owing to the interpolation) and thus to a systematic overestimation of the displacement. Note also that the change in φ(k) as a result of the displacement is being neglected. This is owing to the fact that the phase shift, like the CTF, has also been computed from the distorted images, so that we can assume it to be given correctly in the distorted coordinates.

Using the terms transformed in this way, the sum of squared errors can be approximated by

$[\eqalignno {E_{\rm {mag}} & \simeq \textstyle \sum \limits_{p,{\bf k}} f_{\bf k} \Big|X_{p,{\bf k}} - {\rm CTF}_{p,{\bf k}} \big(\widetilde{V}_{p,{\bf k}} + {\bf g}_{p,{\bf k}}^\top \delta_{\bf k}\big) \Big|^2 & (29) \cr & = \textstyle \sum \limits_{p,{\bf k}} f_{\bf k} \Big| X_{p,{\bf k}} - {\rm CTF}_{p,{\bf k}} \big[\widetilde{V}_{p,{\bf k}} + {\bf g}_{p,{\bf k}}^\top (M - I) {\bf k}\big] \Big|^2. & (30)}]$

This corresponds to two linear systems of equations to be solved in a least-squares sense, either for the per-pixel displacements δ_k (29) or for the global deformation matrix M (30). Analogously to the aberrations methods, we solve for both. Knowing the per-pixel solutions again allows us to confirm visually whether the observed deformations are consistent with a linear distortion; if they are, then the per-pixel displacements δ_k will follow a linear function of k.

The optimal displacements $[\hat{\delta_{\bf k}} \in {\bb R}^2]$ are equal to

$[\eqalignno {\hat{\delta_{\bf k}} & = S_{\bf k}^{-1} {\bf e}_{\bf k}, & (31) \cr {\bf e}_{\bf k} &= \textstyle \sum \limits_p {\rm CTF}_{p,{\bf k}} {\rm Re}[{\bf g}_{p,{\bf k}}^* (X_{p,{\bf k}} -\widetilde{V}_{p,{\bf k}})], & (32)}]$

with the real symmetrical 2 × 2 matrix S_k given by

$[S_{\bf k} = \textstyle \sum \limits_p {\rm CTF}_{p,{\bf k}}^2 {\rm Re} ({\bf g}_{p,{\bf k}}^* {\bf g}_{p,{\bf k}}^\top).\eqno (33)]$

Note that this is equivalent to treating the real and imaginary components of (29) as separate equations, since Re(z*w) = Re(z)Re(w) + Im(z)Im(w) for all z, w ∈ $[{\bb C}]$ . Analogously to the estimation of the symmetrical aberrations, S_k and e_k are computed in one iteration by accumulating five numbers for each pixel k over the entire data set.

The optimal 2 × 2 deformation matrix M is determined by first reshaping it into a column vector m ∈ $[{\bb R}^4]$ ,

$[M = \left [ \matrix { 1 + m^{(1)} & m^{(2)} \cr m^{(3)} & 1 + m^{(4)}} \right ]. \eqno (34)]$

The expression in (30) can then be written as

$[E_{\rm mag} = \textstyle \sum \limits_{p,{\bf k}} \big|X_{p,{\bf k}} - {\rm CTF}_{p,{\bf k}} \widetilde{V}_{p,{\bf k}} - {\bf a}_{p,{\bf k}}^\top {\bf m} \big|^2, \eqno (35)]$

with the column vector a_p,k ∈ $[{\bb C}^4]$ given by

$[{\bf a}_{p,{\bf k}} = {\rm CTF}_{p,{\bf k}} \left [\matrix {k^{(1)} g_{p,{\bf k}}^{(1)}\cr k^{(2)} g_{p,{\bf k}}^{(1)}\cr k^{(1)} g_{p,{\bf k}}^{(2)}\cr k^{(2)} g_{p,{\bf k}}^{(2)}}\right ]. \eqno (36)]$

We can now compute the optimal m,

$[{\bf m} = T^{-1} {\bf l}, \eqno (37)]$

where the real symmetrical 4 × 4 matrix T and the column vector l ∈ $[{\bb R}^4]$ are equal to

$[\eqalignno {T & = \textstyle \sum \limits_{p, {\bf k}} f_{\bf k} {\rm Re} ({\bf a}_{p,{\bf k}}^* {\bf a}_{p,{\bf k}}^\top ), & (38) \cr {\bf l} & = \textstyle \sum \limits_{p, {\bf k}} f_{\bf k} {\rm Re} [{\bf a}_{p,{\bf k}}^* (X_{p,{\bf k}} - {\rm CTF}_{p,{\bf k}} \widetilde{V}_{p,{\bf k}})]. & (39)}]$

There is no need to compute T and l explicitly by iterating over all particles p again, since all the necessary sums are already available as part of S_k and e_k. Instead, we only need to sum up the corresponding values over all pixels k. This is shown in Appendix B.

In order to correct for the anisotropy after M has been estimated, we never resample the observed images. When we compute a 3D map from a set of observed images, we do so by inserting 2D slices into the 3D Fourier-space volume. Since this process requires the insertion of 2D pixels at fractional 3D coordinates (and thus interpolation), we can avoid any additional resampling of the observed images by instead inserting pixel k into the 3D map at position A_pMk instead of at A_pk. Analogously, if the methods described in Sections 2.2 and 2.3 are applied after the distortion matrix M is known, then the predicted images are generated by reading the complex amplitude from W at 3D position A_pMk. This has been omitted from the description of these methods to aid readability.

Furthermore, when dealing with anisotropic magnification in RELION, we have chosen to always define the CTF in the undistorted 2D coordinates. The primary motivation behind this is the assumption that the spherical aberration (the second summand in equation 10) should only be radially symmetrical if the image is not distorted. For this reason, once the distortion matrix M is known, we need to transform the astigmatic-defocus matrix D into the new undistorted coordinate system. This is performed by conjugating D under M⁻¹,

$[D^\prime = M^{-1\top} D M^{-1}. \eqno (40)]$

When a CTF value is computed after this transformation has been performed, it is always computed as CTF(Mk) instead of as CTF(k).

The Zernike polynomials that are used as a basis for the symmetrical and antisymmetrical aberrations are also defined in the undistorted coordinates, i.e. the Zernike polynomials are also evaluated at Z_b(Mk). Note that correction of these coefficients after estimating M cannot be performed analytically, but would require a numerical solution. Instead, we propose that the aberrations be estimated only after M is known. In severe cases, a better estimate of M can be obtained by repeating the magnification refinement after determining optimal defocus and astigmatism estimates using the initial estimate of M. We illustrate this scenario on a synthetic example in Section 3.4.

2.5. Implementation details

The three methods described above need to be applied to a large number of particles in order to obtain a reliable estimate. Nevertheless, we allow the three effects to vary within a data set in RELION-3.1. To facilitate this, we have introduced the concept of optics groups: partitions of the particle set that share the same optical properties, such as the voltage or pixel size (or the aberrations and the magnification matrix). As of RELION-3.1, those optical properties are allowed to vary between optics groups, while particles from different groups can still be refined together. This makes it possible to merge data sets collected on different microscopes with different magnifications and aberrations without the need to resample the images. The anisotropic magnification refinement can then be used to measure the relative magnification between the optics groups by refining their magnification against a common reference map.

Since most of the optical properties of a particle are now defined through the optics group to which it belongs, each particle STAR file written out by RELION-3.1 now contains two tables: one listing the optics groups and one listing the particles. The particles table is equivalent to the old table, except that certain optical properties are no longer listed. Those are typically the voltage, the pixel and image sizes, the spherical aberration and the amplitude contrast, and they are instead specified in the optics groups list. This reduces the overall file size, and makes manual editing of these properties easier.

A number of other optical properties are still stored in the particles list, allowing different values for different particles in the same group. These properties make up the per-particle part of the symmetrical aberration, i.e. the coefficient γ_p,k in (10). The specific parameters that can vary per particle are the following: the phase shift, defocus, astigmatism, the spherical aberration and the B-factor envelope.

The B-factor envelope is a two-dimensional parameter consisting of a scale factor S and the B factor itself. It corresponds to a Gaussian envelope over the CTF [given by Sexp(−4B|k|²)] and it provides a means of weighting different particles against each other. Specifically, a greater B factor means that the particle will contribute less to the higher frequencies of the reconstruction. Although B factors on the CTF have been available in earlier releases of RELION, the method to estimate them is new in version 3.1.

We have developed a new CTF refinement program that considers all particles in a given micrograph and locally optimises all of the above five parameters, while each parameter can be modelled either per particle, per micrograph or remain fixed. The program then uses the L-BFGS algorithm (Liu & Nocedal, 1989 ) to find the least-squares optimal parameter configuration given all the particles in the micrograph. This allows the user to find, for example, the most likely phase shift of a micrograph while simultaneously finding the most likely defocus value of each particle in it. The program has been engineered to offer a wide range of combinations, even though some of those may not appear to be useful at first, for example estimating the spherical aberration or the phase shift per particle. In this manner the program allows exceptions, for example very large particles, but we recommend most users to only model the defocus per particle and everything else per micrograph or not at all.

Note that the terms defocus and astigmatism above refer specifically to δz (defocus) and a₁ and a₂ (astigmatism), where the astigmatic defocus matrix D_p of particle p in (10) is composed as follows:

$[D_p = \left (\matrix {\delta z + a_1 & a_2 \cr a_2 & \delta z - a_1}\right).]$

As an example, this would allow the defocus to be expressed per particle by allocating a separate δz for each particle, while the astigmatism could be estimated per micrograph by requiring a₁ and a₂ to be identical for all particles.

3. Results

To validate our methods and to illustrate their usefulness, we describe four experiments using publicly available data sets. Firstly, we assess aberration correction on two data sets that were collected on a 200 keV Thermo Fisher Talos Arctica microscope. Secondly, we illustrate a limitation of our method for modelling aberrations using a data set that was collected on a 300 keV Thermo Fisher Titan Krios microscope with a Volta phase plate with defocus (Danev et al., 2017 ). Thirdly, we apply our methods to one of the highest resolution cryo-EM structures published so far, collected on a Titan Krios without a phase plate. Finally, we determine the precision to which the magnification matrix M can be recovered in a controlled experiment, using artificially distorted images, again from a Titan Krios microscope.

3.1. Aberration experiment at 200 keV

We reprocessed two publicly available data sets: one on rabbit muscle aldolase (EMPIAR-10181) and the other on the Thermoplasma acidophilum 20S proteasome (EMPIAR-10185). Both data sets were collected on the same 200 keV Talos Arctica microscope, which was equipped with a Gatan K2 Summit direct electron camera. At the time of the original publication (Herzik et al., 2017 ), the aldolase could be reconstructed to 2.6 Å resolution and the proteasome to 3.1 Å resolution using RELION-2.0.

We picked 159 352 particles for the aldolase data set and 74 722 for the proteasome. For both data sets, we performed five steps and measured the resolution at each step. Firstly, we refined the particles without considering the aberrations. The resulting 3D maps were then used to perform an initial CTF refinement in which the per-particle defoci and the aberrations were estimated. The particles were then subjected to Bayesian polishing (Zivanov et al., 2019 ), followed by another iteration of CTF refinement. In order to disentangle the effects of improved Bayesian polishing from the aberration correction, we also performed a refinement with the same polished particles, but assuming all aberrations to be zero. We measured the Fourier shell correlation (FSC) between the two independent half sets and against maps calculated from the known atomic models (PDB entries 1zah and 6bdf, respectively; St-Jean et al., 2005 ; Campbell et al., 2015 ). The plots are shown in Fig. 1 and the resolutions measured by the half-set method, using a threshold of 0.143, in Table 1. Plots of the aberration estimates are shown in Fig. 2.

Table 1
Half-set resolutions (Å) obtained at different stages of our processing pipeline in the aberration experiment on aldolase and 20S proteasome at 200 keV

	Aldolase	Proteasome
Initial	2.7	3.2
First CTF refinement	2.4	2.5
Bayesian polishing	2.3	2.3
Second CTF refinement	2.1	2.3
No aberrations	2.5	3.1

Figure 1
Left: FSC plots from the aberration experiments on aldolase and 20S proteasome at 200 keV. The top plot shows the half-set FSC and the bottom plot shows the FSC against maps calculated from the respective atomic models (PDB entries 1zah and 6bdf; see text for details). Note that estimating the aberrations during the initial CTF refinement already produces a significant increase in resolution (red line). It also allows more effective Bayesian polishing and defocus refinement, increasing the resolution further (solid black line). Neglecting the aberrations while keeping the remainder of the parameters the same (dashed black line) allows us to isolate the effects of aberration correction. For the proteasome, it also exposes a slight (false) positive peak in the half-set FSC around 2.7 Å which corresponds to a negative peak in the reference FSC. This indicates that the phases of the complex amplitudes of the 3D map are, on average, flipped at this frequency band owing to the strong aberrations. Right: small regions of the resulting maps illustrating the effect of considering the aberrations. The maps correspond to the solid black lines (aberrations considered) and the dashed black lines (aberrations not considered) in the FSC plots. The aldolase maps were sharpened by a B factor of −50 Å² and contoured at 3.7σ. The proteasome maps were sharpened by a B factor of −55 Å² and contoured at 3.5σ. All maps were rendered by PyMOL v.1.8.4.1.

Figure 2
Antisymmetrical and symmetrical aberration experiments on aldolase and the 20S proteasome at 200 keV. The upper image of each pair shows the independent phase-angle estimates for each pixel, while the lower image shows the parametric fit using Zernike polynomials. These types of aberrations are referred to as trefoil or threefold astigmatism (left) and fourfold astigmatism (right). The proteasome trefoil exceeds 180° at the very high frequencies, so the sign in the per-pixel plot wraps around. This has no impact on the parametric fit. The dashed circles indicate resolutions of 1.94 and 1.98 Å, respectively.

Fig. 2 indicates that both data sets exhibit antisymmetrical as well as symmetrical aberrations. For both data sets, the shapes of both types of aberrations are well visible in the per-pixel plots, and the parametric Zernike fits capture these shapes well. The antisymmetrical aberrations correspond to a trefoil (or threefold astigmatism) combined with a slight axial coma and they are more pronounced than the symmetrical aberrations. The trefoil is visible as three alternating areas of positive and negative phase difference, with approximate threefold symmetry, in the images for the antisymmetrical aberration estimation (on the left in Fig. 2). The axial coma breaks the threefold symmetry by making one side of the image more positive and the opposite side more negative. The apparent fourfold symmetry in the images for the symmetrical aberrations (on the right in Fig. 2) corresponds to fourfold astigmatism and is strongest for the proteasome data set. The proteasome also shows the stronger antisymmetrical aberrations, which even exceed 180° at the higher frequencies. Note that because the per-pixel plots show the phase angle of $[\hat{\bf t}_{\bf k}]$ from (20), they wrap around once they reach 180°. This has no effect on the estimation of the parameters, however, since $[\hat{\bf t}_{\bf k}]$ itself, which is a 2D point on a circle, is used in the optimization and not its phase angle.

The FSC plots (Fig. 1) indicate that aberration correction leads to higher resolution, as measured by both the FSC between independently refined half-maps and the FSC against maps calculated from the atomic models. Comparing the result of the second CTF refinement and its equivalent run without aberration correction (the lower two lines in Table 1; Fig. 3), the resolution increased from 2.5 to 2.1 Å for the aldolase data set and from 3.1 to 2.3 Å for the proteasome. In addition, aberration correction also allows more effective Bayesian polishing and defocus estimation, which is the reason for performing the CTF refinement twice.

Figure 3
Effects of the symmetrical aberrations on the CTF of the 20S proteasome as part of the aberration experiment at 200 keV. The image on the left shows a CTF expressed by the traditional model, while that on the right shows the fit of our new model which considers higher-order symmetrical aberrations. Note that the slightly square-like shape that arises from fourfold astigmatism cannot be expressed by the traditional model. The aberrations correspond to the bottom right image in Fig. 2

3.2. Phase-plate experiment

We also analysed a second data set on a T. acidophilum 20S proteasome (EMPIAR-10078). This data set was collected using a Volta phase plate (VPP; Danev et al., 2017) under defocus. We picked 138 080 particles and processed them analogously to the previous experiment, except that the CTF refinement now included the estimation of anisotropic magnification. The estimated aberrations are shown in Fig. 4 and the FSCs in Fig. 6.

Figure 4
Antisymmetrical (left) and symmetrical (right) aberrations measured on the phase-plate data set. The upper image shows independent per-pixel estimates and the lower image shows the parametric fits. Note the four afterimages of previously used phase-plate spots in the upper right image. They cannot be represented by our model. The dashed circle indicates a resolution of 2.12 Å.

The purpose of a VPP is to shift the phase of the unscattered beam in order to increase the contrast against the scattered beam. This is accomplished by placing a heated film of amorphous carbon (the VPP) at the back-focal plane of the microscope and letting the electron beam pass through it after it has been scattered by the specimen. The central, unscattered beam, which exhibits much greater intensity than the unscattered components, then spontaneously creates a spot of negative electric potential on the VPP (Danev et al., 2014 ). It is this spot which then causes the phase shift in the unscattered beam. After being used for a certain amount of time, the spot charges up even more and develops imperfections. At this point, the user will typically switch to a different position on the carbon film. The charge at the previous position will decay, although some charge may remain for an extended period. If the VPP is shifted by an insufficient distance, the old spot will reside in a position traversed by scattered rays corresponding to some higher image frequency. We hypothesize that we can observe these spots in our symmetrical aberration plots.

The symmetrical plots show a positive phase shift at the center of frequency space (Fig. 4). We hypothesize that this spot is caused by the size of the charge built up at the currently used position on the phase plate (Danev & Baumeister, 2016 ). Moreover, this plot shows four additional spots at higher spatial frequencies. We hypothesize that these may arise from residual charges on previously used phase-plate positions. These charges would then interfere with the diffracted rays at higher spatial frequency from the current position, resulting in the observed spots in the aberration image. The absence of the vertical neighbor spots from the antisymmetrical plot suggests that the spots were scanned in a vertically alternating but horizontally unidirectional sense. This is illustrated in Fig. 5.

Figure 5
Our interpretation of the aberration plots in Fig. 4

. The presence of all four neighbouring spots in the symmetrical plot, together with the absence of the vertical neighbours from the antisymmetrical plot, indicates that the VPP spots were scanned in a vertically alternating and horizontally unidirectional sense, as shown in the first image. This partitions a majority of the spots into two classes, a and b, in which the direct vertical neighbour is located on opposite sides. The total phase shift induced by the neighboring spots is decomposed into an antisymmetrical and a symmetrical part. Both of them are averaged over particles from both classes during estimation, so the vertical neighbor partially cancels out in the antisymmetrical plot but not in the symmetrical plot.

Because these types of aberrations do not satisfy our smoothness assumptions, they cannot be modelled well using a small number of Zernike basis polynomials. Although increasing the number of Zernike polynomials would in principle allow the expression of any arbitrary aberration function, it also decreases the ability of the system to extrapolate the aberration into the unseen high-frequency regions. As a consequence, our aberration model cannot be used to neutralise the effects of the phase-plate positions, which is confirmed by the FSC plots in Fig. 6. In practice, this problem can be avoided experimentally by spacing the phase-plate positions further apart and thus arbitrarily increasing the affected frequencies.

Figure 6
Half-set (top) and map versus atomic model (bottom) FSC plots for the phase-plate data set. The atomic model used was again PDB entry 6bdf. Note that considering the aberrations does not improve the resolution, since these types of aberrations cannot be expressed by our model. Nevertheless, the CTF refinement does improve the resolution owing to the new micrograph global defocus and phase-shift estimation and owing to considering the slightly anisotropic magnification.

The estimated magnification anisotropy for this data set is relatively weak. The final magnification matrix M we recovered was

$[M = \left (\matrix {1.006 & 0.005 \cr 0.006 & 0.998}\right), ]$

which corresponds to 1.35% anisotropy along two perpendicular axes rotated by 66°.

3.3. High-resolution experiment

We applied our methods to a mouse heavy-chain apoferritin data set (EMPIAR-10216) collected on a 300 keV Titan Krios fitted with a Falcon 3 camera. At the time of its publication, the particle could be reconstructed to a resolution of 1.62 Å using RELION-3.0 (Danev et al., 2019 ). This data set thus offers us a means to examine the effects of higher-order aberrations and anisotropic magnification at higher resolutions.

We compared the following three reconstructions. Firstly, the original, publicly available map. Since it had been estimated using RELION-3.0, the effects of beam tilt could be corrected for, but none of the other high-order aberrations or anisotropic magnification. Secondly, the aberrations alone: for this, we proceeded from the previous refinement and first estimated the higher order aberrations and then, simultaneously, per-particle defoci and per-micrograph astigmatism. Thirdly, we performed the same procedure but only after first estimating the anisotropic magnification. For the third case, the entire procedure was repeated after a round of refinement. For all three cases, we calculated the FSC between the independently refined half-maps and the FSC against an atomic model, PDB entry 6s61, that was built into an independently reconstructed cryo-EM map of mouse apoferritin at a resolution of 1.8 Å. In the absence of a higher-resolution atomic model, comparison with PDB entry 6s61 relies on the assumption that the geometrical restraints applied during atomic modelling resulted in predictive power at resolutions beyond 1.84 Å. We used the same mask as in the original publication for correction of the solvent-flattening effects on the FSC between the independent half-maps, and we used the same set of 147 637 particles throughout.

The aberration plots in Fig. 7 show that this data set exhibits a trefoil aberration and faint fourfold astigmatism. In the magnification plot in Fig. 8, we can see a clear linear relationship between the displacement of each pixel k and its coordinates. This indicates that the measured displacements stem from a linearly distorted image and that the implied distortion is a horizontal dilation and a vertical compression. This is consistent with anisotropic magnification, since the average magnification has to be close to 1 because the reference map itself has been obtained from the same images under random in-plane angles. The smoothness of the per-pixel plot suggest that the large number of particles allows us to measure the small amount of anisotropy reliably. The magnification matrix we estimated was

$[M = \left (\matrix {1.003 & 0.001 \cr 0.001 & 0.998} \right ), ]$

which corresponds to 0.54% anisotropy. As can be seen in the FSC curves in Fig. 9, considering either of these effects is beneficial, while considering both yields a resolution of 1.57 Å, an improvement of three shells over the reconstruction obtained using RELION-3.0.

Figure 7
Higher-order aberrations measured on the high-resolution mouse apoferritin data set. The antisymmetrical plot (left) shows a significant trefoil aberration, while the symmetrical plot (right) shows a faint fourfold astigmatism. Although the aberrations are comparatively weak, they are clearly measurable and considering them does lead to a small improvement in resolution (see Fig. 9

). The dashed circle indicates a resolution of 1.04 Å.

Figure 8
Anisotropic magnification plots for the high-resolution mouse apoferritin data set. The top row shows the estimated displacement for each pixel ( $[\hat{\delta_{\bf k}}]$ in equation 31

), while the bottom row shows the displacement corresponding to the estimated magnification matrix M (i.e. Mk − k). Note that the per-pixel estimates follow a linear relationship, indicating that the displacements are indeed caused by a linear transformation of the image. The horizontal coordinate is defined as increasing to the right and the vertical coordinate as increasing downwards, so the two plots indicate a horizontal dilation and a vertical compression. The dashed circle indicates a resolution of 1.04 Å.

Figure 9
Half-set (top) and map versus atomic model (bottom) FSC plots for the high-resolution mouse apoferritin data set. Considering the anisotropic magnification (black line) produces a further improvement in terms of resolution beyond what is attainable by considering the aberrations alone (blue line). The atomic model used was PDB entry 6s61, another publicly availably cryo-EM structure. The resolution indicated by the bottom plot is limited by the fact that the resolution of the atomic model is only 1.84 Å.

3.4. Simulated anisotropic magnification experiment

To measure the performance of our anisotropic magnification estimation procedure in the presence of a larger amount of anisotropy, we also performed an experiment on synthetic data. For this experiment, we used a small subset (9487 particles from 29 movies) taken from a human apoferritin data set (EMPIAR-10200), which we had processed before (Zivanov et al., 2018). We distorted the micrographs by applying a known anisotropic magnification using MotionCor2 (Zheng et al., 2017 ). The relative scales applied to the images were 0.95 and 1.05, respectively, along two perpendicular axes rotated at a 20° angle. In this process, about 4% of the particles were mapped outside the images, so the number of distorted particles is slightly smaller at 9093.

We then performed four rounds of refinement on particle images extracted from the distorted micrographs in order to recover the anisotropic magnification. Each round consisted of a CTF refinement followed by an autorefinement. The CTF refinement itself was performed twice each time: once to estimate the anisotropy and then again to determine the per-particle defoci and per-micrograph astigmatism. The FSC curves for the different rounds can be seen in Fig. 10. We observe that the FSC already approaches that of the undistorted particles after the second round. In the first round, the initial 3D reference map is not precise enough to allow a reliable recovery of anisotropy.

Figure 10
Half-set (top) and map versus atomic model (bottom) FSC plots for the simulated anisotropic magnification experiment on human apoferritin. The atomic model used was PDB entry 5n27 (Ferraro et al., 2017

). From the second iteration onward, the curves lie close to their final positions. Note that the resolution of the undistorted reconstruction cannot be reached by the distorted reconstructions, since particles have been lost along the way and the image pixels have been degraded by resampling.

The magnification matrix M recovered in the final round is

$[M = \left (\matrix {1.060 & -0.032 \cr -0.032 & 0.984} \right ). ]$

It corresponds to the relative scales of 0.951 and 1.049, respectively, along two perpendicular axes rotated by 19.939°, although it also contains an additional uniform scaling by a factor of 1.022. The uniform scaling factor has no influence on the refinement, but it does change the pixel size of the resulting map. We therefore note that caution must be taken to either enforce the product of the two relative scales to be 1, or to otherwise calibrate the pixel size of the map against an external reference.

This experiment shows that the anisotropy of the magnification can be estimated to three significant digits, even from a relatively small number of particles. Since the estimate arises from adding up contributions from all particles, the precision increases with their number.

4. Discussion

Although we previously described a method to estimate and correct for beam-tilt-induced axial coma (Zivanov et al., 2019), no methods to detect and correct for higher-order optical aberrations have been available until now. It is therefore not yet clear how often these aberrations are a limiting factor in cryo-EM structure determination of biological macromolecules. The observation that we have already encountered several examples of strong threefold and fourfold astigmatism on two different types of microscopes suggests that these aberrations may be relatively common.

Our results with the aldolase and 20S proteasome data sets illustrate than when antisymmetrical and/or symmetrical aberrations are present in the data, our methods lead to an important increase in the achievable resolution. Both aldolase and the 20S proteasome could be considered as `easy' targets for cryo-EM structure determination: they have both been used to test the performance of cryo-EM hardware and software (see, for example, Li et al., 2013 ; Danev & Baumeister, 2016; Herzik et al., 2017; Kim et al., 2018 ). However, our methods are not limited to standard test samples, and have already been used to obtain biological insights from much more challenging data. Images of brain-derived tau filaments from an ex-professional American football player with chronic traumatic encephalopathy that we recorded on a 300 keV Titan Krios microscope showed severe threefold and fourfold astigmatism. Correction for these aberrations led to an increase in resolution from 2.7 to 2.3 Å, which allowed the visualisation of alternative side-chain conformations and of ordered water molecules inside the amyloid filaments (Falcon et al., 2019 ).

Titan Krios microscopes come equipped with lenses that can be tuned to correct for threefold astigmatism, although this operation is typically only performed by engineers. The Titan Krios microscope that was used to image the tau filaments from the American football player is part of the UK national cryo-EM facility at Diamond (Clare et al., 2017 ). After measuring the severity of the aberrations, its lenses were re-adjusted, and no higher-order aberrations have been detected on it since (Peijun Zhang, personal communication). Talos Arctica microscopes do not have lenses to correct for trefoil, and the microscope that was used to collect the aldolase and the 20S proteasome data sets at the Scripps Research Institute continues to yield data sets with fluctuating amounts of aberrations (Gabriel Lander, personal communication). Until the source of these aberrations are determined or better understood, the corrections proposed here will be important for processing of data acquired on these microscopes.

The extent to which higher-order aberrations are limiting will depend on the amount of threefold and fourfold astigmatism, as well as on the target resolution of the reconstruction. We have only observed noticeable increases in resolution for data sets that yielded reconstructions with resolutions beyond 3.0–3.5 Å before the aberration correction. However, the effects of the aberrations are more pronounced for lower-energy electrons. Therefore, our methods may become particularly relevant for data from 100 keV microscopes, the development of which is envisioned to yield better images for thin specimens and to bring down the elevated costs of modern cryo-EM structure determination (Peet et al., 2019 ; Naydenova et al., 2019 ).

The effects of anisotropic magnification on cryo-EM structure determination of biological samples have been described previously, and methods to correct for it have been proposed (Grant & Grigorieff, 2015; Yu et al., 2016). Our method bears some resemblance to the exhaustive search algorithm implemented in JSPR (Guo & Jiang, 2014 ; Yu et al., 2016), in that it compares reference projections with high signal-to-noise ratios and the particle images of an entire data set. However, our method avoids the computationally expensive two-dimensional grid search over the direction and magnitude of the anisotropy in JSPR. In addition, our method is, in principle, capable of detecting and modeling skew components in the magnification.

In addition to modeling anisotropic magnification, our method can also be used for the combination of different data sets with unknown relative magnifications. In cryo-EM imaging, the magnification is often not exactly known. Again, it is possible to accurately measure the magnification using crystalline test specimens with known diffraction geometry, but in practice errors of up to a few percent in the nominal pixel size are often observed. When processing data from a single data set, such errors can be absorbed, to some extent, in the defoci values. This produces a CTF of very similar apperance but at a slightly different scale. Therefore, a small error in pixel size only becomes a problem at the atomic modeling stage, where it leads to overall contracted or expanded models with bad stereochemistry. (Please note that this is no longer true at high spatial frequencies owing to the absolute value of the C_s; e.g. beyond 2.5 Å for non-C_s-corrected 300 kV microscopes.) When data sets from different sessions are combined, however, errors in their relative magnification will affect the 3D reconstruction at much lower resolutions. Our method can directly be used to correct for such errors. In addition, to provide further convenience, our new implementation allows the combination of particle images with different pixel and box sizes into a single refinement. The performance of our methods under these conditions remains to be illustrated. Often, when two or more different data sets are combined, a single data set outperforms the other data sets at the resolution limit of the reconstruction and combination of the data sets does not improve the map.

Our results illustrate that antisymmetrical and symmetrical aberrations, as well as anisotropic magnification, can be accurately estimated and modelled a posteriori from a set of noisy projection images of biological macromolecules. No additional test samples or experiments at the microscope are necessary; all that is needed is a 3D reconstruction of sufficient resolution that the optical effects become noticeable. Our methods could therefore in principle be used in a `shoot first, ask questions later' type of approach, in which the speed of image acquisition is prioritized over exhaustively optimizing the microscope settings. In this context, we caution that while the boundaries of applicability of our methods remain to be explored, it may be better to reserve their use for unexpected effects in data from otherwise carefully conducted experiments.

APPENDIX A

In the following, we show that our formulation of the astigmatic-defocus term as a quadratic form is equivalent to the traditional form as defined in RELION, which in turn was based on the model in CTFFIND (Mindell & Grigorieff, 2003). Let the two defoci be given by Z₁ and Z₂, the azimuthal angle of astigmatism by φ_A and the wavelength of the electron by λ. We then wish to show that

$[\eqalignno {{\bf k}^\top D {\bf k} &= \pi \lambda [Z_\mu + Z_d \cos(2 \delta\varphi_{\bf k})] |{\bf k}|^2, & (41) \cr Z_\mu & = -{{1} \over {2}}(Z_1 + Z_2), & (42)\cr Z_d & = -{{1} \over {2}}(Z_1 - Z_2), & (43)\cr \delta\varphi_{{\bf k}} & = \tan^{-1}\left({{k^{(2)}} \over {k^{(1)}}}\right) - \varphi_A & (44)} ]$

for the astigmatic-defocus matrix D defined as

$[\eqalignno {D &= \pi \lambda Q^\top \Delta Q, & (45) \cr Q &= \left [ \matrix {\cos(\varphi_A) & \sin(\varphi_A) \cr -\sin(\varphi_A) & \cos(\varphi_A)}\right], & (46) \cr \Delta &= \left ( \matrix {-Z_1 & 0 \cr 0 & -Z_2}\right ). & (47)}]$

The multiplication by Q rotates k into the coordinate system of the astigmatism,

$[Q{\bf k} = \left [ \matrix {\cos(\delta\varphi_{\bf k}) \cr \sin(\delta\varphi_{\bf k})} \right ]|{\bf k}|. \eqno (48)]$

Multiplying out the quadratic form and applying the definitions of Z_μ and Z_d yields

$[\eqalignno {{\bf k}^\top D {\bf k} & = (Q{\bf k})^\top \Delta (Q{\bf k}) & (49) \cr &= -\pi \lambda [Z_1\cos^2(\delta\varphi_{\bf k}) + Z_2\sin^2(\delta\varphi_{\bf k})]|{\bf k}|^2 & (50) \cr &= \pi \lambda [(Z_\mu + Z_d\cos^2(\delta\varphi_{\bf k}) - Z_d\sin^2(\delta\varphi_{\bf k})] |{\bf k}|^2. & (51)}]$

By substituting cos(2δφ_k) for cos²(δφ_k) − sin²(δφ_k) we see that this is equivalent to the original formulation.

In order to convert a given D into the traditional formulation, we perform an eigenvalue decomposition of −D/(πλ). The two eigenvalues are then equal to Z₁ and Z₂, respectively, while the azimuthal angle of the eigenvector corresponding to Z₁ is equal to φ_A.

APPENDIX B

Computing T and l explicitly through (38) would require iterating over all particles p in the data set. Since we have already accumulated the terms in S_k and e_k over all p, we can avoid this by instead performing the following summation over all pixels k,

$[\eqalignno {T &= \textstyle \sum\limits_{\bf k} f_{\bf k} \widetilde{S}_{\bf k} \otimes (\widetilde{\bf k} \widetilde{\bf k}^\top), & (52) \cr {\bf l} &= \textstyle\sum \limits_{\bf k} f_{\bf k} \widetilde{\bf e}_{\bf k} \otimes \widetilde{\bf k}, & (53)}]$

where ⊗ indicates element-wise multiplication, and the real symmetrical 4 × 4 matrix $[\widetilde{S}_{\bf k}]$ and the column vectors $[\widetilde{\bf k}]$ and $[\widetilde{\bf e}_{\bf k} \in {\bb R}^4]$ are given by the reshaping of S_k, k and e_k,

$[\eqalignno {\widetilde{S}_{\bf k} &= \left [ \matrix { S_{\bf k}^{(1,1)} & S_{\bf k}^{(1,1)} & S_{\bf k}^{(1,2)} & S_{\bf k}^{(1,2)} \cr S_{\bf k}^{(1,1)} & S_{\bf k}^{(1,1)} & S_{\bf k}^{(1,2)} & S_{\bf k}^{(1,2)} \cr S_{\bf k}^{(2,1)} & S_{\bf k}^{(2,1)} & S_{\bf k}^{(2,2)} & S_{\bf k}^{(2,2)} \cr S_{\bf k}^{(2,1)} & S_{\bf k}^{(2,1)} & S_{\bf k}^{(2,2)} & S_{\bf k}^{(2,2)}}\right ], & (54) \cr \widetilde{\bf k} &= \left [ \matrix {k^{(1)}\cr k^{(2)}\cr k^{(1)}\cr k^{(2)}}\right ], \quad \widetilde{\bf e}_{\bf k} = \left [ \matrix {e_{\bf k}^{(1)}\cr e_{\bf k}^{(1)}\cr e_{\bf k}^{(2)}\cr e_{\bf k}^{(2)}} \right].& (55)}]$

Acknowledgements

We thank Rado Danev for providing polished particles for the data set in EMPIAR-10216, and Jake Grimmett and Toby Darling for assistance with high-performance computing.

Funding information

This work was funded by the UK Medical Research Council (MC_UP_A025_1013 to SHWS), the Japan Society for the Promotion of Science (Overseas Research Fellowship to TN) and the Swiss National Science Foundation (SNF; P2BSP2_168735 to JZ).

References

Batson, P. E., Dellby, N. & Krivanek, O. L. (2002). Nature, 418, 617–620. Web of Science CrossRef PubMed CAS Google Scholar
Campbell, M. G., Veesler, D., Cheng, A., Potter, C. S. & Carragher, B. (2015). eLife, 4, e06380. Web of Science CrossRef Google Scholar
Clare, D. K., Siebert, C. A., Hecksel, C., Hagen, C., Mordhorst, V., Grange, M., Ashton, A. W., Walsh, M. A., Grünewald, K., Saibil, H. R., Stuart, D. I. & Zhang, P. (2017). Acta Cryst. D73, 488–495. Web of Science CrossRef IUCr Journals Google Scholar
Danev, R. & Baumeister, W. (2016). eLife, 5, e13046. Web of Science CrossRef PubMed Google Scholar
Danev, R., Buijsse, B., Khoshouei, M., Plitzko, J. M. & Baumeister, W. (2014). Proc. Natl Acad. Sci. USA, 111, 15635–15640. Web of Science CrossRef CAS PubMed Google Scholar
Danev, R., Tegunov, D. & Baumeister, W. (2017). eLife, 6, e23006. Web of Science CrossRef PubMed Google Scholar
Danev, R., Yanagisawa, H. & Kikkawa, M. (2019). Trends Biochem. Sci. 44, 837–848. Web of Science CrossRef CAS PubMed Google Scholar
Falcon, B., Zivanov, J., Zhang, W., Murzin, A. G., Garringer, H. J., Vidal, R., Crowther, R. A., Newell, K. L., Ghetti, B., Goedert, M. & Scheres, S. H. W. (2019). Nature, 568, 420–423. Web of Science CrossRef CAS PubMed Google Scholar
Fernandez-Leiro, R. & Scheres, S. H. W. (2016). Nature, 537, 339–346. Web of Science CAS PubMed Google Scholar
Ferraro, G., Ciambellotti, S., Messori, L. & Merlino, A. (2017). Inorg. Chem. 56, 9064–9070. Web of Science CrossRef CAS PubMed Google Scholar
Grant, T. & Grigorieff, N. (2015). J. Struct. Biol. 192, 204–208. Web of Science CrossRef PubMed Google Scholar
Guo, F. & Jiang, W. (2014). Methods Mol. Biol., 1117, 401–443. CrossRef CAS PubMed Google Scholar
Hartley, R. I. (1994). ECCV '94: Proceedings of the Third European Conference on Computer Vision, edited by J.-O. Eklundh, pp. 471–478. Berlin: Springer-Verlag. Google Scholar
Hartley, R. I. & Zisserman, A. (2003). Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press. Google Scholar
Hawkes, P. W. & Kasper, E. (1996). Principles of Electron Optics, Vol. 3. New York: Academic Press. Google Scholar
Herzik, M. A. Jr, Wu, M. & Lander, G. C. (2017). Nat. Methods, 14, 1075–1078. Web of Science CrossRef CAS PubMed Google Scholar
Iudin, A., Korir, P. K., Salavert-Torres, J., Kleywegt, G. J. & Patwardhan, A. (2016). Nat. Methods, 13, 387–388. Web of Science CrossRef CAS PubMed Google Scholar
Kim, L. Y., Rice, W. J., Eng, E. T., Kopylov, M., Cheng, A., Raczkowski, A. M., Jordan, K. D., Bobe, D., Potter, C. S. & Carragher, B. (2018). Front. Mol. Biosci. 5, 50. Web of Science CrossRef PubMed Google Scholar
Krivanek, O., Corbin, G., Dellby, N., Elston, B., Keyse, R., Murfitt, M., Own, C., Szilagyi, Z. & Woodruff, J. (2008). Ultramicroscopy, 108, 179–195. Web of Science CrossRef PubMed CAS Google Scholar
Li, X., Mooney, P., Zheng, S., Booth, C. R., Braunfeld, M. B., Gubbens, S., Agard, D. A. & Cheng, Y. (2013). Nat. Methods, 10, 584–590. Web of Science CrossRef CAS PubMed Google Scholar
Liu, D. C. & Nocedal, J. (1989). Math. Program. 45, 503–528. CrossRef Web of Science Google Scholar
Meyer, R., Kirkland, A. & Saxton, W. (2002). Ultramicroscopy, 92, 89–109. Web of Science CrossRef PubMed CAS Google Scholar
Mindell, J. A. & Grigorieff, N. (2003). J. Struct. Biol. 142, 334–347. Web of Science CrossRef PubMed Google Scholar
Naydenova, K., McMullan, G., Peet, M. J., Lee, Y., Edwards, P. C., Chen, S., Leahy, E., Scotcher, S., Henderson, R. & Russo, C. J. (2019). IUCrJ, 6, 1086–1098. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Nelder, J. A. & Mead, R. (1965). Comput. J. 7, 308–313. CrossRef Web of Science Google Scholar
Peet, M. J., Henderson, R. & Russo, C. J. (2019). Ultramicroscopy, 203, 125–131. Web of Science CrossRef CAS PubMed Google Scholar
Saxton, W. (1995). J. Microsc. 179, 201–213. CrossRef Web of Science Google Scholar
Saxton, W. (2000). Ultramicroscopy, 81, 41–45. Web of Science CrossRef PubMed CAS Google Scholar
St-Jean, M., Lafrance-Vanasse, J., Liotard, B. & Sygusch, J. (2005). J. Biol. Chem. 280, 27262–27270. Web of Science CrossRef PubMed CAS Google Scholar
Yu, G., Li, K., Liu, Y., Chen, Z., Wang, Z., Yan, R., Klose, T., Tang, L. & Jiang, W. (2016). J. Struct. Biol. 195, 207–215. Web of Science CrossRef PubMed Google Scholar
Zernike, F. (1934). Physica, 1, 689–704. CrossRef Google Scholar
Zheng, S. Q., Palovcak, E., Armache, J.-P., Verba, K. A., Cheng, Y. & Agard, D. A. (2017). Nat. Methods, 14, 331–332. Web of Science CrossRef CAS PubMed Google Scholar
Zivanov, J., Nakane, T., Forsberg, B. O., Kimanius, D., Hagen, W. J., Lindahl, E. & Scheres, S. H. W. (2018). eLife, 7, e42166. Web of Science CrossRef PubMed Google Scholar
Zivanov, J., Nakane, T. & Scheres, S. H. W. (2019). IUCrJ, 6, 5–17. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar