new algorithms workshop
Introducing robustness to
of electron-microsopy dataaCentro Nacional de Biotecnología – CSIC, Darwin 3, Cantoblanco, 28049 Madrid, Spain
*Correspondence e-mail: scheres@cnb.csic.es
An expectation-maximization algorithm for t-distributions. The novel algorithm has intrinsic characteristics for providing robustness against atypical observations in the data, which is illustrated using an experimental test set with artificially generated outliers. Tests on experimental data revealed only minor differences in two-dimensional classifications, while three-dimensional classification with the new algorithm gave stronger elongation factor G density in the corresponding class of a structurally heterogeneous ribosome data set than the conventional algorithm for Gaussian mixtures.
of electron-microscopy images is presented that is based on fitting mixtures of multivariateKeywords: electron microscopy; maximum-likelihood refinement; expectation-maximization algorithm; robustness.
1. Introduction
Whereas e.g. Bricogne, 1997; de La Fortelle & Bricogne, 1997; Read, 2001; Blanc et al., 2004), in single-particle three-dimensional (3D-EM) such statistical approaches have only recently been started to be explored. An important characteristic of the approach is the natural way in which one may model the experimental noise in the data. Because the noise levels in 3D-EM data are typically extremely high, one would expect that 3D-EM problems could greatly benefit from a proper error model. However, for many years image processing in 3D-EM has been addressed using methods that do not take the noisy character of the experimental data into account in a statistical way (Frank, 2006). Provencher and Vogel performed early work on a statistical model for the noise in 3D-EM data (Provencher & Vogel, 1988; Vogel & Provencher, 1988) and it was only in 1998 that Sigworth introduced a algorithm for the alignment of a set of two-dimensional images (Sigworth, 1998). Thereafter, Doerschuk and coworkers used the same principles for the three-dimensional reconstruction of icosahedral viruses (Doerschuk & Johnson, 2000; Yin et al., 2003; Lee et al., 2007), Pascual-Montano and coworkers introduced a algorithm for self-organizing maps (Pascual-Montano et al., 2001) and Zeng and coworkers applied this approach to two-dimensional alignment of crystal images (Zeng et al., 2007).
approaches have become a gold standard in many areas of macromolecular X-ray crystallography (We were the first to address the problem of simultaneous two-dimensional image alignment and classification using et al., 2005) and we then extended this methodology to the general case of three-dimensional reconstruction from structurally heterogeneous data (Scheres, Gao et al., 2007). The latter is of special relevance for 3D-EM single-particle analysis, in which many samples constitute large and flexible macromolecular complexes. These complexes typically adopt multiple conformations that are often directly related to their function in living organisms. In principle, provided that one can sort the projections from distinct structures using a computer, multiple three-dimensional reconstructions of the particles in their distinct functional states may be obtained from a single 3D-EM experiment. However, this sorting is strongly intertwined with the orientational assignment of the projections and at present still represents one of the major challenges in single-particle image processing (Leschziner & Nogales, 2007).
principles (Scheres, Valle, NuñezWe model structurally heterogeneous data as a finite mixture and treat the unavailable information about the orientation and the structural class of each experimental projection as missing data. We then tackle the mixture problem using expectation maximization, which can be shown to converge to the et al., 1977; McLachlan & Peel, 2000). The resulting algorithm is a multi-reference procedure which is similar to conventional approaches in the field (Radermacher, 1994; Penczek et al., 1994). However, the most important difference of the approach is that the underlying statistical data model allows one to marginalize over the missing variables. That is, whereas conventional approaches assign a single orientation and class membership to each projection, the approach calculates probability-weighted assignments for all possibilities. This provides an intrinsic stabilization of the possibly unstable reconstruction problem. Together with the typical use of relatively small images (also to reduce computational costs) and an early stopping criterion in the underlying algrebraic reconstruction algorithm with smooth basis functions, or blobs (Marabini et al., 1998; Scheres, Gao et al., 2007), this yields a stable algorithm in practice which has been shown to be highly effective on multiple occasions (see, for example, Nickell et al., 2007; Cuellar et al., 2008; Julián et al., 2008; Rehmann et al., 2008).
estimation of the mixture parameters under relatively mild conditions (DempsterDespite the importance of the underlying data model in statistical approaches, little work has been performed to explore alternative models for ). The pixel areas for the classifications described in this paper, for example, range from 12 to 30 squared angstroms. Moreover, several additional sources of noise exist such as structural noise arising from the surrounding ice and detector noise; the combination of these multiple independent sources of noise has been shown to follow a Gaussian distribution (Sorzano, de la Fraga et al., 2004). The additive character of the Gaussian noise model results in a computationally attractive algorithm, but the assumption of whiteness is known to be a poor one for electron-microscopy projections. Therefore, we recently introduced an alternative data model in that allows the modelling of nonwhite, or coloured, Gaussian noise (Scheres, Nunez-Ramirez et al., 2007). The Gaussian distribution still remains a common factor, while in other pattern-recognition fields a notable interest has developed in the use of alternative distributions. For many applied problems the tails of the Gaussian are shorter than required and mixtures of Gaussians may lack robustness in the presence of atypical observations. In particular, the use of multivariate t-distributions has repeatedly been proposed as a more robust alternative. The t-distribution has wider tails and its degree of freedom ν essentially plays the role of rejecting atypical observations. As ν tends to infinity, the t-distribution approaches the Gaussian, so that ν may be viewed as a robustness tuning parameter. Several contributions defining frameworks of expectation-maximization algorithms for mixtures of t-distributions have appeared and mixtures of t-distributions have been successfully applied to a range of different types of data (Lange et al., 1989; McLachlan & Peel, 2000; Wang et al., 2004).
approaches in 3D-EM. All the approaches mentioned above share the assumption of additive white Gaussian noise in real space. A large part of the noise may result from shot noise owing to the small number of imaging electrons (10–20 per squared angstrom). The latter would require a multiplicative noise model with a However, in practice the additive Gaussian model is a good approximation when each pixel represents many squared angstroms (Sigworth, 2004In this contribution, we explore the suitability of modelling structurally heterogeneous 3D-EM data as a mixture of multivariate t-distributions. We derive the corresponding expectation-maximization algorithm in §2. In §3 we illustrate its intrinsic properties of providing robustness against outliers and compare the performance of the new algorithm with the conventional algorithm for Gaussian mixtures in two-dimensional and three-dimensional classification. We conclude this paper with a discussion on the potential usefulness of the proposed algorithm in §4.
2. Methods
2.1. The optimization problem
We model two-dimensional images X1, X2, …, XN as follows:
where
|
The reconstruction problem at hand is to estimate V1, V2, …, VK from the observed data X1, X2, …, XN. We view this estimation problem as a missing data problem, where the missing data associated with the observed data elements Xi are the position Φi and the random index κi. Thus, the complete data set is
We solve the reconstruction problem by way of Θ* that maximize the logarithm of the joint probability of observing the entire set of images X1, X2, …, XN:
estimation, where we aim to find those parametersAs described previously (see, for example, the Supplementary Note in Scheres, Gao et al., 2007), we assume that particle picking has left a two-dimensional Gaussian distribution of residual translations qx and qy centred at the origin and with standard deviation ξ. Furthermore, we assume an even distribution of the Qrot sampled in-plane rotations and a discretized distribution of estimated proportions αkp of the data belonging to the pth projection of the kth underlying three-dimensional structure (with αkp ≥ 0 and = 1). Thereby, P(k, p, q|Θ) is calculated as follows:
In contrast to previous contributions where a Gaussian distribution was employed, we calculate P(Xi|k, p, q, Θ) as a multivariate t-distribution, with a diagonal covariance matrix with all diagonal elements equal to σ2:
with
and ||·|| denoting Euclidian distance.
Following McLachlan & Peel (2000), we notice that the multivariate t-distribution may be viewed as a weighted average Gaussian distribution with the weight given by the Gamma distribution:
Here, q(ui) is the p.d.f. of a Gamma distribution with equal scale and G (ν/2, ν/2), and nJ(Xi; k, p, q, ui, Θ) is a multivariate Gaussian distribution centred at RpqVk, and again with a diagonal covariance matrix, which has all diagonal elements equal to σ2/ui,
Therefore, it is convenient to introduce another set of `missing' variables u1, …, uN, which are defined such that
independently for i = 1, …, N and all ui are independently distributed according to
Thus, the complete data set becomes
and the function to be optimized becomes
In analogy with (3) and with the previously introduced algorithm for Gaussian distributions (Scheres, Gao et al., 2007), the reconstruction problem at hand is to find the parameter set Θ* that maximizes (12). However, Θ now includes an additional parameter ν and the missing data vector has been augmented to not only include positions Φi and random indices κi but also variables ui. In this way, atypical observations in the data (i.e. observations with relatively large residuals) may be accomodated by relatively wide Gaussian distributions (i.e. with small values of ui) and the additional parameter ν is used to describe the assumed distribution of all ui according to (10).
2.2. The optimization algorithm
This optimization problem may be solved by expectation maximization (Dempster et al., 1977). This algorithm is used for finding estimates of parameters in probabilistic models that depend on unobserved or hidden variables. Expectation maximization is an iterative method that alternates between expectation (E) and maximization (M) steps. In the E-step one computes the expectation of the likelihood by including the hidden variables as if they were observed. In the M-step that follows, the estimate of the model parameters is computed by maximizing the expected likelihood found in the previous E-step. The parameters found in the M-step are then used to begin another E-step and the process is repeated. As stated above, the missing variables in this case are ui, Φi and κi and the parameters to be estimated are contained in Θ.
In the E-step, again following McLachlan & Peel (2000), we calculate the of the log-likelihood function using the current estimates of the model parameters (Θold):
Here, τoldikpq is the conditional probability distribution of k, p and q given Xi,
and for the conditional expectation of ui, given Xi, k, p and q, we obtain
In the subsequent M-step of the algorithm, we maximize the lower bound (13) with respect to all model parameters in Θ. Since there exists no closed form for the update of νnew, we will consider ν to be known (i.e. user-determined). The updates for ξ and the proportions αpknew may be calculated independently from the updates of the other model parameters as follows:
For the updates of V and σ, we note that they are a weighted version of the corresponding updates in the case of Gaussian distributions, with standard deviations , …, and with the weights being the additional missing variables u1, …, uN. Therefore, as for the Gaussian case, updating V may be performed by separately solving K least-squares problems, for which we use a modified algebraic reconstruction algorithm (wlsART; see Scheres, Gao et al., 2007). In this case, the least-squares problems are
and the updated σ is obtained as
2.3. Implementation
We implemented a total of four variants of the above-described algorithm in the open-source package XMIPP (Sorzano, Marabini et al., 2004; Scheres et al., 2008). The proposed algorithm for three-dimensional classification can be adapted with only minor changes to a two-dimensional classification algorithm. In this case, instead of optimizing (13) with respect to three-dimensional structures V1, …, VK, one optimizes this function with respect to two-dimensional images A1, …, AK. The algorithm remains basically the same, except for the fact that in this case Rpq represents an in-plane transformation (parametrized by a single rotation and two in-plane coordinates) and the least-squares problem in (18) is replaced by the following updated formula:
In addition, both the two-dimensional and the three-dimensional variants may also be expressed in X1, …, XN, A1, …, AK and V1, …, VK represent the Fourier transforms of the observed data and the two-dimensional or three-dimensional models, respectively, Gi is independent zero-mean additive noise in and Rpq represents the reciprocal-space equivalent of either a projection operation or an in-plane transformation in real space. In the former model one describes the noise by independent distributions on the real-space pixels, while in the latter the noise is modelled as being spatially stationary, which allows one to describe nonwhite or coloured noise. For a more extensive elaboration on these characteristics and their implementation, the reader is referred to Scheres, Nunez-Ramirez et al. (2007).
In this case,Finally, we mention that the summations over k, p and q are extremely computing-intensive operations. Therefore, we have implemented three deviations from the strict expectation-maximization algorithm that result in a considerable speed-up of the calculations without hampering the classification performance in practice. The first two deviations were also implemented as such in the algorithms using Gaussian distributions, whereas the third deviation is specific for the t-distribution case: (i) instead of integrating over the entire search space of k and q, we employ a reduced-space approach (Scheres, Valle & Carazo et al., 2005), (ii) the update of σ is performed using Vold instead of Vnew and (iii) following the proposal of McLachlan & Peel (2000), we replace the division by N in (19) by uoldikpq.
3. Results
3.1. Robustness to outliers
We used a simplified two-dimensional test case to illustrate the potential of the t-distribution in providing robustness to outliers. The test data consisted of 1000 experimental cryo-EM projections of a 70S Escherichia coli ribosome particle in a single orientation. In 50 of the 1000 images, we positioned circles of constant density with radii varying uniformly between 10 and 15 pixels and with centres varying between −15 and 15 pixels from the image origin. The intensity of these circles was set to a constant value of 5, 10, 15 or 20 times the standard deviation of the original experimental images. We then performed two-dimensional real-space refinements with a single reference image for these data sets, comparing the performance of the Gaussian and t-mixtures (Fig. 1). The resulting averages clearly showed the effect of the improved robustness to outliers provided by the t-mixture. For the data sets with the strongest outliers in particular, the averages obtained with the Gaussian mixture showed clear artefacts that were not visible in the averages obtained using the t-mixture model. Analysis of the converged estimates for the standard deviation in the noise indicated that the algorithm for the Gaussian case tries to accommodate the outliers by increasing the widths of the Gaussians. This is much less the case for the t-distribution case, where low values for uoldikφ downweight the contribution of the outliers in the calculation of the averages and the standard deviation of the noise. The stronger the outliers, the larger this downweighting effect and the larger the differences between the two algorithms.
3.2. Performance in two-dimensional classification
To explore the potential of the new algorithm in two-dimensional image classification, we performed et al., 2005) and on a negative-stain data set of G40P top views (Nunez-Ramirez et al., 2006). For each data set we performed four runs, using mixtures of Gaussians or of t-distributions with six and performing two-dimensional refinements in real or in All four runs were started from identical seeds, which were obtained as average images over three random subsets of the data sets. Fig. 2 shows the resulting images of these runs, which show only minor differences between the two types of mixtures either in real or in In all cases, the refined images look very similar to those obtained with a Gaussian mixture. Not only do the densities for the averaged particles in the centre of the images look very similar, the two mixture types even result in common characteristics in the noise background. The optimization path and the optimal orientation and classification parameters of the individual images upon convergence also showed only small differences (not shown).
multi-reference refinements on a cryo-EM data set of MCM top views (Gómez-Llorente3.3. Performance in three-dimensional classification
For three-dimensional classification, we compared the performance of both types of mixtures using a data set of 20 000 ribosome particles. This data set was previously shown to be structurally heterogeneous as only part of the ribosomes are complexed with elongation factor G (EF-G; see Scheres, Nunez-Ramirez et al., 2007). Refinements with four references typically converge to a single class corresponding to ribosomes in complex with EF-G and three classes of ribosomes without EF-G. We again performed four runs using real-space or reciprocal-space and using a Gaussian or a t-distribution mixture with six The intensity of segmented EF-G density in the class corresponding to ribosomes in complex with EF-G may serve as an indicator of classification quality, since remaining heterogeneity will generally yield lower density levels for EF-G. Fig. 3 shows segmented EF-G densities for the four runs performed. Starting from identical seeds, the t-mixture model in real space gave somewhat higher EF-G densities than the Gaussian mixture and the corresponding classes overlapped by 87%. in yielded stronger EF-G densities than in real space for both types of mixtures. The differences between the Gaussian and the t-mixture were smaller in this case, as no obvious difference in the intensity of EF-G density was observed and the EF-G-containing classes overlapped by 94%.
4. Discussion
The selection of individual particles from electron micrographs, called particle picking, is typically a difficult task. For cryo-EM data on relatively small particles (200–500 kDa) in particular, automated procedures may have relatively high error rates and the collection of good data is often strongly dependent on the specialized skills of the electron microscopist (Zhu et al., 2004). Therefore, it is relatively common for cryo-EM data sets to contain significant amounts of outliers. Atypical observations that were mistakenly assumed to be a particle of interest may deteriorate the quality of the three-dimensional reconstruction. In the best case scenario they only affect the resolution obtained. In the worst case scenario artefacts introduced by outliers may affect the interpretation of the structure itself. Conventionally, outliers have been dealt with by removing those particles with the lowest cross-correlation coefficients with the reference from the process (Frank, 2006). Although effective in practice, such discrete decisions are hard to accommodate in the statistical framework of refinement.
The algorithm proposed in this contribution provides an alternative statistical solution to outlier removal. The problem of structurally heterogeneous projection data is modelled as a finite mixture of multivariate t-distributions with a given degree of freedom. In the resulting expectation-maximization algorithm, images with atypically large residuals contribute relatively little to the model estimates through lower values of uoldikφ; see (15), (18) and (19). Note that the residuals used to calculate the weights uoldikφ are closely related to the cross-correlation coefficient, but instead of taking discrete decisions the statistical approach applies a continuous downweighting of outliers as their residuals increase. We illustrated this effect for a small experimental test set with artificially generated outliers. In the Gaussian model all particles contribute equally to the model estimates. Consequently, especially in the presence of strong outliers, the average images obtained showed outlier-related artefacts and the variance of the noise was overestimated. In contrast, the t-distribution model resulted in clean average images and reliable noise estimates through an effective downweighting of the outliers.
In practice, however, there are limits to the downweighting of outliers in the proposed algorithm. From (15) and the example in Fig. 1, we can see that even for a few (e.g. six) significant downweighting is only achieved for images with squared residuals that exceed the standard deviation of the noise in the images several times. This may restrict the usefulness of the proposed algorithm in identifying aberrant particles. In practice, particles with such large residuals may be easily recognizable at earlier stages of image processing, whereas one would ideally want to downweight any particle that does not correspond to a projection of one of the K reference structures. Analyses of the two-dimensional classifications presented in §3.2 indeed did not reveal an obvious relation between low values of uoldikφ and what one would consider atypical images in terms of the underlying signal (results not shown).
The number of will dominate the calculation of when using few Together with the observation made above that even for few the effect of outlier downweighting may be relatively small, this suggests that in practice it may be sufficient to run this algorithm only with few This is confirmed by our calculations. When using three, nine or 30 in the runs shown in Fig. 2, almost identical results were obtained (not shown) compared with using six degrees of freedom.
is a free parameter of the proposed algorithm. Although in theory the optimal value of any free parameter should be tested, we performed all calculations presented in this work with a fixed value of six Because 3D-EM images typically contain many pixels, the right-hand side of both the numerator and the denominator in (15)Improved image classification would be the ultimate aspiration of introducing a novel algorithm for t-mixture model over the conventional Gaussian model in our two-dimensional classifications. One could attribute this to the observation that strong downweighting may only be achieved for outliers with very large residuals and that such strong artefacts were not present in these data. However, although the differences in the uoldikφ weights may appear to be relatively small in practice, more subtle effects may still play an important role in the complicated convergence process. This may perhaps explain why three-dimensional classification of a structurally heterogeneous ribosome data set with a real-space t-mixture model may have given better classification results than the Gaussian mixture, as hinted at by a stronger signal for the complexed EF-G density.
of 3D-EM data. Despite the fact that both the MCM and the G40P data set contained significant amounts of neighbouring particles and other artefacts that were not accounted for in the model, we did not observe any significant improvement in using theWe have presented too few tests to allow the drawing of general conclusions on the relative suitability of the t-mixture model and the conventional Gaussian model. A continuing application of the proposed algorithms on multiple test cases may provide further insights, but this falls beyond the scope of this contribution. Most probably, the optimal choice of algorithm will depend on the data set at hand. Therefore, we have made all algorithms described in this work accessible to the community by implementing them in our open-source package XMIPP (Sorzano, Marabini et al., 2004; Scheres et al., 2008). Apart from modifications to the classification approach as presented here, we also foresee the exploration of alternative algorithms, such as maximum a posteriori (MAP) estimation, which may offer signifant benefits in additional stabilization of the reconstruction problem through the incorporation of prior information.
Acknowledgements
We thank Dr Yacob Gómez-Llorente for providing the MCM data, Dr Rafael Núñez-Ramírez for providing the G40P data, Drs Haixiao Gao and Joachim Frank for providing the ribosome data and the latter for useful comments on an earlier version of this manuscript. We are grateful to the supercomputing centers of Barcelona (BSC-CNS) and Galica (CESGA) for providing computer resources. Funding was provided by the European Union (FP6-502828), the US National Institutes of Health (HL70472), the Spanish Ministry of Science (CSD2006-00023, BIO2007-67150-C03-1 and -3) and the Spanish Comunidad de Madrid (S-GEN-0166-2006).
References
Blanc, E., Roversi, P., Vonrhein, C., Flensburg, C., Lea, S. M. & Bricogne, G. (2004). Acta Cryst. D60, 2210–2221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Bricogne, G. (1997). Methods Enzymol. 276, 472–494. Google Scholar
Cuellar, J., Martin-Benito, J., Scheres, S. H. W., Sousa, R., Moro, F., López-Viñas, E., Gomez-Puertas, P., Muga, A., Carrascosa, J. & Valpuesta, J. (2008). Nature Struct. Mol. Biol. 15, 858–864. Web of Science CrossRef CAS Google Scholar
Dempster, A., Laird, N. & Rubin, D. (1977). J. R. Stat. Soc. Ser. B, 39, 1–38. Google Scholar
Doerschuk, P. C. & Johnson, J. E. (2000). IEEE Trans. Inf. Theory, 46, 1714–1729. Web of Science CrossRef Google Scholar
Frank, J. (2006). Three-Dimensional Electron Microscopy of Macromolecular Assemblies. Oxford University Press. Google Scholar
Gómez-Llorente, Y., Fletcher, R. J., Chen, X. S., Carazo, J. M. & San Martín, C. (2005). J. Biol. Chem. 280, 40909–40915. Web of Science PubMed Google Scholar
Julián, P., Konevega, A. L., Scheres, S. H. W., Lázaro, M., Gil, D., Wintermeyer, W., Rodnina, M. V. & Valle, M. (2008). Proc. Natl Acad. Sci. USA, 105, 16924–16927. Web of Science PubMed Google Scholar
La Fortelle, E. de & Bricogne, G. (1997). Methods Enzymol. 276, 361–423. Google Scholar
Lange, K., Little, R. & Taylor, J. (1989). JSTOR, 84, 881–896. Google Scholar
Lee, J., Doerschuk, P. C. & Johnson, J. E. (2007). IEEE Trans. Image Process. 16, 2865–2878. Web of Science CrossRef PubMed Google Scholar
Leschziner, A. E. & Nogales, E. (2007). Annu. Rev. Biophys. Biomol. Struct. 36, 43–62. Web of Science CrossRef PubMed CAS Google Scholar
Marabini, R., Herman, G. T. & Carazo, J. M. (1998). Ultramicroscopy, 72, 53–65. Web of Science CrossRef CAS PubMed Google Scholar
McLachlan, G. & Peel, D. (2000). Finite Mixture Models. New York: John Wiley & Sons. Google Scholar
Nickell, S., Beck, F., Korinek, A., Mihalache, O., Baumeister, W. & Plitzko, J. M. (2007). FEBS Lett. 581, 2751–2756. Web of Science CrossRef PubMed CAS Google Scholar
Nunez-Ramirez, R., Robledo, Y., Mesa, P., Ayora, S., Alonso, J. C., Carazo, J. M. & Donate, L. E. (2006). J. Mol. Biol. 357, 1063–1076. Web of Science CrossRef PubMed CAS Google Scholar
Pascual-Montano, A., Donate, L. E., Valle, M., Bárcena, M., Pascual-Marqui, R. D. & Carazo, J. M. (2001). J. Struct. Biol. 133, 233–245. Web of Science CrossRef PubMed CAS Google Scholar
Penczek, P. A., Grassucci, R. A. & Frank, J. (1994). Ultramicroscopy, 53, 251–270. CrossRef CAS PubMed Web of Science Google Scholar
Provencher, S. W. & Vogel, R. H. (1988). Ultramicroscopy, 25, 209–221. CrossRef CAS PubMed Web of Science Google Scholar
Radermacher, M. (1994). Ultramicroscopy, 53, 121–136. CrossRef CAS PubMed Web of Science Google Scholar
Read, R. J. (2001). Acta Cryst. D57, 1373–1382. Web of Science CrossRef CAS IUCr Journals Google Scholar
Rehmann, H., Arias-Palomo, E., Hadders, M., Schwede, F., Llorca, O. & Bos, J. (2008). Nature (London), 455, 124–127. Web of Science CrossRef PubMed CAS Google Scholar
Scheres, S. H. W., Gao, H., Valle, M., Herman, G. T., Eggermont, P. P. B., Frank, J. & Carazo, J. M. (2007). Nature Methods, 4, 27–29. Web of Science CrossRef PubMed CAS Google Scholar
Scheres, S. H. W., Nunez-Ramirez, R., Gomez-Llorente, Y., San Martin, C., Eggermont, P. P. B. & Carazo, J. M. (2007). Structure, 15, 1167–1177. Web of Science CrossRef PubMed CAS Google Scholar
Scheres, S. H. W., Nunez-Ramirez, R., Sorzano, C. O. S., Carazo, J. M. & Marabini, R. (2008). Nature Protoc. 3, 977–990. Web of Science CrossRef CAS Google Scholar
Scheres, S. H. W., Valle, M. & Carazo, J. M. (2005). Bioinformatics, 21, Suppl. 2, ii243–ii244. Google Scholar
Scheres, S. H. W., Valle, M., Nuñez, R., Sorzano, C. O. S., Marabini, R., Herman, G. T. & Carazo, J. M. (2005). J. Mol. Biol. 348, 139–149. Web of Science CrossRef PubMed CAS Google Scholar
Sigworth, F. J. (1998). J. Struct. Biol. 122, 328–339. Web of Science CrossRef CAS PubMed Google Scholar
Sigworth, F. J. (2004). J. Struct. Biol. 145, 111–122. CrossRef PubMed CAS Google Scholar
Sorzano, C. O. S., de la Fraga, L. G., Clackdoyle, R. & Carazo, J. M. (2004). Ultramicroscopy, 101, 129–138. Web of Science CrossRef PubMed CAS Google Scholar
Sorzano, C. O. S., Marabini, R., Velázquez-Muriel, J., Bilbao-Castro, J. R., Scheres, S. H. W., Carazo, J. M. & Pascual-Montano, A. (2004). J. Struct. Biol. 148, 194–204. Web of Science CrossRef PubMed CAS Google Scholar
Vogel, R. H. & Provencher, S. W. (1988). Ultramicroscopy, 25, 223–239. CrossRef CAS PubMed Web of Science Google Scholar
Wang, H., Zhang, Q. & Wei, S. (2004). Pattern Recognit. Lett. 25, 701–710. Web of Science CrossRef Google Scholar
Yin, Z., Zheng, Y., Doerschuk, P. C., Natarajan, P. & Johnson, J. E. (2003). J. Struct. Biol. 144, 24–50. Web of Science CrossRef PubMed Google Scholar
Zeng, X., Stahlberg, H. & Grigorieff, N. (2007). J. Struct. Biol. 160, 362–374. Web of Science CrossRef PubMed CAS Google Scholar
Zhu, Y. et al. (2004). J. Struct. Biol. 145, 3–14. Web of Science CrossRef PubMed CAS Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.