- 1. Introduction
- 2. Problem formulation and the MMSE orientation estimator
- 3. Numerical methods for MMSE orientation estimation
- 4. Bayesian orientation estimation as part of the volume-reconstruction problem
- 5. Structural heterogeneity analysis
- 6. Discussion and conclusions
- B1. Additive loss function
- B2. Closed-form MMSE rules under squared loss
- F1. Isotropic Gaussian distribution of SO(3) rotations
- F2. Estimating the rotational prior
- References
- 1. Introduction
- 2. Problem formulation and the MMSE orientation estimator
- 3. Numerical methods for MMSE orientation estimation
- 4. Bayesian orientation estimation as part of the volume-reconstruction problem
- 5. Structural heterogeneity analysis
- 6. Discussion and conclusions
- B1. Additive loss function
- B2. Closed-form MMSE rules under squared loss
- F1. Isotropic Gaussian distribution of SO(3) rotations
- F2. Estimating the rotational prior
- References
research papers
Bayesian perspective for orientation determination in cryo-EM with application to structural heterogeneity analysis
aProgram in Applied and Computational Mathematics, Princeton University, Fine Hall, Washington Road, Princeton, NJ 08544, USA, bSchool of Electrical and Computer Engineering, Tel Aviv University, 69978 Tel Aviv, Israel, and cDepartment of Mathematics, Princeton University, Fine Hall, Washington Road, Princeton, NJ 08544, USA
*Correspondence e-mail: [email protected]
This article is part of a focused issue on image-processing methods for electron microscopy of biological specimens.
Accurate orientation estimation is a crucial component of 3D molecular structure reconstruction, both in single-particle cryo-electron microscopy (cryo-EM) and in the increasingly popular field of cryo-electron tomography (cryo-ET). The dominant approach, which involves searching for the orientation that maximizes cross-correlation relative to given templates, is suboptimal, particularly under low signal-to-noise conditions. In this work, we propose a Bayesian framework for more accurate and flexible orientation estimation, with the minimum mean-square error (MMSE) estimator serving as a key example. Through simulations, we demonstrate that the MMSE estimator consistently outperforms the cross-correlation-based method, especially in challenging low signal-to-noise scenarios, and we provide a theoretical framework that supports these improvements. When incorporated into iterative refinement algorithms in the 3D reconstruction pipeline, the MMSE estimator markedly improves reconstruction accuracy, reduces model bias and enhances robustness to the `Einstein from Noise' artifact. Crucially, we demonstrate that orientation-estimation accuracy has a decisive effect on downstream structural heterogeneity analysis. In particular, integrating the MMSE-based pose estimator into frameworks for continuous heterogeneity recovery yields accuracy improvements approaching those obtained with ground-truth poses, establishing MMSE-based pose estimation as a key enabler of high-fidelity conformational landscape reconstruction. These findings indicate that the proposed Bayesian framework could substantially advance cryo-EM and cryo-ET by enhancing the accuracy, robustness and reliability of 3D molecular structure reconstruction, thereby facilitating deeper insights into complex biological systems.
Keywords: orientation estimation; cryo-electron microscopy; Bayesian inference; MMSE estimator; structural heterogeneity.
1. Introduction
Determining the precise three-dimensional (3D) orientation of biological molecules from their noisy two-dimensional (2D) projection images is a fundamental challenge in cryo-electron microscopy (cryo-EM; Bai et al., 2015
; Lyumkis, 2019
; Bendory et al., 2020
). This process, known as orientation estimation, is crucial for various cryo-EM applications, including 3D reconstruction algorithms (Scheres, 2012b
; Punjani et al., 2017
), heterogeneity analysis (Sorzano et al., 2019
; Toader et al., 2023
; Donnat et al., 2022
; Zhong et al., 2021
) and beyond (Mäeots & Enchev, 2022
). For example, Fig. 1
(a) illustrates the role of orientation estimation within the cryo-EM 3D reconstruction workflow.
|
Figure 1
Comparison between maximum cross-correlation ( |
In cryo-electron tomography (cryo-ET), orientation estimation presents an additional challenge in the form of subtomogram averaging. Notably, cryo-ET suffers from higher noise levels compared with single-particle cryo-EM due to the complex and heterogeneous nature of cellular samples and the challenges of capturing data from multiple angles within thicker specimens. Subtomogram averaging offers an effective approach to enhance the signal-to-noise ratio (SNR), ultimately resulting in the reconstruction of high-resolution structures. This technique often involves extracting multiple similar subtomograms containing the target protein complex or macromolecule from a large cryo-electron tomogram reconstructed from all available tilts (typically from −60° to +60°), followed by aligning and averaging them (Zhang, 2019
; Watson & Bartesaghi, 2024
). Unlike traditional cryo-EM, this process typically aligns 3D structures directly, without the direct use of 2D projections (for an illustration, see Fig. 1
b).
Mathematically, the orientation-estimation tasks in cryo-EM and cryo-ET are slightly different. In the process of single-particle cryo-EM, which involves 2D tomographic projections, the mathematical model can be formulated as
where is the observed 2D projection image,
is the underlying 3D molecular structure, Π is the tomographic projection operator, g is the unknown 3D rotation operator of interest, ɛ represents measurement noise and g ○ V(x) ≡ V(g−1x), representing a rotation acting on a volume V with 3D coordinate x. Analogously, the mathematical model in cryo-ET subtomogram averaging, which aligns directly with the 3D structure, can be represented by
where is the observed 3D subtomogram,
is the underlying 3D molecular structure, g is the unknown 3D rotation operator of interest and ɛ represents measurement noise. Then, the goal of orientation estimation is to find the `best' 3D rotation g based on the 2D projection image (in the cryo-EM case; equation 1
) or the 3D subtomogram (in the cryo-ET case; equation 2
) with respect to the 3D reference V. Namely, we aim to estimate the rotation g given the sample y and the 3D structure V.
The actual mathematical models used in cryo-EM and cryo-ET are more intricate than those presented in equations (1
) and (2
), incorporating additional factors such as the contrast transfer function (CTF) and in-plane translations, as elaborated in Appendix A
. Although the framework introduced in this work is capable of addressing the full cryo-EM model, we adopt the simplified models in equations (1
) and (2
) to more effectively communicate the primary insights.
1.1. The gap
The common approach to estimating the rotation of an observation in the models above involves scanning through a pre-defined set of possible rotations and selecting the one that either maximizes the correlation or minimizes the distance to the given reference (2D projection image or 3D molecular structure, depending on the application). Typically, these metrics involve weighted correlations and distances, where the weights account for the noise characteristics (Scheres, 2012b
).
From an estimation theory perspective, this process corresponds to the maximum-likelihood estimator (MLE), which does not incorporate prior information about the rotation distribution, i.e. the distribution of the 3D rotation g appearing in models (1
) or (2
). A natural extension is the maximum a posteriori (MAP) estimator, which combines the data likelihood with a prior distribution on rotations. Notably, when a uniform prior is assumed, indicating that all orientations are equally likely a priori, the MAP estimator reduces to the MLE estimator (Kay, 1993
). A more rigorous treatment can be found in Section 2.3
.
However, Bayesian theory provides a much deeper and richer statistical framework that leads to improvement: replacing the MLE or MAP estimator with the Bayes estimator. The full potential of this estimator, which provides optimal accuracy according to a user-defined loss function and allows the integratation of prior knowledge about the rotation distribution, has remained untapped so far.
The term rotation distribution, used interchangeably with orientation distribution in this paper, refers to the probabilistic law governing how orientations are distributed over all possible rotations in 3D space. In cryo-EM, this concept is critical because molecules are first suspended in a thin aqueous solution and subsequently vitrified into amorphous ice, at which point their orientations become fixed but remain uncontrolled. Ideally, all orientations would be equally likely, corresponding to a uniform distribution. In practice, however, molecules often adopt preferred orientations, typically due to interactions with the air–water interface or other sample-preparation factors (Tan et al., 2017
; Lyumkis, 2019
; Li et al., 2021
). As a result, certain orientations are disproportionately represented, introducing systematic biases that can degrade reconstruction quality and complicate orientation estimation. Accounting for these deviations from uniformity is therefore essential for both methodological development and the interpretation of cryo-EM data.
Currently, the most practical way to obtain orientation estimates is through modules of reconstruction software such as RELION (Scheres, 2012b
) and cryoSPARC (Punjani et al., 2017
). These packages are primarily designed for 3D structure reconstruction and therefore treat pose estimation as a latent-variable subproblem. In particular, during refinement they compute posterior weights over a discrete set of candidate poses for each particle and use these weights for weighted averaging in the reconstruction step (a soft assignment over poses). At the same time, for downstream usage and reporting, the metadata typically provide a single orientation per particle, which is commonly the MLE estimator; that is, the maximizer of the pose posterior. Thus, while these pipelines do incorporate latent-variable pose weighting internally for reconstruction, they do not typically output a Bayesian rotation estimate for each particle.
In practice, researchers frequently use the reported per-particle orientation point estimates for downstream tasks such as heterogeneity analysis (Gilles & Singer, 2025
; Zhong et al., 2021
) and structural validation (Rosenthal & Henderson, 2003
). However, relying solely on MLE-based estimates can be suboptimal, especially in low-SNR regimes where MLE estimators are known to perform poorly. This work therefore explores opportunities to improve per-particle orientation estimation by using Bayes estimators that explicitly average over pose uncertainty and incorporate prior knowledge of the rotation distribution.
1.2. The Bayesian framework
The Bayesian framework has become a powerful and widely adopted tool in cryo-EM, now recognized as the leading method for recovering 3D molecular structures (Scheres, 2012b
; Punjani et al., 2017
; Toader et al., 2023
; Gilles & Singer, 2025
). It effectively addresses challenges such as overfitting and parameter tuning while enhancing interpretability (Scheres, 2012a
). By explicitly modeling uncertainties, Bayesian methods enable more accurate and robust reconstructions of molecular structures, driving significant advances in both resolution and structural flexibility. While the majority of these methods focus on the task of structure reconstruction (for example, see Section 4
) and aim to achieve the MAP estimator for the volume structure (Scheres, 2012b
; Punjani et al., 2017
), the Bayesian framework offers broader possibilities for addressing other problems.
This framework enables the use of any loss function over the rotational group tailored to users' requirements and accommodates a broad range of prior distributions beyond the uniform distribution. While the Bayesian framework can be adapted to different loss functions, this work focuses on the mean-squared error loss, which is equivalent to the chordal distance between 3D rotations. The primary reason for using this loss function is that its corresponding Bayes estimator has a closed-form analytical solution, making it computationally efficient and easy to interpret. We denote this estimator as , where MMSE stands for minimum mean-square error. It is worth noting that for any given loss function and prior distribution the Bayes estimator is optimal among all possible estimators, in the sense that it minimizes the posterior expected loss. Moreover, the computational cost of calculating our
estimator is on the same scale as that of the commonly used MLE and MAP orientation estimators (see Section 3
for more details).
1.3. Applications
Orientation estimation is not only integral to 3D structure reconstruction but also underpins a range of downstream tasks. Below, we highlight several representative applications that demonstrate its broader methodological significance.
One prominent example is heterogeneity analysis, where the goal is to capture structural variability that may arise from differences in composition, discrete states or continuous conformational changes. A substantial portion of state-of-the-art methods (Gilles & Singer, 2025
; Zhong et al., 2021
; Levy et al., 2025
; Luo et al., 2023
; Punjani et al., 2017
; Punjani & Fleet, 2021
) make the simplifying assumption that particle poses are already known and fixed, typically obtained from a consensus refinement procedure; these pose estimates then form the basis for subsequent modeling of structural variability. In this work, we consider continuous conformational heterogeneity and demonstrate in Section 5
that the quality of orientation estimation has a direct impact on the fidelity of the recovered conformational landscapes.
Another important application is 3D volume alignment. Unlike the setting of equation (2
), where an observation is aligned to a known reference volume, this task involves aligning two noisy volumes of the same structure with unknown relative rotations. This scenario arises, for instance, in the computation of the Fourier shell correlation (FSC) curve, a standard tool for estimating the spectral signal-to-noise ratio (SSNR) and resolution. As shown in Appendix B
, this problem can be expressed within the same statistical framework as equation (2
), differing only by an additional noise term, which highlights its close methodological connection to orientation estimation.
A further noteworthy application concerns validation. A classic example is the method of tilt pairs introduced by Rosenthal & Henderson (2003
), in which a small subset of images (for example around ten) is collected at known relative tilt angles. While the absolute particle orientations remain unknown, the relative orientations between tilt pairs are specified by the experimental geometry. The degree of agreement between estimated and known relative rotations then provides a direct validation of orientation assignment accuracy.
These examples are by no means exhaustive, but they underscore that orientation estimation is not merely a technical nuisance in reconstruction workflows, but rather a key methodological component that enables diverse downstream analyses and validation strategies across cryo-EM and cryo-ET.
1.4. Overview of results and contributions
In this work, we introduce a versatile Bayesian framework for orientation estimation with strong statistical guarantees and high flexibility. Section 2
formulates the problem and develops the Bayesian MMSE estimator. We show theoretically that the MMSE estimator coincides with the MLE estimator in high-SNR regimes (Proposition 1
), while consistently outperforming it under low-SNR conditions (Fig. 1), a typical scenario in cryo-EM and cryo-ET (Bendory et al., 2020
).
Section 3
evaluates the MMSE orientation estimator through simulations. We investigate the effect of different prior distributions and the discretization resolution L of the rotation group . Results indicate that non-uniform priors substantially improve estimation accuracy, underscoring the value of incorporating prior knowledge (Fig. 2). We further show that in high-SNR regimes the estimation error scales as L1/3 (Fig. 3), highlighting discretization as the dominant error source, whereas in low-SNR regimes noise dominates.
In Section 4
, we connect the MMSE estimator to 2D image recovery and 3D structure reconstruction, showing that it naturally leads to the expectation–maximization (EM) framework for models without projections (Proposition 3
). Empirical results confirm that the MMSE estimator consistently outperforms the MLE and MAP estimators in both tasks, providing higher accuracy and greater robustness against the `Einstein from Noise' artifact (Balanov et al., 2024
; Figs. 4 and 5).
Finally, Section 5
integrates MMSE-based pose estimation into structural heterogeneity analysis. Following the fixed-pose framework of RECOVAR, we show that the MMSE pose estimator consistently improves the recovery of conformational variability compared with the MLE counterpart. Section 6
concludes with a discussion of further applications of the Bayesian approach to orientation estimation and future research directions.
1.5. Main takeaways
Fig. 1
illustrates the critical role of orientation estimation in the cryo-EM and cryo-ET reconstruction processes, showcasing the superior performance of the MMSE orientation estimator, , compared with the MLE orientation estimator,
. Specifically, the curves show that the MMSE estimator consistently produces more accurate estimates than the MLE estimator, with the performance gap widening as the SNR decreases. Moreover, incorporating prior knowledge of rotation distribution into our MMSE estimator allows even better performance (see Fig. 2
). This performance gap becomes even more pronounced when the orientation estimators are incorporated into a reconstruction algorithm (see Section 4
).
|
Figure 2
Impact of incorporating the prior rotation distribution on estimation accuracy. Simulations are performed for the cryo-ET model (equation 2 |
Beyond reconstruction accuracy, we demonstrate that orientation-estimation quality has a direct and substantial impact on downstream structural heterogeneity analysis. Understanding structural variability is essential for characterizing the dynamic behavior of macromolecular complexes. Methods such as RECOVAR (Gilles & Singer, 2025
), designed to recover structural heterogeneity, typically assume known particle poses (i.e. fixed-pose methods). In reality these poses must be inferred, and MLE-derived estimates are commonly used in practice. We show that replacing MLE-derived poses with MMSE estimates in RECOVAR substantially improves the recovery of the latent conformational manifold, bringing results closer to those obtained using ground-truth poses. These findings establish MMSE orientation estimation not only as a tool that enhances reconstruction fidelity, but also as a facilitator that advances state-of-the-art continuous heterogeneity analysis.
The main takeaway of this paper is the recommendation to adopt the Bayesian MMSE orientation estimator for determining the orientation of each observation, in place of the commonly used MLE estimator. The MMSE estimator demonstrates superior performance even under a uniform rotation distribution and offers further improvements when incorporating prior knowledge of the underlying rotation distribution. Importantly, while current software packages do not directly compute the MMSE estimator, they already calculate all of the necessary components required for its implementation. Therefore, adopting the MMSE approach can be achieved with minimal additional computational cost. The detailed implementation and code are available at https://github.com/AmnonBa/bayesian-orientation-estimation.
2. Problem formulation and the MMSE orientation estimator
In this section, we present a particular Bayes estimator for orientation determination under a squared-error loss, commonly known as the minimum mean-square error (MMSE) estimator. The name reflects that this estimator minimizes the expected posterior mean-squared error, and it is given by the posterior (conditional) mean rather than a pointwise maximization (Kay, 1993
).
We begin by introducing a flexible mathematical model that encompasses various typical applications involving orientation estimation. Following this, we present a couple of metrics designed to assess the quality of our estimators, providing a robust framework for evaluating performance. Finally, we introduce the class of Bayes estimators, with the MMSE estimator serving as a primary example.
Throughout this paper, we use g to denote both the rotation operator and its corresponding matrix representation. For instance, g can be represented by a three by three rotation matrix in 3D. The intended meaning will be clear from the context, and this slight abuse of notation should not cause confusion.
2.1. Mathematical model for orientation estimation
We consider a unified framework for modeling measurement processes, encompassing problems such as 2D template matching, orientation estimation in cryo-EM and subtomogram averaging in cryo-ET. To better focus on the core aspects of our methodology, we omit certain physical effects, such as CTF and in-plane shifts, in this formulation. A complete mathematical model of the cryo-EM imaging process, incorporating these effects, is provided in Appendix A
.
We begin with a continuous-domain formulation. Let denote a reference structure, where n = 2 for 2D images and n = 3 for 3D volumes. For example, when n = 3, V corresponds to the continuous 3D electron density, as commonly used in cryo-EM or cryo-ET. Let
be a compact group of rotations correspondingly, and let
be an unknown transformation drawn from a distribution Λ over
.
We denote the measurement as a function , where m ≤ n accounts for possible dimension reduction due to projection (e.g. m = 2 in cryo-EM). Then, the continuous measurement model is given by
where (g ○ V)(x) := V(g−1x) is the rotated structure, Π is a known linear operator (such as a tomographic projection or identity) and ɛ denotes additive noise in the measurement domain.
In practice, measurements are only available in discretized form due to finite resolution. In the discrete setting, we assume that the operator Π also incorporates sampling onto a grid of size d, i.e. it includes both projection (if applicable) and discretization. The resulting discrete measurement model becomes
where and
is Gaussian noise with a d × d positive-definite covariance matrix Σ. To simplify the notation, we omit Π when it solely represents the discretization sampling operator and use the same symbol y to refer to the discretized measurement throughout the paper, unless stated otherwise.
2.1.1. Applications of the model
We present three typical examples of this model. In all cases, the goal is to estimate g given the sample y, the structure V and the covariance matrix Σ.
2.2. Preliminaries
Before introducing the specific estimator for the model specified in equation (3
), it is instructive to briefly revisit the general Bayesian framework and the concept of the Bayes estimator. This will provide the necessary foundation for understanding the development and analysis of the proposed MMSE estimator, as well as its statistical properties superior to those of the widely used MLE estimator.
2.2.1. Overview of the Bayesian framework and the Bayes estimators
Suppose that we aim to estimate a true rotation drawn from a known prior distribution Λ. Let
be an estimator of g based on a measurement y and let
be a loss function.
The Bayes risk of is defined as
, where the expectation is taken over the data-generation process of y given g and the prior distribution Λ of g. The Bayes estimator with respect to the loss
(chapter 4, Theorem 1.1 in Lehmann & Casella, 2006
) is defined as the estimator that minimizes the Bayes risk among all possible estimators, i.e.
Equivalently, it is the estimator that minimizes the posterior expected loss , where the expectation is taken over the posterior distribution of g given the measurement y, and any other known parameter such as volume V, with the prior Λ. For the case of
, it is given explicitly by
where, by Bayes' law, we have
2.2.2. The posterior distribution of the rotation g given an observation y under model (3
)
To introduce the MMSE estimator corresponding to the model (3
), we first compute the posterior distribution of g given y and all the additional parameters V, and Σ. We obtain the conditional likelihood density
Note that g follows the underlying prior distribution Λ. Applying Bayes' law, the posterior distribution of g given y is
2.2.3. Metrics over SO(3)
As the Bayes estimator is closely related to the given loss function , we present two candidate metrics on the 3D special orthogonal group
. Here, we represent any rotation
in its natural three by three matrix representation.
|
We note that the chordal and geodesic distances are closely related. In particular, if denotes the geodesic rotation angle between g1 and g2 (equation 10
), then
Consequently, dF is a strictly increasing function of dG and, for small angular errors, the two are locally equivalent in the sense that dF(g1, g2) ≈ (2)1/2dG(g1, g2) (see, for example, Hartley et al., 2013
). More generally, our framework is flexible and can accommodate other loss functions over any group . For a comprehensive discussion of metrics on
and their relationships, we refer the reader to Hartley et al. (2013
) and Huynh (2009
). Similar considerations apply to ; we omit a detailed discussion for brevity.
2.3. The MLE, MAP and MMSE orientation estimators
In the following presentation, we focus on independent and identically distributed Gaussian noise with Σ = σ2Id×d. These noise statistics are commonly employed in modern software tools (Kimanius et al., 2021
). While the proposed framework can incorporate more advanced noise models, we restrict attention to this case for clarity of presentation and to simplify the theoretical analysis.
2.3.1. The MLE estimator
Recalling the maximum cross-correlation method we mentioned earlier, we now connect it with the mathematical model in equation (3
) and introduce the corresponding MLE estimator. The rotation g that minimizes the distance between the corresponding projected rotated volume Π(g ○ V) and the observation y is exactly the MLE estimator defined as
where the first equality follows from equation (7
) and the second equality follows from the assumption Σ = σ2Id×d. In other words, maximizes the conditional density (equation 7
) over all possible rotations in . In the absence of the tomographic projection (for example in cryo-ET), the MLE estimator further simplifies to
which corresponds to the rotation g such that the rotated structure maximizes the correlation with y. This estimator is also frequently used in single-particle cryo-EM, under the assumption that the norm of Π(g ○ V) is approximately constant for all g. This estimator can be approximated by performing a search over a pre-defined grid of 3D rotations, selecting the rotation that minimizes the distance between the measurement y and the projected rotated volume Π(g ○ V). This approach forms the basis of standard practices in single-particle cryo-EM and cryo-ET.
2.3.2. The MAP estimator
The MAP estimator extends the MLE estimator by incorporating prior knowledge on the distribution of the rotation in the special group , dΛ(g). Formally, the MAP estimator is defined as
where the second equality follows from Bayes' law (equation 6
) and the third equality from independence of the denominator on g. In other words, maximizes the posterior density (equation 8
) over all possible rotations in . It can be seen that the MAP estimator
(equation 13
) coincides with the MLE estimator (equation 12
) when the prior distribution dΛ(g) is uniform over .
2.3.3. The MMSE estimator
For any rotation distribution Λ and loss function over
, the MLE and MAP estimators can be further improved by the corresponding Bayes estimator which minimizes the posterior expected loss
. In particular, following the definition (equation 5
) of the Bayes estimator, and for the chordal distance dF(·, ·) (equation 9
), with its corresponding squared loss , the Bayes estimator takes the form
where the expectation is taken over the posterior density (equation 8
). Here, we denote the estimator as , since minimizing the squared chordal distance is equivalent to minimizing the entrywise mean-square error between the two 3 × 3 rotation matrices. Alternative distance measures, such as the geodesic distance on
discussed earlier, could certainly be considered. In this work, however, we focus on the chordal distance, primarily because it admits a particularly convenient derivation of the associated Bayes estimator, as shown below, thereby offering both computational efficiency and interpretability. Importantly, these Bayes estimators are generally distinct from both the MLE and MAP estimators, even under a uniform prior, since it minimizes the posterior expected loss rather than selecting the most probable rotation.
To compute , we begin by noting that it is defined over the manifold
. In general, directly solving the optimization problem on a non-Euclidean manifold is challenging. To address this, we first relax the optimization domain to the ambient Euclidean space
, which leads to the following intermediate estimator:
Here, since we naturally represent as a 3 × 3 rotation matrix, the expectation (or integral) can be interpreted directly in this matrix representation, yielding a 3 × 3 matrix as the posterior mean matrix.1 However,
does not, in general, lie in
and hence may not be a valid rotation operator. To obtain a feasible solution on the manifold, we subsequently apply the orthogonal Procrustes procedure to project
back onto
, as discussed next.
Given a matrix A, to obtain a valid rotation matrix we seek that is closest to A in the Frobenius norm:
This optimization problem is known as the orthogonal Procrustes problem in the linear algebra literature. It can be solved efficiently using the singular value decomposition (SVD) of A (Gower & Dijksterhuis, 2004
). Concretely, given a matrix (in our case
), compute its singular value decomposition
The closest rotation matrix in Frobenius norm is then obtained by
Equivalently, if , one flips the sign of the last column of U (or V) before forming
UVT, ensuring
and yielding the Procrustes solution. Thus, applying the orthogonal Procrustes procedure to
projects it back onto
, yielding a valid rotation matrix. The following proposition establishes that this valid rotation matrix coincides with the MMSE estimator
, thereby completing the procedure for obtaining the MMSE estimate. A detailed proof is provided in Appendix C
.
2.3.4. The MMSE, MAP and MLE estimators in the high-SNR regime
We conclude this section by showing the next proposition. It shows that the MMSE and MAP estimators converge in the high-SNR regime (i.e. σ → 0), as demonstrated empirically in Figs. 3
, 4
and 5
. This implies that the MMSE estimator's superior statistical properties are most advantageous under low-SNR conditions, which are common in structural biology applications such as cryo-EM and cryo-ET (Bendory et al., 2020
). In these low-SNR environments, the MMSE estimator consistently outperforms its MAP counterpart.
|
Figure 3
Impact of the sampling-grid size of |
|
|
Figure 4
Comparison of 2D image recovery using the MMSE and MLE rotation estimators. Iterative image-recovery procedures with the MMSE estimator ( |
|
|
Figure 5
Comparison of 3D structure reconstruction in cryo-ET subtomogram averaging using the MMSE and MLE rotation estimators. Iterative structure-reconstruction procedures with the MMSE estimator ( |
Proposition 2
Let Λ be any distribution over with a strictly positive density, bounded below by some constant c > 0. Let
,
be the MLE and MAP rotation estimators as defined in equations (12
) and (13
), respectively, and assume they are unique. Let be the MMSE estimator as defined in equation (14
). Then, we have
The proof of the proposition is provided in Appendix D
. The proof relies on the existing result (Theorem 5.10 in Robert, 1999
) and is valid not only for rotations but also for other group operators, such as translations.
3. Numerical methods for MMSE orientation estimation
This section compares the numerical performance of the MMSE and MLE estimators. We also introduce various types of prior distributions for beyond the uniform distribution and demonstrate how incorporating this prior knowledge can significantly improve the performance of the MMSE rotation estimator. Furthermore, we study the influence of the number of sampling points of the group
of 3D rotations (namely, the number of candidate rotations), denoted as L, on the quality of rotation estimation. For better illustration, we consider the simplified setting of model (3
), where Σ = σ2Id×d, although our method applies to the general covariance setting as well.
3.1. Numerical procedure and sampling of SO(3)
Since the posterior distribution of in equation (8
) is continuous, a numerical discretization of the rotation group is required. In this work, the expectation is approximated using a quadrature rule over . Let
denote a numerical quadrature on
, constructed as a product of a spherical quadrature on
and a circular rule over the Euler angles (Gräf & Potts, 2009
; Gräf, 2012
). The implementation employed here follows the publicly available MATLAB code described in Hoskins et al. (2024
).
We first consider the uniform case, where Λ is the Haar distribution on . For each quadrature node, define
, for 0 ≤ ℓ ≤ L − 1, where
are the quadrature nodes and
are their associated weights. The MMSE estimator can then be expressed as the weighted average
where the posterior quadrature weights are defined as
For a general non-uniform prior Λ(g), the same quadrature can be applied with modified posterior weights:
followed by normalization to ensure .
Similarly, the MLE estimator in equation (12
) relies on the discretization of , which can be approximated through grid search using the uniform rotation samples or pre-defined grid
as
Both estimators require the evaluation of ||y − xℓ|| for all candidate rotations, which naively costs O(d) per candidate and thus O(Ld) overall. This estimate, however, overlooks the structure of . Any 3D rotation can be decomposed into a viewing direction (two degrees of freedom, discretized with about L2/3 samples) and an in-plane rotation about that direction (one degree of freedom, discretized with L1/3 samples). By exploiting FFT-based methods, all L1/3 in-plane rotations can be evaluated simultaneously for each viewing direction. Consequently, the effective complexity becomes
, which improves over the naive bound by a factor of L1/3 (Kostelec & Rockmore, 2008
; Kileel, Marshall et al., 2025
). This reduction is substantial in practice, since L often reaches tens or even hundreds of thousands in high-resolution cryo-EM.
3.2. The impact of non-uniform distributions of rotations
One key advantage of the Bayesian framework is its flexibility in incorporating different prior distributions for the rotations. In practical cryo-EM applications, the distribution of particle orientations is often non-uniform due to preferred particle orientations or sample-preparation artifacts (Tan et al., 2017
; Lyumkis, 2019
). If the rotation distribution can be well estimated in an early stage, the information of the rotation distribution can then be integrated into the Bayes estimator to improve rotation estimation.
To illustrate rotation estimation under different prior distributions on , we replace the uniform distribution on
with an isotropic Gaussian (IG) distribution on
, denoted by
, parameterized by a scalar variance η2. This distribution is frequently used in machine-learning probabilistic models on
(Corso et al., 2022
; Leach et al., 2022
; Jagvaral et al., 2024
). It is worth noting that the IG distribution serves as a typical example, and similar phenomena as observed here extend to other non-uniform distributions on as well.
The IG distribution can be represented in an axis–angle form, with uniformly sampled axes of rotation and a rotation angle ω ∈ [0, π]. The scalar variance η2 controls the distribution of the rotation angle ω: as η → ∞, the
distribution approaches uniformity, whereas as η → 0 ω becomes increasingly concentrated around 0, i.e. the rotation angle around the rotation axis is small. We apply the inverse sampling method to obtain independent and identically distributed (i.i.d.) samples from the IG distribution. Further details of this distribution are provided in Appendix E
.
Fig. 2
illustrates how incorporating a prior distribution over rotations, governed by the variance parameter η of an isotropic Gaussian distribution, affects the accuracy of the MMSE rotation estimator. In all cases, the true underlying rotation distribution is modeled as an isotropic Gaussian distribution
over
. The MLE estimator was computed according to equation (24
). For the MMSE estimators, the estimation process used different prior distributions with variance parameters η = 0.7, 0.5 and 0.1, respectively. Specifically, the candidate rotations g(ℓ) were generated according to these different priors (see equation 21
), highlighting the impact of prior mismatch on estimation accuracy. As the variance decreases (indicating a more concentrated and less uniform distribution closer to the true underlying distribution), the performance of the MMSE estimator improves, particularly under lower SNR conditions. These findings highlight the value of incorporating prior knowledge in rotation estimation, demonstrating its potential to enhance accuracy substantially. In stark contrast, the MLE estimator remains entirely unchanged regardless of the true underlying rotation distribution. As a result, its performance remains suboptimal, particularly in scenarios where the true distribution deviates significantly from uniformity.
We note that in experimental cryo-EM/ET datasets the viewing-angle distribution is typically unknown. Estimating this distribution, and understanding how prior misspecification impacts downstream reconstruction and heterogeneity analysis, remains an important open problem. We discuss possible approaches for estimating the rotational prior in Appendix F
. Recent work has begun to quantify misspecification effects under non-uniform group actions; see, for example, Xu et al. (2025
). In this light, Fig. 2
can be viewed as a controlled prior-mismatch study: the data are generated from a concentrated isotropic Gaussian prior, whereas the MMSE (and MAP) estimators are evaluated using an assumed prior with varying concentration. We observe a graceful degradation under mismatch, and a clear improvement as the assumed prior approaches the true distribution. Importantly, even under the common assumption of a uniform prior, the MMSE estimator improves over MLE in the low-SNR regime, while more accurate prior information yields additional gains when strong preferred orientations are present.
3.3. The impact of sampling-grid size L and SNR on estimation accuracy
To study these effects, we restrict ourselves to the uniform rotation distribution setting. Fig. 3
illustrates the impact of the sampling-grid size of , together with different levels of SNR, on the geodesic distance between the MLE and MMSE rotation estimators relative to the true rotations. Several observations can be made. Firstly, at high SNR, the MLE and MMSE estimators nearly coincide, consistent with Proposition 1
. In this regime, the geodesic error decreases with the grid resolution, scaling empirically as L1/3. This behavior arises because a discretization of involves three angular parameters, so the resolution in each parameter direction grows with L1/3. Secondly, as the SNR decreases, noise dominates and the advantage of refining the grid diminishes: both estimators approach similar performance, and the dependence on L becomes weaker. In the extreme limit σ → ∞, the mean geodesic distance of the MLE and MMSE estimators is indistinguishable and shows no dependence on the grid size L.
3.4. Sampling resolution and spectral information
Another factor that influences the discrepancy between the MAP and MMSE estimators is the effective spectral information available at a given sampling resolution. When comparing MAP and MMSE estimates across different image sizes, it is important to distinguish increasing the number of pixels from increasing usable spectral content. If the underlying signal is effectively bandlimited and the image is already sampled at (or above) the Nyquist rate for that bandlimit, then increasing the sampling resolution mainly amounts to oversampling and should not materially change the likelihood or the posterior over ; consequently, the MAP–MMSE estimators are expected to remain essentially unchanged. In contrast, when higher sampling resolution is accompanied by additional usable high-frequency signal (i.e. non-negligible per-frequency SNR at higher bands), the posterior typically becomes more concentrated and we expect the MAP and MMSE estimates to approach each other in geodesic distance (at a fixed SNR level).
4. Bayesian orientation estimation as part of the volume-reconstruction problem
Thus far, we have introduced the MLE, MAP and MMSE orientation estimators for estimating the rotation between a single noisy observation y and a reference volume V, as defined in equation (3
). An intriguing question that arises is how these estimators can be utilized and influence performance within a 3D volume-reconstruction process, which constitutes the main computational challenge in cryo-EM and cryo-ET.
Structure reconstruction typically follows two primary approaches, hard-assignment or soft-assignment methods, which are generally implemented through iterative refinement. In the hard-assignment approach, each observation is assigned a single orientation based on the highest correlation, and the 3D structure is then reconstructed given the rotations. In contrast, the soft-assignment method assigns probabilities across all possible orientations for each observation, enabling the 3D structure to be recovered as a weighted average of the observations, with weights determined by these probabilities. The iterative application of the soft-assignment procedure aligns with the expectation–maximization (EM) algorithm (Dempster et al., 1977
; Sigworth et al., 2010
), which serves as the core computational method in modern cryo-EM (Scheres, 2012b
; Punjani et al., 2017
).
In the following, we demonstrate that incorporating the MMSE orientation estimator into the volume-reconstruction process in the cryo-ET model (without projections) resembles the EM algorithm, as it accounts for the full distribution of possible outcomes. In contrast, substituting the MLE estimator into the algorithm operates more like a hard-assignment reconstruction method, focusing exclusively on the most likely outcome. This distinction underscores the broader applicability and flexibility of the Bayes estimator in capturing uncertainty and delivering more accurate estimates for structure reconstruction.
4.1. Connection to the EM algorithm in volume reconstruction without projection
Unlike the previous orientation-estimation model (equation 3
), we consider the following simplified model for volume reconstruction (see also Singer, 2018
; Bandeira et al., 2023
; Fan et al., 2024
). We observe M i.i.d. samples taking the form
where V is the 3D volume structure of interest, are unknown latent variables following an i.i.d. uniform distribution, satisfying
, and
is i.i.d isotropic Gaussian noise with variance σ2. For the case where V represents a 2D image,
corresponds to in-plane rotations and the model is used for image recovery (for more, see Section 4.2
; Ma et al., 2019
). The goal is to recover V from the observations , treating the rotations
as latent variables.
To distinguish the model (26
) from (3
), we highlight the key differences as follows.
The most common method for solving this reconstruction problem is the EM algorithm, which applies soft assignment iteratively, as outlined in Algorithm 2. In each iteration, the algorithm uses the volume estimate from the previous iteration, denoted as , to update the volume estimate
, based on the observations
. The following proposition illustrates the relationship between the volume structure update rule at iteration t + 1, and the MMSE orientation estimator introduced in equation (14
).
To state the next result, we first introduce the MMSE back-rotation operator. Given an observation yi and the current reference volume , let
pi(t)(g) denote the posterior weights over
as defined in equation (27). We define
as the posterior average of the inverse action g−1,
which should be interpreted as a linear operator acting on functions (or volumes). Namely, for any test function f on which rotations act,
Proposition 3 (MMSE operator form of the EM M-step)
Let be the (t + 1)th volume estimator in the EM algorithm as described in Algorithm 2, and let
be defined as in equations (29
) and (30
). The M-step update can then be written as
The proof of this proposition is presented in Appendix G
. In words, the proposition shows that the update rule for the volume structure estimation at iteration t + 1, given the volume , is equivalent to aligning each observation
using the associated back-rotation MMSE operator
, computed based on the reference volume
, and then averaging the aligned observations (equation 31
). Thus, the MMSE back-rotation operator is a key ingredient in the EM algorithm for volume reconstruction.
Remark 1
The MMSE back-rotation operator (equation 29
) should be distinguished from the MMSE rotation matrix estimator obtained in equation (14
) by taking a posterior mean in the 3 × 3 matrix embedding and then projecting onto (via orthogonal Procrustes). In the matrix formulation, one forms
as in equation (15
) and then projects to obtain a single element. In contrast, in equation (30
) the inversion is handled inside the posterior average: the expectation is taken over the inverse actions g−1 ○ (·), yielding a linear operator that acts directly on yi [and is not, in general, itself a member of ]. While these two objects differ in representation, they are both posterior-weighted aggregations over rotations; the operator form is the one that naturally appears in the EM M-step for reconstructing the volume.
Despite the subtle differences arising from rotation estimation versus structure reconstruction, the previously introduced MMSE rotation estimator (equation 14
) can still serve as a practical tool for assessing the performance of soft assignments. Concretely, we take the posterior rotation matrix mean as in equation (29
) and then perform an orthogonal Procrustes projection to replace . In this way, the MMSE rotation estimator provides a concrete approximation to the soft assignment in structure reconstruction. In numerical experiments, it is compared with the hard assignment, demonstrating superior performance relative to the maximum-likelihood estimator, as discussed below.
4.1.1. The MLE estimator as part of volume reconstruction
In contrast to the soft-assignment procedure, if we replace the MMSE operator with the corresponding MLE operator, we obtain the structure-reconstruction algorithm by applying hard assignment iteratively. To be more specific, the MLE operator applied to the observations can be viewed as a hard assignment among all possible rotations. In practice, this procedure involves making a hard decision where a single rotation is selected from the rotation grid according to the closest alignment. In the (t + 1)th iteration, similarly to equation (31
), the hard-assignment process can be expressed as follows:
where is defined by
In other words, where
Hence, the hard assignment can be viewed as using the exact inverse of the MLE estimator introduced earlier.
4.2. Empirical results for volume reconstruction and the `Einstein from Noise' phenomenon
We demonstrate volume reconstruction empirically by applying the MLE estimator and the MMSE estimator
as part of the reconstruction problem, as specified in Section 4.1
.
4.2.1. Description of the experiments
We demonstrate the reconstruction processes, which integrates MLE and MMSE rotation estimators as intermediate steps, using the two examples of 2D image recovery (Fig. 4
) and 3D volume reconstruction without projection (Fig. 4
). The iterative reconstruction process, as outlined in Algorithm 2, was performed until convergence (i.e. when the relative difference between consecutive iterations fell below a predefined threshold of 10−3 or until a maximum of 100 iterations was reached).
Figs. 4
and 5
were generated using slightly different methods. For the 2D experiment presented in Fig. 4
we used polar coordinates, while in the 3D experiment, shown in Fig. 5
, we used a standard Cartesian basis. The primary difference lies in the interpolation required for producing Fig. 5
, which utilizes cubic interpolation for each observation based on the estimated rotation. This introduces certain `quantization' errors. Additionally, as detailed in Section 2
, the 3D reconstruction process presented in Fig. 5
requires a `rounding' step which amounts to solving the orthogonal Procrustes procedure. In Fig. 4
, the true and template structures were generated in a polar representation with d = 300 radial points and L = 30 polar angle points. The reconstruction process was performed with M = 5 × 104 observations. The additive noise was added in the polar representation. In Fig. 5
, M = 3000 observations were used, with a rotation group grid size of L = 300 and a volume size of 32 × 32 × 32.
4.2.2. Empirical observations
A few observations can be made from Figs. 4
and 5
. Firstly, in the case of high SNR (i.e., as σ → 0), the volume reconstruction is similar whether using the MLE estimator or the MMSE estimator. This similarity is theoretically supported by Proposition 2
. However, as the SNR decreases the reconstructions diverge, with the volumes reconstructed using the MMSE estimator showing a better correlation with the true volume. Secondly, in scenarios of extremely low SNR, where the structural signal is nearly nonexistent, the phenomenon known as `Einstein from Noise' manifests in both 2D and 3D contexts. This phenomenon pertains to the inherent model bias within the reconstruction procedure, specifically in relation to the initial templates. In such cases, the reconstructed volume exhibits structural similarities to the initial template, even though the observations do not substantiate this outcome. The generation of a structured image from entirely noisy data has attracted considerable attention, particularly during a significant scientific debate regarding the structure of an HIV molecule (Mao, Castillo-Menendez et al., 2013
; Henderson, 2013
; van Heel, 2013
; Subramaniam, 2013
; Mao, Wang et al., 2013
); for a comprehensive description and statistical analysis, see Balanov et al. (2024
), Balanov, Bendory et al. (2025
), Balanov, Huleihel et al. (2025
) and Balanov, Zabatani et al. (2025
). Notably, our empirical evidence suggests that the `Einstein from Noise' phenomenon is more pronounced when adopting the MLE estimator compared with the MMSE estimator, implying that the MMSE approach is less vulnerable to the choice of the initial template. Furthermore, our experiment suggests that the advantage of using the Bayesian MMSE estimator over the MLE estimator is more significant in 3D structure-reconstruction tasks compared with 2D image recovery, where the 3D setting is a problem of greater interest to researchers in structural biology.
5. Structural heterogeneity analysis
Understanding heterogeneity is central to revealing the dynamic behavior of macromolecular complexes in cryo-EM. Unlike traditional 3D classification methods that assume a small number of discrete states, modern approaches aim to capture continuous structural variability by embedding projection images into a latent conformational space. Here, we adopt the RECOVAR framework (Gilles & Singer, 2025
), which performs heterogeneity analysis using principal component analysis (PCA) based on a regularized estimate of the conformational covariance matrix. The key idea is to infer the covariance of 3D volumes directly from noisy 2D projection images, leveraging the fact that although the volumes themselves are unobserved, their projections contain sufficient statistical information.
5.1. Fixed-pose methods
A central computational assumption in RECOVAR is that the poses {ϕi} of all particle images are known and fixed, typically obtained from a consensus refinement procedure. This `fixed-pose' assumption underlies many modern heterogeneity analysis pipelines, including 3D classification in cryoSPARC (Punjani et al., 2017
), 3D variability analysis (3DVA; Punjani & Fleet, 2021
), 3DFlex (Punjani & Fleet, 2023
), cryoDRGN (Zhong et al., 2021
), CryoDRGN-AI-fixed (Levy et al., 2025
), Opus-DSD (Luo et al., 2023
) and RECOVAR (Gilles & Singer, 2025
). These methods are collectively referred to as fixed-pose methods, since they treat poses as known inputs during downstream inference. Among them, RECOVAR has recently emerged as one of the most effective approaches. According to the CryoBench evaluation (Jeon et al., 2024
), which benchmarks multiple methods across diverse synthetic and experimental datasets, RECOVAR consistently achieves state-of-the-art performance in both structural resolution and latent space recovery. Notably, CryoBench evaluations were performed by external users, indicating RECOVAR's strong out-of-the-box performance and robustness to hyperparameter choices. This motivates our choice to build upon the RECOVAR framework.
Despite its strong empirical performance, the fixed-pose assumption merits careful scrutiny. In practice, ground-truth poses are never directly accessible and cannot be recovered exactly, even if the underlying volume is known, due to noise and the intrinsic ill-posedness of the inverse problem. Most existing methods address this by using MLE pose estimates in real data. This gap between assumed and achievable pose accuracy motivates a central question: to what extent can more accurate pose estimation improve downstream heterogeneity recovery? In this work, we address this question by replacing MLE poses with our MMSE estimator within RECOVAR.
5.2. The mathematical model
The forward model (equation 1 in Gilles & Singer, 2025
) used in RECOVAR is expressed in the Fourier domain as
where is the observed Fourier-transformed image (with N denoting the grid size along each spatial axis),
is the (unknown) 3D Fourier volume corresponding to a particular conformation,
is the tomographic projection operator from 3D to 2D after a rigid-body motion ϕi = (gi, ti) with
a rotation and
an in-plane shift, Ci is the contrast transfer function (CTF) and
is additive noise.
This formulation is a discretized and sampled version of our continuous forward model in (39
). Specifically, the operator acting on
corresponds directly to
, where each component reflects the same physical process, namely, in-plane shift, 3D rotation and evaluation on the central slice in Fourier space.
To model structural variability, we treat each underlying volume as a random sample from a distribution over 3D conformations. We assume this distribution has a well defined mean
and covariance
given by
where the expectation is taken over the conformational distribution. The objective of RECOVAR is to estimate both μ and Σ from the observed projections with given poses ϕi, using regularized least-squares minimization (equations 2 and 3 in Gilles & Singer, 2025
).
5.3. Numerical experiments
In our experiments, we consider a synthetic one-dimensional conformational transition, a standard benchmark for continuous heterogeneity analysis methods such as cryoDRGN (Zhong et al., 2021
) and RECOVAR (Gilles & Singer, 2025
). The corresponding ground-truth density maps along the conformational coordinate are shown in Fig. 6
(a). To generate the projection images, the underlying conformational states were uniformly sampled along the one-dimensional transition, and molecular orientations were sampled from a uniform distribution over . Fig. 6
(b) further displays a representative noisy projection under high-noise conditions.
|
Figure 6
MMSE pose estimation improves heterogeneity reconstruction over MLE. (a) Ground-truth density maps from a synthetic dataset simulating a one-dimensional conformational transition, a standard benchmark for methods such as cryoDRGN (Zhong et al., 2021 |
To avoid storing and manipulating the full high-dimensional covariance matrix , RECOVAR first estimates a low-dimensional subspace of rank
in which the covariance is approximated. Following Appendix A.2 in Gilles & Singer (2025
), we estimate via regularized least squares, and then form a rank-d approximation
by selecting a subset of columns using a greedy SNR-based criterion that ensures both high SNR and low inter-column correlation. An orthonormal basis
spanning
is obtained via randomized SVD. The conformational mean
is also estimated at this stage. Each conformational volume is then represented as
where is the estimated conformational mean from the first stage and
are low-dimensional coordinates. In RECOVAR, these coordinates are not explicitly recovered; their covariance matrix
is estimated directly via a reduced least-squares problem (equation 14 in Gilles & Singer, 2025
), yielding a low-rank covariance approximation that characterizes the variability of zi. The eigen-decomposition of then provides the estimated principal components and eigenvalues in the reduced space. The complete procedure is detailed in Algorithm 1 of Gilles & Singer (2025
).
We perform our analyses on a controlled synthetic dataset of 30 000 projection images (N = 128), simulated from a one-dimensional conformational transition discretized into 50 equally spaced states to approximate continuous heterogeneity (Gilles & Singer, 2025
). We first evaluate the effect of orientation estimation on covariance recovery. The study is conducted in the high-noise regime of RECOVAR (Fig. 6
b), where even with ground-truth poses, the top 30 principal components capture only about 40% of the variance due to finite-sample and high-noise effects. We set r = 50 and compare three pose sources, ground truth, MMSE estimates and MLE estimates, keeping all other steps identical. For fixed-pose methods, poses are treated as given, so the computational cost remains unchanged.
The accuracy of the estimated subspace is quantified by the percentage of total variance captured,
where Uk contains the first k estimated principal components, Σ is the ground-truth covariance matrix and Σ1/2 is its matrix square root. Here, is the Schatten 1-norm of matrix A with σi denoting the singular values of A.
Figs. 6
(e) and 6
(f) present the variance-capture results as described and the corresponding eigenvalue recovery, respectively. Fig. 6
(d) visualizes the first five recovered principal components. Overall, MMSE pose estimation yields principal components and eigenvalue spectra that are consistently closer to the ground truth than those obtained from MLE pose estimation.
We further perform a more granular evaluation of structure reconstruction at the level of individual conformational states. For this analysis, we select all projection images corresponding to a specific conformational state and embed them into the latent conformational space (z-space; see equation 37
) using RECOVAR, conditioned on different pose priors: ground truth, MMSE estimates and MLE estimates. For each subset, the reconstruction corresponding to the mean embedding serves as an estimate of that specific conformational state. These reconstructions provide a direct measure of how different orientation estimates affect the accuracy of heterogeneity structure reconstruction. Fig. 6
(c) demonstrates that MMSE orientation estimates produce reconstructed structures closer to the ground-truth conformation than those obtained from MLE, further highlighting the advantage of MMSE pose estimation in high-noise heterogeneity analysis. We further compared the reconstructions obtained using MMSE and MLE orientation estimates against the ground-truth conformational state. The reported local FSC resolution scores (in Å) show that the 50% quantile resolutions are 11.45 and 11.55 Å for MMSE and MLE, respectively, while the 90% quantiles are 18.83 and 21.43 Å (see also Supplementary S.J. in Gilles & Singer, 2025
).
6. Discussion and conclusions
In this work, we have introduced the Bayesian framework for enhancing orientation estimation for various applications in structural biology. The proposed approach offers greater flexibility and improved accuracy compared with existing methods, with the MMSE estimator as a prime example. This technique handles diverse structural conformations and arbitrary rotation distributions across sample sets. Our empirical results establish that the proposed MMSE estimator consistently surpasses the performance of the current methods, particularly in challenging low-SNR environments, as well as when prior information on rotation distribution is available or approximately known. We provide a theoretical foundation to explain these performance gains. As rotation determination is crucial for both 2D and 3D reconstruction processes, we further illustrate how utilizing the MMSE estimator as a soft-assignment step in iterative refinement leads to significant improvements over hard-assignment methods. Moreover, the proposed Bayesian approach empirically offers enhanced resilience against the `Einstein from Noise' phenomenon, effectively reducing model bias and improving the overall reliability of structural reconstructions. Thus, our main recommendation is to adopt the Bayesian MMSE rotation estimator over the MLE estimator in related application scenarios. Already integrated into most software, the Bayesian rotation estimator can be easily implemented with minimal computational cost, offering improved accuracy and resilience for orientation determination and structure reconstruction.
6.1. Future work
The MMSE orientation estimator represents the most natural Bayes estimator in the context of cryo-EM, as it coincides with the M-step reconstruction of the EM algorithm. Nevertheless, alternative Bayes estimators associated with different loss functions could also be explored in future studies. A particularly promising direction is the direct estimation of the rotation distribution from observed images and the incorporation of this information into rotation estimation or EM-based 3D reconstruction algorithms (Janco & Bendory, 2022
; Xu et al., 2025
). Such approaches could substantially improve rotation accuracy, especially in low-SNR regimes, as indicated by our results in Fig. 2
, and consequently enhance volume-reconstruction quality. Further extensions may involve modeling structural uncertainties, which is particularly relevant for flexible or heterogeneous proteins, and addressing general pose-estimation problems that jointly consider rotations and translations. Finally, leveraging the Bayesian framework to derive confidence regions for individual rotations presents an exciting opportunity to improve the interpretability and reliability of rotation estimation in cryo-EM.
APPENDIX A
Full cryo-EM model
The (homogeneous) cryo-EM imaging process can be described by a comprehensive generative model that incorporates the key physical components of image formation, including random orientations, in-plane shifts, tomographic projection and contrast transfer function (CTF).
Formally, each observed 2D image yi is modeled as
where V denotes the underlying 3D volume, gi denotes the random rotation, Π is the tomographic projection operator (i.e. line integration along the x3 axis),
Tti denotes the in-plane shift operator with shift , * denotes the 2D convolution in the image plane, hi is the point-spread function (whose Fourier transform is commonly known as the CTF) and ɛi denotes the measurement noise. Equivalently, by applying the Fourier transform and using the Fourier slice theorem, we obtain the forward model in the Fourier domain,
where are the 2D Fourier transforms of yi, hi, ɛi, respectively, and
is the 3D Fourier transform of V.
In practice, both the images and the underlying volume are discretized and sampled on finite grids. Depending on whether the model is implemented in the spatial or Fourier domain, different but equivalent discretized versions arise. Specifically:
|
Here, N denotes the resolution of the reconstructed grid: each image consists of N × N pixels, and the volume is represented on an N × N × N voxel grid.
In the heterogeneous case, the underlying volume V may itself be a realization drawn from a distribution over conformational states , where
represents a (possibly low-dimensional) distribution capturing the variability in molecular conformations, such as discrete states and continuous manifolds. This setting gives rise to the heterogeneous cryo-EM problem, where the goal is to infer not just a single structure, but the distribution
from noisy 2D projections.
APPENDIX B
Bayes pose estimation with rotations and shifts
To account for in-plane shifts, we model the latent pose as , where g denotes the 3D rotation and t denotes the in-plane translation. Given a volume V, the forward model can be written as
where is the cryo-EM/ET forward operator including both rotation and shift [for example in the Fourier domain the shift acts via the phase factor exp(−2πiω·t); see Appendix A
]. We assume priors Λ(g) on rotations and π(t) on shifts. The joint posterior then takes the form
B1. Additive loss function
To define a Bayes estimator, we specify a loss function on . Throughout, we adopt an additive pose loss,
which penalizes rotation and translation errors separately (optionally with application-dependent weights to reconcile units). Under equation (37), the Bayes pose estimator is defined by
where the expectation is taken with respect to the joint posterior (equation 36). Because the loss is additive, the minimization in equation (38) decomposes into two marginal Bayes estimators:
B2. Closed-form MMSE rules under squared loss
For translations, choosing Euclidean squared loss yields the posterior mean (similarly to equation 15
),
where p(g, t ∣ y) is defined through equation (41)
. For rotations, when g is represented by its 3 × 3 matrix and we take the squared chordal/Frobenius loss , the corresponding Bayes estimator is obtained by first forming the relaxed posterior mean matrix (similarly to equation 15
),
and then projecting it onto via the orthogonal Procrustes map (similarly to equation 19
),
which returns the closest valid rotation matrix (in Frobenius norm) to .
APPENDIX C
Alignment between two noisy volumes
The model described in equation (3
) can be generalized to address the alignment problem between two noisy volumes. In equation (3
), a single observation y was assumed to follow the model, with the volume V considered deterministic and known. For aligning two noisy volumes, we now consider two observations yi for i = 1, 2, given by
where for a positive-definite covariance matrix Σd×d and gi are drawn from a distribution Λ over
. We assume that ε1, ε2, g1, g2 are independent and that the volume V is deterministic but unknown. Furthermore, we assume that the noise statistics are invariant under the group actions of
, meaning the distributions of ε and g ○ ε are identical for any g ∈
. The objective is to determine the `best' alignment
between y1 and y2, according to criteria such as the MLE, MAP or MMSE estimators.
From equation (49
), the unknown volume V can be expressed as
Substituting equation (50
) into equation (49
) yields
where equation (53
) follows from the associativity of group actions, and in equation (53
) we have used . Since the noise statistics are invariant under group actions, it follows that
.
Thus, the problem of estimating the relative rotation between y1 and y2 is similar to estimating y2 under the assumption that V is known. However, the covariance of the additive noise ε in this estimation is doubled compared with the case when V is known.
APPENDIX D
Proof of Proposition 1![[link]](../../../../../../logos/arrows/d_arr.gif)
By the definition of (equation 14
), we have
Then, as are fixed for any
, we have
As a consequence, equation (57
) is equal to the rotation that has the highest correlation with the relaxed estimator
as in equation (15
). It follows that
Thus, is exactly the Procrustes procedure performed on the intermediate estimator
.
APPENDIX E
Proof of Proposition 2![[link]](../../../../../../logos/arrows/d_arr.gif)
We split the proof into two parts. Firstly, we show that
Secondly, we show that
For equation (58
), the argument is straightforward. Since the prior density Λ(g) is bounded below by some positive constant, the term becomes negligible as σ → 0 compared with the other term. Consequently, the MAP estimator converges to the MLE in this limit.
For equation (59
), we apply the following theorem (Corollary 5.11 in Robert, 1999
).
Theorem 1
Consider h a real-valued function defined on a closed and bounded set Θ of . Let π be a positive density on Θ. If there exists a unique solution θ* satisfying
then
provided h is continuous at θ*.
Let h(g) = −||y − Π(g ○ V)||2 and π(g)dg = dΛ(g). We immediately obtain
Finally, we note that is obtained by applying the orthogonal Procrustes procedure to project
back onto
. Since
is itself an element of
, it follows naturally that
APPENDIX F
Non-uniform distribution of rotations
F1. Isotropic Gaussian distribution of SO(3) rotations
The IG distribution , parameterized by the scalar variance η2, can be represented in an axis–angle form, with uniformly sampled axes and a rotation angle ω ∈ [0, π], having the following probability density function (Savjolova, 1985
):
Notably, the uniform distribution on , denoted
, corresponds to uniformly sampled axes and is characterized by the density function f(ω) = (1 − cosω)/π, which represents the limiting case of η → ∞ in equation (64
).
To provide intuition for the spread induced by η, consider the concentrated regime . In this regime, one may use the standard small-angle approximation in which a rotation is represented by a rotation vector
with
so that ω is approximately Maxwell distributed with scale parameter η (Johnson et al., 1995
). Consequently, its standard deviation is
For example, η = 0.1 corresponds to Std(ω) ≈ 0.067 rad ≈ 3.8°.
For η > 1, the series converges quickly, and ℓmax = 5 is typically sufficient for sub-percent accuracy. However, as η decreases, convergence becomes slower, making it inefficient for modeling concentrated distributions. Fortunately, this series has been well studied, and an excellent approximation for η < 1 has been proposed in the literature (Matthies et al., 1988
), given by the following closed-form approximation:
The method for generating samples according to the isotropic Gaussian distribution on relies on the inverse sampling theorem. This technique enables sampling from a specified probability distribution by utilizing its cumulative distribution function (CDF). Specifically, it involves generating a uniform random number between 0 and 1 and then applying the inverse CDF to obtain a sample from the target distribution (Savjolova, 1985
).
F2. Estimating the rotational prior
A key practical ingredient in Bayesian pose estimation is the prior over rotations. From an algorithmic standpoint, two complementary approaches are currently being developed to handle unknown viewing-angle distributions. Firstly, building on EM-type refinement, recent work proposes the introduction of an explicit parameterization of the orientation distribution over and joint optimization of the structure and the viewing-angle distribution in the M-step, effectively recovering the rotational prior from the data (Xu et al., 2025
). This perspective generalizes Bayesian EM refinement pipelines such as RELION by treating the pose prior as an estimable object that can be learned jointly with the structure during refinement (Scheres, 2012b
). Secondly, method-of-moments approaches relax the uniformity assumption and aim to recover both the structure and the unknown viewing-angle distribution from low-order statistics, typically via alternating optimization between the two components (Sharon et al., 2020
; Kileel, Mickelin et al., 2025
).
In practice, we recommend starting with a uniform (or weak) prior when little is known about the rotational prior, and then optionally estimating a smoothed viewing-angle distribution from an initial refinement and plugging it into the MMSE estimator; simple sensitivity checks (uniform versus learned priors, and varying regularization) can help diagnose the effect of preferred orientations. We emphasize, however, that principled estimation of the rotational prior, and a full theory of robustness to prior misspecification in cryo-EM/ET, remains an important open direction for future work.
APPENDIX G
Proof of Proposition 3![[link]](../../../../../../logos/arrows/d_arr.gif)
Following equation (28), the update law (M-step) of the (t + 1)th iteration of the expectation–maximization algorithm is given by
where
As the action by the group element g preserves the norm, we have
Thus, substituting equation (70)
into equation (68)
leads to
As the term ||g ○ yi||2 is independent of V, equation (71)
can be simplified to
Let us denote by h(V) the right-hand side of equation (72)
:
Taking the first-order condition with respect to V gives
As for every i, we can simplify equation (74)
to obtain
Thus, using the definition (30
) we obtain
which proves the proposition.
Acknowledgements
We thank Marc Auréle Gilles for providing guidance with the RECOVAR software. We also thank Marc Auréle Gilles, Eric Verbeke and Ruiyi Yang for their helpful discussions.
Data availability
The detailed implementation and code are available at https://github.com/AmnonBa/bayesian-orientation-estimation.
Funding information
AS and SX are supported in part by AFOSR under grant FA9550-23-1-0249, the Simons Foundation Math+X Investigator Award, NSF under grant DMS 2510039 and NIH/NIGMS under grant R01GM136780-01. TB is supported in part by BSF under grant 2020159, in part by NSF–BSF under grant 2024791, in part by ISF under grant 1924/21 and in part by a grant from The Center for AI and Data Science at Tel Aviv University (TAD).
References
Bai, X.-C., McMullan, G. & Scheres, S. H. W. (2015). Trends Biochem. Sci. 40, 49–57.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Balanov, A., Bendory, T. & Huleihel, W. (2025). IEEE Trans. Inf. Theory, 71, 8871–8898.
Web of Science
CrossRef
Google Scholar
Balanov, A., Huleihel, W. & Bendory, T. (2024). arXiv:2407.05277.
Google Scholar
Balanov, A., Huleihel, W. & Bendory, T. (2025). arXiv:2505.21435.
Google Scholar
Balanov, A., Zabatani, A. & Bendory, T. (2025). arXiv:2507.03951.
Google Scholar
Bandeira, A. S., Blum-Smith, B., Kileel, J., Niles-Weed, J., Perry, A. & Wein, A. S. (2023). Appl. Comput. Harmon. Anal. 66, 236–319.
Web of Science
CrossRef
Google Scholar
Bartesaghi, A., Matthies, D., Banerjee, S., Merk, A. & Subramaniam, S. (2014). Proc. Natl Acad. Sci. USA, 111, 11709–11714.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Bendory, T., Bartesaghi, A. & Singer, A. (2020). IEEE Signal Process. Mag. 37, 58–76.
Web of Science
CrossRef
PubMed
Google Scholar
Corso, G., Stärk, H., Jing, B., Barzilay, R. & Jaakkola, T. (2022). arXiv:2210.01776.
Google Scholar
Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977). J. R. Stat. Soc. Ser. B Stat. Methodol. 39, 1–22.
CrossRef
Web of Science
Google Scholar
Donnat, C., Levy, A., Poitevin, F., Zhong, E. D. & Miolane, N. (2022). J. Struct. Biol. 214, 107920.
Web of Science
CrossRef
PubMed
Google Scholar
Fan, Z., Lederman, R. R., Sun, Y., Wang, T. & Xu, S. (2024). Ann. Stat. 52, 52–77.
Web of Science
PubMed
Google Scholar
Gilles, M. A. & Singer, A. (2025). Proc. Natl Acad. Sci. USA, 122, e2419140122.
Web of Science
CrossRef
PubMed
Google Scholar
Gower, J. C. & Dijksterhuis, G. B. (2004). Procrustes Problems. Oxford University Press.
Google Scholar
Gräf, M. (2012). Adv. Comput. Math. 37, 379–392.
Google Scholar
Gräf, M. & Potts, D. (2009). Numer. Funct. Anal. Optim. 30, 665–688.
Google Scholar
Harpaz, Y. & Shkolnisky, Y. (2023). Biol. Imaging, 3, e8.
CrossRef
PubMed
Google Scholar
Hartley, R., Trumpf, J., Dai, Y. & Li, H. (2013). Int. J. Comput. Vis. 103, 267–305.
Web of Science
CrossRef
Google Scholar
Henderson, R. (2013). Proc. Natl Acad. Sci. USA, 110, 18037–18041.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Hoskins, J., Khoo, Y., Mickelin, O., Singer, A. & Wang, Y. (2024). arXiv:2410.06889.
Google Scholar
Huynh, D. Q. (2009). J. Math. Imaging Vis. 35, 155–164.
Web of Science
CrossRef
Google Scholar
Jagvaral, Y., Lanusse, F. & Mandelbaum, R. (2024). AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence, pp. 12754–12762. New York: ACM.
Google Scholar
Janco, N. & Bendory, T. (2022). IEEE Trans. Signal Process. 70, 3237–3248.
Web of Science
CrossRef
Google Scholar
Jeon, M., Raghu, R., Astore, M., Woollard, G., Feathers, R., Kaz, A., Hanson, S. M, Cossio, P. & Zhong, E. D. (2024). Adv. Neural Inf. Process. Syst., 37, 89468–89512.
Google Scholar
Johnson, N. L., Kotz, S. & Balakrishnan, N. (1995). Continuous Univariate Distributions, Vol. 2, 2nd ed. Hoboken: Wiley.
Google Scholar
Kay, S. M. (1993). Fundamentals of Statistical Signal Processing: Estimation Theory. Englewood Cliffs: Prentice-Hall.
Google Scholar
Kileel, J., Marshall, N. F., Mickelin, O. & Singer, A. (2025). SIAM J. Sci. Comput. 47, A1117–A1144.
Google Scholar
Kileel, J., Mickelin, O., Singer, A. & Xu, S. (2025). arXiv:2511.07438.
Google Scholar
Kimanius, D., Dong, L., Sharov, G., Nakane, T. & Scheres, S. H. W. (2021). Biochem. J. 478, 4169–4185.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Kostelec, P. J. & Rockmore, D. N. (2008). J. Fourier Anal. Appl. 14, 145–179.
Web of Science
CrossRef
Google Scholar
Leach, A., Schmon, S. M., Degiacomi, M. T. & Willcocks, C. G. (2022). Tenth International Conference on Learning Representations 2022.
Google Scholar
Lehmann, E. L. & Casella, G. (2006). Theory of Point Estimation. New York: Springer.
Google Scholar
Levy, A., Raghu, R., Feathers, J. R., Grzadkowski, M., Poitevin, F., Johnston, J. D., Vallese, F., Clarke, O. B., Wetzstein, G. & Zhong, E. D. (2025). Nat. Methods, 22, 1486–1494.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Li, B., Zhu, D., Shi, H. & Zhang, X. (2021). J. Struct. Biol. 213, 107783.
Web of Science
CrossRef
PubMed
Google Scholar
Luo, Z., Ni, F., Wang, Q. & Ma, J. (2023). Nat. Methods, 20, 1729–1738.
Web of Science
CrossRef
PubMed
Google Scholar
Lyumkis, D. (2019). J. Biol. Chem. 294, 5181–5197.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Ma, C., Bendory, T., Boumal, N., Sigworth, F. & Singer, A. (2019). IEEE Trans. Image Process. 29, 1699–1710.
Web of Science
CrossRef
Google Scholar
Mäeots, M.-E. & Enchev, R. I. (2022). Acta Cryst. D78, 927–935.
Web of Science
CrossRef
IUCr Journals
Google Scholar
Mao, Y., Castillo-Menendez, L. R. & Sodroski, J. G. (2013). Proc. Natl Acad. Sci. USA, 110, E4178–E4182.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Mao, Y., Wang, L., Gu, C., Herschhorn, A., Désormeaux, A., Finzi, A., Xiang, S. H. & Sodroski, J. G. (2013). Proc. Natl Acad. Sci. USA, 110, 12438–12443.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Matthies, S., Muller, J. & Vinel, G. W. (1988). Texture Stress Microstruct. 10, 77–96.
CrossRef
Web of Science
Google Scholar
Punjani, A. & Fleet, D. J. (2021). J. Struct. Biol. 213, 107702.
Web of Science
CrossRef
PubMed
Google Scholar
Punjani, A. & Fleet, D. J. (2023). Nat. Methods, 20, 860–870.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. (2017). Nat. Methods, 14, 290–296.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Robert, C. P. (1999). Monte Carlo Statistical Methods. New York: Springer.
Google Scholar
Rosenthal, P. B. & Henderson, R. (2003). J. Mol. Biol. 333, 721–745.
Web of Science
CrossRef
PubMed
CAS
Google Scholar
Savjolova, T. I. (1985). Metallurgija (Moscow), 4, 6.
Google Scholar
Scheres, S. H. W. (2012a). J. Mol. Biol. 415, 406–418.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Scheres, S. H. W. (2012b). J. Struct. Biol. 180, 519–530.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Sharon, N., Kileel, J., Khoo, Y., Landa, B. & Singer, A. (2020). Inverse Probl. 36, 044003.
Web of Science
CrossRef
PubMed
Google Scholar
Sigworth, F. J., Doerschuk, P. C., Carazo, J.-M. & Scheres, S. H. W. (2010). Methods Enzymol. 482, 263–294.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Singer, A. (2018). Proceedings of the International Congress of Mathematicians (ICM 2018), pp. 3995–4014. Singapore: World Scientific.
Google Scholar
Singer, A. & Yang, R. (2024). Biol. Imaging, 4, e5.
CrossRef
PubMed
Google Scholar
Sorzano, C. O. S., Jiménez, A., Mota, J., Vilas, J. L., Maluenda, D., Martínez, M., Ramírez-Aportela, E., Majtner, T., Segura, J., Sánchez-García, R., Rancel, Y., del Caño, L., Conesa, P., Melero, R., Jonic, S., Vargas, J., Cazals, F., Freyberg, Z., Krieger, J., Bahar, I., Marabini, R. & Carazo, J. M. (2019). Acta Cryst. F75, 19–32.
Web of Science
CrossRef
IUCr Journals
Google Scholar
Subramaniam, S. (2013). Proc. Natl Acad. Sci. USA, 110, E4172–E4174.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Tan, Y. Z., Baldwin, P. R., Davis, J. H., Williamson, J. R., Potter, C. S., Carragher, B. & Lyumkis, D. (2017). Nat. Methods, 14, 793–796.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Toader, B., Sigworth, F. J. & Lederman, R. R. (2023). J. Mol. Biol. 435, 168020.
Web of Science
CrossRef
PubMed
Google Scholar
van Heel, M. (2013). Proc. Natl Acad. Sci. USA, 110, e4175.
Web of Science
CrossRef
PubMed
Google Scholar
Watson, A. J. I. & Bartesaghi, A. (2024). Curr. Opin. Struct. Biol. 87, 102861.
Web of Science
CrossRef
PubMed
Google Scholar
Wong, W., Bai, X., Brown, A., Fernandez, I. S., Hanssen, E., Condron, M., Tan, Y. H., Baum, J. & Scheres, S. H. W. (2014). eLife, 3, e03080.
Web of Science
CrossRef
PubMed
Google Scholar
Xu, S., Zhang, A. Y. & Singer, A. (2025). arXiv:2509.22945.
Google Scholar
Zhang, P. (2019). Curr. Opin. Struct. Biol. 58, 249–258.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
Zhong, E. D., Bepler, T., Berger, B. & Davis, J. H. (2021). Nat. Methods, 18, 176–185.
Web of Science
CrossRef
CAS
PubMed
Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

journal menu
access



