Reconstructing three-dimensional protein crystal intensities from sparse unoriented two-axis X-ray diffraction patterns

To simulate the signal level of serial microcrystallography experiments at storage ring sources, data frames were collected from a large lysozyme crystal rotated about two orthogonal axes and illuminated by a dim X-ray source. Using the EMC algorithm, this study shows that three-dimensional intensity reconstruction is still feasible even without the knowledge of the crystal orientation in each data frame.


Introduction
The advance of serial femtosecond microcrystallography (SFX) at X-ray free-electron lasers (XFELs) (Chapman et al., 2011;Boutet et al., 2012) allows structure determination with protein crystals whose sizes are too small for conventional crystallography experiments. SFX is based on injecting a sequence of randomly oriented microcrystals to intercept a train of X-ray pulses. The tens of femtoseconds long pulse width enables the photon scattering process to outrun the radiation damage of the crystals, while the ultra-high brightness of the pulses results in a sufficient number of resolvable Bragg peaks collected by a fast-framing detector (Philipp et al., 2008) for indexing. Using this concept of 'diffract before destroy' (Neutze et al., 2000), a complete dataset can be obtained given enough indexed data frames. SFX has the advantage of rapid data collection owing to the high repetition rate of the X-ray pulses and provides a promising means to study proteins that do not readily form large single crystals.
Despite the success of SFX, the paucity of XFEL beamtime has inspired interest in adapting serial microcrystallography (SMX) experiments to storage ring (SR) sources (Gati et al., 2014;Stellato et al., 2014;Heymann et al., 2014;Nogly et al., 2015;Botha et al., 2015;Gruner & Lattman, 2015;Schubert et al., 2016). Radiation damage cannot be outrun in the same way at SR sources, thereby much more strongly limiting the dose per protein that can be tolerated. The limit to the ISSN 1600-5767 smallest usable crystal size in SMX is the ability to index each data frame, since frames collected from very small crystals will have so few diffracted photons that Bragg peaks will not be obviously identifiable. Because successfully indexing a frame usually requires at least 20-30 resolvable Bragg peaks, under conventional processing schemes data frames that are too weak to identify this number of Bragg peaks would be discarded.
Instead of determining the orientation of each data frame individually, the expand-maximize-compress (EMC) algorithm (Loh & Elser, 2009) seeks to reconstruct a consistent three-dimensional intensity model using all the data frames simultaneously. The EMC algorithm treats the orientation of each data frame as a probability distribution conditional on the current model and iteratively updates the model by maximizing the associated likelihood function. The validation of its probabilistic modeling of orientations has been demonstrated in many proof-of-concept experiments (Loh et al., 2010;Philipp et al., 2012;Ayyer et al., 2014Ayyer et al., , 2015Ekeberg et al., 2015;Wierman et al., 2016), even in some cases where the number of collected photons per frame is extremely low. This success has motivated us to apply the EMC algorithm to SMX to push the limit of usable crystal sizes.
This study is the latest of a series of proof-of-concept tabletop experiments with increasing complexity to test the applicability of the EMC algorithm to SMX. We simulated the data frames collected in an SMX experiment by taking a large volume of data frames of very brief exposures from a large hen egg white lysozyme (HEWL) crystal with a dim laboratory X-ray source and a fast-framing mixed-mode pixel array detector (MM-PAD) (Tate et al., 2013). In contrast to our previous work (Wierman et al., 2016), where the crystal was rotated about a single axis, the data frames used in this study were collected from a crystal rotated about two orthogonal axes continuously to sample a greater portion of the rotation space. The crystal intensities were reconstructed using the discrete three-dimensional rotation samples lying in this rotation subset, so our method applies to randomly oriented frames by replacing the rotation subset with the whole threedimensional rotation space. However, like the full exploration of the rotation space, this rotation sampling results in a cubic growth with resolution in the memory and time scaling of the EMC algorithm, which makes the two-axis problem more difficult than its single-axis counterpart. To remedy this problem, we developed computing schemes that greatly reduce the memory usage and computation time. With no input information on the orientation of each data frame, the EMC algorithm successfully reconstructed the Bragg reflections to 2.27 Å resolution. This result further paves the way for EMC-based SMX experiments.
This paper is organized as follows. In x2, we present the details of the experiment, an overview of the EMC algorithm and other aspects of the data processing. In particular, we introduce a local update scheme to speed up the EMC algorithm at high resolution. In x3, we examine the sparsity of the data frames input to the EMC algorithm and the results of the intensity reconstruction. In the appendices, we describe a memory-efficient parallel implementation of the EMC algorithm and quantify the speed-up of the local update scheme in practice.

Setup of rotation axes
In order to sample a greater portion of the rotation space than a single rotation axis does (Wierman et al., 2016), two orthogonal rotation axes were built by fixing the ' rotation stage (Newport URS100) perpendicularly to the rotation stage (Newport UE17CC), as schematically shown in Fig. 1. In the present experiment rotations about the axis bring the ' rotation stage into alignment with the X-ray beam, so the rotation was limited to 18 to avoid blocking the beam and collision with the detector. This limited angular range of is acceptable because, as described in x2.4.2, the solution to the two-axis problem is readily generalized to full three-dimensional rotations even over this limited angular range. Using a microscope, the rotation axes were adjusted to intersect perpendicularly and their intersection was centered within the X-ray beam using a fluorescent fiber, whose position was recorded for the subsequent sample centering.

Sample preparation
The protein crystallization technique followed is similar to that described by Wierman et al. (2016). Lyophilized hen egg white lysozyme powder (Sigma, Saint Louis, MO, USA) was dissolved in deionized water to 25 mg ml À1 without further purification. Crystals were grown at room temperature  A simplified schematic of the experimental setup with two orthogonal rotation axes. The beam incidence is perpendicular to the axis and the MM-PAD, and the main beam is blocked by the beamstop. The crystal is rotated in increments of 0.1 about the axis, with the data frames recorded by the MM-PAD when ' traverses 360 continuously at each value of . The figure is not drawn to scale. capillary (outside diameter = 864 mm, wall thickness = 25.4 mm; Advanced Polymers, Salem, NH, USA), a crystal was then retrieved from the crystallization droplets and positioned approximately 22 mm from one end of the capillary. This provided enough remaining length of the capillary on the opposite end to add a small amount of mother liquor solution near the crystal to maintain hydration via vapor diffusion during data collection. Mother liquor directly surrounding the crystal was carefully removed from the 22 mm end of the capillary with a paper wick (Hampton Research, Aliso Veijo, CA, USA), and the ends of the capillary were sealed with vacuum grease. The capillary containing the crystal was then mounted on a Hampton Research pin base and attached to a goniometer for data collection.

Data collection
The X-ray diffraction patterns were collected from the capillary containing a single HEWL protein crystal centered at the intersection of the two orthogonal rotation axes and illuminated by a 0.5 Â 0.5 mm Cu K X-ray beam (1.54 Å wavelength). The X-rays were generated from a rotating anode machine set to 36 kV and 50 mA (Rigaku RU-H3R) and focused using Ni-coated Franks mirrors placed 1 m from the sample with a divergence of 1 mrad and a flux of 10 7 photons per second. The beam incidence was perpendicular to the axis and the MM-PAD, and the sample-todetector distance was 60 mm. The center of the beam was placed in one corner of the active area of the MM-PAD, giving a resolution of 2.0 Å in the opposite corner. A pin-diode beamstop was used to prevent the direct beam from striking the MM-PAD during data collection.
The capillary and crystal were rotated about the axis from 0 to 17.9 and then from À18.0 to À0.1 in increments of 0.1 . At each value of , the capillary and crystal were rotated by 360 about the ' axis continuously at a constant angular velocity of 0.5 per second. The MM-PAD collected images at a framing rate of 4 ms per frame in each revolution of ', which gave an oscillation angle of 0.002 per frame. After the average dark signal had been subtracted and the chip-to-chip global response adjusted, pixel counts were thresholded to avoid false positives. The thresholded pixel counts were then quantized to photon counts by dividing with a known gain and rounding to the nearest integer. Only counts from pixels with at least one photon hit were recorded during data collection to reduce the file size for storage and allow more images to be recorded with the available disk space.
Owing to radiation damage and possible dehydration of the crystal, we only kept the data frames recorded at ranging from 0 to 15.9 to pass on to processing. We also discarded frames that did not record any photons, which was possibly caused by glitches of the rotating anode. To simulate the signal level of an SMX experiment, we further collapsed every 100 successive frames that did not contain any discarded frames, since they were recorded when the crystal was rotated continuously in ' at a fixed value of . We note that an intensity reconstruction was attempted by collapsing every 30 successive frames, but the Bragg reflections could not be reconstructed beyond 3 Å . The collapse of every 100 successive frames gave us 2:7 Â 10 5 frames with an average of 3000 photons per collapsed frame. These collapsed frames were then passed to the EMC algorithm for intensity reconstruction, though their relative orientations were unknown to the algorithm.
It was discovered after data had been collected and the apparatus disassembled that the crystal was of poor quality. The actual Bragg spot intensities obtained by summing adjacent frames with their known relative orientations cannot be phased to produce a high-resolution structure even though the Bragg peaks do extend to high resolutions. The goal of the experiment, however, was not to solve the well known lysozyme structure but rather to demonstrate that the EMC approach can reconstruct the intensity map in the two-axis case. Because the quality of the reconstructed intensities can be assessed by comparing with the actual intensities, the goal of the experiment could be met even though the crystal was of poor quality for solving a structure.

Intensity reconstruction
2.4.1. EMC algorithm. The unoriented data frames were merged into a three-dimensional intensity map iteratively with the EMC algorithm (Loh & Elser, 2009). Each iteration of the algorithm consists of three steps: expand (E), maximize (M) and compress (C). Consider a reconstruction problem with M pix detector pixels, M rot rotation samples, M data data frames and N average photons per frame. Starting with an initial intensity model WðqÞ, where q denotes the spatial frequency, the E step calculates the average photon number W ij , measured at pixel i from WðqÞ when the crystal has orientation j . With the data frames represented by K ik , the photon count recorded at pixel i in frame k, the matrices W ij and K ik are cross correlated in the M step to evaluate the conditional probability P jk ðWÞ that frame k was measured at crystal orientation j based on the current intensity model W. Assuming Poisson statistics, P jk ðWÞ is given by where w j is the fraction of the continuous rotation group assigned to sample j . The algorithm subsequently maximizes the expectation value of the log-likelihood function over P jk ðWÞ by updating the model according to the rule The C step maps W 0 ij back to the reciprocal space to form a new intensity map W 0 ðqÞ to ensure consistency among all the tomograms W ij calculated in the next iteration. The algorithm then takes W 0 ðqÞ as the initial intensity model WðqÞ of the next iteration and repeats the iterations until WðqÞ ' W 0 ðqÞ.
2.4.2. Reference intensity map, rotation sampling and initial seeding. Although the relative orientations of the data research papers frames were not passed to the EMC algorithm, we can use them to construct a 'reference' intensity map to compare with the reconstructed intensity map. The data frames were mapped to the reciprocal space to form a three-dimensional intensity map according to their relative orientations when recorded. The reciprocal lattice of the crystal is embedded in the intensity map and differs from the laboratory frame by a global rotation R g . We determined R g by segmenting out the Bragg peaks (Wierman et al., 2016) and then applying indexing (Steller et al., 1997) to the peaks. The intensity map was subsequently rotated by R g to align with the laboratory frame, and this aligned intensity map is what we call the reference intensity map.
We generated the discrete rotation samples using quaternions (Loh & Elser, 2009), where the angular resolution ' 0:944=n is specified by the order n ¼ 1; 2; . . . . In this study, we confined the rotation samples to those in the subset of rotation space explored by the rotated crystal. The range of the subset in the laboratory frame was found by applying the global rotation R g obtained above to the relative orientations between the data frames, though we need to stress that the orientation of each data frame within the subset was unknown to the EMC algorithm. This choice of rotation samples makes the solution to the two-axis problem directly applicable to the randomly oriented frames in real SMX experiments, where the rotation subset is replaced with the whole three-dimensional rotation space.
We seeded the initial intensity map with small threedimensional Gaussian peaks of random height at each predicted Bragg position, with the lattice constants given by the indexing process mentioned above. In real SMX experiments with a small beam, this information can be obtained from indexing the pseudo-powder patterns. No symmetry was imposed in either the seeding or the reconstruction process.
2.4.3. Local update scheme. Owing to the exhaustive search in rotations, an EMC reconstruction is usually challenged by its poor time and memory scaling, which are both proportional to M rot . Resolving peaks at high resolution becomes especially difficult, since where q max denotes the highest resolvable spatial frequency.
Here we propose an update scheme to speed up the EMC algorithm at high resolution, and a parallel implementation that alleviates the memory burden is discussed in Appendix A.
To understand how to speed up the EMC algorithm, we first review how an EMC reconstruction converges in qualitative terms. The peaks at low resolution of the intensity map are reconstructed first owing to the strong diffraction signal at low q. These low-resolution peaks hence give each data frame a great preference for certain orientations, and the intensity map is refined about these probable orientations to resolve peaks at higher resolution. With improved signal-to-noise ratio in the intensity map, the convergence gradually proceeds from low q to high q. This observation shows that the intensity reconstruction has a special feature of locality in orientations: each data frame has high probabilities only at a handful of orientations favored by the low-resolution peaks, while the other orientations with negligible probabilities actually do not contribute to the refinement of the intensity map. Restricting the search to the vicinity of the probable orientations on a perframe basis can therefore significantly reduce the computation time.
The computing scheme that we call the local update scheme takes advantage of the locality in orientations to speed up the convergence of the intensity reconstruction, and we hereafter refer to the scheme discussed in x2.4.1 as the standard update scheme. The local update scheme consists of two major parts: the calculation of the probable orientation list and the refinement of the intensity map. Starting with a converged low-resolution intensity model WðqÞ and a coarse rotation sampling f j c g of order n c , the local update scheme calculates the probabilities P j c k ðWÞ according to equation (1). The probable orientation list is represented by the binary matrix where " p is a pre-defined threshold.
In the second part, the intensity map is refined using a fine rotation sampling f j f g of order n f without calculating all the elements of P j f k ðWÞ. For each coarse rotation sample j c , we define its neighborhood as the subset of rotation space that is closer to j c than to any other samples, and assign the fine rotation samples j f that lie in this subset as the neighbors of j c . This mapping is stored as a matrix C j c j f , where The intensity map is then refined in the same way as in the standard update scheme, with the exception that only the entries of P j f k ðWÞ that satisfy the conditions B j c k ¼ 1 and C j c j f ¼ 1 are calculated while the others are set as zero. We hence restrict the calculation of P j f k ðWÞ to the neighbors of the probable coarse rotation samples in each data frame. The probable orientation list, or equivalently the binary matrix B j c k , is only recalculated after the intensity map converges to allow a global search over all the coarse rotation samples. The refinement then continues with the updated matrix B j c k . The whole process terminates when the update of the probable orientation list stops changing the intensity map. Restricting the search in orientations saves a great amount of computation because calculating the probability matrix is the most time-consuming part of the EMC algorithm. A simple estimate (see Appendix B) shows that the local update scheme can achieve a speed-up by tens to hundreds of times in practice. In addition, the matrices B j c k and C j c j f are both sparse, so they barely add any burden to the memory usage. Since the local update scheme places no special focus on the Bragg peaks, it is also applicable to single-particle imaging.
The idea of our local update scheme is similar to the sparse update scheme proposed by Neal & Hinton (1998), which speeds up the expectation maximization algorithm by freezing the probabilities of improbable values in most of the iterations research papers and only updating them once every many iterations. The only difference is the specific property of locality in our intensity reconstruction application, which allows us to search in a finer grid about the probable coarse rotation samples to refine the intensity map at high resolution. Nonetheless, we need to stress that the only reason to adopt the local update scheme is to speed up the reconstruction at high resolution. The likelihood function maximized in each local update iteration cannot exceed its counterpart when the whole rotation group is explored.

Integration
After a converged intensity map had been obtained, the reflections were summed over ellipsoidal windows centered at each Bragg position and aligned with the reciprocal lattice. We used the average of the neighboring voxels outside each ellipsoidal window to estimate the background level of each reflection. Reflections with their ellipsoidal windows intersecting the detector gaps or the boundary of the intensity map were considered as partial peaks and rejected.

Sparsity of data frames
To show the sparsity of the collapsed data frames described in x2.3, we counted the number of peaks per frame with the criterion that a peak has more than two connected pixels and an average of no less than two photons per pixel. As shown in Fig. 2, most of the frames do not have enough peaks to meet the requirements of conventional indexing methods (at least 20-30), even with this generous criterion for peak finding.
Following the calculation by Holton (2009), we also estimated the energy absorbed by the crystal over the exposure of one collapsed frame, assuming that protein crystals have the same mass energy absorption cross section as water. Our calculation showed that an 8 mm 3 crystal would have endured a 0.2 MGy radiation dose if it had scattered the same number of photons as our large HEWL crystal during this period. This dose is within the lifetime of protein crystals at room temperature if the radiation is delivered quickly (Owen et al., 2012), so the signal level in our study should be comparable to that in a real SMX experiment.

Intensity reconstruction
Given the 2:7 Â 10 5 collapsed frames, we started an EMC reconstruction from the randomly seeded model described in x2.4.2 using the standard update scheme and a rotation sampling of order n ¼ 40. Only data up to 3 Å were used at this stage because our goal was to quickly obtain a converged intensity map at low resolution. After the intensity map had converged, we took its probability distribution and assembled all the data frames to form an intensity map using equation (2) to include data up to 2 Å resolution. This intensity map was then used as the initial model of the local update scheme using rotation samplings of orders ðn c ; n f Þ ¼ ð40; 60Þ for refinement. (a) Histogram of the number of peaks per collapsed frame, which is the sum of 100 successive frames in the raw data. A patch with more than two connected pixels and an average of no less than two photons per pixel is identified as a peak. (b) A random selection of the collapsed frames, with identified peaks marked with blue circles. The cross denotes the beam center, and the resolution at the upper right corner is about 2 Å .

Figure 3
The average signal-to-noise ratio of the integrated reflections from the converged intensity maps at different stages of the reconstruction. The increase of hI= I i at high q indicates the reconstruction of high-resolution peaks. The 2.27 Å resolution determined by CC* is marked by the black dashed line. Different pairs of orders ðn c ; n f Þ with increasing angular resolutions were sequentially used in the local update scheme to extend the peak convergence to high resolution. Fig. 3 shows the average signal-to-noise ratio hI= I i of the integrated reflections from the converged intensity maps at different stages of the reconstruction. We first see that hI= I i dropped at low resolution while it remained at similar levels at high resolution when moving from the standard update scheme of n ¼ 40 to the local update scheme of ðn c ; n f Þ = (40, 60). The lack of improvement at high resolution indicates that the current angular resolution of the local update scheme still cannot resolve high-resolution peaks. On the other hand, the inclusion of data beyond 3 Å slightly disrupted the original probability distribution, which in turn reduced hI= I i at low resolution. The improvement of hI= I i when increasing the angular resolutions signals the reconstruction of high-resolution peaks and justifies the local update scheme.
With the converged intensity map from the local update scheme of ðn c ; n f Þ = (60, 150) as our final intensity reconstruction, Fig. 4 compares the slices of the reconstructed and reference intensity maps perpendicular to the k axis of the reciprocal lattice. Although we did not impose any symmetry in the process of seeding or reconstruction, the converged intensity map still follows the reflection condition 0k0 : k ¼ 2n required by the space-group symmetry P4 3 2 1 2 of the HEWL crystal (Hahn, 2006), which demonstrates the success of the EMC reconstruction. We note that the discrepancy between the two intensity maps in high-resolution peaks is consistent with the low signal-to-noise ratio at high resolutions (see Fig. 3). Because the photons contributing to the high-resolution shells were mostly collected by the upper left corner of the MM-PAD (Fig. 1), the resulting lower signal-to-noise ratio made the orientation reconstruction more challenging in this region.
A further comparison is shown in the scatter plot of the integrated reflections from the reconstructed and reference intensity maps (Fig. 5), which excludes the reflections with the signal-to-noise ratio I= I < 2. The linear correlation of the reflections shows the consistency of the two intensity maps. Using where F ref and F reconst are the structure factors calculated from the reference and reconstructed intensity maps, respectively, we quantify the discrepancy between the two sets of integrated reflections as R ¼ 21%, where the reflections with I= I < 2 are also excluded. This larger discrepancy than R ¼ 4:7% obtained by Wierman et al. (2016) could be caused by the adverse influence of the background scatter and the exploration of a much larger rotation subset than a single rotation axis. By summing the total photon counts of both the integrated and the partial peaks, we estimated the fraction of photons coming from the background and diffuse scatter as $90%. We expect to improve the quality of our reconstruction and push the limit to sparser data frames by reducing the background scatter and using a larger detector to gain more information to assist orientation reconstruction. Another way to assess the quality of the reconstruction is through the calculation of CC Ã , the correlation coefficient of the observed reflections with the underlying true signal (Karplus & Diederichs, 2012). We first randomly separated the symmetry-related reflections of each unique reflection into two subsets and then calculated the correlation coefficient CC 1=2 between the average intensities of the two subsets in different resolution shells. Under the assumption that the errors of the two subsets are independent, identically Slices of the reconstructed and reference intensity maps in the hl plane at constant values of k of the reciprocal lattice. Even without imposing any symmetry in the process of seeding or reconstruction, the converged intensity map still follows the reflection condition 0k0 : k ¼ 2n required by the space-group symmetry P4 3 2 1 2 of the HEWL crystal (see insets). The 2.27 Å resolution determined by CC Ã is marked by the arcs in white. The mapping into reciprocal space transforms the detector gaps (Tate et al., 2013) into curves.

Figure 5
Scatter plot comparing the integrated reflections from the reconstructed and reference intensity maps. Reflections with the signal-to-noise ratio I= I < 2 are excluded from the plot. The linear correlation shows the agreement between the two intensity maps. distributed and free from the errors of the true signal, the value of CC Ã can be estimated from The distribution of CC Ã as a function of spatial frequencies is shown in Fig. 6, with the error bars estimated by repeating the random separation of reflections 1000 times. The large error bar in the highest-resolution shell shows the low correlation between the intensities of the two subsets, which is consistent with the low signal-to-noise ratio at high resolution. We determine the resolution of the reconstructed reflections as 2.27 Å by a threshold CC Ã ¼ 0:5. This choice is consistent with the resolution where the average signal-to-noise ratio of our final reconstructed intensity (the black curve in Fig. 3) drops to 2. We note that the value of the correlation coefficient is dominated by the stronger peaks in each resolution shell. Therefore, CC* can still have moderate values at high resolutions even if some low-signal peaks are not resolvable, as indicated by the discrepancy between the two intensity maps in high-resolution peaks in Fig. 4.

Conclusion
The results of this study show that the limit to the usable crystal sizes in current SMX experiments could be relaxed by employing the EMC algorithm. Because the algorithm leverages the data redundancy arising from the common arcs between pairs of diffraction patterns, the intensity reconstruction is feasible even though each frame may not contain sufficient information to be oriented individually. The computing schemes we have developed in this article further alleviate the computational requirements of the EMC reconstruction, which makes EMC-based SMX experiments more practical.
The fact that the EMC algorithm was able to reconstruct the actual X-ray intensity incident on the detector, irrespective of crystal quality, illustrates the generality of the algorithm. The algorithm has no 'knowledge' of what is being reconstructed. Everything in the detected X-ray field -background, diffraction spots, diffuse scatter etc. -is reconstructed.
Several issues remain to be addressed to put EMC-based SMX experiments into practice. In contrast to the randomly sampled crystal orientations in real SMX experiments, the data frames in this study were taken from a crystal rotated continuously. From our past experience, this difference in orientation sampling should not affect the EMC reconstruction as long as the random orientation sampling size is large enough. We estimate that 10 5 -10 6 data frames are required, which amounts to a data collection time of within a day when a 10% single-crystal hit rate and an exposure time of 10 ms per frame are assumed.
In this paper we used a large single crystal in various orientations to emulate the data expected from multiple small crystals. The obvious next step towards practical application of the method is to try the EMC algorithm on data from multiple small crystals. It will be necessary to experimentally determine the severity of difficulties arising from sources including varying crystal diffraction quality and occasional multiple crystals in the beam. We expect to incorporate metrics such as the normalized surprise function (Munke et al., 2016) into the EMC algorithm to estimate the reliability of each frame based on the current intensity model and reject frames containing multiple lattices. To tackle the frame-to-frame crystal size variation, the EMC algorithm also needs to calculate the relative contribution of each frame to the intensity model iteratively. As indicated by Loh et al. (2010), the intensity model could be updated by maximizing the likelihood function with respect to the relative crystal sizes and the crystal orientations alternately, with the cost of doubling the computation time.
The last issue is the background scatter. In principle, the EMC algorithm is able to deal with stable background by modifying the conditional probability calculation. However, background reduction by improving the experiment becomes necessary when the frame-to-frame variation in background is significant. With the above issues considered, the analysis of SMX data would involve first reconstructing a low-resolution intensity map using the standard EMC update scheme, and then refining the high-resolution peaks using the local update scheme because of its computational efficiency.

Memory-efficient parallel implementation
Here we describe a memory-efficient parallel implementation of the EMC algorithm, which is applicable to both the standard and local update schemes. The memory usage of the EMC algorithm is dominated by the matrices P jk ðWÞ and W ij =W 0 ij , with sizes of M rot Â M data and M pix Â M rot , respectively. Since M rot / q 3 max , the required memory rapidly The distribution of CC Ã as a function of spatial frequencies. The resolution of the reflections is determined as 2.27 Å by a threshold CC Ã ¼ 0:5. The error bars are estimated by repeating the random separation of reflections 1000 times, while the ups and downs in CC Ã result from the binning in resolution shells. becomes intractable, for example, hundreds of terabytes (TB), even to resolve peaks at moderate resolutions (Ayyer et al., 2016). However, only a small portion of the entries of P jk ðWÞ are significant according to our discussion in x2.4.3, so we can treat P jk ðWÞ as a sparse matrix to reduce the required memory. This amounts to the fact that a given data frame only has nonnegligible probabilities at a small fraction of the orientations, unless the signal level is as weak as only several photons per frame. In our implementation, we distribute blocks of data frames (ranges in k index) to different processors, each of which holds the same copy of the intensity map WðqÞ. The algorithm strides through the M rot rotations in steps of size M step and calculates W ij and R jk ðWÞ in each step. Here we define R jk ðWÞ as and each processor dynamically updates the value of max j R jk ðWÞ when walking through all the orientations. From the inequality P jk ðWÞ ¼ R jk ðWÞ P M rot j¼1 R jk ðWÞ R jk ðWÞ max j R jk ðWÞ ; the entries of R jk ðWÞ are saved only when the ratio R jk ðWÞ= max j R jk ðWÞ exceeds the pre-defined threshold " p , and this condition is checked for all the saved entries every time the value of max j R jk ðWÞ is updated. After going through all the rotation samples, the algorithm calculates the significant values of P jk ðWÞ by normalizing the saved entries of R jk ðWÞ over orientations. Subsequently, we update the tomograms W 0 ij also in steps of size M step over all the orientations, and map W 0 ij back to the copy of the updated intensity map W 0 ðqÞ held by each processor after each step. Finally, W 0 ðqÞ is reduced among all the processors to complete the iteration.
As a result, the memory scaling of P jk ðWÞ and W ij =W 0 ij is reduced to N p M data and M pix M step , respectively, where N p denotes the average number of probable orientations per frame and is governed by the threshold " p and the signal level. The memory usage can be limited to only tens of gigabytes even when using an extremely fine rotation sampling, since the dominant memory scaling is independent of q max .

APPENDIX B Speed-up of the local update scheme
The most time-intensive part of a standard EMC update is to calculate the probability matrix P jk ðWÞ, which is proportional to M rot . In a local update scheme with M coarse and M fine rotation samples, this proportionality becomes N p M fine =M coarse , where N p is the average number of the probable coarse rotation samples per frame and M fine =M coarse is the average number of neighbors each coarse rotation sample has. Consider a local update scheme that recalculates the most probable orientation lists B j c k every N iter iterations. The speedup is given by where the numerator is the time proportionality for N iter iterations of standard updates using the fine rotation sampling, and M coarse in the first term of the denominator denotes the proportionality for the recalculation of B j c k . The number of rotations that samples the rotation space with order n is given by M rot ¼ 10ð5n 3 þ nÞ. Consider a local update scheme that uses N iter ¼ 10 and rotation samplings of orders ðn c ; n f Þ = (60, 150). Assume that a chosen value of " p leads to N p ' 100. Equation (7) gives a speed-up of 156 times.