research papers
Classifying and assembling two-dimensional X-ray laser diffraction patterns of a single particle to reconstruct the three-dimensional diffraction intensity function: resolution limit due to the quantum noise
aRiken Harima Institute, 1-1-1 Kouto, Sayo-cho, Sayo-gun, Hyogo, 679-5148, Japan, bQuantum Beam Science Directorate, Japan Atomic Energy Agency, 8-1-7 Umemidai, Kidugawa-shi, Kyoto, 619-0215, Japan, and cXFEL Division, Japan Synchrotron Radiation Research Institute, 1-1-1 Kouto, Sayo-cho, Sayo-gun, Hyogo, 679-5198, Japan
*Correspondence e-mail: go.nobuhiro@spring8.or.jp
A new two-step algorithm is developed for reconstructing the three-dimensional diffraction intensity of a globular biological macromolecule from many experimentally measured quantum-noise-limited two-dimensional X-ray laser diffraction patterns, each for an unknown orientation. The first step is classification of the two-dimensional patterns into groups according to the similarity of direction of the incident X-rays with respect to the molecule and an averaging within each group to reduce the noise. The second step is detection of common intersecting circles between the signal-enhanced two-dimensional patterns to identify their mutual location in the three-dimensional wavenumber space. The newly developed algorithm enables one to detect a signal for classification in noisy experimental photon-count data with as low as ∼0.1 photons per effective pixel. The wavenumber of such a limiting pixel determines the attainable structural resolution. From this fact, the resolution limit due to the quantum noise attainable by this new method of analysis as well as two important experimental parameters, the number of two-dimensional patterns to be measured (the load for the detector) and the number of pairs of two-dimensional patterns to be analysed (the load for the computer), are derived as a function of the incident
and quantities characterizing the target molecule.Keywords: biological macromolecules; classification of two-dimensional diffraction patterns; common intersecting circles; attainable structural resolution.
1. Introduction
New, intense X-ray free-electron laser (XFEL) light sources offer a new possibility in imaging single biological macromolecules. The main problems to be solved for realization of this possibility originate from the extreme weakness of the scattered light from a single molecule. One problem is severe damage of a target caused by a single shot of intense X-ray light used to compensate for the weakness. In this respect, a lower intensity of incident X-rays is preferred. Another problem due to the weakness is the quantum noise. Algorithms for
must be developed to process the experimental data immersed in the quantum noise. From this perspective, a higher intensity of incident X-rays is preferred. This paper focuses attention on this latter problem. For this purpose we make a tentative assumption that the damage process can be neglected, and will clarify the mechanism of how the quantum noise sets a limit on the resolution of In other words, we are interested in this paper only in the lower bound of the incident X-ray intensity.The damage problem prevents the possibility of using the same molecule repeatedly as a target. Instead we assume that a target macromolecule assumes a well defined three-dimensional structure and a new molecule from an ensemble of the same molecules is placed repeatedly at the target position, but unfortunately in an unknown random orientation. A macromolecular complex with a definite molecular composition and three-dimensional structure can also be treated. The term `molecule' is used to mean both a biological macromolecule and its complex. Extension of the results of this paper to cases of large-scale conformational fluctuations with a magnitude larger than the resolution of
will be addressed in a future paper. Because of this limitation, we specifically exclude fibrous macromolecules and assume that molecules are globular with a more-or-less spherical shape. Note also that we develop analyses in this paper under idealizing (as compared with probable experimental realizations) assumptions about the state of the target molecule such as (i) ideally random orientations and (ii) the absence of hydrating water molecules. Adaptation to these realistic problems will also be treated in future papers.A measurable two-dimensional diffraction intensity pattern depends on an unknown molecular orientation. This missing orientational information is to be recovered computationally during an analysis of a set of many two-dimensional intensity patterns. Also missing in a two-dimensional intensity pattern is the phase information necessary for derivation of a three-dimensional molecular structure. This missing phase information is also to be recovered computationally by the so-called oversampling method (Fienup, 1982; Elser, 2003).
The methods of single-particle imaging by XFEL can be classified into two paths, depending on which of the two types of missing information is recovered first. In the first, path A, method, a computational procedure is applied to a set of phase-missing two-dimensional intensity patterns to find their mutual locations in the three-dimensional wavenumber space. When a sufficient number of two-dimensional patterns are properly located, a three-dimensional diffraction intensity function can be constructed, to which the oversampling method is applied to recover the missing phase information. Together with this phase information, a three-dimensional real-space structure can be derived by an inverse Fourier transformation. In the second, path B, method, the oversampling method is applied to each of the measured two-dimensional intensity patterns. Together with the recovered phase information, a two-dimensional real-space structure is obtained by an inverse Fourier transformation, which is approximately a projection of a three-dimensional real-space structure along an axis of the incident X-ray beam. Such two-dimensional structures of a minivirus particle as revealed by a single-shot 6.9 Å hard-X-ray free-electron laser have been recently reported (Seibert et al., 2011). From many projected two-dimensional images thus obtained, a three-dimensional real-space structure can be constructed by applying the method of tomography. Such a three-dimensional human chromosome structure as revealed by coherent 2.5 Å X-rays from synchrotron radiation has been reported (Nishino et al., 2009).
Because of the weakness of scattered light from a single molecule, the quantum noise is a serious problem especially at high-angle pixels. The quantum noise appears to limit the resolution of a three-dimensional real-space structure to be obtained at the end, though by different mechanisms in paths A and B. In path A, the quantum noise is expected to set a resolution limit to locating a two-dimensional intensity pattern in the three-dimensional wavenumber space. Because photon-count data at many pixels can be considered integrally in a computational procedure to find a location, effective information seems extractable even from high-noise data at pixels with an expected mean photon count smaller than unity. Even though data at high-angle pixels are very noisy in each of the two-dimensional intensity patterns thus located, data at similar locations can be averaged to reduce the noise. The attainable structural resolution is determined by the wavenumber of a limiting pixel, from the data of which effective information can be extracted. In path B, where the oversampling method is applied directly to each two-dimensional intensity pattern, the quantum noise is expected to set a limit on the applicability of this method. When high-noise data from high-angle pixels with an expected photon count smaller than unity are included, the phase recovery procedure is expected to fail to converge and to cease to work. Therefore a pixel with an expected photon count of unity appears to be a limiting pixel, and its wavenumber appears to give the resolution. From such an analysis we expect that the path A analysis is better in extracting effective information from noisy data and therefore in deriving higher-resolution real-space structures. For this reason we are interested in this paper in developing a path A method. Of course, superior situations of the path B method are conceivable depending on problems of developing detecting devices and sample preparation, and also on the biological significance of the results obtained.
Methods hitherto proposed to find the locations of individual two-dimensional intensity patterns in the three-dimensional wavenumber space in path A methodology can be classified into two groups. In group 1 (Huldt et al., 2003; Bortel & Faigel, 2007; Shneerson et al., 2008; Bortel et al., 2009; Yang et al., 2010), a method of finding similarity between an arbitrary pair of two-dimensional intensity patterns is prepared. Then, a set of two-dimensional patterns are classified into groups of similar patterns according to this similarity, which are then averaged to reduce the quantum noise. Then, for an arbitrary pair of noise-reduced intensity patterns, their mutual location in the three-dimensional wavenumber space is identified by finding an intersecting circle between them. A three-dimensional diffraction intensity function can be constructed when a sufficient number of two-dimensional patterns are properly located in the three-dimensional wavenumber space. In the methods of group 2 (Fung et al., 2009; Loh & Elser, 2009; Elser, 2009; Loh et al., 2010), a tentative three-dimensional diffraction intensity function (or a function of similar mathematical setting) is assumed. Then, each two-dimensional intensity pattern is located so as to best fit in this three-dimensional intensity function. From a set of two-dimensional patterns thus located, the three-dimensional diffraction intensity function is updated. By repeating this cycle of best fitting and updating, an ultimate three-dimensional diffraction intensity function is obtained. Even though the methods of the second group appear promising, the demonstrated abilities of the individual methods proposed so far are limited.
We focus attention in this paper on developing an algorithm beyond existing methods of single-particle imaging belonging to the path A, group 1 methodology. New developments have been attained in two aspects. First, we will develop and improve methods of computational analyses and procedures to arrange a set of many experimentally measurable two-dimensional intensity patterns in the three-dimensional wavenumber space so that a three-dimensional intensity function can be constructed. The newly developed method enables us to attain higher-resolution structures. Second, we will derive explicit theoretical expressions for two main parameters which govern the number Nmeasure of two-dimensional patterns to be measured (the load for the measuring machine) and the number Ncompare of pairs of two-dimensional patterns to be compared (the main computational load for analyses), as well as for the space resolution attainable from the analysis of the data, in terms of (a) the used for the measurement and two types of quantities characterizing a target; (b) the Shannon molecular length (or, simply, molecular length) L (the length of a side of the smallest cubic box that can contain a target globular molecule); and (c) the radial diffraction intensity density function (the average of the the squared modulus of the structure-factor function on a sphere in the wavenumber space). The achievement of the second aspect provides basic information for designing new experiments and experimental instruments.
The results in the two aspects are obtained in this paper by taking advantage of simulated diffraction intensity data for a protein, lysozyme, and a protein complex, HslUV complex, for which structural atomic coordinates are available from the Protein Data Bank (PDB) (Berman et al., 2000). The former (Weaver & Matthews, 1987) (PDB code 2lzm , number of residues 164, molecular length 60 Å) is chosen as a typical small globular protein. The latter (Sousa et al., 2002) (PDB code 1kyi , total number of residues 7416, molecular length 200 Å) is chosen from very large protein complexes in the PDB with more-or-less globular shape. But it is in a sense an atypical complex, because it has a big hollow space inside the structure. Simulations are carried out by assuming the wavelength of incident X-rays 1 Å and for various intensities. The Shannon molecular length L and radial diffraction intensity density function are determined for these two targets from their respective PDB atomic coordinates, and are used to estimate numerical values of the two main experimental parameters, Nmeasure and Ncompare, and also of the attainable resolution as functions of the incident X-ray intensity.
To make the results of this paper more useful for designing new experiments and experimental instruments, the theory must be extended so as to be applicable even for structure-unknown molecules. This objective is studied in a separate paper.
This paper is ordered as follows. In §2, we will discuss a method of finding similarity between an arbitrary pair of two-dimensional diffraction intensity patterns. This method is used to classify two-dimensional patterns into groups of similar patterns. In §3, we will treat the problem of finding relative orientations between groups of similar patterns, which is information to be used for assembling two-dimensional intensity patterns into a three-dimensional diffraction intensity density function. In §4, we will give a somewhat detailed summary. Readers may find it easier to comprehend §§2 and 3 by reading §4 in parallel. Appendices A, B and C describe mathematical derivations of relations used in §§2 and 3.
2. Classification of two-dimensional diffraction intensity patterns
2.1. Two-dimensional diffraction pattern on an Ewald sphere
Here we define notations of pertinent quantities. The experimentally observable diffraction intensity , given in the unit of a number of photons arriving at a pixel of the detector of solid angle , is given, except for a phase factor, by
where Ii is the incident (given, in the following, in the unit of a number of photons per pulse of free-electron laser per mm2), rCE is the classical electron radius, C is a coefficient given by these quantities, is the is the momentum transfer and is the diffraction intensity density. The magnitude of the momentum transfer is given by
where is the X-ray wavelength and is the angle of diffraction. Note that, even though this angle is expressed as in the usual literature, we nevertheless express it as because this quantity, having a meaning as part of a certain polar angle, has an important role in this paper. Even though the 1 / R, it corresponds to the use of the detector pixel size of in solid angle, i.e.
is a continuous function in the wavenumber space, its squared modulus is measured experimentally by a detector with an array of finite-sized pixels. When is discretely sampled at lattice points of a cubic lattice with a lattice constant ofIn this expression, L is the length of a side of the smallest cubic box (Shannon box) that can contain a target globular molecule. We shall refer to this quantity as the Shannon molecular length or simply the molecular length. From the point of view of the oversampling method for phase retrieval (Fienup, 1982; Elser, 2003), the detector pixel size must be chosen so that . The ratio is called the linear oversampling ratio. A pixel with will be referred to as a Shannon pixel.
The quantity of equation (1) is an expected number of photons arriving at a detector pixel. However, in real experiments, what is measured is an integral number of photons, given by the quantum-mechanical probability. In this paper we simulate this probability by replacing the function of equation (1) with a stochastic function which assumes only integral values according to the To distinguish these two functions, let us call the quantity of equation (1) the theoretical diffraction intensity, and the replaced stochastic function the experimental diffraction intensity.
Examples of a simulated two-dimensional experimental and a theoretical diffraction intensity pattern are shown in Figs. 1(a) and 1(b), respectively, for the case of lysozyme. We see that, when theoretical diffraction intensities are much less than unity, experimental diffraction intensity values at most pixels vanish. From such noisy data, we have to guess the correct mean values. These are patterns for which we develop a method of analysis for For this purpose we need some theoretical tools. A mathematical expression of a two-dimensional pattern for a molecule with a given orientation is explored in detail in Appendix A. A molecular orientation is described by a Eulerian angle with its corresponding 3 × 3 orthogonal matrix given by equation (27). The Eulerian angle is defined so that, out of a set of three angles, are polar angles of the direction of the incident X-ray beam with respect to the molecule, and is an angle of rotation of the detector plane around the axis of the incident beam. A two-dimensional pattern is given by the quantity of equation (1) on the surface of an By introducing a polar coordinate on the surface of an the two-dimensional pattern is given from equations (40), (39), (35) and (42) by
This equation means that the Ewald spheres corresponding to and are essentially the same spheres giving the same surface.
2.2. High correlation line in a correlation pattern
As outlined in §1 we adopt in this paper the basic strategy in which, after measurement of a large number of two-dimensional patterns for a molecule in unknown random orientations, we classify them into groups of similar patterns. In the following we consider a pair of Eulerian angles and with corresponding matrices and , and two-dimensional patterns and . For this pair we will be interested in an angle between the two beam directions and , which satisfies
This angle plays the role of a measure of similarity between a pair of two-dimensional patterns. When it is very small, we classify them into one group of similar patterns even for very different and .
Our problem is to judge, for a given pair of experimental two-dimensional patterns such as those shown in Fig. 1(a), whether or not they are realizations of a similar theoretical two-dimensional pattern. The starting point is the calculation of a correlation function as was originally proposed by Huldt et al. (2003). In this treatment we are interested in pixels on a circle with a fixed value of . Let the number of pixels on a circle be . Bortel & Faigel (2007) proposed to pre-normalize the data on a circle to have uniform second moments for the calculation of a correlation function. More recently, they (Bortel et al., 2009) proposed further to pre-normalize so as to have vanishing mean and uniform second moment and also to compare axially rotated patterns. In our method we normalize the data on a circle to have a uniform mean and are interested in the correlation of their deviation from the mean after they are mutually rotated by an angle . This means that we are concerned with the following correlation function:
This quantity can also be expressed as follows:
The correlation function of equation (6) is calculated as a two-dimensional function of and , which will be referred to as a correlation pattern. The merits of using the normalized quantity of equation (6) are to normalize for variations of intensities over different circles and for experimental variations of the pulse intensities of XFELs.
An example of a correlation pattern is shown in Fig. 2(a). In this figure the pattern is shown not as a function of and but as a function of k and , where k is given by equation (2). In a general case where the angle between the two beam directions is not very small, this correlation pattern appears to be a random function of k and . Let us write such as , where BG stands for background. When is very small, there appears a line of high correlation for a certain value of . In the particular case of Fig. 2(a), the two beam directions are identical, and therefore both and vanish.
The correlation pattern of equation (6) is heavily affected by the quantum noise. When the quantum noise is suppressed, this quantity is given approximately by
where is a mean averaged over the quantum noise. To derive this expression we introduced an approximation of taking the means of three factors independently, an approximation which is good when standard deviations of the three factors are significantly smaller than their respective means. Examples of this quantity are shown in Figs. 2(b) and 2(c). Fig. 2(c) shows a pair of diffraction patterns for which the value of is not small. This is an example of . Even though Figs. 2(b) and 2(c) contain no effect of quantum noise, the pattern other than the high correlation line appears to be rather random. Such a behaviour should be a consequence of an appearance of the diffraction intensity density function that can be captured as a stochastic function.
To characterize the diffraction intensity density function from such a point of view, we studied first how values of are distributed on a sphere of around its mean . For this purpose, points are sampled randomly with a uniform probability on each sphere, and the value of is calculated at each sampled point. We confirmed that, except for small values of k, the distribution is given to a very good accuracy by the exponential distribution as was originally discovered by Wilson (1949). Moreover, as shown in Fig. 3, it is found empirically that, except for small values of k, values of on the sphere of are correlated in such a way as to satisfy the following simple relation,
where is an angle between the two vectors, the average is taken over all pairs of vectors with a given value , and the correlation length kC has been found to be independent of the value of k and is given approximately by
Because, as shown by Wilson (1949), the exponential distribution is a consequence of the irregular three-dimensional structures of biopolymers at the atomic level, we shall refer to the empirically observed distribution as a structure irregularity distribution.
To derive the mean behaviour of , we average over the structure irregularity distribution. As shown in Appendix C, we see that vanishes. This is reasonable because when there is no correlation between two points on two circles appearing in equation (6) an average can be taken on each of to yield a vanishing result.
Further examples of a correlation pattern of equation (6) are shown in Fig. 4. Because the mean of the correlation pattern vanishes except for a high correlation line, a mathematical expression of the high correlation line should be obtained as a mean of the correlation pattern over the two distributions, the and the structure irregularity distribution. As shown in Appendix C it is given to a good approximation by the following expression:
where k is a function of through equation (2) and the direction of the high correlation line, , is given to the zeroth order of the small quantity by
It should be noted that k appears in equation (11) as a quantity normalized by kC, or, because of equation (10), as a product kL.
As mentioned earlier, Bortel et al. (2009) proposed to use a quantity pre-normalized to have a vanishing mean and uniform second moment. However, when we apply our analysis to their proposed correlation pattern, an expression similar to equation (11) is obtained but with an additional factor which is approximately and becomes smaller for larger k. Because of this additional factor, the quantity they employed for detecting similarity between two-dimensional patterns is less sensitive for high k values. This explains why our method is more sensitive for higher k values.
2.3. Identifying the high correlation line against the noisy background and attainable resolution
In Fig. 4 we see that, even though the mean value of the correlation pattern should vanish in the background, its actual values become very noisy for larger values of k. This is due to both the quantum noise and the structure irregularity distribution. Identification of the high correlation line would be affected by the noise. (Note that we are here treating the structure irregularity distribution as a part of the noise.) The level of noise in the quantity of equation (6) can be expressed by its standard deviation. As derived in Appendix C, it is given by
where the function g is defined by
and is given by
which means that we are assuming Shannon pixels. Fig. 5 shows the graph of y = g(x ). Fig. 6 is a plot of the standard deviation given by equation (13), which is a globally increasing function of k in the high-k region. When averaged over the two distributions, the and the structure irregularity distribution, the peak value of the high correlation line for a given value of k (or, equivalently ) is according to equation (11), which is a decreasing function of k. It is expected that the actual high correlation line is observable roughly up to k = kmax, where the mean peak value becomes equal to the standard deviation , i.e.
In fact, in Figs. 2(a) and 4(a) for cases of and therefore where the mean peak value should stay unity, the actual high correlation lines are observed for lysozyme up to 0.7 Å−1 and for the HslUV complex up to 0.55 Å−1, where according to Fig. 6. In Fig. 4(b) for the case of the HslUV complex with = 1°, the actual high correlation line is observable up to 0.3 Å−1, where according to Fig. 6 and equation (16) is roughly satisfied.
From equation (16) and Fig. 6, we see that we can derive the value of from the measured length kmax of the high correlation line. The longer the length kmax, the smaller the value of . From the value of , we judge the similarity of a pair of two-dimensional patterns. When judged similar, they are classified into the same group. After the classification, two-dimensional patterns classified into the same group are averaged in order to improve the signal-to-noise ratio. Because this averaging is done for patterns with slightly different directions of the incident beam, the resolution of the resulting three-dimensional structure will be affected. In order to attain the highest possible resolution, we should adopt a strategy in which we classify a pair of two-dimensional patterns into the same group when their high correlation line reaches the highest possible k region. Let us define kN (subscript N for noise) as the lower bound of such a region. This quantity kN, the limiting k value for correlation recognition, plays a central role in the method of single-particle imaging developed in this paper. In the case of Fig. 4(a) we judge that the high correlation line extends up to such a region, where the line can no longer be distinguished from the background. In the case of Fig. 4(b) the line appears to have faded away before reaching such a region. The limiting value kN, to be determined purely operationally in real applications, appears more-or-less well defined. However, we need to interpret the value of kN in a more theoretical setting. Because it defines the lower bound of the noisy region, it should be characterized by its value of . From Figs. 4(a) and 4(b) we see that it should be between 0.6 and 1.0. As a modest estimate, we assume that kN corresponds to the value of k at which . Then, from equation (16) we see that the corresponding value of is estimated to be within
Let us now assume that a classification group of similar two-dimensional patterns is constructed by a group of two-dimensional patterns with within this angle from a certain reference two-dimensional pattern. Note that the average distance (root-mean-square distance) between a pair of two-dimensional patterns in this classification group is also given by . During the procedure of averaging, two-dimensional patterns rotated by around the origin are averaged. The magnitude of displacement in k space by this rotation is given by , with its maximum value being . When this magnitude is smaller than the correlation length kC of the diffraction intensity density, the averaging procedure works to attenuate the effect of the noise. When the product becomes larger than kC, the averaging procedure works to destroy the information in two-dimensional patterns. This means that the structural information is contained in k only up to kR satisfying . Then, from equation (17), we see
A limiting photon count sN, which is an expected number of photons arriving at a limiting pixel, a Shannon pixel at the limiting k value, k = kN, is given by . In equation (13), at k = kN is given by . This expression can be approximated as , because is in most cases at least a few times larger than . In Fig. 5 for a graph of y = g(x ), x and y can also be interpreted as sN and , respectively. Note that the normalized resolution, kRL, is the number of independent structural descriptive elements along the molecular length L. For a method of single-molecule imaging to be useful, this number should be at least 20, hopefully 100. Note that this number is determined mainly by the limiting photon count sN. Fig. 5 shows this dependence. We see that, to attain kRL = 20–100, sN should be in the range of 0.25–0.08. We have to measure and analyse such low-photon-number data. Also this number highlights a high sensitivity of the proposed method of analysis to extracting information from noisy data.
In the above relation between the limiting photon count sN and the normalized resolution kRL, the incident Ii is treated as an implicit variable parameter. To identify a particular value of Ii to attain a resolution kR = kN, we remember the relation [equation (1)]. Then, by defining a function inverse to the function g of equation (14) as x = g - 1(y) = 1/[( 4 + y)1/2 - 3], equation (13) can be transformed to
Fig. 7 shows the resolution kR as a function of intensity Ii for lysozyme and the HslUV complex obtained by using this equation. When there is more than one value of resolution for a given value of Ii, the best value can be obtained. (The high correlation line may become visible again in a high-k but low-noise region after once becoming invisible in a low-k but high-noise region.)
Since the solid angle of the range of one classification group is given by , and the total solid angle of the direction of the incident beam is because of the centrosymmetric property of the three-dimensional diffraction intensity function, the number of classification groups NG is given by
3. Placing two-dimensional patterns in the three-dimensional wavenumber space: a method of finding the relative orientation between two-dimensional patterns
After two-dimensional patterns are classified by the method described in the previous section, patterns classified into the same group are averaged to reduce the noise. When signal-enhanced patterns are obtained, they are to be placed in the three-dimensional wavenumber space by finding their relative orientations. The two-dimensional patterns exist on Ewald spheres. Because all these Ewald spheres have the same radii and their surfaces contain the origin of the wavenumber space, any pair of Ewald spheres either contact at the origin or have a circular intersection, which also contains the origin. Shneerson et al. (2008) studied the problem of placing two-dimensional patterns in the three-dimensional space by paying attention only to an approximately straight portion of the intersecting circles near the origin . Yang et al. (2010) refined the method of finding the tangential direction of the intersecting circle at the origin by explicitly paying attention to the curvature of the intersection. However, for the placement problem they used the method of Singer & Shkolnisky (2011) for cryo-electron microscopy in which only information on tangential directions is used. However, it is obvious that a relative orientation between a pair of Ewald spheres can be determined once their common circle is identified. In this section we first develop a method of identifying an intersecting circle for a given pair of signal-enhanced two-dimensional patterns and , and derive a mathematical expression for the relative orientation. Second, we ask what is the necessary number of patterns to be averaged for possible identification of common circles? The actual construction of a single three-dimensional diffraction intensity function from the data of relative orientations will be treated in a different paper.
We assume that a pair of signal-enhanced two-dimensional patterns and exist on Ewald spheres of as yet unknown orientations, and . Because of the centrosymmetric property of the three-dimensional diffraction intensity function, any B and here we describe only the result. Each of the two common circles exists as a circle of vanishing values in each of two groups of plots, (a) and (b) , where the parameters and generate each of the two groups, respectively. When the plot (a) vanishes on a circle with its centre at for a certain parameter value , the polar coordinates of the centres of intersecting circles are on and on . When the plot (b) vanishes on a circle with its centre at for a certain parameter value , the polar coordinates of the centres of intersecting circles are on and on . Then, the Euler angle of the relative orientation is given by
has its centrosymmetric image characterized by . Therefore, should have an intersecting circle with each of the Ewald spheres and . This means that for any pair of two-dimensional patterns, and , there exist two common circles. A method of finding them is developed in AppendixThus, the same set of Eulerian angles is now determined from each of the intersecting circles between Ewald spheres and , and between Ewald spheres and . Even though the result is redundant, the actual procedures of finding vanishing circles on plots (a) and (b) are much influenced by experimental noise. In this situation, finding the same quantity simultaneously by the two methods is a desirable numerical procedure.
Fig. 8 shows an example of how circles of vanishing values become visible as the number of averaging patterns is increased. In the case of this example for the HslUV complex, in which the limiting photon count sN is 0.13, we see by inspection that averaging over about 61 patterns is necessary for the identification. This process has been done for lysozyme and the HslUV complex both for a series of values of the incident It has been found that a product of the number NA of necessary patterns and the limiting photon count sN (therefore, the intensity Ii of the incident X-ray) is constant in either `molecule'. The analysis described in Appendix C indicates that the product NAsN must be 8 or larger for common circles to be identified. This is exactly the number observed in the case of Fig. 8. Therefore, the necessary number of patterns is given by
Table 1 summarizes the results obtained as applied to the two `molecules'.
|
4. Summary and conclusion
Two aspects of a method of single-particle imaging belonging to the path A, group 1 methodology have been developed. First, a new, improved method has been developed for computational analyses and procedures to arrange a set of many experimentally measurable two-dimensional diffraction intensity patterns in the three-dimensional wavenumber space. Second, explicit theoretical expressions have been derived for important experimental parameters in terms of the incident and two types of quantities characterizing a target.
The number Nmeasure of two-dimensional patterns to be measured is given by the product NANG of the number of classification groups NG and the average number of two-dimensional patterns NA to be averaged in each group for noise reduction. The number Ncompare of pairs of two-dimensional patterns to be analysed is given by NANG2, because the detection of similarity of patterns is to be carried out for each of the pairs, one from patterns representing each group and the other from all measured patterns. We derived theoretical expressions for the two parameters, NG and NA.
Concerning the first aspect, we have improved a hitherto proposed method for judging whether or not an arbitrary pair of two-dimensional patterns are similar enough to belong to the same classification group. Also, we developed methods of finding common intersecting circles between an arbitrary pair of noise-reduced two-dimensional patterns, and thereby relatively locating them in the three-dimensional wavenumber space. After locating many two-dimensional diffraction patterns properly in the wavenumber space, we have to construct a single three-dimensional diffraction intensity function. This problem, as well as the problem of application of the phase retrieval procedure to such a three-dimensional function, will be treated in a different paper.
The judgment of similarity is based on a two-dimensional correlation pattern for each pair of two-dimensional diffraction intensity patterns. For the calculation of correlation patterns, a new normalization of measurable two-dimensional intensity patterns is employed, thereby enabling one to enhance the sensitivity of judgment to high-angle k values, and eventually to improve attainable space resolution.
A two-dimensional intensity pattern depends on the direction of the incident X-ray beam with respect to the molecule-fixed coordinate system and an angle of rotation of the detector plane placed perpendicularly to the beam axis. When an angle between the directions of the incident beam for a pair of two-dimensional intensity patterns is small, a high correlation line is observed in the two-dimensional correlation pattern as a straight line extending radially from the centre. The angle of the line in the correlation pattern gives a relative angle of rotation of the detector plane. The intensity of a high correlation line is unity near k = 0 and becomes weaker at higher angles. The intensity reduces faster for larger values of .
The background of correlation patterns other than the high correlation line is characterized as a pattern of random appearance reflecting the irregular three-dimensional structures of biopolymers at the atomic level superimposed with the quantum noise. Owing to the deliberately adopted, new normalization of measurable two-dimensional intensity patterns, the mean value of the distribution in the background of two-dimensional correlation patterns turns out to vanish. The standard deviation of the distribution around its vanishing mean becomes globally, but not monotonically, larger as k becomes larger. When the standard deviation becomes larger than the intensity of a high correlation line, the latter becomes no longer recognizable. The recognizable length of a high correlation line becomes longer as the value of becomes smaller. The latter can be determined from the former.
When the value of is smaller than a certain value, say, (therefore, when the recognizable length of the corresponding high correlation line becomes longer than a certain value, say, kN), we classify the pair into the same group. To attain the best resolution, we should employ the largest possible value for kN. Operationally we determine the value of kN, the limiting k value for correlation recognition, as the lower bound of the noise-dominant k region in two-dimensional correlation patterns. Such a value kN can be characterized theoretically as the value of k at which the standard deviation of the background distribution is 0.6. An analytic expression, equation (13), for the standard deviation is derived which is approximately a function of the wavenumber normalized by the Shannon length of the target molecule, i.e. kL, and an expected photon count s by a pixel at the position of the wavenumber k. The quantity kN plays a central role in the method of analysis developed in this paper. It is shown that the structural resolution kR attainable by this method is given by kN.
In the method of identification of a high correlation line, upon which judgment of similarity of a pair of two-dimensional intensity patterns is based, effective information is extracted from the very noisy data in the range of wavenumbers up to the limiting value kN where the value of the standard deviation is 0.6. From the analytic expression for , we can derive the limiting photon count sN, an expected photon count at a limiting pixel, approximately as a function of the normalized resolution kRL = kNL (Fig. 5). For a method of to be useful, the value of the normalized resolution should be in the range of 20–100. The corresponding value of the limiting photon count sN turns out to be in the range of 0.25–0.08. The proposed method of analysis is sufficiently sensitive to enable one to extract information from such low-photon-count noisy data. This high sensitivity has been attained by employing a new correlation function. When the molecular length L and the radial diffraction intensity density function are known, the above relation between the normalized resolution kRL and the limiting photon count can be transformed to a relation giving the intensity Ii of the incident beam to be used to attain a resolution kR.
The angle to define a range of classification groups is given by an inverse of the normalized limiting wavenumber, (kNL ) - 1. As a result, the number of classification groups NG is given by 2(kRL )2.
A method of identifying common circles between an arbitrary pair of noise-reduced two-dimensional patterns is developed. For an arbitrary A and B is searched, another common circle is searched at the same time between two-dimensional patterns A and , where is a two-dimensional pattern on an conjugate to the one on which B exists. The average number of two-dimensional patterns NA to be averaged in each group for identification of common circles has been shown to be given in terms of the limiting photon count sN by 8/ sN.
there exists a conjugate which is centrosymmetric with respect to the origin of the wavenumber space. In the proposed method, when a common circle between two-dimensional patternsThe obtained theoretical expressions are used to evaluate values of important parameters for the two sample `molecules' by assuming, respectively, two typical intensities of the incident beam. The results are shown in Table 1. We should note the very low limiting photon counts, highlighting the strength of the method developed here. We should also note that the predicted attainable resolutions are remarkably high. This is partly due to the strength of the method of analysis developed here, but also due to the assumed high intensities of the incident beam. The assumed values of intensity in Fig. 7 and Table 1 are in the range of around 1021 photons pulse−1 mm−2, which is far larger than the peak value of 1.6 × 1016 photons pulse−1 mm−2 reported in the recent experiment (Seibert et al., 2011) carried out at the Linac Coherent Light Source (LCLS). Because the X-ray beam diameter reported in the experiment at LCLS is about 10 µm and a new technology (Mimura et al., 2010) is now available to focus it down to 10 nm, the values assumed in this paper appear realistic. Since we developed the analysis in this paper under a tentative assumption that damage processes can be neglected, the indicated intensity is the lower bound to attain a targeted resolution. We are now carrying out a study of the damage processes to assess the upper bound of employable intensity. The number of two-dimensional patterns to be measured in Table 1 is not small, but appears tractable for real experiments. At the same time we should note that the number of classification calculations is not small.
APPENDIX A
Relations between molecular orientation and the Ewald sphere
We define two right-handed coordinate systems, one fixed to the molecule and the other fixed to the experimental detector. They are defined, respectively, in terms of a set of mutually orthogonal unit vectors fixed to the molecule and in terms of another set of mutually orthogonal unit vectors fixed to the detector. The position of any point in the molecule can be expressed either in terms of coordinates (here the superscript t means transpose) in the molecule-fixed coordinate system (MFCS) as or in terms of coordinates in the detector-fixed coordinate system (DFCS) as . Thus
The relative orientation between the MFCS and DFCS can be described in terms of a 3 × 3 orthogonal matrix as
This is a shorthand notation for
where aji is an element of the matrix . This is an orthogonal matrix, i.e. . By introducing equation (24) into equation (23), we have the following relation between coordinates in the two coordinate systems:
It is convenient to express the orthogonal matrix in terms of a Eulerian angle as
where
The range of variation of the Eulerian angle is taken as follows:
Molecular structure is described by its electron density:
When the molecule is in the orientation described by an orthogonal matrix with respect to the detector, the electron density in the DFCS is given by
The
is calculated in the MFCS asThe
in the DFCS is given byThis equation has the same structure as equation (33), indicating that the electron density and the behave in the same way for rotation. It follows from this that the wavenumber vectors in the MFCS and in the DFCS are related to each other similarly as in equation (26):
Experimentally, the diffracted X-rays from a single molecule in a certain specific orientation are measured by a two-dimensional detector as a continuous diffraction pattern. This diffraction pattern is given by the squared modulus of the
on the following in the wavenumber space in the DFCS,Here is the wavelength of X-rays used in the experiment. The incident X-rays are assumed to proceed to the negative direction of the third axis of the DFCS. The first and second axes are taken on the surface of the detector. Equation (37) describes a sphere in the space. The surface of the sphere contains the origin of the wavenumber space and its centre is located at .
We now introduce a polar coordinate on the surface of this
byThe origin of the polar coordinate corresponds to the origin of the wavenumber space. This polar coordinate system is adopted so that when the detector is viewed from the direction of the incident X-rays, the detector's positive horizontal and vertical axes coincide with the directions of and , respectively. The observable diffraction pattern is given by equation (1) as
Because the orthogonal matrix can be specified by a Eulerian angle , we sometimes write
We also identify various Ewald spheres by their corresponding orthogonal matrix .
Let us now locate the
in the MFCS. It is given byThe last equality means
This equation means that the Ewald spheres corresponding to and are essentially the same spheres giving the same surface. The former is obtained from the latter by rotating the latter anticlockwise around its third axis by angle .
Here, the different ways in which a Eulerian angle appears in the two different coordinate systems may be worth noting. In the DFCS, a set of three angles describes the spatial orientation of the molecule. In the MFCS, the angles are nothing but the polar coordinates of the incident X-ray beam axis and the angle describes the angle of rotation of the detector around this beam axis. Because of these clear meanings of the Eulerian angles in the MFCS, we employ them to describe molecular orientations rather than often mathematically better behaved quaternions.
Equation (42) suggests an experimental procedure of classifying various observable diffraction patterns into groups only with respect to values of and . Those with the same values of and but with different values of are to be classified into the same group.
Because the electron density is a real function, the
satisfies the relationBecause of this relation, the same diffraction pattern is observed on a pair of different Ewald spheres. From equations (40), (39), (35) and (43) we have
The last equation indicates that two Ewald spheres, characterized by rotation matrices and , respectively, exist always in a pair. They occupy positions centrosymmetric to each other with respect to the origin of the wavenumber space, and they are in touch with each other at the origin. When we introduce a right-handed coordinate system to the
with its origin at the centre of the sphere, the corresponding coordinate system of the centrosymmetric is now left handed. This fact is understood as the two Ewald spheres having different handedness or parity.APPENDIX B
Finding the relative orientation between a pair of unknown orientations from their corresponding diffraction intensity patterns
Real, experimentally observable diffraction patterns are subject to severe quantum noise. To cope with such noise, the same patterns should be measured many times and their means should be calculated to reduce the noise. In the following, the quantity of equation (39) is used under the assumption that such averaging has already been done so that the effect of noise can be neglected.
Let the two unknown orientations be given by
The corresponding Ewald spheres and are given, respectively, by
and
Because all these Ewald spheres have the same radii and their surfaces contain the origin of the wavenumber space, they either contact at the origin or have a circular intersection, which also contains the origin.
Let us now study the intersecting circle between Ewald spheres and . Let on the first
and on the second be the same point on the intersecting circle,Therefore, by setting
we have
These equations mean that the polar coordinates on
for points on the intersecting circle with are also polar coordinates on for points on the intersecting circle with . Similarly, the polar coordinates on for points on the intersecting circle with are also polar coordinates on for points on the intersecting circle with .We now calculate the intersecting circle between Ewald spheres and , and that between Ewald spheres and . We assume that is given by equation (27). The centres of the three Ewald spheres, , and , are given in the MFCS by
Therefore
Let the centre of the intersecting circle between Ewald spheres and on
be . Similarly, let the centre of the intersecting circle between Ewald spheres and on be . They are given byWe now proceed to describe a procedure to find the intersecting circles for a given pair of experimental diffraction patterns and . When we move clockwise along the intersecting circle on the
, it appears as an anticlockwise motion along the intersecting circle on the . Therefore, we calculate the following quantity for all assumed values of in the range of :This quantity should vanish on a certain circle for a certain value of . Because the intersecting circle contains the polar origin , it can be uniquely specified by the polar coordinate of its centre. When the quantity of equation (54) vanishes on a circle with its centre at , the polar coordinates of the centres of intersecting circles are on and on . These polar coordinates of the centres are to be compared with equation (53). Therefore,
From these relations we have a half of equation (21) in the main text.
Let us now proceed to study the intersecting circle between Ewald spheres and . By following similar procedures as above we have the following relations from equation (47):
By introducing similar quantities and following similar procedures, we have the following results:
To find the intersecting circle between Ewald spheres and for a given pair of experimental diffraction patterns and , we calculate the following quantity for all assumed values of in the range of :
When we move clockwise along the intersecting circle on , unlike in equation (54), is now plus. When the quantity of equation (58) vanishes on a circle with its centre at , the polar coordinates of the centres of intersecting circles are on and on . These polar coordinates are to be compared with equation (57). Therefore
, it appears as an anticlockwise motion along the intersecting circle on . Because Ewald spheres and have opposite parities, this motion appears as a clockwise motion along its centrosymmetric circle on . This is the reason why the sign in front of in in equation (58)From these relations we have another half of equation (21) in the main text.
APPENDIX C
Derivation of equations (11), (12), (13) and (22)
The mean of the quantities of equations (6) and (7) with respect to the is given by
When the quantity of equation (60) is averaged over the structure irregularity distribution, we have from equation (9) and the relation due to the exponential distribution
where is an angle between the two vectors given explicitly as follows according to equation (41):
When the two vectors are always significantly different, equation (62) reduces to
The average of the quantity of equation (61) over the structure irregularity distribution is given by
By introducing an approximation
we see from equations (64) and (65) that vanishes as mentioned in the main text. Also, from equations (62), (65) and (66) we have
This is an expected expression when all background features having random appearance due to the quantum noise and the structure irregularity distribution are erased from such high correlation lines as observed in Figs. 2(a) and 2(b).
We now proceed to simplify equation (67). At first we calculate , an angle between the two vectors of equation (63). The magnitude of the two vectors is given by equation (2). To calculate the inner product between the two vectors, we introduce the relative Eulerian angle by
Here is the angle between beam directions given by equation (5). In terms of the relative Eulerian angle, we have
The quantity of equation (67) is to be obtained by substituting from this equation. Let us write the result of the integration with respect to in equation (67) by writing three independent variables explicitly as . Here, remember that and k are related by equation (2). Even though it is difficult to derive an analytic form of this quantity, we can calculate it numerically. Fig. 9(a) shows an example of this quantity obtained numerically. For practical applications we want to have an analytic form of this quantity even when it may be somewhat approximate. To derive an approximate but analytic form we at first notice that the integrand of equation (67) has a significant contribution only when and are small. By using this fact we Taylor-expand both sides of equation (69) in terms of , and , and retain only up to the second-order terms to obtain
The quantity of equation (67), obtained numerically by using this expression, was found to have almost no difference from , meaning that equation (71) is a very good approximation to equation (69). Even for this expression of equation (71), the derivation of an analytic form for the integral of equation (67) is difficult. In this situation we introduce a drastic approximation of replacing the quantity of equation (71) by its mean
and use the value of thus obtained in equation (67). Writing the result thus obtained as we have
For practical applications we can approximate this expression further as
Fig. 9(b) is a plot of this quantity. By comparing it with Fig. 9(a), we see that this simple analytic form is a reasonably good approximation.
Equation (74) is equation (11) of the main text. Equation (12) can be obtained from equations (68) and (70) by assuming that and therefore both and are small quantities of the same order.
For the purpose of deriving equation (13), we evaluate the standard deviations of the quantities appearing in equations (6) and (7). After some calculations they are obtained as
During the derivation of this equation we assumed that the values of the diffraction intensity density at the neighbouring Shannon pixels on a circle are not correlated. This means that the distance in the wavenumber space between the neighbouring Shannon pixels must be larger than the correlation length kC of equation (10). Because the correlation fades quickly beyond kC, we assume that equation (75) still holds for this distance. Therefore we have equation (15). The standard deviation of the correlation pattern is then given by
from which we have equation (13).
We now proceed to derive equation (22). We study how the value of the product NAsN is determined for common circles to be identified. At first we note that a sum of Poisson distributions is a Therefore, a sum of NA two-dimensional patterns is equal to one sample of a two-dimensional pattern whose expected photon number is given by , which we shall write as in equation (77) below. We shall also write the mean of on a sphere as . Now we consider the distribution of values of the difference (1) on and (2) off the common circle. (1) On the common circle, D is given by a difference of two integers, both given by a Its mean vanishes of course. The mean of its absolute value may be estimated approximately as
(2) Off the common circle, the values of and are both distributed according to the two distributions, the
and the exponential distribution. The composite distribution is given byFor this distribution, the mean of the absolute value is estimated approximately as follows:
Let us assume that, for clear recognition of | D |on out of | D |off, the following condition must be satisfied up to k = kN: , which means
Acknowledgements
This study has been supported by the `X-ray
Utilization Research Project' of the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT). We are heavily indebted to Dr Yasumasa Joti for his help during the revision of the manuscript.References
Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235–242. Web of Science CrossRef PubMed CAS Google Scholar
Bortel, G. & Faigel, G. (2007). J. Struct. Biol. 158, 10–18. Web of Science CrossRef PubMed CAS Google Scholar
Bortel, G., Faigel, G. & Tegze, M. (2009). J. Struct. Biol. 166, 226–233. Web of Science CrossRef PubMed CAS Google Scholar
Elser, V. (2003). J. Opt. Soc. Am. A, 20, 40–55. Web of Science CrossRef Google Scholar
Elser, V. (2009). IEEE Trans. Inf. Theory, 55, 4715–4722. Web of Science CrossRef Google Scholar
Fienup, J. R. (1982). Appl. Opt. 21, 2758–2769. CrossRef CAS PubMed Web of Science Google Scholar
Fung, R., Shneerson, V., Saldin, D. K. & Ourmazd, A. (2009). Nat. Phys. 5, 64–67. Web of Science CrossRef CAS Google Scholar
Huldt, G., Szoeke, A. & Hajdu, J. (2003). J. Struct. Biol. 144, 219–227. Web of Science CrossRef PubMed CAS Google Scholar
Loh, N. D. et al. (2010). Phys. Rev. Lett. 104, 225501. Web of Science CrossRef PubMed Google Scholar
Loh, N. D. & Elser, V. (2009). Phys. Rev. E, 80, 026705. Web of Science CrossRef Google Scholar
Mimura, H. et al. (2010). Nat. Phys. 6, 122–125. Web of Science CrossRef CAS Google Scholar
Nishino, Y., Takahashi, Y., Imamoto, N., Ishikawa, T. & Maeshima, K. (2009). Phys. Rev. Lett. 102, 018101. Web of Science CrossRef PubMed Google Scholar
Seibert, M. M. et al. (2011). Nature (London), 470, 78–81. Web of Science CrossRef CAS PubMed Google Scholar
Shneerson, V. L., Ourmazd, A. & Saldin, D. K. (2008). Acta Cryst. A64, 303–315. Web of Science CrossRef CAS IUCr Journals Google Scholar
Singer, A. & Shkolnisky, Y. (2011). SIAM J. Imag. Sci. 4, 543–572. Web of Science CrossRef Google Scholar
Sousa, M. C., Kessler, B. M., Overkleeft, H. S. & McKay, D. B. (2002). J. Mol. Biol. 318, 779–785. Web of Science CrossRef PubMed CAS Google Scholar
Weaver, L. H. & Matthews, B. W. (1987). J. Mol. Biol. 193, 189–199. CrossRef CAS PubMed Web of Science Google Scholar
Wilson, A. J. C. (1949). Acta Cryst. 2, 318–321. CrossRef IUCr Journals Web of Science Google Scholar
Yang, C., Wang, Z. & Marchesini, S. (2010). Proc. SPIE, 7800, 78000P1–78000P10. Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.