Received 11 May 2012
Continuous X-ray diffractive field in protein nanocrystallography
aARC Centre of Excellence for Coherent X-ray Science, School of Physics, The University of Melbourne, Melbourne, Victoria 3010, Australia, and bARC Centre of Excellence for Coherent X-ray Science, CSIRO Materials Science and Engineering, and Preventative Health Flagship, Parkville, Victoria 3052, Australia
The recent development of X-ray free-electron laser sources has created new opportunities for the structural analysis of protein nanocrystals. The extremely small sizes of the crystals, as well as imperfections of the crystal structure, result in an interference phenomenon in the diffraction pattern. With decreasing crystallite size the structural imperfections play a role in the formation of the diffraction pattern that is comparable in importance to the size effects and should be taken into account during the data analysis and structure reconstruction processes. There now exists a need to develop new methods of protein structure determination that do not depend on the availability of good-quality crystals and that can treat proteins under conditions close to the active form. This paper demonstrates an approach that is specifically tailored to nanocrystalline samples and offers a unique crystallographic solution.
Growing crystals of membrane proteins to the size suitable for high-resolution X-ray structure analysis has always been a major challenge in structural biology (Garcia-Ruiz, 2003; Malkin & Thorne, 2004; Baker, 2010). Membrane proteins often tend to form nanoscale crystals (Caffrey, 2003; Fromme & Spence, 2011) which diffract weakly even at synchrotron X-ray sources. The need for long exposure times and, as a result, the delivery of high X-ray doses to record high-resolution data lead to extensive damage of the sample. The recent development of extremely bright X-ray free-electron laser (XFEL) sources has created an opportunity for the structure analysis of such protein nanocrystals (Chapman et al., 2011; Boutet et al., 2012; Johansson et al., 2012; Koopmann et al., 2012). The `diffract and destroy' approach of XFEL experiments, in which diffraction data are collected from a stream of protein nanocrystals of varying sizes and shapes in random orientations, requires, however, the development of new methods for the structural analysis of proteins.
The size and quality of the protein crystals used in structural analysis are of particular importance. The implications of structural imperfections for solving the structure of proteins by conventional techniques have been identified and analysed (Faure et al., 1994; Mizuguchi et al., 1994; Eyal et al., 2005; Welberry, 2004; Welberry et al., 2011). It has been noted that conventional protein crystallography relies almost entirely on the analysis of Bragg diffraction data. Scattering between the Bragg reflections, however, contains information in addition to that obtainable from the Bragg peaks and should be included in the structural analysis of proteins.
The situation is much more complicated for nanoscale crystals. The extremely small size of the crystals, as well as imperfections of the crystal structure, result in an interference phenomenon in the diffraction pattern and influence the shape of the Bragg peaks, as well as the scattering between them (Vainshtein, 1966; Welberry, 2004; Rafaja et al., 2000, 2004). The mechanisms that suppress the growth of crystals are poorly understood. Some studies suggest that the incorporation of errors leads to a `poisoning' of the surface (Feher & Kam, 1985; Grant & Saville, 1994). The presence of the lipid or detergent components of the crystallization solution can limit the ideal long-range packing in the formation of crystals. During crystal growth, structural defects and chemical impurities are incorporated until they accumulate to such an extent on the surface of the crystal that further building of a well aligned crystalline lattice becomes energetically unfavourable. If we consider the crystal as a cube of size L = na, where a is the unit-cell parameter and n is the number of the unit cells along the a direction, then the ratio of the number of unit cells on the crystal surface, NS, to the total number of unit cells, NV = n3, can be estimated to be NS/NV = 1 - [1 - 2(a/ L) ]3. Fig. 1 shows that the fraction of unit cells at the surface of the crystal increases rapidly as the crystal size decreases. As examples, 400 nm-size crystals with 100, 200 and 300 Å unit cells contain, respectively, approximately 14, 27 and 39% surface unit cells. It is clear that the effects due to disorder of a large number of surface unit cells cannot be neglected in the structural analysis of the nanoscale crystals. Moreover, structural imperfections play a role in the formation of the diffraction patterns that is comparable in importance with the size effects if the crystallite size is reduced. Consequently, the extraction of the integrated intensities of Bragg reflections from the diffraction pattern is a major problem in protein nanocrystallography.
| || Figure 1 |
NS/NV ratio as a function of crystal size for different unit-cell parameters: (1) a = 10.0 nm, (2) a = 20.0 nm and (3) a = 30.0 nm.
A new method for the structural analysis of protein nanocrystals has recently been demonstrated (Kirian et al., 2010, 2011; Spence et al., 2011). The total intensity of X-rays scattered from the nanocrystals can be described as a simple product of two factors: the form factor of the crystal (CFF) and the molecular form factor (MFF) of the protein. Assuming that the MFF is identical for all crystals but the CFF depends on the size and shape of a crystal, one can derive the electron density of the protein molecule by properly averaging the CFF over all crystals and using iterative phasing methods. The intensities of reflections are integrated within the Wigner-Seitz cell around each Bragg peak, which provides sufficient information for the direct phasing of the MFF.
The method further assumes that crystal size effects dominate in the formation of the diffraction pattern of the protein nanocrystals. It is assumed that the density of structural defects in protein nanocrystals is small and that such effects can, therefore, be neglected or corrected by the static Debye-Waller factor (Kirian et al., 2010). Our analysis suggests, however, that structural imperfections may play an important role in the formation of the diffraction pattern of nanocrystals. Moreover, a reason why crystals may not grow beyond micron size is that they are not ideal, suggesting strongly that the structural imperfections should be included in the structural analysis.
Here we present an approach in which the diffraction pattern of the protein nanocrystals is regarded as a continuous function of the scattering vector, rather than as a discrete set of Bragg reflections. It has been suggested that the diffraction data that are available using XFEL sources offer the possibility of direct phasing of the continuous scattering from single proteins or protein nanocrystals of varying sizes and orientations (Fung et al., 2009; Kirian et al., 2010). With this in mind, one can build a continuous three-dimensional diffraction pattern using a sufficient quantity of two-dimensional diffraction snapshots, collected from a stream of randomly oriented nanocrystals. Coherent diffractive imaging (CDI) methods, which have generally been developed for the analysis of non-periodic finite objects (Fienup, 1982; Miao et al., 1998; Spence, 2004; Chapman et al., 2006; Quiney, 2010) are then applied to reconstruct the electron density of the protein molecule. We will demonstrate that the variation of crystal shapes and structural imperfections can be incorporated into the reconstruction process as an additional source of partial coherence of the X-rays scattered from the sample. It has recently been shown that the explicit incorporation of models of partial coherence into the solution of the phase problem significantly improves the quality of reconstruction in diffractive imaging applications (Whitehead et al., 2009; Chen et al., 2009; Abbey et al., 2011; Quiney & Nugent, 2011).
Diffraction data collected using an XFEL source consist of two-dimensional diffraction patterns (snapshots) of randomly oriented particles (nanocrystals). Each snapshot represents the diffraction pattern from a single particle of finite size and a unique orientation, but each particle possesses its own size, orientation and degree of structural disorder. The structure of a protein molecule is determined from the three-dimensional diffracted intensity distribution by applying the iterative phase retrieval technique, which should be modified to incorporate a priori information pertaining to the experimental conditions. Several critical questions, which influence the validity of our method, are discussed below.
In recent years, considerable attention has been given to the problem of the analysis of new types of data obtained using XFEL sources (Shneerson et al., 2008; Loh & Elser, 2009; Fung et al., 2009; Schwander et al., 2010; Kirian et al., 2010; Spence et al., 2011; Tokuhisa et al., 2012). It has been shown that the orientations of individual particles or even single molecules can be determined from the XFEL diffraction data (Fung et al., 2009; Kirian et al., 2010). In the case of nanocrystals, this process is more straightforward, since it is based on indexing very strong Bragg reflections. It has been shown that the existing auto-indexing programs are capable of determining unit-cell parameters and orientations with high accuracy (Kirian et al., 2010).
The scope of this article is to analyse the influence of structural imperfections on the three-dimensional diffraction pattern obtained from a set of randomly oriented, structurally imperfect nanocrystals of different shapes. In our analysis we follow earlier work (Kirian et al., 2010) which describes techniques for merging two-dimensional snapshots into a three-dimensional intensity distribution volume.
The successful application of any iterative algorithm depends on the availability of an accurate model for the data acquisition process. A common deviation from the ideal conditions arises because of the impact of noise or low photon counts on the diffraction data. Recent work has investigated the impact of these factors on the quality of reconstruction and explored the point at which the iterative scheme fails to converge to a correct solution (Williams et al., 2007). It has been shown that the critical number of scattered photons for the successful reconstruction of a two-dimensional projection is and the signal-to-noise ratio is , where SNR is defined by
In equation (1), Ik is the number of photons collected in pixel k, Pk is the signal due to photons scattered into the detector by alien sources, and Bk is the bias level arising from the detector and electronics. We will base our simulations on these critical parameters.
Several factors have a strong influence on the setup of a CDI experiment.
(a) The diffraction pattern should satisfy the oversampling criterion
(b) The far-field condition, , where d is the size of the object (nanocrystal), is the wavelength of the X-ray source and Z is the object-detector distance.
(c) The discrete Fourier transform relation for the sampling intervals, and , in the object and detector planes, respectively,
where N is the number of pixels along one side of the detector. Meanwhile, the crystallographic resolution is defined as , where is the maximum Bragg angle that can be measured (see Fig. 2a), or
| || Figure 2 |
(a) Schematic of the CDI experiment. (b) Vector representation of the position of the atom in the crystal lattice.
(d) Since our approach is based on the analysis of the continuous three-dimensional diffraction pattern, correct sampling of intensities between the Bragg reflections is also important. The number of points between the Bragg reflections can be defined as (see Fig. 2a)
where . Consider the crystal as a cube of size L = na, where a is the unit-cell parameter and n is the number of unit cells along the a direction. Consequently, the number of fringes between the (h00) Bragg reflections is equal to p = n - 2. Our analysis shows that the critical number of measured points, mC, is given by mC = 2p + 1, corresponding to two measured points for each subsidiary fringe.
In this section we consider the idealized case of diffraction from a single nanocrystal of finite size. Although the influence of structural imperfection on the diffraction pattern of protein crystals has been analysed previously (Mizuguchi et al., 1994; Rafaja et al., 2000, 2004; Welberry, 2004; Welberry et al., 2011), it is useful to summarize the main features in order to provide necessary information for the analysis that follows.
We consider a cluster of proteins located in the unit cell as the constituent elements of the crystal structure. We call such a group of proteins a `molecular cluster'. In other words, each node of the three-dimensional periodic lattice corresponds to the ideal position of the molecular cluster. We further assume that the molecular cluster consists of identical protein molecules, replicated within the unit cell according to the symmetry of the crystal.
where is the position of the lth unit cell in the crystal or, according to our definitions, the position of the lth molecular cluster in the crystal. The position of the kth atom in the lth molecular cluster is denoted , fk(q ) is the atomic scattering vector of the kth atom, is the scattering vector, and summations are performed over all atoms, labelled k, and all unit cells, labelled l. Scattering from the molecular cluster is determined by the so-called MFF defined by
Suppose all molecular clusters in all unit cells are identical and, therefore, for all clusters in the crystal. Using * to signify the complex conjugate, the intensity of X-rays scattered from an ideal crystal is obtained through the relation , so that
In equation (9), represents the interference function produced by the three-dimensional periodic lattice of molecular clusters. For large crystals, the scattered intensity is negligible everywhere except near the positions of the Bragg reflections.
To incorporate structural imperfections into the analysis of the diffraction data we define the positions of atoms as , where is the average position of the kth atom in the molecular cluster, which is identical for all clusters, and denotes the displacement of the kth atom from the average position in the lth molecular cluster (Fig. 2b).
In equation (10b), is the weighting factor which describes the degree of the coherence of two molecular clusters located at positions and . It is equal to unity for an ideal crystal, and equal to zero for the totally incoherent formation of molecular clusters. It is defined by
where angle brackets denote an average over all molecular clusters of the crystal. As one can see from equation (10b), the behaviour of the interference function, , is governed by the form of . The factors describe the structural imperfections, such as the loss of the coherence of molecular clusters due to their misorientations (Rafaja et al., 2004),
Here, m = | j - l |, Dm is the distance between two molecular clusters located at positions and , and denotes the `texture' or `preferred orientation' function which characterizes the angular distribution of the molecular clusters. The mean-square displacement (MSD) of the mth neighbouring molecular cluster is denoted , and can be defined as (Vainshtein, 1966) where is the MSD of the nearest-neighbour molecular clusters.
According to equation (10b), the presence of structural imperfections in protein nanocrystals leads to broadening of diffraction peaks in addition to the size effects. Fig. 3 shows the interference function calculated for the ideal and the partially disordered protein nanocrystal. In our analysis, we assumed that , , and , where da, db and dc are the interplanar distances along the crystallographic directions a, b and c, respectively, and , and are variable parameters. As one can see from Fig. 3(b), the broadening effect is more pronounced at high-q regions of the diffraction pattern. Detailed analysis of the interference function, , however, shows that the appearance of diffraction fringes between the Bragg reflections is extremely sensitive to the function, even at the low-q regions (Fig. 4).
| || Figure 3 |
The (hk0) plane of the interference function calculated for (a) the ideal nanocrystal and (b) for the partially disordered nanocrystal. The is calculated using equation (12b), assuming that , where d100 is the interplanar distance along the (100) direction of the crystal lattice and = 0.005.
| || Figure 4 |
The variation of the interference function, , between (500) and (600) Bragg reflections for the hexagonal unit cell, a = b = 281, c = 165 Å. The is calculated using equation (12b), assuming that , where d100 is the interplanar distance along the (100) direction of the crystal lattice. (1) = 0.0, (2) = 0.002, (3) = 0.005, (4) = 0.008.
In our analysis, we have assumed that the molecular clusters consist of identical protein molecules. Protein crystals are, however, often grown as complexes of two or more proteins which introduces so-called conformational disorder. Conformational disorder does not affect the shape of Bragg peaks, but reduces the intensity of reflections by the additional Debye-Waller factor and also leads to continuous diffuse scattering in inter-Bragg regions (Vainshtein, 1966; Welberry, 2004).
The interference phenomenon which can be observed in the diffraction patterns of nanocrystals depends on contributions to (Vainshtein, 1966; Welberry, 2004). As follows from the analysis, the structural imperfections of nanocrystals lead to broadening of the Bragg reflections, to decay of their peak intensities and to modification of the scattering between the Bragg reflections. In conventional protein crystallography, critical assumptions are usually made: the density of structural defects is small and displacements of atoms are mutually uncorrelated. In this case, the static Debye-Waller factor can be used to account for the influence of structural imperfections on the diffraction pattern. The intensity of X-rays scattered from the partially disordered nanocrystal is transformed into the form , where B is the Wilson factor (Giacovazzo, 2011). Generally, however, is a non-symmetrical, continuous function that exists in the entire range of and not just around the Bragg reflections. The averaging of intensities only around the positions of the Bragg reflections ignores most of the effects of the structural imperfections, leading to the loss of important structural information. From this point of view, extracting the integrated intensities of the Bragg reflections from the diffraction pattern is a formidable task.
To overcome this problem one should consider the diffraction pattern from the partially disordered protein nanocrystals not as a discrete set of intensities of the Bragg reflections, but as a continuous function of . In this case, the approach that is now known as the CDI technique can be applied for structure analysis of the protein nanocrystals. The ability of the CDI approach to reconstruct the object comprised of a small number of periodic elements has recently been demonstrated (Chen et al., 2009). The CDI reconstruction technique is based on Fienup's extensions of the algorithm first proposed by Gerchberg & Saxton (1972) and Fienup (1982). The Gerchberg-Saxton-Fienup (GSF) algorithm propagates a numerical representation of a scalar wavefield between `object' and `detector' planes using instances of the Fourier transformation. The wavefield in these fixed planes is constrained by the application of a priori information, and the iteration between planes is continued until self-consistency is achieved (Fienup, 1982). The continuous diffractive field produced by the nanocrystals contains all the information necessary for structure analysis and, unlike conventional crystallography, provides a unique solution in a single set of data (Bates, 1982). According to the analysis presented in this section, may be regarded as describing the degree of coherence of the X-rays scattered from two molecular clusters, located at positions and . The structural imperfections in a protein nanocrystal can, therefore, be considered as a source of partially coherent X-rays, scattered from the ensemble of molecular clusters arranged in a three-dimensional periodic lattice. Recent developments in CDI show that the coherent properties (spatial and temporal) of X-ray sources can be incorporated directly into the CDI reconstruction process (Whitehead et al., 2009; Chen et al., 2009; Abbey et al., 2011). Moreover, the radiation damage caused by the interaction of X-rays of the XFEL source with the protein molecules can also be incorporated into the structural analysis by introducing models of partial coherence (Quiney & Nugent, 2011); we have utilized the same general approach in our analysis. Furthermore, and, therefore, the electron density distributed in the unit cell of the protein nanocrystal can be reconstructed from measured intensities by introducing models of into the algorithm.
In this section, we assume that sufficient two-dimensional diffraction patterns of randomly oriented particles have been collected to build a three-dimensional intensity distribution volume. Each snapshot represents the diffraction pattern from a single particle of finite size and a unique orientation, but each particle has its own size, orientation and degree of disorder. The orientation of each nanocrystal is determined from the two-dimensional diffraction pattern by the indexing approach presented in Kirian et al. (2010). The diffraction patterns for each orientation are accumulated to achieve the critical number of scattering photons, , from which the three-dimensional diffraction pattern is ultimately constructed. We also assume that all of the criteria described in §2 are satisfied.
We consider diffraction from the set of M distinct nanocrystals of different shapes , where defines the shape of the mth nanocrystal. We assume that each nanocrystal fits a box, , with dimensions , formed by the ideal unit cell of the nanocrystal translated periodically along the lattice vectors , and , shown in Fig. 5. The position of the jth atom of the mth crystal in the lattice is defined as , or , where is the average position of the jth atom in the crystal, is the origin of the lth unit cell (molecular cluster) in , is the ideal position of the kth atom in the lth molecular cluster and denotes the displacement of the jth atom from the average position in the mth crystal (Fig. 2b). Each unit cell of is assigned an occupation factor, , which is equal to unity if and zero otherwise.
| || Figure 5 |
Schematic of the crystal structure representation; the (ab) plane of the box is shown.
Here, the summation is performed over all atoms located in , I0(t) is the time-dependent intensity of the incident pulse, fjm(q,t) is the elastic scattering factor of the jth atom in the mth crystal, and .
The integrated intensity, , of X-rays scattered from a set of nanocrystals of different shapes can then be defined as
Angle brackets denote an average over time, all crystal shapes and all structural distortions. As one can see from equations (14) and (15), all of the structural information, specified by the ideal nuclear positions of atoms, is contained in , while all structural imperfections are combined within . One may determine one of these quantities from the measurements of if the other is known. The goal of our presentation is the structural analysis of proteins to determine . Since is a continuous function of , one can also characterize the structural imperfections from measured intensities, , in crystals with known ideal (average) structure. We consider potential applications in the following examples.
We assume that the density of structural defects is small and, therefore, the influence of such defects on the diffraction pattern can be neglected or taken into account by the static Debye-Waller factor, exp( - Bq2 ). The total intensity, , can be rewritten in the form
where and is the average CFF. Since the CFF is the same in the neighbourhood of every Bragg reflection for a given crystal, equation (17) can be transformed into the same form discussed in Spence et al. (2011) and Kirian et al. (2010).
The integrated intensity, , can be rewritten in the form
This case is the target of our analysis. We consider only sources of structural disorder, such as mutual displacement or misorientations of molecular clusters. The current analysis does not include conformational disorders, though these are readily incorporated into the analysis. The integrated intensity, , is represented in the form
where . The elements of the matrix describe the degree of coherence of two waves, scattered from two molecular clusters separated by the distance . Consequently, the structure of the protein crystal can be derived from by introducing models of coherency, .
According to the coherent mode formulation of coherence theory, proposed by Wolf (1982), the relationship between the coherent properties of molecular clusters and a far-field scattered intensity can be represented in terms of a modal expansion of the form
where are real, non-negative numbers representing the occupancy of the mode and M is the number of such modes. The modes are themselves mutually incoherent. It has recently been demonstrated (Whitehead et al., 2009; Chen et al., 2009; Quiney & Nugent, 2011; Abbey et al., 2011) that the incorporation of models of partial coherence into CDI analysis significantly improves the quality of the reconstruction; we have utilized the same approach in our analysis here. We consider only one type of structural imperfection, namely a type of distortion of the crystal lattice caused by correlated displacements of the molecular clusters from their ideal positions [equation (12b)]. We used this type of distortion as an illustrative example and as a tool to explore the generic properties of the approach.
The protein molecule photosystem I (Jordan et al., 2001) was used as a target for the structure reconstruction. We simulated both ideal and disordered nanocrystals (200 nm linear size) of photosystem I containing 7 × 7 × 12 unit cells. Each unit cell consists of six protein molecules (the molecular cluster) according to the space group P63.
The electron-density distribution of the crystal, , can be represented by the convolution
The summation is performed over all L lattice translations (unit cells) of the crystal. In equation (22), denotes the lth translational position of the three-dimensional lattice, pl is the occupancy factor and is the Dirac delta function; the occupancy factor equals unity when the position is fully occupied by the molecular cluster and zero if the position is vacant. Starting with the central position, , all subsequent translational positions, , of the molecular cluster are , where is the elementary translation vector and denotes the displacement of the molecular cluster from the ideal position; for the ideal crystal . To simulate the partially disordered nanocrystal, a set of 100 random vectors, , is generated for each position using a Gaussian distribution with a mean of , from which one displacement vector is randomly selected. The number of unit cells, Lm, for each crystal was randomly generated, with a Gaussian distribution corresponding to a mean length of 200 nm and with a standard deviation of 10%. The three-dimensional intensity distribution volume was calculated as follows; the MFF function, , was calculated using
where N is the number of atoms in the protein molecule, K is the number of symmetry operations, fj(q ) is the atomic scattering factor of the jth atom, is the position of the jth atom and is the kth symmetry operation matrix. The interference function for each crystal, , was calculated for each nanocrystal using
where denotes the convolution operator. The intensity scattered from the crystal was calculated using
Finally, the total intensity was calculated as
where M is the total number of three-dimensional diffraction patterns simulated. The two-dimensional cross section of the three-dimensional intensity distribution, , for the ideal and imperfect photosystem I nanocrystals is shown in Fig. 6.
| || Figure 6 |
Simulated diffraction patterns of the photosystem I nanocrystal [(ab) plane]: (a) diffraction from the ideal crystal; (b) diffraction from the set of disordered nanocrystals. (c) A magnified view of a segment from the diffraction patterns of the ideal (left) and the disordered (right) nanocrystals.
where , is the distribution function which describes the lattice distortion and is the crystal form distribution function, which is the self-convolution of the crystal shape function (§4). Consequently, equation (27) can be written in the form (Vainshtein, 1966)
In order to acquire information about the average shape and dimensions of the crystal, we first applied the conventional CDI algorithm to reconstruct its image assuming that for all j and k. Fig. 7 shows the reconstructed images in the (001) planes for both the ideal (Fig. 7a) and disordered protein nanocrystals (Figs. 7b, 7c), with a resolution of 12 Å. Since the conventional CDI algorithm does not take into account the imperfections of the crystal structure, an excellent quality image of the ideal crystal is reconstructed, while the image of the imperfect crystal shows a pronounced blurring effect. Nevertheless, the shape and the dimensions of the crystal projection can be correctly identified from Figs. 7(b), 7(c). Considering the diversity of individual nanocrystals, however, the average form of the nanocrystal defines the function for further CDI reconstructions.
| || Figure 7 |
The (001) projection of the electron density for the ideal (a) and the disordered (b), (c) photosystem I protein crystals, reconstructed at a resolution of 12 Å without incorporation of the partial coherency model into the reconstruction process.
We also generated two data sets of integrated intensities of the Bragg reflections for ideal, Ii(hkl ), and disordered, Id(hkl ), protein nanocrystals, using the intensity integration approach presented in Kirian et al. (2010), to analyse them using conventional crystallography. The crystal disorder was modelled by the static Debye-Waller factor, , where B = 40.0 Å2. We first conducted a molecular replacement with PHASER (McCoy et al., 2007) using the published photosystem I structure (Protein Data Bank ID 1jb0 ) as an initial model against the generated structure factors, Fi(hkl ) = [Ii(hkl )]1/2 and Fd(hkl ) = [Id(hkl )]1/2, respectively.
In both cases, the analysis showed the correct position and orientation of the photosystem I molecule in the unit cell. This is the expected result, since the position, orientation and shape of the molecule can be determined solely from the low-q data because the high-q data are most sensitive to the structural imperfections. Subsequent constrained structural refinement with REFMAC (Murshudov et al., 1997) failed, however, with extremely high thermal parameters in the range of 40-120 Å2 for most of the atoms and over 100% R/Rfree factors for the disordered data set (Table 3). This is also in good agreement with the results obtained using the conventional CDI algorithm.
Finally, we reconstructed the electron density of the molecular clusters by incorporating models of structural imperfections into the reconstruction process. The common `recipe' adopted for the CDI reconstruction using the partially coherent models has been described in several publications (Whitehead et al., 2009; Chen et al., 2009; Abbey et al., 2011; Quiney & Nugent, 2011). For structure analysis of the partially disordered protein nanocrystals we devised the following iterative procedure:
(i) The reconstruction process starts from the initial guess of the uniform distribution of the electron density in the unit cell, , where V is the unit-cell volume and Z is the total number of electrons in the unit cell. We use the unit cell of the crystal as the initial support function, . Later, during the reconstruction process, the low-resolution envelope of the molecular cluster is used as a support.
(ii) The MFF function is calculated using .
(v) Parameters are optimized by minimizing the error cost function, , using the steepest descent method, where is the measured (simulated) intensity. We used 1 = 2 = 0.002 and 3 = 0.001 as the initial values for .
(vi) The modulus constraint is imposed using .
(vii) With the amplitude updated, the complex functions, , are transformed into real space and the support function is applied to obtain the updated electron density of the molecular cluster distributed in the unit cell of the crystal, .
Steps (ii)-(vii) are repeated and the progress of the reconstruction monitored using an error metric defined as
where K is the scale factor.
Fig. 8 shows the electron density of the photosystem I molecular clusters reconstructed at a resolution of 4.1 Å. The resulting images clearly show structural details, including the envelopes of photosystem I protein molecules and transmembrane -helices. As one can see from Fig. 8, the method enables reconstruction to be obtained from data that are intractable to conventional crystallographic analysis.
| || Figure 8 |
The projections of the electron density of the photosystem I molecular cluster, reconstructed with a resolution of 4.1 Å by incorporating models of the partial coherency: (a) (ab) crystallographic plane, (b) (ac) crystallographic plane. (c) and (d) models of the photosystem I molecule, (ab) and (ac) projections, respectively.
The broadening of diffraction peaks due to imperfections of protein nanocrystals determines the resolution achievable in the reconstructed image. The resolution limit is defined by the qmax value corresponding to the largest measured angle for which the diffraction pattern provides prominent reflections. If the Bragg peak becomes so broad that its half-width is greater than q = D, where D is the distance between the pair of nearest-neighbour Bragg reflections, and overlaps significantly with its neighbours, then the interference function becomes almost constant (Fig. 9). This defines the coherence length of the partially disordered nanocrystal that is the boundary between the crystal-type scattering (molecular clusters are scattering coherently) and `gas'-type scattering (molecular clusters are scattering incoherently). For the classes of structural imperfection considered here, the coherence length, LC, is defined by Vainshtein (1966),
where d is the distance between two nearest molecular clusters and . According to equation (30), LC is governed by the degree of disorder, . The `gas'-type scattering for which the function exhibits no maximum occurs if the degree of disorder is given by (Vainshtein, 1966).
| || Figure 9 |
The one-dimensional interference function calculated for the nanocrystal with structural disordering.
Considering the diffraction pattern of the protein nanocrystal as a continuous function of the scattering vector, q, rather than as a discrete set of Bragg reflections, yields a more complete extraction of structural information from diffraction data in the structural analysis of partially disordered nanocrystals. Our approach is particularly suited to crystals obtained by in vivo crystallization as this intrinsically leads to the formation of partially disordered crystals (Hempelmann & Marques, 1994; Wolf et al., 1999; Frenkiel-Krispin & Minsky, 2002). The boundary between the continuous and discrete approaches to structural analysis of diffraction by protein nanocrystals is determined primarily by the ratio NS /NV (§1). The conventional approach can be applied if the fraction of unit cells adjacent to the surface of the crystal does not exceed 10% of the total number of unit cells, suggesting an effective size limit of 1 µm for a protein crystal with an average unit-cell parameter of 100 Å.
We have shown that structural imperfections, such as mutual displacement or misalignment of proteins, particularly at the surface of the nanocrystals, as well as the shape and dimensions of the nanocrystals, play an important role in the formation of a diffraction pattern. It has also been shown that an algorithm that accommodates these structural imperfections is able to extract information in cases for which existing conventional methods fail. The analysis proposed here, based on continuous diffraction patterns, offers a unique solution for the structure without the need for additional assumptions or additional data. This may lead to the determination of molecular structures without the need to introduce molecular replacement strategies which introduce a natural bias in favour of existing trial structures. Such an approach is in the direction of truly ab initio structure determination, which may be made possible by XFEL sources.
The authors acknowledge the support of the Australian Research Council through its Centres of Excellence and Federation Fellowship programmes, and from the CSIRO Preventative Health Flagship, Neurodegenerative Diseases Theme.
Abbey, B., Lachlan, W., Whitehead, H. M., Quiney, D. J., Vine, G. A., Cadenazzi, C. A., Henderson, K. A., Nugent, E., Balaur, C. T., Putkunz, A. G., Peele, G., Williams, J. & McNulty, I. (2011). Nat. Photonics, 5, 420-424.
Baker, M. (2010). Nat. Methods, 7, 429-434.
Bates, R. H. T. (1982). Optik, 61, 247-262.
Boutet, S. et al. (2012). Science, 337, 362-364.
Caffrey, M. (2003). J. Struct. Biol. 142, 108-132.
Chapman, H. N., Barty, A., Marchesini, S., Noy, A., Hau-Riege, S. P., Cui, C., Howells, M. R., Rosen, R., He, H., Spence, J. C. H., Weierstall, U., Beetz, T., Jacobsen, C. & Shapiro, D. (2006). J. Opt. Soc. Am. A, 23, 1179-1200.
Chapman, H. N. et al. (2011). Nature (London), 470, 73-77.
Chen, B., Dilanian, R. A., Teichmann, S., Abbey, B., Peele, A. G., Williams, G. J., Hannaford, P., Dao, L. V., Quiney, H. M. & Nugent, K. A. (2009). Phys. Rev. A, 79, 023809.
Eyal, E., Gerzon, S., Potapov, V., Edelman, M. & Sobolev, V. (2005). J. Mol. Biol. 351, 431-442.
Faure, P. et al. (1994). Nat. Struct. Biol. 1, 124-128.
Feher, G. & Kam, Z. (1985). Methods Enzymol. 114, 77-112.
Fienup, J. R. (1982). Appl. Opt. 21, 2758-2769.
Frenkiel-Krispin, D. & Minsky, A. (2002). ASM News, 68, 277-283.
Fromme, P. & Spence, J. C. H. (2011). Curr. Opin. Struct. Biol. 21, 509-516.
Fung, R., Shneerson, V., Saldin, D. K. & Ourmazd, A. (2009). Nat. Phys. 5, 64-67.
Garcia-Ruiz, J. M. (2003). J. Struct. Biol. 142, 22-31.
Gerchberg, R. W. & Saxton, W. O. (1972). Optik, 34, 275-284.
Giacovazzo, C. (2011). Editor. Fundamentals of Crystallography, 3rd ed. Oxford University Press.
Grant, M. L. & Saville, D. A. (1994). J. Phys. Chem. 98, 10358-10367.
Hempelmann, E. & Marques, H. M. (1994). J. Pharmacol. Toxicol. Methods, 32, 25-30.
Johansson, L. C. et al. (2012). Nat. Methods, 9, 263-265.
Jordan, P., Fromme, P., Witt, H. T., Klukas, O., Saenger, W. & Krauss, N. (2001). Nature (London), 411, 909-917.
Kirian, R. A., Wang, X., Weierstall, U., Schmidt, K. E., Spence, J. C. H., Hunter, M., Fromme, P., White, T., Chapman, H. N. & Holton, J. (2010). Opt. Express, 18, 5713-5723.
Kirian, R. A., White, T. A., Holton, J. M., Chapman, H. N., Fromme, P., Barty, A., Lomb, L., Aquila, A., Maia, F. R. N. C., Martin, A. V., Fromme, R., Wang, X., Hunter, M. S., Schmidt, K. E. & Spence, J. C. H. (2011). Acta Cryst. A67, 131-140.
Koopmann, R. et al. (2012). Nat. Methods, 9, 259-262.
Loh, N. D. & Elser, V. (2009). Phys. Rev. E, 80, 026705.
McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658-674.
Malkin, A. J. & Thorne, R. E. (2004). Methods, 34, 273-299.
Miao, J., Sayre, D. & Chapman, H. M. (1998). J. Opt. Soc. Am. A, 15, 1662-1669.
Mizuguchi, K., Kidera, A. & Go, N. (1994). Proteins, 18, 34-48.
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Acta Cryst. D53, 240-255.
Quiney, H. M. (2010). J. Mod. Opt. 57, 1109-1149.
Quiney, H. M. & Nugent, K. A. (2011). Nat. Phys. 7, 142-146.
Rafaja, D., Klemm, V., Schriber, G., Knapp, M. & Kuzel, R. (2000). Phys. Rev. B, 61, 16144-16153.
Rafaja, D., Klemm, V., Schreiber, G., Knapp, M. & Kuzel, R. (2004). J. Appl. Cryst. 37, 613-620.
Schwander, P., Fung, R., Phillips, G. N. Jr & Ourmazd, A. (2010). New J. Phys. 12, 035007.
Shneerson, V. L., Ourmazd, A. & Saldin, D. K. (2008). Acta Cryst. A64, 303-315.
Spence, J. C. H. (2004). Science of Microscopy, edited by P. W. Hawkes & J. C. H. Spence. New York: Springer.
Spence, J. C. H., Kirian, R. A., Wang, X., Weierstall, U., Schmidt, K. E., White, T., Barty, A., Chapman, H. N., Marchesini, S. & Holton, J. (2011). Opt. Express, 19, 2866-2873.
Tokuhisa, A., Taka, J., Kono, H. & Go, N. (2012). Acta Cryst. A68, 366-381.
Vainshtein, B. K. (1966). Diffraction of X-rays by Chain Molecules. New York: American Elsevier Publishing Co., Inc.
Welberry, T. R. (2004). Diffuse X-ray Scattering and Models of Disorder. Oxford University Press.
Welberry, T. R., Heerdegen, A. P., Goldstone, D. C. & Taylor, I. A. (2011). Acta Cryst. B67, 516-524.
Whitehead, L. W., Williams, G. J., Quiney, H. M., Vine, D. J., Dilanian, R. A., Flewett, S., Nugent, K. A., Peele, A. G., Balaur, E. & McNulty, I. (2009). Phys. Rev. Lett. 103, 243902.
Williams, G., Pfeifer, M., Vartanyants, I. & Robinson, I. (2007). Acta Cryst. A63, 36-42.
Wolf, E. (1982). J. Opt. Soc. Am. 72, 343-351.
Wolf, S. G., Frenkiel, D., Arad, T., Finkel, S. E., Kolter, R. & Minsky, A. (1999). Nature (London), 400, 83-85.