research papers
Accounting for partiality in serial crystallography using ray-tracing principles
aCrystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands, and bM4I Division of Nanoscopy, Maastricht University, PO Box 616, 6200 MD Maastricht, The Netherlands
*Correspondence e-mail: l.m.j.kroon-batenburg@uu.nl
Serial crystallography generates `still' diffraction data sets that are composed of single diffraction images obtained from a large number of crystals arbitrarily oriented in the X-ray beam. Estimation of the reflection partialities, which accounts for the expected observed fractions of diffraction intensities, has so far been problematic. In this paper, a method is derived for modelling the partialities by making use of the ray-tracing diffraction-integration method EVAL. The method estimates partialities based on crystal mosaicity, beam divergence, crystal size and the interference function, accounting for crystallite size. It is shown that modelling of each reflection by a distribution of interference-function weighted rays yields a `still' Lorentz factor. Still data are compared with a conventional rotation data set collected from a single lysozyme crystal. Overall, the presented still integration method improves the data quality markedly. The R factor of the still data compared with the rotation data decreases from 26% using a Monte Carlo approach to 12% after applying the Lorentz correction, to 5.3% when estimating partialities by EVAL and finally to 4.7% after post-refinement. The merging Rint factor of the still data improves from 105 to 56% but remains high. This suggests that the accuracy of the model parameters could be further improved. However, with a multiplicity of around 40 and an Rint of ∼50% the merged still data approximate the quality of the rotation data. The presented integration method suitably accounts for the partiality of the observed intensities in still diffraction data, which is a critical step to improve data quality in serial crystallography.
Keywords: serial crystallography; EVAL; partiality of still data.
1. Introduction
X-ray free-electron lasers and high-brilliance undulator beamlines at synchrotrons have been used to perform serial (femtosecond) crystallography, collecting diffraction data from a large number (thousands up to millions) of micrometre-sized or nanometre-sized crystals (Chapman et al., 2011; Boutet et al., 2012; Redecke et al., 2013; Gati et al., 2014; Demirci et al., 2013). Individual crystals may be hit by an X-ray pulse, thereby producing a diffraction pattern within the 10–50 fs before being vaporized by the transferred energy. This principle of `diffraction before destruction' has been demonstrated by experiments on the Linac Coherent Light Source (LCLS) hard X-ray free-electron laser (Chapman et al., 2011). Since the X-ray pulses are shorter than it takes for radiation-induced structural changes to occur, this approach of serial crystallography overcomes radiation damage, which has become a major problem with highly brilliant synchrotron sources (Weik et al., 2000; Ravelli & McSweeney, 2000; Burmeister, 2000) using conventional rotation methods of collecting data from one or very few larger crystals. The diffraction images in serial crystallography are single snapshots of nonrotating crystals: so-called still images. As opposed to the conventional rotation data, the reflections are not fully integrated but are partials, except possibly when using future pink XFEL beams (Dejoie et al., 2015). The particular orientation of the determines the extent of this partiality, which is a great unknown in the data-reduction process.
The specific challenges in data processing are the indexing of the stills, the reconstruction of full intensities and the merging of data obtained from different crystals, in addition to the handling of huge amounts of data. Three software packages are available to process serial X-ray diffraction patterns: CrystFEL (White et al., 2012, 2013; White, 2014), cctbx.xfel from the Computational Crystallographic Toolbox (Sauter et al., 2013; Hattne et al., 2014) and nXDS (Kabsch, 2014). For indexing, rotation-method indexing packages such as MOSFLM (Leslie & Powell, 2007), DirAx (Duisenberg, 1992) and LABELIT (Sauter et al., 2004) are being used. In 2010, Kirian and coworkers proposed a Monte Carlo integration method that, by averaging large numbers of diffraction spots, averages out the unknown partialities as well as differences in crystal size, beam and the incident spectrum (Kirian et al., 2010). Thousands of diffraction images are needed for this method to converge (Boutet et al., 2012). It is generally believed (White, 2014) that estimation of partialities could reduce the number of images needed for the Monte Carlo integration method and could improve the data quality. Three approaches have been proposed to estimate partialities. All three use post-refinement to improve the partiality correction factors and scale factors for each image. Kabsch (2014) derived an analytical expression for partiality from a Gaussian mosaic spread function. Comparison of ultrafine-sliced rotation images treated as stills or as normal rotation images gave satisfactory results. Kabsch includes a Lorentz factor for still data explicitly. The still data processing is not as good as one would expect, according to Kabsch. He concludes that this may be caused by two-dimensional rather than three-dimensional profile fits and the lack of other unimplemented corrections. White (2014) considers the overlap of reciprocal reflection volumes with a nest of Ewald spheres and calculates partialities from the distance of reciprocal-lattice points to the two limiting Ewald spheres. Using modelled data, White shows that the partiality estimation improve the data, with significant improvement of the statistics upon post-refinement. Most recently, Sauter (2015) and Uervirojnangkoorn et al. (2015) presented a partiality model that is implemented in cctbx.xfel. They calculated the intersection with the of a spherical reciprocal-lattice point, where the radii of the lattice points are determined by mosaic spread and (asymmetric) beam divergence. Sauter (2015) also includes a parameter for the coherently scattering volume of mosaic blocks. Using this approach on XFEL data with post-refinement of crystal orientations, scale factors and beam parameters, the data are improved in quality as judged from molecular-replacement scores, structural and anomalous difference maps (Uervirojnangkoorn et al., 2015). Moreover, they show that reliable structures can be obtained with a lower number of images. Unfortunately, these authors do not mention merging R factors. Another correction that potentially improves Monte Carlo integration convergence for nanocrystals is explored by estimation of the crystal sizes and their corresponding diffraction power, as described by Qu et al. (2014). They show that the geometric correction factor, solely based on the maximum of the Laue interference function for each crystal with size Nx × Ny × Nz, is superior to Monte Carlo integration for simulated data. Although the above efforts were made to improve processed serial crystallography still data, many questions still need to be addressed. Why do the data not improve rigorously with the current partiality-correction models? What factors exactly determine the partiality? Which errors dominate the partiality-estimation schemes?
Here, we describe an extension of the EVAL profile-prediction algorithm to process still images. EVAL is a data-reduction method designed for integrating reflection intensities through profile fitting using ray-tracing simulations (Duisenberg et al., 2003; Schreurs et al., 2010). We derived a general interference function that is valid for crystals of any size and effectively includes the shape transform. The diffraction process is simulated by typically 10 000 rays, which are diffracted by an equal amount of reciprocal-lattice vectors. In the rotation method, we bring reciprocal-lattice vectors onto the by rotation around the spindle axis. However, in the still diffraction method we calculate the deviation from the exact Bragg condition for each ray and estimate its contribution to the total diffracted intensity using the interference function. By summation, the partiality of a reflection is obtained and, as we will show, also the still Lorentz factor. To test the approach, we used two still data sets collected on our in-house diffractometer using a single lysozyme crystal: one consisting of consecutive, stepwise stills and one consisting of stills from arbitrary orientations. Both were compared with conventional rotation data collected under the same conditions. We show that for these data sets the reflection partialities can be estimated by the ray-tracing simulation method and that the presented approach significantly improves the mean intensities of the observed reflections.
2. Diffraction theory
Reflection profiles from a crystal in EVAL are simulated by generating ray traces. We consider a crystal to be built up from small crystallites by dividing the crystal on a three-dimensional grid (sampled from a distribution K) that can have random orientations taken from a mosaic distribution (M). Incident X-rays are emitted from a virtual focus (e.g. a square area F) in direction k0 with respect to the crystallite and with wavelength λ (sampled from a L). A crystallite with an orientation of the reciprocal-lattice vector gives rise to a diffracted ray in direction k1 as determined by the Ewald construction. For F, L, K and M several statistical distributions are available (Schreurs et al., 2010). In the simulation of rotation data the vectors are rotated around the spindle axis so as to match the Bragg condition and then touch the (Fig. 1).
In case of still diffraction experiments with one particular orientation of the crystal, none of the crystallite lattice vectors are exactly on the ∊) of the θ and may give rise to diffracted intensity that is a function of ∊ (see below). The integrated intensity of all vectors from the various crystallites depends on the mosaic spread, the the beam size and divergence, the crystal size and the crystallite size itself. The latter corresponds to the coherently diffracting volume of the mosaic blocks, and the total reflected intensity of the crystal is the incoherent sum of all diffracted rays.
However, some vectors are within a certain tolerance in angular deviation (2.1. Still diffraction images
The scattered intensity of a crystal at θ can be thought of as the coherent sum of scattering by s layers of thickness d, according to what we call the James–Buerger theory (James, 1958; Buerger, 1960). The scattered intensity of a single layer is
where F2 is the squared I0 is the incident p is the polarization owing to the reflection, e2/mc2 is the Thomson scattering length of one electron and n is the number of unit cells per unit volume. A is made up of tiny crystallites with associated reciprocal-lattice vectors that are spread over an angular range μ, and each of them may not be perfectly oriented to be in Bragg condition. The s layers within such a crystallite then scatter slightly out of phase and their scattered intensity is given by the interference function
where ∊ is the deviation of the θ, B = 2πdcosθ/λ and Vcrystallite is the reflecting volume of the crystallite.
The James–Buerger theory can be extended by writing the total diffracted intensity as an integral over all possible orientations of the crystallite vectors that make angles of 90° − η with the incident X-ray beam (90° − θ in the Bragg condition) and replacing Vcrystallite by the volume of the crystal V and ∊ by θ − η (see Fig. 2). This results in
where P(η|η0) is the probability distribution of η angles given the angle η0 of the central reciprocal-lattice vector (0) of the crystal as obtained from the unit-cell matrix.
The mosaic spread, the divergence of incident rays, the wavelength variations and the crystallite positions being slightly off-centre in a larger crystal are the cause of deviations ∊ for the individual rays. For the discussion here, we will concentrate on the mosaic spread, but the other parameters are accounted for as well in our ray-tracing simulations.
The distribution function can take several forms. Suppose that P(η|η0) is uniform, while , then the integral over dη in (3) reduces to sπ/B and
where C = I0F2p(e2/mc2)2λ3n2. The last term in (4) is familiar: it is the Lorentz factor for rotation in the equatorial plane and is equal to the powder Lorentz factor. Thus, for still images the powder Lorentz factor applies. When P(η|η0) is a normal or an otherwise monotonous distribution, we should explicitly include it in the calculation of (3). However, it cannot be reduced to a simple trigonometric function nor to an erf (see Kabsch, 2014) because of the presence of the sinc function in the integral. Instead, it can be evaluated numerically.
2.2. Rotation diffraction images
Rotation images can be regarded as a superposition of many stills separated by an infinitesimal rotation angle ω. The integrated intensity for these is
In a sufficiently large ω scan each vector makes a complete pass through the so that we can write
where L′ is the duration component of the Lorentz factor L for the rotation experiment (equal to the reflection range in Kabsch, 2014). The rotation Lorentz factor may alternatively be written as 1/[d*(k0 × ω)] (Milch & Minor, 1974). In case of rotation in the equator, (6) reduces to (4). In rotation data, therefore, the specific distribution function P(η) is irrelevant to the integrated intensity and complete reflections are obtained.
2.3. Implementation of the interference function in EVAL
In EVAL a large number of vectors (i) are generated from a Gaussian or Lorentzian two-dimensional mosaic distributions (M) and combined with vectors k0 from F, L, K distributions. The contribution of each of these to the scattered intensity is calculated with
Summing all contributions gives the total scattered normalized intensity (i.e. C = 1.0; see text below equation 4), which is effectively an integral over dη, dk0 and, to a minor extent, dλ, because our beam is almost monochromatic. This normalized intensity is stored in the parameter `partiality' after correction for the still Lorentz factor, i.e. the partiality is . The only new parameter introduced is the number of unit cells in the crystallite Ncell, where s = Ncell(|h| + |k| + |l|), the number of reflecting planes, while the crystallite size equals sdhkl.
Every (i) produces its own impact on detector pixel coordinates (x, y) and is weighted by contribution Ii. All impacts together build the two-dimensional reflection profile that is used as the model profile in the EVAL least-squares fit to obtain the observed integrated intensity for each diffraction spot on an image. Both the observed intensity and the summed interference function (7) contain the still Lorentz factor, and by dividing one by the other we extract F2. We also correct for the polarization and apply possible incidence corrections.
2.4. Laue interference function
In this paper, we follow the James and Buerger approach, as explained in §2.1, for deriving diffracted intensities by crystals. The resulting interference function only depends on the deviation ∊ from the θ and the number of unit cells contained in the crystal. An alternative is to use the three Laue conditions, and the squared sinc function in (3) is replaced by
Here,Δk = k1 − k0 and N1, N2 and N3 are the number of unit cells in the three periodic axis directions. In Appendix A, we show that the two approaches are exactly the same.
(8) is often referred to as the shape transform of the crystal (Kirian et al., 2010; Spence et al., 2011).
2.5. Impact positions and refinement
Peak-position PEAKREF (Schreurs, 1999) minimizes the peak-position residuals and the deviation ∊0 of the central reciprocal-lattice vector either using peak maxima from the peak search or using optimized profile centroids from the EVAL profile fit. Inclusion of ∊0 in the unit-cell matrix avoids divergence of unit-cell orientations through rotations perpendicular to the incident beam, as discussed by Sauter et al. (2014). Similarly, Kabsch (2014) uses the angular deviation τ divided by the mosaic spread σM in the target function for peak All three approaches use the β axis, defined for each reflection as the axis perpendicular to the incident and diffracted beams, to calculate the deviation from the θ (Schutt & Winkler, 1977). For each still image, the following target function was minimized to refine the unit-cell matrix,
inwhere Δxi and Δyi are the differences of observed and calculated peak positions. PEAKREF can optimize many instrumental parameters such as detector-offset positions, primary beam direction and crystal position, which in the current analysis were fixed in the still data and based on the rotation data (see below).
We found that the peak-position residuals from the post-refinement were much smaller than from the peak maxima found on a single still image, despite the much larger number of peaks. This was caused by a shift in the observed θ value for the partial reflections with large ∊0. For large mosaic crystals such as our lysozyme crystal measured with a divergent beam, these shifts in θ occur because only distinct directions of the primary rays or distinct points on the crystal are active dependent on the deviation ∊0 (Fig. 3). A negative value for ∊0 results in an apparent larger θ and a positive ∊0 in an apparent smaller θ. This θ-divergence effect has to be taken into account when the cell matrix is determined and refined from peak positions. We introduced a parameter `flex' in PEAKREF that is jointly refined and takes account of this shift. The `flex' parameter turned out to have a constant value for all still images and it appears to be a property typical for the crystal and the beam divergence of the particular experiment.
2.6. Post-refinement
We implemented a post-refinement procedure in which both the peak positions from the EVAL integration and the partialities could be refined. For this purpose, we calculate the mean intensity of all equivalent reflections h as
Weights are obtained from the standard deviations from the EVAL profile fit (Schreurs et al., 2010) and are given by we = 1/σe2. The partialities pe arise from the EVAL ray-tracing simulation, and the image scale factors sf are determined in ANY (Schreurs, 2007), assuming a constant sum of Bragg intensities in each frame. The summation in (10) runs over all equivalents of reflection h (Nh) in the data set. In PEAKREF image scale factors sf′ and unit-cell parameters and crystal orientation angles are refined using the target function
We specifically include peak positions in this EVAL ray-tracing procedure is not repeated to obtain partialities; instead, we use a fitted partiality versus ∊ curve with a single Gaussian. The parameters in the Gaussian were kept fixed in the ∊ changes with the unit-cell parameters from which we recalculate the partiality (pi′).
step in order to avoid unwanted divergence from peak position-derived unit-cell parameters and orientations. In this post-refinement step the3. Materials and methods
3.1. Crystal preparation
Hen egg-white lysozyme (Sigma–Aldrich, Schnelldorf, Germany) was crystallized using the hanging-drop vapour-diffusion method with a protein concentration of 75 mg ml−1 in 0.1 M sodium acetate buffer pH 4.8. The precipitant consisted of 0.1 M sodium acetate buffer pH 4.8, 10–15%(w/v) sodium chloride, 30%(v/v) ethylene glycol (Sutton et al., 2013). Drops of 4 µl were set up with a 1:1 protein:precipitant ratio.
3.2. Data collection
A crystal of dimensions 250 × 250 × 150 µm was vitrified in a cold N2-gas stream from an Oxford Instruments 700 series jet operated at 100 K. Data were collected on a Bruker–AXS X8 Proteum in-house source with Cu Kα radiation. The rotating anode was operated at 45 kV and 60 mA. The reference rotation data set was collected by rotating over 190° in φ in 0.5° steps per frame. Data were recorded on a PLATINUM135 CCD detector with a sample-to-crystal distance of 52 mm. 380 still images were collected with identical angular settings as the starting angles for each of the rotation frames; thus, 380 still images were collected at 0.5° intervals. An additional 394 stills were recorded by random selections from ω scans 0–7° in ω apart at 15 different ω, κ and φ goniometer settings. The exposure time for all images was 5 s.
3.3. Data processing and analysis
VIEW was used for image display and peak search (Schreurs, 1998). Both the rotation images and stills were indexed using DirAx (Duisenberg, 1992). Almost all stills could be indexed without manual intervention. constraints were applied and the unit cells were made congruent (using the goniometer positions), ensuring a consistent choice of unit-cell axes. The unit-cell matrix and detector positions were refined from 649 peak positions in the rotation data. For the still peak positions we used different options. In the first approach we made use of our knowledge of the relative positions of the goniometer axes, so that a global single unit-cell matrix could be refined against 10 728 peak positions. In the second approach, we determined and refined a unit-cell matrix for each image from 300 peak positions, as would be the normal procedure in serial crystallography. The detector-offset positions were taken from the peak-position of the rotation data. The unit-cell matrix was refined against the observed peak positions, using the `flex' parameter to account for apparent shifts in θ, simultaneously minimizing the off-Bragg angle ∊0 (9). Using the unit-cell matrix, we extracted three-dimensional and two-dimensional reflection boxes for rotation and still images, respectively, and processed these with EVAL. For every reflection, 10 000 rays were simulated and the impacts were collected in pixels contained in the box. In case of still data every individual ray is associated with a reciprocal-lattice vector (i) with a small angular deviation from the ∊i and is weighted by the interference function (7). The impact position on the detector is given by the direction of the shortest distance of the reciprocal-lattice vector to the The divergence effects are accounted for in the ray tracing and thus the profiles are generated correctly at deviating positions in θ (i.e. without the need for a `flex' parameter as used at the peak-refinement stage).
The parameters for crystal size, mosaic spread and beam divergence were optimized automatically in the reflection profile fitted to ∼50 reflections with I/σ(I) > 20 using a simplex method (see Schreurs et al., 2010). For comparison reasons, identical values of parameters in the ray-tracing simulations were used for both types of data sets, although a similar optimization can be performed for still images. In addition, for the still images we used Ncell = 25 in the interference function (the number of unit cells in a crystallite as described in §2.3). Sampling of the interference function converges much faster with low values of Ncell, typical for nano-sized crystals. The current data imply a larger value of Ncell, which in the current implementation would require many more rays (up to 106 instead of 10 000) to sample reflection profiles smoothly. The integrated intensities are obtained by a least-squares fit of the three-dimensional and two-dimensional model profiles to the observed pixel intensities for the rotation and still reflections, respectively. EVAL then delivers the profile-fitted, Lorentz- and polarization-corrected intensity values in an XML-type datafile that is further processed in ANY (Schreurs, 2007). In this program, we determine image scale factors, correct for the partiality factor and output the intensities and standard deviations to an hkl- or mtz-type file. Many of the graphical plots and statistical analyses are made using ANY.
All still images were also processed with the CrystFEL software suite v.0.5.1 (White et al., 2012). Structural refinements were carried out with REFMAC5 (Murshudov et al., 2011) and scaling between data sets with SCALEIT (Howell & Smith, 1992), both from the CCP4 suite (Winn et al., 2011). ANODE (Thorn & Sheldrick, 2011) was used to calculate anomalous difference densities.
4. Results
We collected rotation and still diffraction data from one lysozyme crystal and formed three data sets for analysis: a 190° rotation data set collected in 380 images in ranges of 0.5° for reference, a consecutive still data set of 380 images collected in steps of 0.5°, and this consecutive data set combined with 15 wedges of separate arbitrary orientations totalling 774 images (Table 1). Indexing and peak by PEAKREF/EVAL and CrystFEL yielded unit-cell dimensions that varied by ∼0.3–0.7% between the separate stills. Initially, the average residuals in peak positions for the still data were significantly larger than for the rotation data. Introduction of the `flex' parameter, which takes into account the apparent shift in θ owing to divergence effects (see §2.5), reduced the positional residuals of peak maxima on still images significantly: from 0.13 to 0.07–0.08 mm on average. This deviation is only slightly larger than that observed for the rotation data, which was 0.06 mm. The residual in rotation angle for the rotation data was 0.039° (for 0.5° scan width). For the still data the deviation from the ∊0 was 0.18°, which is consistent with a mosaic spread of 0.5°. Relaxing the unit cells by a separate matrix for each image lowered the ∊0 residuals to 0.14° (compare `single versus `unit cell per image' in Table 1). In our setup the orientation of each matrix was known because the crystal orientation was set using a goniometer. The r.m.s. deviations between the set and refined orientations of the reciprocal-lattice vectors a*, b* and c* were 0.03° for the consecutive still data and increased to ∼0.10° using all still data (consecutive and random orientations). Overall, the number of observations taken into account by EVAL were 106 × 103 for the rotation data set, 325 × 103 for the consecutive still data and 657 × 103 for all still data, whereas CrystFEL took 733 × 103 into account for all still data (Table 2). All processed sets resulted in ∼8300 unique reflections. The multiplicity of the consecutive still data was roughly three times that of the rotation data, indicating that reflections were, on average, sliced through three times in our still data-collection experiment.
‡Average angular deviation of central reciprocal-lattice vector d* with (see text for explanation). |
‡Still data are not scaled by SADABS like the rotation data and no error model is determined for σ. In the merging step σ is determined from the internal standard deviation . §Rint = , where the summations runs over all N unique reflections h and equivalents. |
The statistics for the integration and merging of data for the rotation and still data are shown in Table 2. Processing of the reference rotation data yielded an internal merging Rint of 3.8% with an 〈I/σ(I)〉 after merging of 47.7. Processing of the still diffraction data without correction, referred to as Monte Carlo averaging, produced Rint values exceeding 100% and 〈I/σ(I)〉 values that were about fourfold lower than that for the rotation data using the same number of images. Application of the still Lorentz correction (4) slightly increased the Rint (Table 2).
To estimate partialities, we determined the parameters for mosaic spread, divergence of the incident beam, crystal size and Ncell by optimizing two-dimensional profile fits using figures of merit (Schreurs et al., 2010) on a subset of reflections in EVAL. Mosaic spread was set to 0.5°, beam divergence to 8.6 mrad, crystal size to 130 × 130 × 130 µm (although we estimated a slightly larger size when selecting the crystal under the microscope) and Ncell to 25. The ray-tracing procedure yielded partialities which showed a Gaussian-like distribution with ∊0 (Fig. 4a). Notably, the computed still partialities are not normalized and exceed a value of 1, and hence are used as relative scale factors. In rotation data the partiality is defined up to 1 for a fully observed reflection (Rossmann & Beek, 1999); in contrast, the partiality in still diffraction is determined by the angular width of the intersection with the which depends on various instrumental and crystal parameters such as those given by (3). Lorentz-corrected still and (Lorentz-corrected) rotation reflections on average give the same absolute intensities. Fig. 5 shows that some still partialities are larger than 1.0 and the still intensities scatter around the rotation intensity. Further, to illustrate that the partialities depend strongly on the precise ray-tracing model parameters, Fig. 4(b) shows the partialities as a function of ∊0 in the case of a long focus for the incident beam, which results in two Gaussian-like curves superimposed. This implies that a simple Gaussian model for the partiality is not always correct. When divided into ∊0 bins, the observed average intensities correlate well with the estimated partialities (Fig. 6). Application of the partiality model resulted in average I/〈I〉 values that varied around the ideal value of 1.0. Subsequent merging of these data, i.e. with both Lorentz and partiality corrections applied, reduced the Rint values to 57 and 63% for the data sets with consecutive and all stills, respectively.
Next, the effects of Lorentz and partiality correction were evaluated by comparing the data with the reference rotation data set. The uncorrected and the Lorentz-corrected intensities have high internal Rint values of 104.9 and 106.5%, respectively, consistent with the scattering in Fig. 5. The Lorentz- and partiality-corrected intensities have an Rint of 63.8%. Upon merging the data to unique reflections the agreement with the rotation data improved dramatically; the scatter diagrams in Fig. 7(a) and 7(b) reflect the improvement corresponding to the uncorrected (Monte Carlo) and corrected (Lorentz and partiality) data. The effects from the still data corrections are more clearly demonstrated by the R factors with respect to the reference rotation data, which we refer to as Rcomp (Table 3). Rcomp (on intensities) was 26% using Monte Carlo averaging. Application of the Lorentz correction alone decreased the Rcomp to 12%. Application of both Lorentz and partiality corrections yielded an Rcomp of 5.3%.
‡From SCALEIT: overall scale. |
Although the Lorentz and partiality corrections significantly improved the quality of the merged data, the merging Rint value remained high (i.e. 63.8% for all still data). To improve the partialities, we performed post-refinement of the image scale factor, unit-cell parameters and orientations, minimizing the target function of (11). Post-refinement of the `all stills' data gave scale factors of 0.84–1.35 (additional to the scale factor sf used in equation 10) and sharpened the distribution of unit-cell dimensions, with virtually no effect on the variation of crystal orientations (Table 1). These adjustments resulted in a significant, but modest, reduction of Rint from 63.8 to 55.7% (Table 2). The progress in the precision of processing the data is reflected by the distributions I(hkl)/〈I(hkl)〉 shown in Fig. 8. Ideally, I(hkl)/〈I(hkl)〉 values form a sharp distribution around 1 (as a reference, we depict the distribution resulting from the rotation data in Fig. 8e). Figs. 8(b) and 8(c) reflect the striking improvement obtained by modelling the partiality in EVAL and subsequent post-refinement. Fig. 8(d) shows that mainly the weak data do not profit from the post-refinement. Comparison of the merged data sets shows that the improvement in precision is matched by an improvement in accuracy. Post-refinement reduced the Rcomp from 5.3 to 4.7% (Table 3).
To illustrate the data quality, we refined the lysozyme 193l (Vaney et al., 1996) against the reflection data using REFMAC, and we observed similar Rwork and Rfree values for the differently processed data (Table 4). Significant differences between the methods were observed for the resulting average isotropic B factors. Monte Carlo averaging of the data in CrystFEL and EVAL yielded increased B factors (21–25 Å2) compared with the reference defined by the rotation data set (〈B〉 = 13.8 Å2). The Lorentz correction had a large effect on the B factors and produced an average B factor of 11.8 Å2; this large effect on the B factors is explained by a comparable fall-off in θ of the Lorentz factor and the temperature factor. When the Lorentz and partiality corrections were both applied, the B factors became more similar to those obtained when using the rotation data (13.2 versus 13.8 Å2). Anomalous differences are much more sensitive to the accuracy of the data than structure We generated anomalous difference densities based on the processed data sets using phases from the refined structure by ANODE. For the methionine sulfur positions the anomalous density from the rotation data gave a peak height of 13.3σ. The uncorrected, Monte Carlo averaged still data yielded a weak anomalous signal: a 4.2σ peak for methionine S, corresponding to 32% of the peak height using the rotation data. Lorentz correction improved the methionine S signal to 35%, whereas including partiality corrections resulted in 47% of the signal. Finally, this signal improved to 54% after post-refinement. This shows that both Lorentz and partiality correction improved the intensities deduced from the still data.
and computed anomalous difference densities. The structure was refined starting from PDB entry
‡ANODE with data merged in 422. Averaged densities over similar atom types (two for Met SD, eight for Cys SG, 14 for Cl− and one for Na+). §Selected by SHELXC. |
We tested the effect of data-set size by limiting the still data to 60 images (Table 5). For the reduced `consecutive still' data we used images 250–310. For the `random still' data 60 images from three different wedges were used. For these limited data sets (91.7 and 97.3% completeness, respectively), the Rfree factors show that the structure quality deteriorated. Furthermore, the anomalous signal is largely lost. For both structure and anomalous density analyses the Lorentz and partiality-corrected data outperform the noncorrected Monte Carlo processed data.
‡ANODE with data merged in 422. Averaged densities over similar atom types (two for Met SD, eight for Cys SG, 14 for Cl− and one for Na+). §Selected by SHELXC. |
5. Discussion and conclusions
We used our ray-tracing profile-prediction methods to model partialities of the observed reflections in still diffraction data and adapted the programs PEAKREF and EVAL to process still diffraction images. By taking experimental conditions into account, we compute 10 000 rays generated from focus, crystal grid points, wavelength spectrum and mosaic distributions, and calculate the interference-function weighted contribution to an observed reflection and hence derive its partiality. Our formalism implicitly models for the Lorentz factor, mimicking the contribution of the Lorentz factor to the observed intensities. Our approach differs fundamentally from other still data-processing methods. Kabsch (2014) defined an analytical erf function for the partiality, which is the integral over a Gaussian mosaic function. It is equivalent to our integral in (3) for an infinitely sharp sinc function (implying that integration over this function is complete within a solid angle smaller than the pixel size of the detector), while ignoring broadening effects other than the mosaic spread. Kabsch explicitly corrects for the still Lorentz factor. White (2014) and Sauter (2015) use reciprocal-lattice point volumes for calculating the partiality. White (2014) accounts for spectral width and beam divergence by calculating the overlap of a reciprocal-lattice volume with a nest of Ewald spheres. Sauter (2015) and Uervirojnangkoorn et al. (2015) use a single and calculate the intersection with a spherical reciprocal-lattice volume, the size of which is determined by beam divergence, mosaic spread and spectral dispersion. Both approaches account for increase of reciprocal diffracting volume with resolution, and in this way for the wider range of acceptable off-Bragg angles (d∊; see Appendix A). However, both approaches lack the reflectivity part of the Lorentz factor (dΩ; see Appendix A). If the spectral width of the beam becomes large, an additional Lorentz factor needs to be accounted for, as used in the Laue method (Zachariasen, 1945). Uervirojnangkoorn et al. (2015) very recently presented their results on XFEL data. They showed that the Rwork and Rfree of refined structures improved and part of the anomalous signal was retrieved. Unfortunately, they do not provide merging Rint or a comparison to a rotation data set, i.e. Rcomp, to evaluate the resulting quality of the data more directly. In our approach, the integration of (3) is achieved by simulation of the rays that contribute to an observed reflection spot. Because of the simulation, the derivation of analytical functions for the various effects is not needed and the Lorentz effect is implicitly taken into account. Moreover, the interference function can be taken into account in our approach.
For the initial development of the method, we used an experimental setup that allowed a direct comparison to the conventional rotation method. Our analysis showed a dramatic improvement in data quality after partiality and Lorentz correction. Both data processing and structure Rcomp factor between the intensities derived from rotation and still data from 26% to a final value of 4.7% after Lorentz and partiality correction and post-refinement.
showed that Lorentz correction is important and that omission of the Lorentz correction strongly affects the temperature factor. The anomalous sulfur densities increased 1.7-fold upon Lorentz and partiality correction of the still data. Overall, our approach markedly improved theConcurrent with the improvement in the final data quality upon Lorentz and partiality correction in EVAL, the internal merging Rint decreased from 105 to 64%. When we were developing the method, we hoped that post-refinement of the parameters would improve the final unique intensity data as well as further reduce the internal merging Rint factor. Post-refinement improved the precision of the modelled unit-cell dimensions and scale factor per image, although the error in modelled crystal orientations remained ∼0.1°. These more precise parameters indeed improved the resulting intensities (Rcomp decreased from 5.3 to 4.7%). The internal statistics improved as well (Rint decreased from 64 to 56%); however, the final Rint factor remained high. This high Rint could be owing to features that were not included in our ray-tracing model, such as possible asymmetry in the focus, (anisotropic) mosaic spread or crystal form, or absorption by the crystal. Notably, crystal absorption may have a significant effect on the presented data because a relatively large crystal was used in this experiment. Crystal absorption is likely to be negligible when data are collected from microcrystals or nanocrystals, as is the case in serial crystallography. Obviously, further development of our approach is needed to account for the experimental conditions of serial (femtosecond) crystallography using XFEL or synchrotron sources. Automated schemes will be needed to model, for example, the large number of single-crystal diffraction images and fluctuations in beam spectra. In general, comprehensive modelling of the relevant experimental conditions should improve both the internal merging statistics and the resulting intensities. Not modelling significant effects that are present in the data can only be overcome by collecting more data to allow the averaging out of these effects by the Monte Carlo approach. In a real-case scenario the rotation data will not be available to evaluate the data quality, and an Rint of ∼50% may possibly be a practical metric to judge the resulting data quality.
Overall, we have shown that ray tracing can produce reliable partialities that improve the resulting data quality originating from still diffraction images. Moreover, our method is versatile and allows the modelling of a wide variety of effects, including those that yield non-Gaussian, asymmetric effects on the diffraction spot. In particular, the approach can take the interference function into account, which will be critical for processing data obtained from nanocrystals. Thus, in this paper we have presented the theoretical framework and demonstrated the potential of the ray-tracing methodology for processing still diffraction data.
The rotation and still diffraction images are available at https://rawdata.chem.uu.nl/c003 .
APPENDIX A
Comparison with Laue interference function
The diffracted intensity reflected by a small crystal bathed in an incident monochromatic beam is proportional to the shape transform of the crystal. The reflected intensity received by the detector in a small cone of solid angle dΩ, while the reciprocal-lattice vector has a small deviation ∊ of the θ, is given by the Laue interference function (Laue, 1936; James, 1958) and is used in papers by Kirian et al. (2010) and White et al. (2012),
N1, N2 and N3 are the number of unit cells in the three dimensions of the parallepiped crystal. ξ is the scalar product Δk · a and in near-Bragg condition it is equal to h + Δh, and likewise for the other directions. As we are only interested in the diffracted intensity close to the Bragg condition, we introduce a local reciprocal axis system and replace ξ by the nonperiodic Δh. The terms in the denominator of (12) can then be written as (πΔh)2 because they concern only small numbers,
It is more convenient to choose the reciprocal axes system such that Δl is along the reciprocal-lattice vector and Δh and Δk are parallel to the diffracting Bragg plane hkl (Authier, 2001). Such a transformation can be carried out because the normal to the Bragg plane is always a reciprocal-lattice vector. (Note that Δh, Δk and Δl are dimensionless.) The Jacobians of the transformation of the integration variables are dΩ = V*cell(dhkl/sinθ)λ2d(Δh)d(Δk) and d∊ = λ/(2dhklcosθ)d(Δl) (Authier, 2001), leading to
By integration over Δh and Δk the diffracted intensity for a given value of ∊ is obtained,
The equivalence of this expression to that of James (1958) and Buerger (1960), as we use in EVAL, is shown by the following. How does Δl depend on a small deviation Δθ = ∊? We write Δl = d(l)/dθ = d(c/dhkl)/dθ = d(cd*hkl)/dθ = c2cosθ/λ. cN3 can be written as ldhklN3, and further we use the property that the volume of the crystal Vcrystal= N1N2N3Vcell and B = 2πdhklcosθ/λ. Writing (14) in terms of d∊ gives
Using N1N2/Vcell = Vcrystal/V2cell(1/N3) (James, 1958, p. 43) and dhkl = λ/2sinθ, we can write
(17) is exactly the equation used in EVAL (2), as the number of layers s = N3l, and using n = N1N2N3/Vcrystal we can write Vcrystal/V2cell = n2Vcrystal.
Acknowledgements
We gratefully acknowledge Thomas White and Henry Chapman (CFEL, Hamburg) for discussions. This work was supported by the Council for Chemical Sciences of the Netherlands Organization for Scientific Research (NWO-CW) grant No. 175.010.2007.013.
References
Authier, A. (2001). Dynamical Theory of X-ray Diffraction. Oxford University Press. Google Scholar
Boutet, S. et al. (2012). Science, 337, 362–364. CrossRef CAS PubMed Google Scholar
Buerger, M. J. (1960). Crystal Structure Analysis. New York: John Wiley & Sons. Google Scholar
Burmeister, W. P. (2000). Acta Cryst. D56, 328–341. Web of Science CrossRef CAS IUCr Journals Google Scholar
Chapman, H. N. et al. (2011). Nature (London), 470, 73–77. Web of Science CrossRef CAS PubMed Google Scholar
Dejoie, C., Smeets, S., Baerlocher, C., Tamura, N., Pattison, P., Abela, R. & McCusker, L. B. (2015). IUCrJ, 2, 361–370. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Demirci, H. et al. (2013). Acta Cryst. F69, 1066–1009. CrossRef IUCr Journals Google Scholar
Duisenberg, A. J. M. (1992). J. Appl. Cryst. 25, 92–96. CrossRef CAS Web of Science IUCr Journals Google Scholar
Duisenberg, A. J. M., Kroon-Batenburg, L. M. J. & Schreurs, A. M. M. (2003). J. Appl. Cryst. 36, 220–229. Web of Science CrossRef CAS IUCr Journals Google Scholar
Gati, C., Bourenkov, G., Klinge, M., Rehders, D., Stellato, F., Oberthür, D., Yefanov, O., Sommer, B. P., Mogk, S., Duszenko, M., Betzel, C., Schneider, T. R., Chapman, H. N. & Redecke, L. (2014). IUCrJ, 1, 87–94. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Hattne, J. et al. (2014). Nature Methods, 11, 545–548. Web of Science CrossRef CAS PubMed Google Scholar
Howell, P. L. & Smith, G. D. (1992). J. Appl. Cryst. 25, 81–86. CrossRef Web of Science IUCr Journals Google Scholar
James, R. W. (1958). The Optical Principles of the Diffraction of X-rays. London: G. Bell & Sons. Google Scholar
Kabsch, W. (2014). Acta Cryst. D70, 2204–2216. Web of Science CrossRef IUCr Journals Google Scholar
Kirian, R. A., Wang, X., Weierstall, U., Schmidt, K. E., Spence, J. C. H., Hunter, M., Fromme, P., White, T., Chapman, H. N. & Holton, J. (2010). Opt. Express, 18, 5713–5723. Web of Science CrossRef PubMed Google Scholar
Laue, M. von (1936). Ann. Phys. 41, 971–988. Google Scholar
Leslie, A. G. W. & Powell, H. R. (2007). Evolving Methods for Macromolecular Crystallography, edited by R. J. Read & J. L. Sussman, pp. 41–51. Dordrecht: Springer. Google Scholar
Milch, J. R. & Minor, T. C. (1974). J. Appl. Cryst. 7, 502–505. CrossRef IUCr Journals Web of Science Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Qu, K., Zhou, L. & Dong, Y.-H. (2014). Acta Cryst. D70, 1202–1211. CrossRef IUCr Journals Google Scholar
Ravelli, R. B. G. & McSweeney, S. M. (2000). Structure, 8, 315–328. Web of Science CrossRef PubMed CAS Google Scholar
Redecke, L. et al. (2013). Science, 339, 227–230. Web of Science CrossRef CAS PubMed Google Scholar
Rossmann, M. G. & van Beek, C. G. (1999). Acta Cryst. D55, 1631–1640. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sauter, N. K. (2015). J. Synchrotron Rad. 22, 239–248. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sauter, N. K., Grosse-Kunstleve, R. W. & Adams, P. D. (2004). J. Appl. Cryst. 37, 399–409. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sauter, N. K., Hattne, J., Brewster, A. S., Echols, N., Zwart, P. H. & Adams, P. D. (2014). Acta Cryst. D70, 3299–3309. Web of Science CrossRef IUCr Journals Google Scholar
Sauter, N. K., Hattne, J., Grosse-Kunstleve, R. W. & Echols, N. (2013). Acta Cryst. D69, 1274–1282. Web of Science CrossRef CAS IUCr Journals Google Scholar
Schreurs, A. M. M. (1998). VIEW. Utrecht University, The Netherlands. Google Scholar
Schreurs, A. M. M. (1999). PEAKREF. Utrecht University, The Netherlands. Google Scholar
Schreurs, A. M. M. (2007). ANY. Utrecht University, The Netherlands. Google Scholar
Schreurs, A. M. M., Xian, X. & Kroon-Batenburg, L. M. J. (2010). J. Appl. Cryst. 43, 70–82. Web of Science CrossRef CAS IUCr Journals Google Scholar
Schutt, N. K. & Winkler, F. K. (1977). The Rotation Method in Crystallography, edited by U. W. Arndt & A. J. Wonacott, pp. 173–186. Amsterdam: North Holland. Google Scholar
Spence, J. C. H., Kirian, R. A., Wang, X., Weierstall, U., Schmidt, K. E., White, T., Barty, A., Chapman, H. N., Marchesini, S. & Holton, J. (2011). Opt. Express, 19, 2866–2873. Web of Science CrossRef CAS PubMed Google Scholar
Sutton, K. A., Black, P. J., Mercer, K. R., Garman, E. F., Owen, R. L., Snell, E. H. & Bernhard, W. A. (2013). Acta Cryst. D69, 2381–2394. Web of Science CrossRef CAS IUCr Journals Google Scholar
Thorn, A. & Sheldrick, G. M. (2011). J. Appl. Cryst. 44, 1285–1287. Web of Science CrossRef CAS IUCr Journals Google Scholar
Uervirojnangkoorn, M., Zeldin, O. B., Lyubimov, A. Y., Hattne, J., Brewster, A. S., Sauter, N. K., Brunger, A. T. & Weis, W. I. (2015). eLife, 4, e05421. CrossRef Google Scholar
Vaney, M. C., Maignan, S., Riès-Kautt, M. & Ducruix, A. (1996). Acta Cryst. D52, 505–517. CrossRef CAS Web of Science IUCr Journals Google Scholar
Weik, M., Ravelli, R. B. G., Kryger, G., McSweeney, S., Raves, M. L., Harel, M., Gros, P., Silman, I., Kroon, J. & Sussman, J. L. (2000). Proc. Natl Acad. Sci. USA, 97, 623–628. Web of Science CrossRef PubMed CAS Google Scholar
White, T. A. (2014). Philos. Trans. R. Soc. B Biol. Sci. 369, 20130330. CrossRef Google Scholar
White, T. A., Barty, A., Stellato, F., Holton, J. M., Kirian, R. A., Zatsepin, N. A. & Chapman, H. N. (2013). Acta Cryst. D69, 1231–1240. Web of Science CrossRef CAS IUCr Journals Google Scholar
White, T. A., Kirian, R. A., Martin, A. V., Aquila, A., Nass, K., Barty, A. & Chapman, H. N. (2012). J. Appl. Cryst. 45, 335–341. Web of Science CrossRef CAS IUCr Journals Google Scholar
Winn, M. D. et al. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Zachariasen, W. H. (1945). Theory of X-ray Diffraction in Crystals. New York: Dover. Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.