Multiparametric scaling of diffraction intensities
A novel and general approach to scaling diffraction intensities is presented. The method minimizes the disagreement among multiple measurements of symmetry-related reflections using a stable refinement procedure. The scale factors are described by a flexible exponential function that allows different scaling corrections to be chosen and combined according to the needs of the experiment. The scaling model presented here includes: scale and temperature factor per batch of data; temperature factor as a continuous function of the radiation dose; absorption in the crystal; uneven exposure within a single diffraction image; and corrections for phenomena that depend on the diffraction peak position on the detector. This scaling model can be extended to include additional corrections for various instrumental and data-collection problems.
A natural starting point in the complex analysis of measurements is to define the data model that predicts observations based on the physics of the experiment. The purpose of data analysis is to solve the inverse problem – how to obtain the parameters of the physical model from the measurements. In crystallography, the inverse problem has many steps, and this article addresses one of them – determination of the scaling model in the diffraction data reduction process.
where Ib is the flux density of the primary beam; re = e2 /me c2 is the classical electron radius (2.818 × 10−15 m); is the wavelength; is the cross product between the diffraction vector and (projection of the crystal rotation-speed vector on the plane perpendicular to the primary beam), term is known as the Lorentz factor; P is the polarization factor (Azároff, 1955); T is the transmission [absorbance is defined as A = - lnT, T = exp( - A)]; vu is the volume of a primitive crystal unit cell; V is the volume of the crystal exposed by the beam; is the square of the structure-factor amplitude for a given reflection hkl; DA is the X-ray absorption in the detector's active material; DS is the detector's response to a single absorbed X-ray photon; it may depend on the wavelength, the position in the detector, the incidence angle etc.
where I is the intensity and K is the total scale factor.
In principle, all scaling components in (1) can be calculated from non-diffraction measurements and calibration of the data-collection system. The total scale factor of a particular observation can be expressed as
Typically, only relative scale factors, kr, are calculated and an overall scale factor ko, common to all measurements, is determined by comparing the scaled data to the squared structure-factor amplitudes predicted from an atomic model. The overall scale factor describes the typical lack of knowledge about the absolute calibration of the whole system (Evans, 1993).
We always assume that some components of scale factors are known from detector calibration, beam monitoring and the diffraction geometry; kc represents their product. The scaling procedure described here determines only the remaining part of the relative scale factors, ks:
In many cases, quantities needed to describe ks are calculated based on the knowledge from subsequent stages of crystallographic analysis. In such cases, scaling and subsequent steps should be done iteratively, with reasonable guesses for starting values (see §§4.1 and 4.3).
The method presented here is a generalization of scaling methods where parameterized scaling corrections were applied, for example, using scale and temperature factor per batch of data (Fox & Holmes, 1966; Stuart & Walker, 1979; Evans, 1993). The presented method applies to crystallographic scaling the concept of exponential modeling (Della-Pietra et al., 1997). Scale factors for individual measurements are calculated using a set of parameters pi:
fi is a pre-defined modeling function of experimental conditions for a given observation and pi specify the a priori unknown parameters. The i is shorthand for the hierarchical index, which describes here both the type of correction for a given physical effect and indices of parameters for functions describing this correction. Some of the functions fi in (6) can be derived directly from the description of a particular physical effect, for example an error of individual amplifier gain in multireadout channel CCD detectors (see §2.1.2). To represent an unknown but smooth-shaped function, for example one describing absorption in the crystal (see §2.1.3), the combination of the basis functions fi and a group of parameters can be used. The values of smooth functions rather than the values of the pi coefficients can be interpreted in physical terms.
The advantage of the exponential modeling approach is its flexibility when dealing with correlations among parameters and a uniform description of the parameter optimization process for various scaling models.
Scale factors are used to describe crystal diffraction, the experimental system and to correct for potential approximation errors during the integration step.
On the crystal diffraction level, the main problem is absorption in the crystal, before and after diffraction (Kopfmann & Huber, 1968; Huber & Kopfmann, 1969; Stuart & Walker, 1979; Schutt & Evans, 1985).
Typical instrumental problems are: beam fluctuations (Stupakov & Heifets, 2002); the error in the position of the rotation axis; inaccuracy of detector calibration (Tate et al., 2000); mechanical problems resulting in non-uniform crystal rotation and shutter errors (Evans, 1993).
Integration programs make certain assumptions about diffraction spot shape and size. Crystal mosaicity and changing spot shape may result in systematic underestimation of the diffraction intensity owing to reflection tails extending past the assumed shape. Imperfect profile shape predictions also result in systematic effects (Diamond, 1969; Leslie, 1999).
Parameterization of a particular source of an effect described by a scale factor may be quite effective in describing other physical phenomena or approximation errors, and for this reason refined values of parameters may not have a simple physical interpretation. For example, beam intensity corrections may also correct for changes in exposed volume of the crystal.
An example of the application of a modeling function is correcting for a small discrepancy between the actual and assumed directions of the crystal rotation axis. The inaccuracy results in an error in the calculated value of the Lorentz factor. The scale factor to correct this error can be described using the parameter pl, whose value represents a small angular error, and the corresponding function is
Detector-specific corrections are important for measurements of a very weak phasing signal. A very small correction, of about 1% in magnitude, can be critical for phasing from weak anomalous scatterers. Even if the detector sensitivity has been properly calibrated, it may have changed with time. For detectors with multichannel read-out, the most likely source of the above problem is a relative change in amplifier gains. The amplifier corrections are represented by characteristic functions corresponding to CCD areas read out by separate channels:
The parameters pda,j are logarithms of relative changes in amplifier gains.
Other detector corrections require a smooth function that can be described by a series of basis functions. One possibility is to use two-dimensional Fourier–Bessel series (Weissman, 1982); alternatively, one can use two-dimensional Chebyshev polynomials or two-dimensional cosines (Boyd, 1991). The last option transforms detector coordinates (x,y) onto the square . For transformed values of coordinates, the modeling function is
The above functions have been used to effectively correct for the signal decay during the imaging plate read-out. To correct for differences between plates in two-plate scanners, an additional index is necessary to describe separate corrections for individual plates.
Smooth functions dependent on the spot position might correct also for other phenomena, e.g. the non-rotating component of absorption, the incorrect polarization and systematic errors in integration. In diffraction data collected with a large oscillation range from mosaic crystals, reflections diffracting along the rotation axis have a distorted spot shape. Such spot-shape variability results in an integration error that decreases with the distance from the projection of the rotation axis onto the detector. A function that was found useful in correcting for this effect is
In the case of anisotropic mosaicity, this correction can be made asymmetric with respect to the sign of fl (left/right distinction for horizontal rotation axis). This can be accomplished by using two functions instead of the function in (10):
where step is a step function with zero value for negative arguments and one for positive.
Owing to a different geometry of diffraction for every reflection, the absorption in the crystal is not uniform. A frequently used approximation of crystal absorbance is the average of absorbances in the incoming- and diffracted-beam directions (Kopfmann & Huber, 1968). The crystal absorption can be parameterized by real spherical harmonics in the rotating crystal coordinate system (Katayama, 1986; Blessing, 1995):
where as,lm and ac,lm are parts of the hierarchical index i from (6); l,m are indices of spherical harmonics; Plm are Legendre polynomials; are polar coordinates of the incoming (index i) and the outgoing (index o) directions in the crystal coordinate system.
Owing to absorption being the same in opposite directions, the odd-order spherical harmonics should have zero coefficients and can be omitted in the description of pure absorption. Nevertheless, owing to correlation with other effects, the scaling results can be improved by the inclusion of low-order odd harmonics.
This scale factor describes the product of different physical effects: beam intensity, illuminated crystal volume and average absorption. It has a simple modeling function:
The parameter psj is the logarithm of the scale factor for batch j.
Resolution-dependent crystal decay can be described by a temperature factor that is a function of the accumulated radiation dose. This dependence can be stated explicitly:
where fpb for n = 1 describes a linear dependence and higher-order terms can be added to describe more complex behavior of radiation decay. Alternatively, the traditional approach is to apply temperature factors separately to batches of data:
Parameter pbj is the temperature factor of batch j.
The rotation method (Arndt & Wonacott, 1977) assumes that the exposure is constant in the angular range. The factors that can result in non-uniform exposure are: gear misalignment in the crystal goniostat, beam fluctuations and X-ray shutter timing errors. Uneven exposure can be reproducible from image to image and may also have a random component. The correction for the reproducible part of exposure fluctuations is well determined and its description needs relatively few parameters. The correction for the random component can be problematic owing to a large number of parameters needed to describe it. Random fluctuations should be eliminated rather than corrected for by scaling. The main value of correcting for random changes may lie in its usefulness as a powerful diagnostic tool to detect under-recognized instrumental problems.
Uneven exposure can be described by a function series based on the sine and cosine of the goniostat angle corresponding to a diffraction condition for the center of a Bragg peak. The correction of uneven exposure has to average the exposure variations over the angular range in which a particular reflection diffracts. The angular range depends not only on mosaicity but also on the geometry of a reflection crossing the Ewald sphere.
The reproducible unevenness of exposure due to mechanical backlash or a shutter error can be described by a series of frequencies g = nt, where and is the rotation range. For a particular periodicity g, the modeling functions are:
For a Gaussian mosaicity profile, the averaging factor A should be a Gaussian function that depends on the coefficient c. The expression for A approximates the Gaussian function effectively but it can also approximate averaging resulting from other mosaicity shapes. The averaging resulting from the mosaicity shape function is described by two parameters, a and b, that tend to have values close to one and zero, respectively. For unusual shapes of intensity profiles, a and b may need to be optimized. To describe random fluctuations, one can use the term , where g = nt /2 and n goes from 1 to a maximum value. Parameters pu,g,c,j are independent for every frame of data j. The term corresponding to n = 0 is the same as the scale factor per batch of data (see §2.1.4).
The consequence of averaging of the exposure variations is the reduced impact of the fluctuations with very high frequencies. For example, a stepper motor creates fluctuations corresponding to the step size of the motor divided by the gear ratio, typically 0.01° or less. For crystals of typical mosaicity, the consequences of such high-frequency fluctuations are negligible. Fluctuations of very low frequencies are corrected by the scale factor per batch j. The fluctuations of intermediate angular scale are the ones most affected by (16).
The scaling procedure determines the parameters pi [equation (6)] by a method where a -like target function is minimized. The target function describes the agreement between symmetry-related reflections weighted by the expectation of measurement errors. Two similar target functions can be generated, depending on whether the intensities or their logarithms are compared.
A function that works very well in practice and whose optimization always converges in one cycle is obtained from comparing the logarithms of scaled measurements (Rae, 1965). First, we define r as the logarithm of the unknown scale factor ks:
then Pm is the logarithm of the measured intensity Im for a given observation:
where Ps is the logarithm of a scaled observation.
The intensity error estimate can be used to calculate the error estimate of the logarithm of the intensity and the corresponding weight u in the least-squares method:
where n is the index of symmetry-related observations of a unique hkl. In the target function, we minimize the squared difference between the logarithms of scaled intensity and the scaled geometrical average of symmetry-related measurements:
The minimum of the target function can be found by using multivariate Newton methods. The first and second derivatives are straightforward to calculate. The first derivative is calculated with respect to r and then, using the chain rule, the derivative with respect to parameters pi is calculated:
The second derivative with respect to r is
Because r is a linear function of unknown parameters pi:
the first derivative of the target function with respect to pi is
It is important to note that the second derivative with respect to pi for the logarithmic target function is constant – independent of the parameters pi [equations (23) and (25)]. In such a case, the solution is given by a single matrix equation, and there is no need to make initial guesses for parameter values.
3.2. Alternative refinement of scale factors
The traditional approach to finding scale factors is based on an equation where the inverse rather than the logarithm of the scale factor appears and where we use the arithmetic rather than the geometric average of intensities:
where ks is the scaling factor, is the estimated error of Im and is the mean intensity of reflections for a given hkl.
The calculation of the first and then the second derivative with respect to ks follows the same logic as in the case of a logarithmic function. First derivative:
and second derivative:
However, the formula
when used as a second derivative in the Newton method, gives better convergence because it better approximates the second derivative at the minimum of the target function.
Since in this case the second derivative is not constant with respect to ks, the Newton method has to be used iteratively.
It is not clear if or when scale factors determined by logarithmic or traditional approaches are better, but even in the case when one prefers the traditional approach, it is preferable to run the first cycle with the logarithmic function owing to its superior convergence.
For reflections with no redundancy, scale factors are extrapolated from other measurements. Generally, without additional restraints, the extrapolation is unstable for highly correlated parameters. In the method presented here, the restraints can be obtained from the prior knowledge about the reasonable magnitude of refined parameters. The logic behind it is the same as that behind the restraints in the atomic refinement. For example, one can assume that logarithms of scale factors typically do not fluctuate more than ws between frames, where expectation about ws is a function of the data-collection stability (beam stability, goniostat and/or crystal vibrations). This knowledge is described by adding a penalty term to the functions being optimized:
where psj is a logarithm of the scale factor for batch j.
We can similarly treat the expectation about the magnitude of absorption coefficients. For a smooth absorption with the expectation of decreasing magnitude of parameters for high orders of spherical harmonics [equation (21)], a reasonable penalty term added to the target function [equation (21)] is
If we do not want to penalize high-order terms more, we can use the following term:
In a similar fashion, penalty terms can be created and used to restrain all the other scaling parameters.
The main criteria of scaling effectiveness come from subsequent stages of analysis: merging, phasing etc., described elsewhere. During scaling itself, we can only judge the convergence and the value of the target function – overall .
The method described above, based on the logarithmic target function, always converges in one cycle. Thus, the next step – merging – can always be accomplished, even with problematic data. A high value of overall in itself does not differentiate whether it is due to anomalous signal, low quality of data or an unreasonable error model. Comparing scaled symmetry-related data allows for the calculation of various statistics in order to identify potential problems with data quality, with the error model or the existence of non-isomorphism in the diffraction.
Potentially, merging analysis may separate sources of variability in the structure factors owing to anomalous and dispersive effects, radiation-induced changes, non-isomorphism among crystals, and pseudosymmetry. In the case of structure factors not being constant, it is usually better to ignore this variability during determination of the scale factors and to scale the reflections but separate them during merging (Evans, 1993; Otwinowski & Minor, 2000). Redetermination of the error model, postrefinement and rejection of outliers may impact the assumptions about the data to be scaled, so it maybe worthwhile to redo the scaling after the execution of these steps.
In the following example, the structure factors were stable and the measurement errors were lower than in most experiments. Traditional scaling with scale and temperature factor per batch of data resulted in better than average merging statistics, but still with a rather low value of the significance ratio for anomalous scattering from intrinsic sulfur. Application of the above-described scaling corrections improved the merging statistics and resulted in a significant anomalous signal for medium resolution (Table 1).
Global scaling can be followed by local scaling (Matthews & Czerwinski, 1975). Local scaling is mostly applied to calculate differences of the phasing signal, where it is assumed that a group of measurements, for example those close together in the reciprocal space or in the detector space, should be on a similar scale. A flexible parameterization by the exponential modeling allows for a good description of all kinds of smooth corrections. Local scaling is much more limited in terms of what type of smooth variation is being corrected for, so it is unlikely to provide additional benefit to the general scaling method described here.
The basic equation (1) assumes a crystal with a level of imperfection that results in negligible extinction. More perfect crystals require a non-linear correction for extinction. However, crystals often are more than `ideally imperfect' (Guinier, 1994). The departure from crystallinity may result in diffuse scattering, extra modulation (incommensurate structure), stacking disorders, large mosaicity that produces spot overlap, twinning etc. The sample may also be contaminated with other crystals or ice. These problems may require further analysis with already scaled data but with results that can impact the assumption about the data to be scaled. As in the case of changes in the error model or outliers rejection, iterations may be needed.
One of the purposes of scaling is to correct for a reasonable level of imperfection in the measurement system. Such a procedure may also attempt to correct severe problems resulting in non-obvious behavior of the determined scale factors. For example, an incorrect or missing description of the beamstop produces similar effects to a very large absorption in an area of reciprocal space. The smoothly varying absorption correction will produce a high estimate of absorption not only for the area behind the beamstop but also for reflections at some distance from the beamstop. Such problems should not be corrected by scaling, but rather by applying the proper experimental procedure.
The purpose of scale determination can be the calibration of the instrument rather than the structure determination. For this purpose, one can use a diffraction pattern from a very high quality crystal, but it is even better to use artificial patterns. In particular, grid masks can be used not only for distortion calibration but also for the calibration of the detector sensitivity in addition to the flood field method. Sensitivity calibration can also be defined as a problem of finding a multiplicative scale function, so it can be parameterized and determined in analogous fashion as described above. This may help to solve problems in detector calibration discussed before in the literature (Tate et al., 2000).
The structure determination from a single-crystal diffraction experiment requires scale factors that relate measured and calculated intensities. We describe here a scaling procedure based on exponential modeling that is very effective in correcting for factors that are difficult to calculate from the physical description of the experiment. Because such factors always exist in macromolecular crystallography owing to the complex geometry of absorption around the crystal and owing to radiation decay, this kind of scaling is in practice necessary. It is also effective in small-molecule crystallography.
The power of the described method comes from the absolute stability of the refinement for virtually any description of experimental problems. All so-far proposed parameterizations (Fox & Holmes, 1966; Rossmann et al., 1979; Evans, 1993) can be accommodated into an exponential approach. The parameterizations described here are being implemented in Scalepack (Otwinowski & Minor, 1997, 2000). This program corrects for a large group of problems that have been observed to be significant, though typically not all problems in every case. The development of the parameterization was driven by the recognition of problems encountered in practice. This process is likely to continue. The remarkable success of applying the corrections discussed in this article requires a description of the problems in the particular experiments, a subject of separate publications.
The parameters in the scaling model should be chosen on the basis of the aim of the experiment and the understanding of what is the most significant scaling problem likely to be encountered. The choice of parameterization is particularly important when obtaining the phasing signal is the target of the experiment. For typically sized crystals, errors in the measurements of strong reflections are mostly multiplicative, so in principle an appropriate scaling model can correct them. Carefully conducted experiments with proper scaling should allow for SAD structure solution from sulfur and other weak scatterers to be more than a rare occurrence.
This work was supported by grant GM53163 from the National Institutes of Health. The authors would like to thank Janet Smith and MinSun LeeSong for providing diffraction data presented here, Halszka Czarnocka for her help with the preparation of the manuscript and Mischa Machius for his extensive comments and stimulating discussions.
Arndt, U. W. & Wonacott, A. J. (1977). The Rotation Method in Crystallography. Amsterdam: North Holland.
Azároff, L. (1955). Acta Cryst. 8, 701–704. CrossRef CAS IUCr Journals Web of Science
Blessing, R. H. (1995). Acta Cryst. A51, 33–38. CrossRef CAS Web of Science IUCr Journals
Boyd, J. P. (1991). Chebyshev and Fourier Spectral Methods. Mineola, NY: Dover.
Della-Pietra, S., Della-Pietra, V. & Lafferty, J. (1997). IEEE Trans. Pattern Anal. 19, 380–393. CrossRef Web of Science
Diamond, R. (1969). Acta Cryst. A25, 43–55. CrossRef CAS IUCr Journals Web of Science
Evans, P. R. (1993). Data Reduction: Data Collection and Processing, edited by L. Sawyer, N. Isaacs & S. Bailey. Proceedings of the Daresbury CCP4 Study Weekend, pp. 114–123. Daresbury Laboratory, Warrington, England.
Fox, G. C. & Holmes, K. C. (1966). Acta Cryst. 20, 886–891. CrossRef CAS IUCr Journals Web of Science
Guinier, A. (1994). X-ray Diffraction in Crystals, Imperfect Crystals, and Amorphous Bodies. New York: Dover.
Huber, R. & Kopfmann, G. (1969). Acta Cryst. A25, 143–152. CrossRef CAS IUCr Journals Web of Science
Katayama, C. (1986). Acta Cryst. A42, 19–23. CrossRef CAS Web of Science IUCr Journals
Kopfmann, G. & Huber, R. (1968). Acta Cryst. A24, 348–351. CrossRef IUCr Journals Web of Science
Leesong, M., Henderson, B. S., Gillig, J. R., Schwab, J. M. & Smith, J. L. (1996). Structure, 4, 253–264. CrossRef CAS PubMed Web of Science
Leslie, A. G. W. (1999). Acta Cryst. D55, 1696–1702. Web of Science CrossRef CAS IUCr Journals
Matthews, B. W. & Czerwinski, E. W. (1975). Acta Cryst. A31, 480–487. CrossRef CAS IUCr Journals Web of Science
Otwinowski, Z. & Minor, W. (1997). Processing of X-ray Diffraction Data Collected in Oscillation Mode. In Methods in Enzymology, Vol. 276, edited by J. C. W. Carter & R. M. Sweet. New York: Academic Press.
Otwinowski, Z. & Minor, W. (2000). Denzo and Scalepack. In International Tables for Crystallography, Vol. F, edited by M. G. Rossman & E. Arnold. Dordrecht: Kluwer.
Rae, A. (1965). Acta Cryst. 19, 683–684. CrossRef IUCr Journals Web of Science
Rossmann, M. G., Leslie, A. G. W., Abdel-Meguid, S. S. & Tsukihara, T. (1979). J. Appl. Cryst. 12, 570–581. CrossRef CAS IUCr Journals Web of Science
Schutt, C. E. & Evans, P. R. (1985). Acta Cryst. A41, 568–570. CrossRef CAS Web of Science IUCr Journals
Stuart, D. & Walker, N. (1979). Acta Cryst. A35, 925–933. CrossRef CAS IUCr Journals Web of Science
Stupakov, G. & Heifets, S. (2002). Phys. Rev. Spec. Top. Ac. 5, No. 054402.
Tate, M. W., Eikenberry, E. F. & Gruner, S. M. (2000). CCD Detectors. In International Tables for Crystallography, Vol. F, edited by M. G. Rossman & E. Arnold. Dordrecht: Kluwer.
Weissman, L. (1982). Strategies for Extracting Isomorphous and Anomalous Signals. Computational Crystallography, edited by D. Sayre. Oxford: Clarendon Press.
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.