Simple algorithm for a maximum-likelihood SAD function

McCoy, A.J.; Storoni, L.C.; Read, R.J.

doi:10.1107/S0907444904009990

research papers

STRUCTURAL
BIOLOGY

ISSN: 2059-7983

Volume 60| Part 7| July 2004| Pages 1220-1228

https://doi.org/10.1107/S0907444904009990

Simple algorithm for a maximum-likelihood SAD function

Airlie J. McCoy,^a Laurent C. Storoni ^a and Randy J. Read ^a ^*

^aUniversity of Cambridge, Department of Haematology, Cambridge Institute for Medical Research, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 2XY, England
^*Correspondence e-mail: rjr27@cam.ac.uk

(Received 18 December 2003; accepted 23 April 2004)

Recently, the multivariate complex normal distribution has been used to develop a maximum-likelihood probability function for single-wavelength anomalous diffraction phasing and refinement of heavy-atom parameters [Pannu & Read (2004 ), Acta Cryst. D60, 22–27]. The function accounts explicitly for the correlations between the observed and calculated Friedel mates and their errors. However, the method of derivation of the equation described by Pannu & Read (2004) leads to a complicated likelihood expression that suffers from a number of algorithmic limitations. Here, an alternative derivation of the P_SAD function is described that leads to simplified algorithmic requirements and that allows an intuitive understanding of the expression.

Keywords: single-wavelength anomalous diffraction; multivariate complex normal; maximum likelihood; experimental phasing.

1. Introduction

The availability of tuneable synchrotron sources allowed the development of multiple-wavelength anomalous diffraction (MAD; Hendrickson, 1991 ) phasing experiments, which today underpin many high-throughput structural biology efforts around the world. With improvements in synchrotron sources, cryocooling of crystals and increased detector sensitivity, phasing by single-wavelength anomalous diffraction (SAD) has become not only feasible, but in some cases preferable to phasing by MAD, particularly where radiation damage is significant (Rice et al., 2000 ; Dodson, 2003 ) or where the absorption edge for the anomalous scatterer is not accessible (e.g. sulfur, xenon). However, until recently technical improvements in the SAD experiment had not been matched by corresponding improvements in the theory for obtaining phases from SAD.

A maximum-likelihood treatment of the SAD phasing problem describes the probability distribution P_SAD of the (unphased) model structure factors F⁺ and F⁻ given the (phased) calculated heavy-atom structure factors H⁺ and H⁻,

$[P_{\rm SAD} = P({F}^{+}, {F}^ - |{\bf H}^{+}, {\bf H}^{-*}),]$

where F⁺ = |F⁺| and F⁻ = |F⁻|. F⁺ and F⁻ are highly correlated and so P_SAD cannot be approximated by a product of independent probabilities for the two observations F⁺ and F⁻. Also highly correlated are the substructure-model errors contributing to the conditional probability distribution of F⁺ and F⁻, since they are generated by the same set of anomalous scatterers. These correlations must be included in the probability distribution for a complete analysis.

Traditional methods for SAD phasing have avoided the complication of including the correlations by using the mean F and the Bijvoet difference (F and ΔF^±) rather than F⁺ and F⁻, as these are relatively independent and have relatively independent errors. In these treatments, the distribution of Bijvoet differences has been assumed to be Gaussian (North, 1965 ; Matthews, 1966 ; de La Fortelle & Bricogne, 1997 ). More recently, joint probability distributions for F⁺ and F⁻ have been described that go some way towards addressing the problem (Hauptman, 1982 ; Giacovazzo, 1983 ; Burla et al., 2002 ; Giacovazzo & Siliqi, 2001a ,b ; Terwilliger & Eisenberg, 1987 ), but it was not until Pannu & Read (2004) that a P_SAD function was described that accounted explicitly for the correlations in the SAD experiment,

$[\eqalignno {P_{\rm SAD} & = {{2F^+ F^- |\Sigma _2|} \over {\pi|\Sigma _4|}} \exp [-a_{11} F^{+2} - a_{22} F^{-2} \cr &\ \quad -\ (a_{33} - c_{33})H^{+2} - (a_{44} - c_{44})H^{-2}] \cr &\ \quad {\times}\ \exp\{ - 2H^+ H^- [(a_{34} - c_{34})\cos (\alpha_H^+ - \alpha_H^-) \cr &\ \quad -\ (b_{34} - d_{34})\sin (\alpha_H^+ - \alpha_H^-)] \} \cr &\ \quad {\times}\ \textstyle \int\limits_0^{2\pi} \big(\exp \{- 2F^- H^+[a_{23} \cos(\alpha^- - \alpha_H^+) \cr &\ \quad -\ b_{23} \sin (\alpha^- - \alpha_H^+)]\} \cr &\ \quad {\times}\ \exp \{ - 2F^- H^- [a_{24} \cos(\alpha^- - \alpha_H^-)\cr &\ \quad -\ b_{24} \sin (\alpha^- - \alpha_H^-)]\} I_0 (\xi^{1/2}) \,\,{\rm d}\alpha^-\big), & (1)}]$

$[\eqalign {\xi & = 4 F^{+2} [a_{12}F^ - \cos(\alpha^-) + b_{12}F^- \sin(\alpha^-) \cr &\ \quad +\ a_{13}H^+ \cos(\alpha_H^+) + b_{13}H^+ \sin (\alpha_H^+) \cr &\ \quad +\ a_{14} H^- \cos (\alpha_H^-) + b_{14}H^- \sin(\alpha_H^-)]^2 \cr &\ \quad +\ 4F^{+2}[a_{12}F^- \sin(\alpha^-) - b_{12}F^- \cos(\alpha^-) \cr &\ \quad +\ a_{13}H^+ \sin (\alpha_H^+) - b_{13}H^+ \cos(\alpha_H^+) \cr &\ \quad +\ a_{14}H^- \sin(\alpha_H^-) - b_{14}H^- \cos(\alpha_H^-)]^2 },]$

$[\Sigma_4^{ - 1} = \left(\matrix{ a_{11} & a_{12} + ib_{12} & a_{13} + ib_{13} & a_{14} + ib_{14} \cr a_{12} - ib_{12} & a_{22} & a_{23} + ib_{23} & a_{24} + ib_{24} \cr a_{13} - ib_{13} & a_{23} - ib_{23} & a_{33} & a_{34} + ib_{34} \cr a_{14} - ib_{14} & a_{24} - ib_{24} & a_{34} - ib_{34} & a_{44} } \right),]$

$[\Sigma _2^{ - 1} = \left(\matrix{ c_{11} & c_{12} + id_{12} \cr c_{12} - id_{12} & c_{22} }\right),]$

where Σ₄ is the (Hermitian) covariance matrix of the tetravariate complex Gaussian distribution P(F⁺, F^−*, H⁺, H^−*), Σ₂ is the (Hermitian) covariance matrix of the bivariate Gaussian complex distribution P(H⁺, H^−*) and α⁻, $[\alpha_{H}^{+}]$ and $[\alpha_{H}^{-}]$ are the phases of F^−*, H⁺ and H^−*, respectively. It is assumed that the reflections are independent, so the total likelihood is the product of the reflection likelihoods.

The complexity of (1) is immediately apparent. There are 20 different coefficients arising from the inverse of the covariance matrices Σ₄ (ten real, six imaginary) and Σ₂ (three real, one imaginary). During refinement Σ₄ and Σ₂ must be kept positive definite and in the implementation of the P_SAD function described by Pannu & Read (2004) this was performed by setting negative eigenvalues to zero during calculation of their inverses by singular value decomposition. The derivatives of the function become even more verbose. In the implementation described by Pannu & Read (2004), derivatives were not calculated analytically. Instead, an automatic differentiation method (ADOLC; Griewank et al., 1996 ) was used to obtain the gradient vectors. The complex functional form of (1) makes it difficult to get an intuitive feel for the effects of the different parameters or the physical meaning of the terms.

Here, we present an alternative derivation of a maximum-likelihood P_SAD function that has only three unique error parameters, does not involve matrix inversion, allows analytic derivatives to be calculated easily and provides an intuitive understanding of the SAD experiment.

2. Results

2.1. SAD likelihood function

Equation (1) was derived by finding the expression for P(F⁺, F^−*, H⁺, H^−*), integrating out the unknown phases to obtain the joint probability P(F⁺, F⁻, H⁺, H^−*) and then fixing the calculated structure factors and renormalizing to obtain the desired conditional probability P(F⁺, F⁻|H⁺, H^−*). If, instead, the order of the operations is reversed and the conditional probability P(F⁺, F^−*|H⁺, H^−*) is formed before integrating out the unknown phases, we obtain (Appendix A) the expression

$[\eqalignno {P_{\rm SAD} & = {{2F^ + F^-} \over {\pi \varepsilon ^2 (1 - D_\Phi ^2)\sigma_\Delta^4 }}{\textstyle \int\limits_0^{2\pi} }\exp \biggr [- {{|F^- \exp(i\alpha^-) - D{\bf H}^{-*}|^2 } \over {\varepsilon \sigma _\Delta ^2 }} \cr &\ \quad -\ {{F^{+2} + F_C^{+2}} \over {\varepsilon (1 - D_\Phi^2)\sigma _\Delta ^2 }} \biggr] I_0 \left [{{2F^+ F_C^+} \over {\varepsilon (1 - D_\Phi^2)\sigma _\Delta ^2 }} \right]\,\,{\rm d}\alpha^-,&(2)}]$

where

$[F_C^+ = |D{\bf H}^+ + D_\Phi \exp(i\alpha_\Phi) [F^- \exp(i\alpha^-) - D{\bf H}^{-*}]|.]$

This equation contains three error parameters derived from the initial covariance matrix (σ_Δ, D_Φ and α_Φ). Again, it is assumed that the reflections are independent so that the total likelihood is the product of the reflection probabilities.

(2) was derived by integrating out the phase α⁺ analytically, leaving the integration over α⁻ to be performed numerically. Equivalently, the phase α⁻ could have been integrated out analytically, leaving the integration over α⁺ to be performed numerically. Numerical integration tests comparing these two forms of the equation confirm that they give the same values for P_SAD (data not shown).

2.2. Phase probabilities and maps

P_SAD is obtained by integrating P(F⁺, F⁻, α⁻|H⁺, H^−*) over α⁻. The conditional probability distribution of α⁻ can be obtained by fixing F⁺ and F⁻ in the joint distribution P(F⁺, F⁻, α⁻|H⁺, H^−*) and renormalizing to obtain P(α⁻|F⁺, F⁻, H⁺, H^−*). In other words, the probability distribution for this phase is proportional to the integrand in (2). The roles of F⁺ and F⁻ can be reversed to obtain the probability distribution for α⁺.

For building an atomic model into electron density one is generally most interested in the map representing the normal (real) scattering component, although the map representing the imaginary component is often useful as well. When the relative contribution of the imaginary component of the anomalous scatterers is small, a map computed using either the centroid (figure-of-merit-weighted) estimate of F⁺ or the centroid estimate of F^−* (making the usual assumption in the map calculation that Friedel's law applies) will differ little from the map representing the real component of the electron density. However, in the presence of very strong anomalous scatterers the phases of F⁺ and F^−* will differ significantly. Therefore, for generality it is better either to compute a complex electron-density map by providing separate coefficients for F⁺ and F⁻ or to compute separate real and imaginary electron-density maps with coefficients obtained from figure-of-merit-weighted (F⁺ + F^−*)/2 and exp(−πi/2)(F⁺ − F^−*)/2, respectively.

2.3. Implementation and test cases

The P_SAD function described above, with slight modifications for numerical stability and the inclusion of the effect of experimental errors (Appendix B), was implemented in the program PHASER. Analytic derivatives were used to calculate the gradients. Optimal anomalous scatterer and error parameters were found by minimizing the minus log-likelihood.

Results of the implementation in PHASER were compared with results from the programs MLPHARE (version 4.0; Otwinowski, 1991 ; Collaborative Computational Project, Number 4, 1994 ), SOLVE (version 2.02; Terwilliger & Berendzen, 1997 ) and SHARP (version 2.0; de La Fortelle & Bricogne, 1997). Tests were performed with the two publicly available data sets used by Pannu & Read (2004): the 90° and the 360° pass data sets of a Z-form DNA hexamer duplex phased on ten intrinsic P atoms (Dauter & Adamiak, 2001 ). The results (Table 1) for MLPHARE and SOLVE were comparable to those reported by Pannu & Read (2004), but the results for SHARP were significantly better, as instead of using the default refinement protocol, the refinement protocol was customized to the test case. Statistics for the implementation of P_SAD in PHASER were not significantly different from those reported for the P_SAD function implemented in Pannu & Read (2004), confirming that when the parameters have been optimized (1) and (2) give very similar final phase distributions.

Table 1
Statistics for SAD refinement and phasing of a Z-form DNA hexamer duplex

	MLPHARE†	SOLVE‡	SHARP§	PHASER¶
360° pass
Map correlation††	0.607	0.588	0.722	0.723
Reported figure of merit††	0.587	0.492	0.575	0.650
Mean cos(phase error)††	0.500	0.553	0.634	0.643
Mean phase error††	53.53	50.52	42.90	41.64
90° pass
Map correlation††	0.500	0.487	0.643	0.649
Reported figure of merit††	0.405	0.352	0.443	0.561
Mean cos(phase error)††	0.416	0.484	0.548	0.568
Mean phase error††	59.67	55.23	49.49	47.55

†Coordinates and isotropic B factors were refined. Occupancies were not refined.
‡Coordinates, isotropic B factors and occupancies were refined. The minimum allowed B factor was zero.
§Coordinates, isotropic B factors and the global and local imperfection parameters on anomalous differences were refined.
¶Coordinates, isotropic B factors, occupancies and variance parameters were refined.
††Statistics calculated with SFTOOLS (B. Hazes, unpublished work; Collaborative Computational Project, Number 4, 1994

). Map correlation compared the figure-of-merit-weighted map from experimental phasing with the figure-of-merit-weighted SIGMAA (Read, 1986

) map calculated with phases from the final model.

3. Discussion

The P_SAD expression described in (2) is simpler than that in (1). It has several algorithmic advantages: the parameterization is compact, refinement of heavy-atom parameters does not involve the inversion of covariance matrices, and analytic derivatives can be determined easily. It is thus likely to be much more robust when applied to a wide range of SAD data sets.

In general, a maximum-likelihood approach in crystallography is of greatest benefit when the data and the model are poor. This is clearly seen in the test cases, where including the correlations has a significant influence on the determination of the figure of merit in the poorer (90°) data set, but little effect in the better (360°) data set. The figure of merit reported by PHASER for the poorer (90°) data set is closer to the mean cosine of the phase error than that produced by the other three programs. This suggests that the P_SAD function gives better phase probability distribution estimates for use in density modification (required to break the phase ambiguity present in SAD phasing) when the phasing is marginal.

The P_SAD function can also be used for the refinement of models containing anomalous scatterers (Garib Murshudov, personal communication). In model refinement, fast calculation of the target function is of key importance as other aspects of the algorithm are already time-consuming given the large number of atomic parameters (e.g. the structure-factor calculation). The reduced parameterization for P_SAD should also be helpful for this application.

The new formulation of P_SAD also allows a more intuitive understanding of the SAD likelihood function. As shown in the appendices, P_SAD can be expressed as the integral of the product of two functions,

$[\eqalignno {P_{\rm SAD} & = P(F_O^+, F_O^- |{\bf H}^+, {\bf H}^{-*})& (3) \cr & = \textstyle\int\limits_0^{2\pi} P(F_O^-, \alpha^- |{\bf H}^+, {\bf H}^{-*})P(F_O^+ |F_O^-, \alpha^-, {\bf H}^+, {\bf H}^{-*}) \,\, {\rm d}\alpha^-, }]$

where

$[\eqalign {P(F_O^-, \alpha^- |{\bf H}^+, {\bf H}^{-*}) & = P(F_O^-, \alpha^- |{\bf H}^{-*})\cr & = {{F_O^-} \over {\pi \Sigma^- }} \exp \left [{{ -|F_O^- \exp(i\alpha^-) - D{\bf H}^{-*}|^2 } \over {\Sigma^- }} \right]},]$

$[\eqalign{P(F_O^ + |F_O^-, \alpha^-, &{\bf H}^+, {\bf H}^{-*}) = \cr &{{2F_O^+} \over {\Sigma^+ }} \exp \left[- {{(F_O^+ - F_C^+)^2 } \over {\Sigma^+ }}\right]eI_0 \left({{2F_O^+ F_C^+} \over {\Sigma^+}} \right)},]$

$[F_C^+ = |D{\bf H}^+ + D_\Phi \exp(i\alpha_\Phi)[F_O^- \exp(i\alpha^-) - D{\bf H}^{-*}]|.]$

In this version of the expression for P_SAD, the variances Σ⁺ and Σ⁻ have been inflated (as discussed in Appendix B) to account for the effect of experimental error. The first distribution in the product expresses what is known about one observation, F_O^-, when only the corresponding calculated structure factor H⁻ is given; accordingly, its variance Σ⁻ accounts for what is left unexplained by H⁻. (Once H⁻ is known, no further information about F_O^- is added by the knowledge of H⁺, so this part of the distribution does not depend on H⁺.) The second distribution expresses what is known about the second observation, F_O⁺, when F_O^- (phased by some value of the variable of integration, α⁻) and both calculated structure factors are given; accordingly, its variance Σ⁺ accounts for what is left unexplained by the value of F⁺ predicted from the other three structure factors. To a good approximation, the first distribution provides a `Sim factor' to account for the information given by the partial structure (primarily normal scattering), while the second distribution takes account of the anomalous difference. While the mathematical details differ considerably, the SAD phasing function presented recently by Giacovazzo et al. (2003 ) also combines a term arising from anomalous differences with a Sim-like term. Note that when expressed using the exponential Bessel function (eI₀), the second distribution in (3) has the same exponential term as a Gaussian distribution. The exponential Bessel function will tend to be flatter than the Gaussian component and so the Gaussian component will dominate the shape of the distribution. This resemblance to a Gaussian distribution explains why the Gaussian approximation, comparing the calculated and observed anomalous differences, is reasonably successful.

The influence of the two components of P_SAD is shown in Figs. 1 and 2. Fig. 1 illustrates the situation characteristic of SAD phasing, in which the model consists of only the strong anomalous scatterers. In this case, the model of the normal scattering component is very incomplete, so the first (Sim) distribution is very broad and serves primarily to break the phase ambiguity of the second (anomalous difference) distribution. By contrast, Fig. 2 illustrates the situation that would occur in full model refinement against SAD data, where the model of the normal scattering component is nearly complete so the Sim distribution will tend to dominate, while the anomalous difference distribution will provide a weak bimodal indication of the correct phase.

Figure 1
Schematic illustration of P_SAD for the case of SAD phasing. The three contour plots (a)–(c) are shown as a function of the assumed complex value of F^−*; in each contour plot, the cross indicates the origin and the black circle indicates the measured value of F_O^- for which the function values shown in (d) are taken. (a) The first (Sim) component of P_SAD, P(F^−*|H^−*), is shown in blue contours centred on H^−* (blue arrow). (b) The second component of P_SAD, P(F_O⁺|F^−*, H⁺, H^−*), is shown in red contours centred on the expected vector difference between F⁺ and F^−* (tail of red arrow). (c) The product of the two components of P_SAD is shown in magenta contours. P_SAD is given by the integral of this surface under the black circle. (d) The components of P_SAD are shown as a function of the assumed value of α⁻, with P(F_O^-, α⁻ |H⁺, H^−*) shown in blue, P( F_O⁺| F_O^-, α⁻, H⁺, H^−*) shown in red and their product in black. The three distributions have been normalized to place them on a common scale.

Figure 2
Schematic illustration of P_SAD for the case of model refinement against the SAD function. The three contour plots (a)–(c) are shown as a function of the assumed complex value of F^−*; in each contour plot the cross indicates the origin and the black circle indicates the measured value of F_O^- for which the function values shown in (d) are taken. (a) The first (Sim) component of P_SAD, P(F^−*|H^−*), is shown in blue contours centred on H^−* (blue arrow). Compared with the SAD phasing case, the full scattering model is more complete, which increases the magnitude of H^−* and decreases the variance in this distribution. (b) The second component of P_SAD, P(F_O⁺|F^−*, H⁺, H^−*), is shown in red contours centred on the expected vector difference between F⁺ and F^−* (tail of red arrow). (c) The product of the two components of P_SAD is shown in magenta contours. P_SAD is given by the integral of this surface under the black circle. (d) The components of P_SAD are shown as a function of the assumed value of α⁻, with P( F_O^-, α⁻ |H⁺, H^−*) shown in blue, P( F_O⁺| F_O^-, α⁻, H⁺, H^−*) shown in red and their product in black. The three distributions have been normalized to place them on a common scale.

(3) bears a close resemblance to the phased MLHL target (Pannu et al., 1998 ) for model refinement, so one would expect refinement of a full model against the MLHL target (if appropriately implemented) to yield similar results to refinement against the SAD target. In the MLHL target, an integration over possible phases in the Sim probability distribution is weighted by prior knowledge of the phase probability distribution. If no significant improvement were made in the anomalous scatterer model, the second (anomalous difference) component of (3) would not change during the course of refinement, so it could be used as a constant source of prior phase information in the MLHL target. Note, however, that it would not be appropriate to provide prior phase information to MLHL in the form of the full phase probability distribution obtained by normalizing the integrand of P_SAD, because the normal scattering from the anomalous scatterers would then appear twice, in both the Sim component of P_SAD and the Sim component of MLHL. When the imaginary ( [f''] ) contribution to the structure factor is weak compared with the real (f + [f'] ) contribution, the amplitude of the real scattering component can be approximated reasonably well by the mean of F_O⁺ and F_O^-. Typically, such a mean amplitude would be used in the MLHL target. However, in the presence of very strong anomalous scatterers this approximation breaks down. By analogy with (3), the Sim component of the MLHL target should then compare the observed value of one of the Friedel mates with its corresponding calculated value (including the imaginary contribution). Compared with such an implementation of the MLHL target, any improvement from using the SAD function for model refinement would only arise through improvements in the anomalous scatterering model during the course of refinement. The model of strong anomalous scatterers is unlikely to change substantially during subsequent full model refinement, so the main potential for improvement with the SAD function will come from accounting for partially occupied sites and the weak anomalous scattering from the rest of the structure, such as C, N and O atoms.

APPENDIX A

Derivation of SAD likelihood function

A1. General SAD likelihood function

For our maximum-likelihood P_SAD function we obtain first the probability of the true F⁺ and F⁻ (unphased) given the heavy-atom structure factors H⁺ and H^−* (phased). (A correction for the effect of measurement error will be introduced later; see Appendix B). We derive this expression from the probability of the true phased structure factors F⁺ and F^−* given the calculated heavy-atom structure factors H⁺ and H^−* and then integrate out the phases. Complex conjugates are used for F⁻ and H⁻ because these are much more highly correlated with their Friedel mates, F⁺ and H⁺,

$[\eqalignno {P(F^+,F^-|{\bf H}^+, &{\bf H}^{-*}) = \cr & \textstyle \int\limits_0^{2\pi} \textstyle \int\limits_0^{2\pi} P(F^+, \alpha^+, F^-, \alpha^- |{\bf H}^+, {\bf H}^{-*})\,\,{\rm d}\alpha^+ \,\, {\rm d}\alpha^-. & (4)}]$

The conditional probability within the integral can be expressed as a product of two conditional probabilities, only one of which is dependent on α⁺,

$[\eqalignno {P(F^+, \alpha^+, &F^-, \alpha^- |{\bf H}^+, {\bf H}^{-*}) &(5)\cr &= P(F^-, \alpha^- |{\bf H}^+, {\bf H}^{-*}) P(F^+, \alpha^+ |F^-, \alpha^-, {\bf H}^+, {\bf H}^{-*}). }]$

Substituting (5) into (4) we obtain

$[\eqalignno {P(F^+, F^-& |{\bf H}^ +, {\bf H}^{-*}) = \textstyle \int\limits_0^{2\pi} P(F^-, \alpha^- |{\bf H}^+, {\bf H}^{-*}) \cr & \times \left [\textstyle\int\limits_0^{2\pi} P(F^+,\alpha^+ |F^-, \alpha^-, {\bf H}^ +, {\bf H}^{-*})\,\,{\rm d}\alpha^+\right] \,\,{\rm d}\alpha^-. &(6)}]$

The integral within the square brackets can be performed analytically to obtain a Rice distribution (§A4). The integration over α⁻ must be performed numerically,

$[\eqalignno {P(F^+, &F^- |{\bf H}^+, {\bf H}^{-*})& (7)\cr & = \textstyle \int\limits_0^{2\pi} P(F^-,\alpha^- |{\bf H}^+, {\bf H}^{-*})P(F^+ |F^-, \alpha^-, {\bf H}^+,{\bf H}^{-*})\,\,{\rm d}\alpha^-.}]$

A2. Multivariate complex normal distribution of {F⁺, F^−, H⁺, H^−}

In order to obtain the probability functions in (2), we start from a multivariate complex normal distribution of structure factors {F⁺, F^−*, H⁺, H^−*}. There is no prior information before fixing the heavy-atom model and so the expected values are zero.

$[\eqalignno{P({\bf F}^+, {\bf F}^{-*}, &{\bf H}^+, {\bf H}^{-*}) \cr = &{1 \over {| \pi \Sigma _{FFHH}|}}\exp \left [- \left(\matrix{ {\bf F}^ + \cr {\bf F}^{-*} \cr {\bf H}^ + \cr {\bf H}^{-*}} \right)^H \Sigma _{FFHH}^{ - 1} \left(\matrix{ {\bf F}^+ \cr {\bf F}^{-*} \cr {\bf H}^+ \cr {\bf H}^{-*} } \right) \right], & (8)}]$

where

$[\Sigma_{FFHH} = \left (\matrix{ \Sigma _{11} & \Sigma _{12} \cr \Sigma _{21} & \Sigma _{22}}\right)]$

and

$[\Sigma_{11} = \left(\matrix{\langle {\bf F}^+ {\bf F}^{+*}\rangle & \langle {\bf F}^+ {\bf F}^- \rangle \cr \langle {\bf F}^+ {\bf F}^- \rangle^* & \langle {\bf F}^- {\bf F}^{-*}\rangle} \right),]$

$[\Sigma_{12} = \left(\matrix{\langle {\bf F}^+ {\bf H}^{+*}\rangle & \langle {\bf F}^+ {\bf H}^- \rangle \cr \langle {\bf F}^- {\bf H}^+ \rangle^* & \langle {\bf F}^{-*} {\bf H}^-\rangle} \right),]$

$[\Sigma_{21} = \Sigma_{12}^H,]$

$[\Sigma_{22} = \left(\matrix{\langle {\bf H}^+ {\bf H}^{+*}\rangle & \langle {\bf H}^+ {\bf H}^-\rangle \cr \langle {\bf H}^+ {\bf H}^- \rangle^* & \langle {\bf H}^- {\bf H}^{-*} \rangle} \right).]$

The covariance matrix Σ_FFHH is shown in terms of submatrices (Σ₁₁, Σ₁₂, Σ₂₁ and Σ₂₂) that will be manipulated when the conditional variables are fixed. A superscript H is used here and elsewhere to denote the Hermitian transpose of a matrix.

In defining F⁺, F^−*, H⁺ and H^−*, we use f and g to represent atomic scattering factors and x and y to represent coordinates for the corresponding crystal and model. In general, the scattering factors are complex to allow for the effects of anomalous scattering so that, for instance, f_k = f_k + i $[f_{k}'']$ . For simplicity, the model can be considered to contain all the atoms present in the crystal (N), but with zero scattering factor for atoms that are not present in the model. The sums can then be divided into contributions from unmodelled (NU atoms) and modelled atoms.

$[\eqalign {{\bf F}^+ & = \textstyle\sum\limits_{k = 1}^N {\bf f}_k \exp (2\pi i{\bf h} \cdot {\bf x}_k) \cr & = \textstyle \sum\limits_{k = 1}^{NU} {\bf f}_k \exp(2\pi i{\bf h} \cdot {\bf x}_k) + \sum\limits_{k = NU + 1}^N {\bf f}_k \exp (2\pi i{\bf h} \cdot {\bf x}_k), \cr {\bf F}^{-*} & = \textstyle \sum\limits_{k = 1}^N {\bf f}_k^* \exp (2\pi i{\bf h} \cdot {\bf x}_k) \cr & = \textstyle \sum\limits_{k = 1}^{NU} {\bf f}_k^* \exp (2\pi i{\bf h} \cdot {\bf x}_k) + \sum\limits_{k = NU + 1}^N {\bf f}_k^* \exp (2\pi i{\bf h} \cdot {\bf x}_k),}]$

$[\eqalignno {{\bf H}^+ & = \textstyle \sum\limits_{k = 1}^N {\bf g}_k \exp (2\pi i{\bf h} \cdot {\bf y}_k) \cr & = \textstyle \sum\limits_{k = NU + 1}^N {\bf g}_k \exp (2\pi i{\bf h} \cdot {\bf y}_k) \cr {\bf H}^{-*} & = \textstyle \sum\limits_{k = 1}^N {\bf g}_k^* \exp (2\pi i{\bf h} \cdot {\bf y}_k),\cr & = \textstyle \sum\limits_{k = NU + 1}^N {\bf g}_k^* \exp (2\pi i{\bf h} \cdot {\bf y}_k). & (9)}]$

Following the reasoning outlined in Pannu et al. (2003 ), the submatrix Σ₂₂ can be filled in as follows:

$[\Sigma_{22} = \left(\matrix{ \varepsilon \Sigma _H & \varepsilon \sigma _{{\bf H}^+ {\bf H}^-} \cr \varepsilon \sigma _{{\bf H}^+ {\bf H}^{-}}^* & \varepsilon \Sigma _H} \right), \eqno (10)]$

where

$[\eqalign {\Sigma_H & = \textstyle \sum\limits_{k = NU + 1}^N | {\bf g}_k |^2, \cr \sigma_{{\bf H}^+ {\bf H}^-} & = \textstyle \sum\limits_{k = NU + 1}^N {\bf g}_k^2 = \sum\limits_{k = NU + 1}^N g_k^2 - g_k''^{2} + 2ig_k g''_k.}]$

The factor ∊ accounts for the statistical effect of symmetry. The submatrix Σ₁₁ is completed similarly,

$[\Sigma_{11} = \left(\matrix{ \varepsilon \Sigma _N & \varepsilon \sigma _{{\bf F}^+ {\bf F}^-} \cr \varepsilon \sigma _{{\bf F}^+ {\bf F}^ -} ^* & \varepsilon \Sigma _N } \right), \eqno (11)]$

where

$[\eqalign {\Sigma _N & = \textstyle \sum\limits_{k = 1}^N | {\bf f}_k|^2, \cr \sigma_{{\bf F}^ + {\bf F}^-} & = \textstyle \sum\limits_{k = 1}^N {\bf f}_k^2 = \textstyle \sum\limits_{k = 1}^N f_k^2 - f''^{2}_k + 2if_k f''_k }.]$

The submatrix Σ₁₂ includes the effects of coordinate error and of differences between the true and modelled atomic scattering factors. In a fashion similar to that described in Read (2003 ), in the context of multiple isomorphous replacement, the elements of Σ₁₂ can be described in terms of the elements of Σ₂₂. Consider one element of the matrix Σ₁₂,

$[\langle {\bf F}^+ {\bf H}^ - \rangle = \varepsilon \textstyle \sum\limits_{k = NU + 1}^N \langle {\bf f}_k {\bf g}_k \exp[2\pi i({\bf x}_k - {\bf y}_k)]\rangle = \varepsilon D\sigma _{{\bf H}^+ {\bf H}^- }. \eqno (12)]$

Here, it is assumed that differences in position are uncorrelated with differences in scattering factor. The factor D accounts for the overall effect of the phase-shift term arising from coordinate errors and absorbs any overall difference in scale between f and g. The same considerations apply to other elements of Σ₁₂, so that

$[\Sigma _{12} = D\Sigma _{22}.]$

As discussed in Read (2003), after the maximum-likelihood refinement of occupancies and B factors, the model atomic scattering factors g_k should be approximately equal to f_k〈exp[2πi(x_k − y_k)]〉, so that the phase shift and scale components of D will cancel and D will be equal to one.

A3. Conditional distribution P(F⁺, F^−|H⁺, H^−)

The conditional distribution P(F⁺, F^−*|H⁺, H^−*) has a mean and covariance matrix given by standard manipulation (Johnson & Wichern, 1998 ) of the above covariance elements,

$[\eqalignno {P({\bf F}^+, {\bf F}^{-*}|{\bf H}^+, {\bf H}^{-*}) &= {1 \over {| \pi \Sigma _{FF}|}}\exp \biggr\{ - \left [\left(\matrix{{\bf F}^+ \cr {\bf F}^{-*}}\right) - {\boldmu}_{FF} \right]^H \cr &\ \quad {\times}\ \Sigma _{FF}^{-1} \left[\left(\matrix{{\bf F}^+ \cr {\bf F}^{-*}} \right) - {\boldmu}_{FF}\right] \biggr\} & (13)}]$

where

$[{\boldmu}_{FF} = \Sigma _{12} \Sigma _{22}^{ - 1} \left(\matrix{ {\bf H}^+ \cr {\bf H}^{-*}} \right) = D\left(\matrix{ {\bf H}^+ \cr {\bf H}^{-*}} \right)]$

and

$[\eqalign {\Sigma _{FF} & = \Sigma _{11} - \Sigma _{12} \Sigma _{22}^{ - 1} \Sigma _{21} \cr & = \Sigma _{11} - D^2 \Sigma _{22} \cr & = \left(\matrix{ \varepsilon \sigma _\Delta ^2 & \varepsilon \sigma _\Phi \cr \varepsilon \sigma _\Phi ^* & \varepsilon \sigma _\Delta ^2} \right), \cr \sigma _\Delta ^2 & = \Sigma _N - D^2 \Sigma _H, \cr \sigma _\Phi & = \sigma _{{\bf F}^ + {\bf F}^- } - D^2 \sigma _{{\bf H}^ + {\bf H}^-}.}]$

The phase component of σ_Φ arises both from errors in the model of anomalous scatterers and from the (perhaps weak) anomalous scattering from atoms not included in the model. It represents the systematic phase shift between the parts of F⁺ and F^−* that are not explained by the model. If the model includes most of the significant anomalous scatterers, the phase shift will be very small and could probably be ignored.

A4. Conditional distributions P(F⁻, α⁻|H⁺, H^−) and P(F⁺, α⁺|F⁻, α⁻, H⁺, H^−)

Again, with standard manipulations (including a change of variable from complex to polar coordinates) we can form the two conditional distributions in (7). For convenience in notation, we define F^−* = F⁻exp(iα⁻), i.e. α⁻ is the phase of the complex conjugate of F⁻,

$[\eqalignno {P(F^-, \alpha^- |{\bf H}^+, {\bf H}^{-*}) & = P(F^-, \alpha^- |{\bf H}^{-*}) &(14)\cr & = {{F^-} \over {\pi \varepsilon \sigma _\Delta ^2 }}\exp \left[{{-|F^- \exp(i\alpha^-) - D{\bf H}^{-*}|^2 } \over {\varepsilon \sigma _\Delta ^2 }} \right], }]$

$[\eqalignno {P(F^+, \alpha^+ &|F^-, \alpha^-, {\bf H}^+, {\bf H}^{-*}) \cr &= {{F^+} \over {\pi \varepsilon \left(\sigma _\Delta ^2 - {{|\sigma _\Phi|^2} \over {\sigma _\Delta ^2}} \right)}} \exp \left [{{ - |F^ + \exp(i\alpha ^+) - {\bf F}_C^+|^2 } \over {\varepsilon \left(\sigma _\Delta^2 - {{| \sigma _\Phi |^2 } \over {\sigma _\Delta ^2}} \right)}} \right], & (15)}]$

where

$[{\bf F}_C^ + = D{\bf H}^+ + {{\sigma _\Phi } \over {\sigma _\Delta ^2 }}[F^- \exp(i\alpha^-) - D{\bf H}^{-*}].]$

The phase α⁺ can be integrated out analytically to obtain the Rice distribution, which appears frequently in crystallographic literature (e.g. Sim, 1959 ),

$[\eqalignno {P(F^+ |F^-, \alpha^-, &{\bf H}^+, {\bf H}^{-*}) = {{2F^+} \over {\varepsilon \left(\sigma _\Delta ^2 - {{|\sigma _\Phi|^2 } \over {\sigma _\Delta ^2 }} \right)}}&(16)\cr & \times \exp \left[- {{F^{+2} + F_C^{+2}} \over {\varepsilon \left(\sigma _\Delta ^2 - {{|\sigma _\Phi|^2 } \over {\sigma _\Delta ^2 }} \right)}} \right]I_0 \left [{{2F^ + F_C^+ } \over {\varepsilon \left(\sigma _\Delta ^2 - {{|\sigma _\Phi|^2 } \over {\sigma _\Delta ^2 }} \right)}} \right]. }]$

A5. Conditional distribution P(F⁺, F⁻|H⁺, H^−*)

Using the probabilities (14) and (16) in (7) and making the substitution

$[\sigma _\Phi = \sigma _\Delta ^2 D_\Phi \exp(i\alpha _\Phi)\,\,{\rm where }\,\,0 \le D_\Phi \le 1,]$

we obtain (2) as presented above,

$[\eqalignno {P_{\rm SAD} & = {{2F^ + F^-} \over {\pi \varepsilon ^2 (1 - D_\Phi ^2)\sigma_\Delta^4 }}{\textstyle \int\limits_0^{2\pi} }\exp \bigg[- {{|F^- \exp(i\alpha^-) - D{\bf H}^{-*}|^2 } \over {\varepsilon \sigma _\Delta ^2 }} \cr &\ \quad -\ {{F^{+2} + F_C^{+2}} \over {\varepsilon (1 - D_\Phi^2)\sigma _\Delta ^2 }} \bigg] I_0 \left [{{2F^+ F_C^+} \over {\varepsilon (1 - D_\Phi^2)\sigma _\Delta ^2 }} \right]\,\,{\rm d}\alpha^-,}]$

where

$[F_C^+ = |D{\bf H}^+ + D_\Phi \exp(i\alpha_\Phi) [F^- \exp(i\alpha^-) - D{\bf H}^{-*}]|.]$

APPENDIX B

Implementation of SAD likelihood function

For numerical stability it is convenient to express (2) in terms of the exponential Bessel function eI₀(x) = exp(−x)I₀(x) (Cody & Stoltz, 1989 ). During refinement of the heavy-atom parameters, the D values are absorbed by the occupancies and B factors of the heavy atoms and are therefore not included. The term (1 − $[D_{\Phi}^{2}]$ ) $[\sigma _\Delta ^2]$ can be problematic during refinement because D_Φ and σ_Δ are on very different scales (D_Φ is very close to 1, while σ_Δ is large) and (1 − $[D_{\Phi}^{2}]$ ) must remain positive (i.e. $[D_{\Phi}^{2}]$ must remain between 0 and 1). In order to avoid these problems, we introduce a parameter σ₊ to replace this term. This removes the problem of scale and simplifies the constraint to one in which σ₊ must remain positive.

Up to this point, we have derived the function in terms of the true values of F⁺ and F⁻. If we use the experimental observations of their values, F_O⁺ and F_O^-, we need to consider the experimental errors, which will be described by variance parameters $[\sigma _{F_O^ + }^2]$ and $[\sigma _{F_O^ - }^2]$ . In the case of MIR phasing, the effect of measurement error in the observed amplitude can be approximated by inflating the corresponding variance element of the covariance matrix (Pannu et al., 2003), as suggested by others (Green, 1979 ; de La Fortelle & Bricogne, 1997; Murshudov et al., 1997 ). The increment to the variance ends up in the variance of the Rice distribution for each observed amplitude. However, if this approach is taken for the SAD function, the variances for the component distributions of P_SAD become unnecessarily complicated. Rather than inflating the diagonal elements of the covariance matrix, we have chosen instead to inflate the variances of the conditional distributions for each observation that are the components of P_SAD. The variance term for P( F_O^-, α⁻|H⁺, H⁻*) only needs to account for errors in the measurement of F_O^-, but the variance term for P( F_O⁺, α⁺| F_O^-, α⁻, H⁺, H⁻*) needs to account for errors in both measurements, as the expected value of F_O⁺ is computed using the measured value of F_O^-, weighted by D_Φ. The magnitude of D_Φ will typically be very close to one, so the weighting factor on the variance of F_O^- can be ignored; the very small systematic decrease in the contribution from the experimental error in F_O^- owing to D_Φ can be absorbed by σ₊. Numerical simulations show that this approximation to the effect of measurement error gives almost identical results to those obtained by inflating the diagonal elements of the covariance matrix.

The target function for anomalous scatterer refinement in PHASER is thus given by

$[\eqalignno {- \ln (P_{\rm SAD}) = &- \ln \biggr\{ {{2F_O^ + F_O^ - } \over {\pi \Sigma ^ + \Sigma ^ - }} {\textstyle \int\limits_0^{2\pi}} \exp \biggr[- {{| F_O^ - \exp(i\alpha^ -) - {\bf H}^{-*}|^2 } \over {\Sigma ^ - }} \cr &\ \quad -\ {{(F_O^ + - F_C^ +)^2 } \over {\Sigma ^ + }} \biggr] eI_0 \left({{2F_O^ + F_C^ + } \over {\Sigma ^ + }} \right)\,\,{\rm d}\alpha^ - \biggr \}, & (17)}]$

where

$[\eqalign {\Sigma^- & = \varepsilon \sigma_\Delta^2 + \sigma_{F_O^-}^2, \cr \Sigma^+ & = \varepsilon \sigma_+ + \sigma_{F_O^+}^2 + \sigma_{F_O^-}^2, \cr F_C^+ & = |{\bf H}^+ + D_\Phi \exp(i\alpha_\Phi)[F_O^- \exp(i\alpha^-) - {\bf H}^{-*}]|.}]$

Initial estimates for $[\sigma_\Delta ^2]$ can be obtained for each resolution shell by subtracting the mean value of |H⁻|² from the mean value of F_O^-2. Initial estimates for σ₊ could in principle be obtained as a weighted average of ( F_O⁺ - F_C⁺)² over the phase integral, weighted by the phase probability distribution. In practice, σ₊ will be comparable in size to the contributions from measurement errors and can be readily refined from an initial estimate given by the mean value of $[\sigma_{F_O^ + }^2]$ .

Acknowledgements

We thank Z. Dauter and M. Turkenburg for the diffraction data sets used in the test cases. This work was funded by NIH/NIGMS under grant No. 1P01GM063210 and by a Principal Research Fellowship from the Wellcome Trust (RJR).

References

Burla, M. C., Carrazzini, B., Cascarano, G. L., Giacovazzo, C., Polidori, G. & Siliqi, G. (2002). Acta Cryst. D58, 928–935. Web of Science CrossRef CAS IUCr Journals Google Scholar
Cody, W. J. & Stoltz, L. (1989). ACM TOMS, 15, 41–48. CrossRef Web of Science Google Scholar
Collaborative Computational Project, Number 4 (1994). Acta Cryst. D50, 760–763. CrossRef IUCr Journals Google Scholar
Dauter, Z. & Adamiak, D. A. (2001). Acta Cryst. D57, 990–995. Web of Science CrossRef CAS IUCr Journals Google Scholar
Dodson, E. (2003). Acta Cryst. D59, 1958–1965. Web of Science CrossRef CAS IUCr Journals Google Scholar
Giacovazzo, C. (1983). Acta Cryst. A39, 585–592. CrossRef CAS Web of Science IUCr Journals Google Scholar
Giacovazzo, C., Ladisa, M. & Siliqi, D. (2003). Acta Cryst. A59, 262–265. Web of Science CrossRef CAS IUCr Journals Google Scholar
Giacovazzo, C. & Siliqi, D. (2001a). Acta Cryst. A57, 40–46. Web of Science CrossRef CAS IUCr Journals Google Scholar
Giacovazzo, C. & Siliqi, D. (2001b). Acta Cryst. A57, 414–419. Web of Science CrossRef CAS IUCr Journals Google Scholar
Green, E. A. (1979). Acta Cryst. A35, 351–359. CrossRef CAS IUCr Journals Web of Science Google Scholar
Griewank, A., Juedes, D., Mitev, H., Utke, J., Vogel, O. & Walther, A. (1996). ACM TOMS, 22, 131–167. CrossRef Web of Science Google Scholar
Hauptman, H. (1982). Acta Cryst. A38, 632–641. CrossRef CAS Web of Science IUCr Journals Google Scholar
Hendrickson, W. A. (1991). Science, 254, 51–58. CrossRef PubMed CAS Web of Science Google Scholar
Johnson, R. A. & Wichern, D. W. (1998). Applied Multivariate Statistical Analysis, 4th ed. New Jersey: Prentice–Hall. Google Scholar
La Fortelle, E. de & Bricogne, G. (1997). Methods Enzymol. 276, 472–494. Google Scholar
Matthews, B. W. (1966). Acta Cryst. 20, 82–86. CrossRef IUCr Journals Web of Science Google Scholar
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Acta Cryst. D53, 240–255. CrossRef CAS Web of Science IUCr Journals Google Scholar
North, A. C. T. (1965). Acta Cryst. 18, 212–216. CrossRef IUCr Journals Web of Science Google Scholar
Otwinowski, Z. (1991). Proceedings of the CCP4 Study Weekend. Isomorphous Replacement and Anomalous Scattering, edited by W. Wolf, P. R. Evans & A. G. W. Leslie, pp. 80–86. Warrington: Daresbury Laboratory. Google Scholar
Pannu, N. S., McCoy, A. J. & Read, R. J. (2003). Acta Cryst. D59, 1801–1808. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pannu, N. S., Murshudov, G. N., Dodson, E. J. & Read, R. J. (1998). Acta Cryst. D54, 1285–1294. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pannu, N. S. & Read, R. J. (2004). Acta Cryst. D60, 22–27. Web of Science CrossRef CAS IUCr Journals Google Scholar
Read, R. J. (1986). Acta Cryst. A42, 140–149. CrossRef CAS Web of Science IUCr Journals Google Scholar
Read, R. J. (2003). Acta Cryst. D59, 1891–1902. Web of Science CrossRef CAS IUCr Journals Google Scholar
Rice, L. M., Earnest, T. N. & Brünger, A. T. (2000). Acta Cryst. D56, 1413–1420. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sim, G. A. (1959). Acta Cryst. 12, 813–815. CrossRef IUCr Journals Web of Science Google Scholar
Terwilliger, T. C. & Berendzen, J. (1997). Acta Cryst. D53, 571–579. CrossRef CAS Web of Science IUCr Journals Google Scholar
Terwilliger, T. C. & Eisenberg, D. (1987). Acta Cryst. A43, 6–13. CrossRef CAS Web of Science IUCr Journals Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.

STRUCTURAL
BIOLOGY

ISSN: 2059-7983

Volume 60| Part 7| July 2004| Pages 1220-1228

https://doi.org/10.1107/S0907444904009990