- 1. Introduction
- 2. Results
- 3. Discussion
- A1. General SAD likelihood function
- A2. Multivariate complex normal distribution of {F+, F−*, H+, H−*}
- A3. Conditional distribution P(F+, F−*|H+, H−*)
- A4. Conditional distributions P(F−, α−|H+, H−*) and P(F+, α+|F−, α−, H+, H−*)
- A5. Conditional distribution P(F+, F−|H+, H−*)
- References

- 1. Introduction
- 2. Results
- 3. Discussion
- A1. General SAD likelihood function
- A2. Multivariate complex normal distribution of {F+, F−*, H+, H−*}
- A3. Conditional distribution P(F+, F−*|H+, H−*)
- A4. Conditional distributions P(F−, α−|H+, H−*) and P(F+, α+|F−, α−, H+, H−*)
- A5. Conditional distribution P(F+, F−|H+, H−*)
- References

## research papers

## Simple algorithm for a

SAD function^{a}University of Cambridge, Department of Haematology, Cambridge Institute for Medical Research, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 2XY, England^{*}Correspondence e-mail: rjr27@cam.ac.uk

Recently, the multivariate complex normal distribution has been used to develop a ), *Acta Cryst.* D**60**, 22–27]. The function accounts explicitly for the correlations between the observed and calculated Friedel mates and their errors. However, the method of derivation of the equation described by Pannu & Read (2004) leads to a complicated likelihood expression that suffers from a number of algorithmic limitations. Here, an alternative derivation of the *P*_{SAD} function is described that leads to simplified algorithmic requirements and that allows an intuitive understanding of the expression.

Keywords: single-wavelength anomalous diffraction; multivariate complex normal; maximum likelihood; experimental phasing.

### 1. Introduction

The availability of tuneable synchrotron sources allowed the development of multiple-wavelength anomalous diffraction (MAD; Hendrickson, 1991) phasing experiments, which today underpin many high-throughput structural biology efforts around the world. With improvements in synchrotron sources, cryocooling of crystals and increased detector sensitivity, phasing by single-wavelength anomalous diffraction (SAD) has become not only feasible, but in some cases preferable to phasing by MAD, particularly where radiation damage is significant (Rice *et al.*, 2000; Dodson, 2003) or where the for the anomalous scatterer is not accessible (*e.g.* sulfur, xenon). However, until recently technical improvements in the SAD experiment had not been matched by corresponding improvements in the theory for obtaining phases from SAD.

A *P*_{SAD} of the (unphased) model structure factors *F*^{+} and *F*^{−} given the (phased) calculated heavy-atom structure factors **H**^{+} and **H**^{−},

where *F*^{+} = |**F**^{+}| and *F*^{−} = |**F**^{−}|. *F*^{+} and *F*^{−} are highly correlated and so *P*_{SAD} cannot be approximated by a product of independent probabilities for the two observations *F*^{+} and *F*^{−}. Also highly correlated are the substructure-model errors contributing to the conditional probability distribution of *F*^{+} and *F*^{−}, since they are generated by the same set of anomalous scatterers. These correlations must be included in the probability distribution for a complete analysis.

Traditional methods for SAD phasing have avoided the complication of including the correlations by using the mean *F* and the Bijvoet difference (*F* and Δ*F*^{±}) rather than *F*^{+} and *F*^{−}, as these are relatively independent and have relatively independent errors. In these treatments, the distribution of Bijvoet differences has been assumed to be Gaussian (North, 1965; Matthews, 1966; de La Fortelle & Bricogne, 1997). More recently, joint probability distributions for *F*^{+} and *F*^{−} have been described that go some way towards addressing the problem (Hauptman, 1982; Giacovazzo, 1983; Burla *et al.*, 2002; Giacovazzo & Siliqi, 2001*a*,*b*; Terwilliger & Eisenberg, 1987), but it was not until Pannu & Read (2004) that a *P*_{SAD} function was described that accounted explicitly for the correlations in the SAD experiment,

where Σ_{4} is the (Hermitian) covariance matrix of the tetravariate complex Gaussian distribution *P*(**F**^{+}, **F**^{−*}, **H**^{+}, **H**^{−*}), Σ_{2} is the (Hermitian) covariance matrix of the bivariate Gaussian complex distribution *P*(**H**^{+}, **H**^{−*}) and α^{−}, and are the phases of **F**^{−*}, **H**^{+} and **H**^{−*}, respectively. It is assumed that the reflections are independent, so the total likelihood is the product of the reflection likelihoods.

The complexity of (1) is immediately apparent. There are 20 different coefficients arising from the inverse of the covariance matrices Σ_{4} (ten real, six imaginary) and Σ_{2} (three real, one imaginary). During Σ_{4} and Σ_{2} must be kept positive definite and in the implementation of the *P*_{SAD} function described by Pannu & Read (2004) this was performed by setting negative eigenvalues to zero during calculation of their inverses by singular value decomposition. The derivatives of the function become even more verbose. In the implementation described by Pannu & Read (2004), derivatives were not calculated analytically. Instead, an automatic differentiation method (*ADOLC*; Griewank *et al.*, 1996) was used to obtain the gradient vectors. The complex functional form of (1) makes it difficult to get an intuitive feel for the effects of the different parameters or the physical meaning of the terms.

Here, we present an alternative derivation of a *P*_{SAD} function that has only three unique error parameters, does not involve matrix inversion, allows analytic derivatives to be calculated easily and provides an intuitive understanding of the SAD experiment.

### 2. Results

#### 2.1. SAD likelihood function

Equation (1) was derived by finding the expression for *P*(**F**^{+}, **F**^{−*}, **H**^{+}, **H**^{−*}), integrating out the unknown phases to obtain the joint probability *P*(*F*^{+}, *F*^{−}, **H**^{+}, **H**^{−*}) and then fixing the calculated structure factors and renormalizing to obtain the desired conditional probability *P*(*F*^{+}, *F*^{−}|**H**^{+}, **H**^{−*}). If, instead, the order of the operations is reversed and the conditional probability *P*(**F**^{+}, **F**^{−*}|**H**^{+}, **H**^{−*}) is formed before integrating out the unknown phases, we obtain (Appendix *A*) the expression

where

This equation contains three error parameters derived from the initial covariance matrix (σ_{Δ}, *D*_{Φ} and α_{Φ}). Again, it is assumed that the reflections are independent so that the total likelihood is the product of the reflection probabilities.

(2) was derived by integrating out the phase α^{+} analytically, leaving the integration over α^{−} to be performed numerically. Equivalently, the phase α^{−} could have been integrated out analytically, leaving the integration over α^{+} to be performed numerically. Numerical integration tests comparing these two forms of the equation confirm that they give the same values for *P*_{SAD} (data not shown).

#### 2.2. Phase probabilities and maps

*P*_{SAD} is obtained by integrating *P*(*F*^{+}, *F*^{−}, α^{−}|**H**^{+}, **H**^{−*}) over α^{−}. The conditional probability distribution of α^{−} can be obtained by fixing *F*^{+} and *F*^{−} in the joint distribution *P*(*F*^{+}, *F*^{−}, α^{−}|**H**^{+}, **H**^{−*}) and renormalizing to obtain *P*(α^{−}|*F*^{+}, *F*^{−}, **H**^{+}, **H**^{−*}). In other words, the probability distribution for this phase is proportional to the integrand in (2). The roles of *F*^{+} and *F*^{−} can be reversed to obtain the probability distribution for α^{+}.

For building an atomic model into electron density one is generally most interested in the map representing the normal (real) scattering component, although the map representing the imaginary component is often useful as well. When the relative contribution of the imaginary component of the anomalous scatterers is small, a map computed using either the centroid (figure-of-merit-weighted) estimate of **F**^{+} or the centroid estimate of **F**^{−*} (making the usual assumption in the map calculation that Friedel's law applies) will differ little from the map representing the real component of the electron density. However, in the presence of very strong anomalous scatterers the phases of **F**^{+} and **F**^{−*} will differ significantly. Therefore, for generality it is better either to compute a complex electron-density map by providing separate coefficients for **F**^{+} and **F**^{−} or to compute separate real and imaginary electron-density maps with coefficients obtained from figure-of-merit-weighted (**F**^{+} + **F**^{−*})/2 and exp(−π*i*/2)(**F**^{+} − **F**^{−*})/2, respectively.

#### 2.3. Implementation and test cases

The *P*_{SAD} function described above, with slight modifications for numerical stability and the inclusion of the effect of experimental errors (Appendix *B*), was implemented in the program *PHASER*. Analytic derivatives were used to calculate the gradients. Optimal anomalous scatterer and error parameters were found by minimizing the minus log-likelihood.

Results of the implementation in *PHASER* were compared with results from the programs *MLPHARE* (version 4.0; Otwinowski, 1991; Collaborative Computational Project, Number 4, 1994), *SOLVE* (version 2.02; Terwilliger & Berendzen, 1997) and *SHARP* (version 2.0; de La Fortelle & Bricogne, 1997). Tests were performed with the two publicly available data sets used by Pannu & Read (2004): the 90° and the 360° pass data sets of a Z-form DNA hexamer duplex phased on ten intrinsic P atoms (Dauter & Adamiak, 2001). The results (Table 1) for *MLPHARE* and *SOLVE* were comparable to those reported by Pannu & Read (2004), but the results for *SHARP* were significantly better, as instead of using the default protocol, the protocol was customized to the test case. Statistics for the implementation of *P*_{SAD} in *PHASER* were not significantly different from those reported for the *P*_{SAD} function implemented in Pannu & Read (2004), confirming that when the parameters have been optimized (1) and (2) give very similar final phase distributions.

B factors were refined. Occupancies were not refined. ‡Coordinates, isotropic B factors and occupancies were refined. The minimum allowed B factor was zero. §Coordinates, isotropic B factors and the global and local imperfection parameters on anomalous differences were refined. ¶Coordinates, isotropic B factors, occupancies and variance parameters were refined. ††Statistics calculated with SFTOOLS (B. Hazes, unpublished work; Collaborative Computational Project, Number 4, 1994). Map correlation compared the figure-of-merit-weighted map from experimental phasing with the figure-of-merit-weighted SIGMAA (Read, 1986) map calculated with phases from the final model. |

### 3. Discussion

The *P*_{SAD} expression described in (2) is simpler than that in (1). It has several algorithmic advantages: the parameterization is compact, of heavy-atom parameters does not involve the inversion of covariance matrices, and analytic derivatives can be determined easily. It is thus likely to be much more robust when applied to a wide range of SAD data sets.

In general, a *PHASER* for the poorer (90°) data set is closer to the mean cosine of the phase error than that produced by the other three programs. This suggests that the *P*_{SAD} function gives better phase probability distribution estimates for use in density modification (required to break the phase ambiguity present in SAD phasing) when the phasing is marginal.

The *P*_{SAD} function can also be used for the of models containing anomalous scatterers (Garib Murshudov, personal communication). In model fast calculation of the target function is of key importance as other aspects of the algorithm are already time-consuming given the large number of atomic parameters (*e.g.* the structure-factor calculation). The reduced parameterization for *P*_{SAD} should also be helpful for this application.

The new formulation of *P*_{SAD} also allows a more intuitive understanding of the SAD likelihood function. As shown in the appendices, *P*_{SAD} can be expressed as the integral of the product of two functions,

where

In this version of the expression for *P*_{SAD}, the variances Σ^{+} and Σ^{−} have been inflated (as discussed in Appendix *B*) to account for the effect of experimental error. The first distribution in the product expresses what is known about one observation, *F*_{O}^{-}, when only the corresponding calculated **H**^{−} is given; accordingly, its variance Σ^{−} accounts for what is left unexplained by **H**^{−}. (Once **H**^{−} is known, no further information about *F*_{O}^{-} is added by the knowledge of **H**^{+}, so this part of the distribution does not depend on **H**^{+}.) The second distribution expresses what is known about the second observation, *F*_{O}^{+}, when *F*_{O}^{-} (phased by some value of the variable of integration, α^{ −}) and both calculated structure factors are given; accordingly, its variance Σ^{+} accounts for what is left unexplained by the value of **F**^{+} predicted from the other three structure factors. To a good approximation, the first distribution provides a `Sim factor' to account for the information given by the partial structure (primarily normal scattering), while the second distribution takes account of the anomalous difference. While the mathematical details differ considerably, the SAD phasing function presented recently by Giacovazzo *et al.* (2003) also combines a term arising from anomalous differences with a Sim-like term. Note that when expressed using the exponential Bessel function (*eI*_{0}), the second distribution in (3) has the same exponential term as a Gaussian distribution. The exponential Bessel function will tend to be flatter than the Gaussian component and so the Gaussian component will dominate the shape of the distribution. This resemblance to a Gaussian distribution explains why the Gaussian approximation, comparing the calculated and observed anomalous differences, is reasonably successful.

The influence of the two components of *P*_{SAD} is shown in Figs. 1 and 2. Fig. 1 illustrates the situation characteristic of SAD phasing, in which the model consists of only the strong anomalous scatterers. In this case, the model of the normal scattering component is very incomplete, so the first (Sim) distribution is very broad and serves primarily to break the phase ambiguity of the second (anomalous difference) distribution. By contrast, Fig. 2 illustrates the situation that would occur in full model against SAD data, where the model of the normal scattering component is nearly complete so the Sim distribution will tend to dominate, while the anomalous difference distribution will provide a weak bimodal indication of the correct phase.

(3) bears a close resemblance to the phased MLHL target (Pannu *et al.*, 1998) for model so one would expect of a full model against the MLHL target (if appropriately implemented) to yield similar results to against the SAD target. In the MLHL target, an integration over possible phases in the Sim probability distribution is weighted by prior knowledge of the phase probability distribution. If no significant improvement were made in the anomalous scatterer model, the second (anomalous difference) component of (3) would not change during the course of so it could be used as a constant source of prior phase information in the MLHL target. Note, however, that it would not be appropriate to provide prior phase information to MLHL in the form of the full phase probability distribution obtained by normalizing the integrand of *P*_{SAD}, because the normal scattering from the anomalous scatterers would then appear twice, in both the Sim component of *P*_{SAD} and the Sim component of MLHL. When the imaginary () contribution to the is weak compared with the real (*f* + ) contribution, the amplitude of the real scattering component can be approximated reasonably well by the mean of *F*_{O}^{+} and *F*_{O}^{-}. Typically, such a mean amplitude would be used in the MLHL target. However, in the presence of very strong anomalous scatterers this approximation breaks down. By analogy with (3), the Sim component of the MLHL target should then compare the observed value of one of the Friedel mates with its corresponding calculated value (including the imaginary contribution). Compared with such an implementation of the MLHL target, any improvement from using the SAD function for model would only arise through improvements in the anomalous scatterering model during the course of The model of strong anomalous scatterers is unlikely to change substantially during subsequent full model so the main potential for improvement with the SAD function will come from accounting for partially occupied sites and the weak from the rest of the structure, such as C, N and O atoms.

### APPENDIX A

### Derivation of SAD likelihood function

#### A1. General SAD likelihood function

For our *P*_{SAD} function we obtain first the probability of the true *F*^{+} and *F*^{−} (unphased) given the heavy-atom structure factors **H**^{+} and **H**^{−*} (phased). (A correction for the effect of measurement error will be introduced later; see Appendix *B*). We derive this expression from the probability of the true phased structure factors **F**^{+} and **F**^{−*} given the calculated heavy-atom structure factors** H**^{+} and **H**^{−*} and then integrate out the phases. Complex conjugates are used for **F**^{−} and **H**^{−} because these are much more highly correlated with their Friedel mates, **F**^{+} and **H**^{+},

The conditional probability within the integral can be expressed as a product of two conditional probabilities, only one of which is dependent on α^{+},

Substituting (5) into (4) we obtain

The integral within the square brackets can be performed analytically to obtain a Rice distribution (§*A*4). The integration over *α*^{−} must be performed numerically,

#### A2. Multivariate complex normal distribution of {*F*^{+}, *F*^{−*}, **H**^{+}**,** **H**^{−*}}

In order to obtain the probability functions in (2), we start from a multivariate complex normal distribution of structure factors {**F**^{+}, **F**^{−*}, **H**^{+}, **H**^{−*}}. There is no prior information before fixing the heavy-atom model and so the expected values are zero.

where

and

The covariance matrix Σ_{FFHH} is shown in terms of submatrices (Σ_{11}, Σ_{12}, Σ_{21} and Σ_{22}) that will be manipulated when the conditional variables are fixed. A superscript *H* is used here and elsewhere to denote the Hermitian transpose of a matrix.

In defining **F**^{+}, **F**^{−*}, **H**^{+} and **H**^{−*}, we use **f** and **g** to represent atomic scattering factors and **x** and **y** to represent coordinates for the corresponding crystal and model. In general, the scattering factors are complex to allow for the effects of so that, for instance, **f**_{k} = *f*_{k} + *i*. For simplicity, the model can be considered to contain all the atoms present in the crystal (*N*), but with zero scattering factor for atoms that are not present in the model. The sums can then be divided into contributions from unmodelled (*NU* atoms) and modelled atoms.

Following the reasoning outlined in Pannu *et al.* (2003), the submatrix Σ_{22} can be filled in as follows:

where

The factor ∊ accounts for the statistical effect of symmetry. The submatrix Σ_{11} is completed similarly,

where

The submatrix Σ_{12} includes the effects of coordinate error and of differences between the true and modelled atomic scattering factors. In a fashion similar to that described in Read (2003), in the context of multiple the elements of Σ_{12} can be described in terms of the elements of Σ_{22}. Consider one element of the matrix Σ_{12},

Here, it is assumed that differences in position are uncorrelated with differences in scattering factor. The factor *D* accounts for the overall effect of the phase-shift term arising from coordinate errors and absorbs any overall difference in scale between **f** and **g**. The same considerations apply to other elements of Σ_{12}, so that

As discussed in Read (2003), after the of occupancies and *B* factors, the model atomic scattering factors **g**_{k} should be approximately equal to **f*** _{k}*〈exp[2πi(

**x**

*−*

_{k}**y**

*)]〉, so that the phase shift and scale components of*

_{k}*D*will cancel and

*D*will be equal to one.

#### A3. Conditional distribution *P*(**F**^{+}, **F**^{−*}|**H**^{+}, **H**^{−*})

The conditional distribution *P*(**F**^{+}, **F**^{−*}|**H**^{+}, **H**^{−*}) has a mean and covariance matrix given by standard manipulation (Johnson & Wichern, 1998) of the above covariance elements,

where

and

The phase component of σ_{Φ} arises both from errors in the model of anomalous scatterers and from the (perhaps weak) from atoms not included in the model. It represents the systematic phase shift between the parts of **F**^{+} and **F**^{−*} that are not explained by the model. If the model includes most of the significant anomalous scatterers, the phase shift will be very small and could probably be ignored.

#### A4. Conditional distributions *P*(*F*^{−}, α^{−}|**H**^{+}, **H**^{−*}) and *P*(*F*^{+}, α^{+}|*F*^{−}, α^{−}, **H**^{+}, **H**^{−*})

Again, with standard manipulations (including a change of variable from complex to polar coordinates) we can form the two conditional distributions in (7). For convenience in notation, we define **F**^{−*} = *F*^{−}exp(*i*α^{−}), *i.e.* α^{−} is the phase of the complex conjugate of **F**^{−},

where

The phase α^{+} can be integrated out analytically to obtain the Rice distribution, which appears frequently in crystallographic literature (*e.g.* Sim, 1959),

### APPENDIX B

### Implementation of SAD likelihood function

For numerical stability it is convenient to express (2) in terms of the exponential Bessel function *eI*_{0}(*x*) = exp(−*x*)*I*_{0}(*x*) (Cody & Stoltz, 1989). During of the heavy-atom parameters, the *D* values are absorbed by the occupancies and *B* factors of the heavy atoms and are therefore not included. The term (1 − ) can be problematic during because *D*_{Φ} and σ_{Δ} are on very different scales (*D*_{Φ} is very close to 1, while σ_{Δ} is large) and (1 − ) must remain positive (*i.e.* must remain between 0 and 1). In order to avoid these problems, we introduce a parameter σ_{+} to replace this term. This removes the problem of scale and simplifies the constraint to one in which σ_{+} must remain positive.

Up to this point, we have derived the function in terms of the true values of *F*^{+} and *F*^{−}. If we use the experimental observations of their values, *F*_{O}^{+} and *F*_{O}^{-}, we need to consider the experimental errors, which will be described by variance parameters and . In the case of MIR phasing, the effect of measurement error in the observed amplitude can be approximated by inflating the corresponding variance element of the covariance matrix (Pannu *et al.*, 2003), as suggested by others (Green, 1979; de La Fortelle & Bricogne, 1997; Murshudov *et al.*, 1997). The increment to the variance ends up in the variance of the Rice distribution for each observed amplitude. However, if this approach is taken for the SAD function, the variances for the component distributions of *P*_{SAD} become unnecessarily complicated. Rather than inflating the diagonal elements of the covariance matrix, we have chosen instead to inflate the variances of the conditional distributions for each observation that are the components of *P*_{SAD}. The variance term for *P*( *F*_{O}^{-}, α^{−}|**H**^{+}, **H**^{−}*) only needs to account for errors in the measurement of *F*_{O}^{-}, but the variance term for *P*( *F*_{O}^{+}, α^{+}| *F*_{O}^{-}, α^{−}, **H**^{+}, **H**^{−}*) needs to account for errors in both measurements, as the expected value of *F*_{O}^{+} is computed using the measured value of *F*_{O}^{-}, weighted by *D*_{Φ}. The magnitude of *D*_{Φ} will typically be very close to one, so the weighting factor on the variance of *F*_{O}^{-} can be ignored; the very small systematic decrease in the contribution from the experimental error in *F*_{O}^{-} owing to *D*_{Φ} can be absorbed by σ_{+}. Numerical simulations show that this approximation to the effect of measurement error gives almost identical results to those obtained by inflating the diagonal elements of the covariance matrix.

The target function for anomalous scatterer *PHASER* is thus given by

where

Initial estimates for can be obtained for each resolution shell by subtracting the mean value of |**H**^{−}|^{2} from the mean value of *F*_{O}^{-2}. Initial estimates for σ_{+} could in principle be obtained as a weighted average of ( *F*_{O}^{+} - *F*_{C}^{+})^{2} over the phase integral, weighted by the phase probability distribution. In practice, σ_{+} will be comparable in size to the contributions from measurement errors and can be readily refined from an initial estimate given by the mean value of .

### Acknowledgements

We thank Z. Dauter and M. Turkenburg for the diffraction data sets used in the test cases. This work was funded by NIH/NIGMS under grant No. 1P01GM063210 and by a Principal Research Fellowship from the Wellcome Trust (RJR).

### References

Burla, M. C., Carrazzini, B., Cascarano, G. L., Giacovazzo, C., Polidori, G. & Siliqi, G. (2002). *Acta Cryst.* D**58**, 928–935. Web of Science CrossRef CAS IUCr Journals Google Scholar

Cody, W. J. & Stoltz, L. (1989). *ACM TOMS*, **15**, 41–48. CrossRef Web of Science Google Scholar

Collaborative Computational Project, Number 4 (1994). *Acta Cryst.* D**50**, 760–763. CrossRef IUCr Journals Google Scholar

Dauter, Z. & Adamiak, D. A. (2001). *Acta Cryst.* D**57**, 990–995. Web of Science CrossRef CAS IUCr Journals Google Scholar

Dodson, E. (2003). *Acta Cryst.* D**59**, 1958–1965. Web of Science CrossRef CAS IUCr Journals Google Scholar

Giacovazzo, C. (1983). *Acta Cryst.* A**39**, 585–592. CrossRef CAS Web of Science IUCr Journals Google Scholar

Giacovazzo, C., Ladisa, M. & Siliqi, D. (2003). *Acta Cryst.* A**59**, 262–265. Web of Science CrossRef CAS IUCr Journals Google Scholar

Giacovazzo, C. & Siliqi, D. (2001*a*). *Acta Cryst.* A**57**, 40–46. Web of Science CrossRef CAS IUCr Journals Google Scholar

Giacovazzo, C. & Siliqi, D. (2001*b*). *Acta Cryst.* A**57**, 414–419. Web of Science CrossRef CAS IUCr Journals Google Scholar

Green, E. A. (1979). *Acta Cryst.* A**35**, 351–359. CrossRef CAS IUCr Journals Web of Science Google Scholar

Griewank, A., Juedes, D., Mitev, H., Utke, J., Vogel, O. & Walther, A. (1996). *ACM TOMS*, **22**, 131–167. CrossRef Web of Science Google Scholar

Hauptman, H. (1982). *Acta Cryst.* A**38**, 632–641. CrossRef CAS Web of Science IUCr Journals Google Scholar

Hendrickson, W. A. (1991). *Science*, **254**, 51–58. CrossRef PubMed CAS Web of Science Google Scholar

Johnson, R. A. & Wichern, D. W. (1998). *Applied Multivariate Statistical Analysis*, 4th ed. New Jersey: Prentice–Hall. Google Scholar

La Fortelle, E. de & Bricogne, G. (1997). *Methods Enzymol.* **276**, 472–494. Google Scholar

Matthews, B. W. (1966). *Acta Cryst.* **20**, 82–86. CrossRef IUCr Journals Web of Science Google Scholar

Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). *Acta Cryst.* D**53**, 240–255. CrossRef CAS Web of Science IUCr Journals Google Scholar

North, A. C. T. (1965). *Acta Cryst.* **18**, 212–216. CrossRef IUCr Journals Web of Science Google Scholar

Otwinowski, Z. (1991). *Proceedings of the CCP4 Study Weekend. Isomorphous Replacement and Anomalous Scattering*, edited by W. Wolf, P. R. Evans & A. G. W. Leslie, pp. 80–86. Warrington: Daresbury Laboratory. Google Scholar

Pannu, N. S., McCoy, A. J. & Read, R. J. (2003). *Acta Cryst.* D**59**, 1801–1808. Web of Science CrossRef CAS IUCr Journals Google Scholar

Pannu, N. S., Murshudov, G. N., Dodson, E. J. & Read, R. J. (1998). *Acta Cryst.* D**54**, 1285–1294. Web of Science CrossRef CAS IUCr Journals Google Scholar

Pannu, N. S. & Read, R. J. (2004). *Acta Cryst.* D**60**, 22–27. Web of Science CrossRef CAS IUCr Journals Google Scholar

Read, R. J. (1986). *Acta Cryst.* A**42**, 140–149. CrossRef CAS Web of Science IUCr Journals Google Scholar

Read, R. J. (2003). *Acta Cryst.* D**59**, 1891–1902. Web of Science CrossRef CAS IUCr Journals Google Scholar

Rice, L. M., Earnest, T. N. & Brünger, A. T. (2000). *Acta Cryst.* D**56**, 1413–1420. Web of Science CrossRef CAS IUCr Journals Google Scholar

Sim, G. A. (1959). *Acta Cryst.* **12**, 813–815. CrossRef IUCr Journals Web of Science Google Scholar

Terwilliger, T. C. & Berendzen, J. (1997). *Acta Cryst.* D**53**, 571–579. CrossRef CAS Web of Science IUCr Journals Google Scholar

Terwilliger, T. C. & Eisenberg, D. (1987). *Acta Cryst.* A**43**, 6–13. CrossRef CAS Web of Science IUCr Journals Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.