CCP4 study weekend\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoBIOLOGICAL
CRYSTALLOGRAPHY
ISSN: 1399-0047

Is it jolly SAD?

CROSSMARK_Color_square_no_text.svg

aDepartment of Chemistry, University of York, Heslington, York YO10 5DD, England
*Correspondence e-mail: e.dodson@ysbl.york.ac.uk

(Received 4 June 2003; accepted 22 September 2003)

Examples of phasing macromolecular crystal structures based on single-wavelength anomalous dispersion (SAD) have demonstrated that this approach may have general applications in structural biology. With better data-collection facilities and cryogenic techniques, combined with powerful data-processing, phasing and density-modification programs, the SAD approach may prove simpler than phasing from multi-wavelength (MAD) measurements. It can be performed at any wavelength where anomalous scattering can be observed, in many cases using laboratory X-ray sources. However, there is still a need for accurate data, successful phase improvement and a certain amount of luck. This paper extends the discussion of Jolly SAD in Dauter et al. [Dauter, Z., Dauter, M. & Dodson, E. (2002)[Dauter, Z., Dauter, M. & Dodson, E. (2002). Acta Cryst. D58, 494-506.], Acta Cryst. D58, 494–506].

1. Introduction

Structure determination is now recognized as one of the most effective methods for acquiring biological insight and there is pressure on crystallographers to provide accurate information as quickly and painlessly as possible. The diffraction experiment only gives the intensities arising from the atomic distribution of the molecule within a lattice, so in order to produce an interpretable image it is necessary to determine the associated phases. The most unbiased approach to finding these is by modifying the observations in some predictable way. Formally, an experimental phase for any reflection can be uniquely estimated from three measurements of associated amplitudes, provided that the vector describing the differences can be calculated. This is easy if the coordinates of those atoms generating the differences, described as the substructure, are known. Once phases and error estimates have been obtained, there are powerful ways to refine them using known properties of the protein electron density, e.g. flattening the density in the solvent region, modifying the density within the protein or averaging the density for different copies of a molecule.

1.1. Historical background

The method of multiple isomorphous replacement (MIR) uses related crystals where the substructure consists of additional atoms, usually heavy metals, soaked into the crystal from appropriate salt solutions. There is always a problem with isomorphism with respect to the native crystal; the salts often cause other rearrangements within the lattice apart from introducing the heavy atoms.

This approach is augmented by exploiting the anomalous dispersion differences between F(hkl) and F(−h, −k, −l) resulting from the resonant scattering of the substructure within the derivative(s) (the acronyms MIRAS or SIRAS stand for multiple or single isomorphous replacement with anomalous scattering). Such differences are not affected by non-isomorphism, but are usually much weaker than the isomorphous differences and therefore harder to measure accurately.

1.2. Current practice

Once a model of the substructure is obtained, it must be refined in order to improve its ability to predict the observed differences. Simultaneously, it is used to deduce protein phases from these differences and from the calculated heavy-atom model structure factors (Blundell & Johnson, 1976[Blundell, T. L. & Johnson, L. N. (1976). Protein Crystallography. London: Academic Press.]; Drenth, 1999[Drenth, J. (1999). Principles of Protein X-ray Crystallography, 2nd ed. Heidelberg: Springer.]). Although formally a phase can be determined from three observations and the appropriate model, the errors in both measurements and models mean that it is essential to use a probabilistic approach to assign an appropriate weight to the phase. In addition, the methods used for finding the positions of the model atoms cannot distinguish the hand of the solution and the phasing geometry is equally well described by the model or by its mirror image. If there is only one partial structure (as for SAD, MAD or SIRAS experiments), the correct enantiomer can only be chosen by assessing which hand generates the better electron-density map. The underlying theory is reviewed in this volume and described in many classic texts, e.g. Blow & Rossmann (1961[Blow, D. M. & Rossmann, M. G. (1961). Acta Cryst. 14, 1195-1202.]), North (1965[North, A. C. T. (1965). Acta Cryst. 18, 212-216.]), Mathews (1966[Mathews, B. W. (1966). Acta Cryst. 20, 230-239.]), Dodson & Vijayan (1971[Dodson, E. J. & Vijayan, M. (1971). Acta Cryst. B27, 2402-2411.]), Fourme et al. (1996[Fourme, R., Shepard, W. & Kahn, R. (1996). Prog. Biophys. Mol. Biol. 64, 167-199.]), Blundell & Johnson (1976[Blundell, T. L. & Johnson, L. N. (1976). Protein Crystallography. London: Academic Press.]), Drenth (1999[Drenth, J. (1999). Principles of Protein X-ray Crystallography, 2nd ed. Heidelberg: Springer.]).

1.3. Using the anomalous signal alone

The multi-wavelength anomalous diffraction (MAD) method uses only the wavelength-dependence of the atomic structure factor of the anomalously scattering atoms for solving the phase problem (Phillips & Hodgson, 1980[Phillips, J. C. & Hodgson, K. O. (1980). Acta Cryst. A36, 856-864.]; Karle, 1980[Karle, J. (1980). Int. J. Quant. Chem. 7, 357-367.]; Hendrickson, 1991[Hendrickson, W. A. (1991). Science, 254, 51-58.], 1999[Hendrickson, W. A. (1999). J. Synchrotron. Rad. 6, 845-851.]). In this approach, several data sets are collected at various wavelengths around the absorption edge of the anomalous scatterer present in the crystal and the differences in the [f'] and [f''] contributions are utilized for phase calculation. Such MAD experiments are possible only at synchrotron X-ray sources, where the X-ray wavelength can be tuned to the desired values. The anomalous scatterer used for MAD phasing may be inherently contained in the metalloprotein (e.g. Zn, Cu, Fe), introduced by soaking (classic heavy atoms, e.g. Hg, Pt, Au compounds or halide ions in the solvent shell) or by metabolic or chemical modification, such as those used to incorporate selenomethionine in proteins or bromouracil in DNA (Boggon & Shapiro, 2000[Boggon, T. J. & Shapiro, L. (2000). Structure, 8, R143-R149.]). If conditions are favourable, the phasing power is excellent.

However, it has been proposed (González et al., 1999[González, A., Pedelacq, J. D., Sola, M., Gomis-Rüth, F.-X., Coll, M., Samama, J. P. & Benini, S. (1999). Acta Cryst. D55, 1449-1458.]) and demonstrated that sufficiently good phase estimates may be obtained by collecting more accurate data at fewer wavelengths. In some cases, data collected at one wavelength have been sufficient to determine the phases of both test and novel structures, as demonstrated by the solution of the structure of crambin (Hendrickson & Teeter, 1981[Hendrickson, W. A. & Teeter, M. M. (1981). Nature (London), 290, 107-113.]) and advocated by Wang (1985[Wang, B.-C. (1985). Methods Enzymol. 115, 90-112.]). Dauter et al. (2002[Dauter, Z., Dauter, M. & Dodson, E. (2002). Acta Cryst. D58, 494-506.]) give many examples where this approach of single-wavelength anomalous dispersion (SAD) coupled with increasingly powerful phasing and density-modification algorithms (La Fortelle & Bricogne, 1997[La Fortelle, E. de & Bricogne, G. (1997). Methods Enzymol. 276, 472-494.]; Hauptman, 1996[Hauptman, H. A. (1996). Acta Cryst. A52, 490-496.]; Langs et al., 1999[Langs, D. A., Blessing, R. H. & Guo, D. Y. (1999). Acta Cryst. A55, 755-760.]; Cowtan, 1999[Cowtan, K. (1999). Acta Cryst. D55, 1555-1567.]; Terwilliger, 2000[Terwilliger, T. C. (2000). Acta Cryst. D56, 965-972.]) can solve the phase problem for macromolecular structures.

2. Background of phase determination

Phase determination is covered elsewhere in this volume and only a brief outline is presented here. X-rays are diffracted by atoms positioned within a crystal lattice. Most diffraction arises from the electrons surrounding the atomic nucleus and since this electron cloud has a radius comparable to the X-ray wavelength, the contribution falls off at higher diffraction angles, i.e. at higher resolution. This is represented by the atomic form factor. Such a signal from the whole atom is isotropic and can be treated as a real number, f0(θ).

If X-rays can excite those electrons that are able to jump from lower to higher energy shells, an auxiliary resonant anomalous signal is observed and the atomic form factor can be expressed as a complex number [f'] + i[f'']. Generally, [f''] is proportional to the atomic absorption of the X-rays and to their fluorescence and [f'] follows the derivative of this function, according to the Kramer–Kronig transformation (James, 1958[James, R. W. (1958). The Optical Principles of the Diffraction by X-­rays. London: Bell & Sons.]). In contrast to the normal atomic scattering factor f0, the anomalous dispersion corrections [f'] and [f''] depend only on the wavelength λ of the X-rays used for the diffraction experiment and do not diminish with the diffraction angle. The full atomic form factor is

[f(\theta, \lambda) = f^0(\theta) + f' (\lambda) + if''(\lambda).]

In macromolecules, most of the atoms have negligible [f'](λ) and [f''](λ) and there are only a few anomalous scatterers, so that the total anomalous dispersion generates only small differences in intensity. The diffraction data must be measured very accurately to allow these differences to be utilized for phasing.

When all atomic form factors are real with zero [f''] contribution Friedel's law holds, so that F(hkl) and F(−h, −k, −l) have the same magnitude and φ(hkl) = −φ(−h, −k, −l). However, when the form factor contains an imaginary contribution i[f''], the reflections F(hkl) and F(−h, −k, −l) have different intensities and their phases are no longer complementary. In the MAD technique, where several data sets are measured at different wavelengths λi with different values for the dispersive difference [f'] and the anomalous difference [f''], two associated but different measurements of the amplitudes are obtained for each wavelength. Once the positions of the anomalous scatterers are known and the magnitudes of [f'] and [f''] for this wavelength have been estimated, the protein phases for the reflections can formally be derived in an analogous way to the MIRAS approach.

Once again, the procedure has two independent stages. Firstly, the positions of the anomalous scatterers have to be deduced from Patterson or direct-methods searches using coefficients derived from either dispersive or anomalous differences or from a combination of both; secondly, the position and precise values of [f'] and [f''] for the partial structure needs to be refined in order to maximize its ability to predict the observed differences.

3. Single-wavelength phasing

It is not formally possible to evaluate a protein phase exactly if there are only two experimental measurements, e.g. when the data are restricted to one wavelength (SAD) with only a single anomalous difference available or in the SIR case when only the native and one derivative data set is measured. Even assuming that the measured protein amplitudes, F+ and F, and the calculated amplitude and phase contributions of the anomalous partial structure, Fa and φA, are error-free, there is a twofold ambiguity in the estimation of the protein phase (Ramachandran & Raman, 1956[Ramachandran, G. N. & Raman, S. (1956). Curr. Sci. 25, 348-351.]). Fig. 1[link] shows that for the SAD case, where all the anomalous scatterers are of the same kind, the two possible phase values of the protein structure factor, φT, are symmetrically oriented around (φA − 90°). There is a phase error for either solution of (φTφA + 90°), with an associated figure of merit of cos(φTφA + 90°). Note that all centric reflections where (φTφA) must be either 0 or 180° have figures of merit of zero. (Analogously, for the SIR case the two possible values of the protein phase are symmetrically oriented about the heavy-atom phase, φH.) Thus, a unique protein phase could only be determined if the protein and anomalous scatterer phases differ by 90°, when the two solutions would coincide. (These reflections also have the maximum possible Bijvoet difference.)

[Figure 1]
Figure 1
Part of the Argand diagram showing various contributions to the scattering factors. The measured amplitudes of both Friedel mates and their mean (Ft+, Ft- and Ft) are shown in black and green, those of the anomalous scatterers (Fa and [F''_{a}]) in red and the resultant contribution of the normally scattering atoms (Fp) in blue. (The likely contribution of Fa has been grossly exaggerated to clarify the figure.) The magnitudes of Ft+ and Ft- are known and once the anomalous substructure has been positioned, the red vectors can be calculated. Two solutions for Ft are then possible, with their phase, φT, symmetrically placed on either side of φA − 90°. The contribution of the normal scatterers, Fp will be different in the two cases.

The relation between the Bijvoet difference, ΔF±, the phase of the protein, φT, and that of the anomalous substructure, φA, can be deduced from Fig. 1[link],

[F^{+2} - F^{-2} = 4F_{t}F''_{a}\sin(\varphi_{T} - \varphi_{A}).]

If the contribution of the anomalous scattering to the total diffracting power of the crystal is small, Fa << Ft, then (|F+| + |F|)/2 ≃ Ft and

[\Delta F^{\pm} = |F^{+}| - |F^{-}| \simeq 2F''_{a} \sin (\varphi_{T} - \varphi_{A}).]

Defining θ = cos−1(ΔF±/2[F''_{a}]) and since sin(φTφA) = sin(180° − φT + φA),

[\varphi_{T} - \varphi_{A} = 90^{\circ} + \theta \,\,{\rm or}\,\,90^{\circ} - \theta.]

Except when θ = 90°, the ambiguity follows.

The probability of phase distribution resulting from anomalous scattering (Hendrickson, 1979[Hendrickson, W. A. (1979). Acta Cryst. A35, 245-247.]) can be expressed,

[P_{\rm anom}(\varphi) = N \exp \{-[\Delta F^{\pm} + 2F''_{a}\sin (\varphi_{T} - \varphi_{A})]^{2}/2E^{2}\},]

where N is the normalizing factor and E the standard error estimation.

However, since the anomalous scatterers are part of the structure, φT will be correlated with φA and of the two possibilities resulting from the sine ambiguity, there is a slightly higher probability that the protein phase, φT, has the value closer to φA. Sim (1959[Sim, G. A. (1959). Acta Cryst. 12, 813-815.]) derived the statistical probability of the protein phase estimated from the known partial structure as

[P_{\rm par}(\varphi_{T}) = N\exp[2(|F_{t}||F_{a}|/F_{u}^{2})\cos(\varphi_{T} - \varphi_{A})],]

where Fu2 is the contribution of the normally scattering (unknown) atoms. The total phase probability is obtained from a combination of the Sim-weighted estimate and that derived from the SAD equation. Modern programs such as SHARP (de La Fortelle & Bricogne, 1997[La Fortelle, E. de & Bricogne, G. (1997). Methods Enzymol. 276, 472-494.]), SOLVE (Terwilliger & Berendzen, 1999[Terwilliger, T. C. & Berendzen, J. (1999). Acta Cryst. D55, 849-861.]), BP3 (Pannu et al., 2003[Pannu, N. S., McCoy, A. J. & Read, R. J. (2003). Acta Cryst. D59, 1801-1808.]) and MLPHARE (Otwinowski, 1991[Otwinowski, Z. (1991). Proceedings of the CCP4 Study Weekend. Isomorphous Replacement and Anomalous Scattering, edited by Wolf, P. R. Evans & A. G. W. Leslie, pp. 80-86. Warrington: Darebury Laboratory.]) endeavour to provide realistic starting probabilistic estimates of the initial phases and figures of merit.

4. Phase-improvement techniques

The problem of resolving the SAD phase ambiguity for reflections has been tackled by various methods: resolved anomalous phasing, used originally by Hendrickson & Teeter (1981[Hendrickson, W. A. & Teeter, M. M. (1981). Nature (London), 290, 107-113.]) for the solution of crambin, the iterative single-wavelength anomalous scattering (ISAS) approach, introduced by Wang (1985[Wang, B.-C. (1985). Methods Enzymol. 115, 90-112.]), and direct-methods applications as proposed by Hauptman (1982[Hauptman, H. A. (1982). Acta Cryst. A38, 289-294.], 1996[Hauptman, H. A. (1996). Acta Cryst. A52, 490-496.]) or by Fan et al. (1990[Fan, H.-F., Hao, Q., Gu, Y.-X., Qian, J.-Z., Zheng, C.-D. & Ke, H. (1990). Acta Cryst. A46, 935-939.]).

However, the most powerful approach to improving the phase distributions uses density-modification procedures such as those programmed in SOLOMON (Abrahams & Leslie, 1996[Abrahams, J. P. & Leslie, A. G. W. (1996). Acta Cryst. D52, 30-42.]), DM (Cowtan, 1999[Cowtan, K. (1999). Acta Cryst. D55, 1555-1567.]) and RESOLVE (Terwilliger, 2000[Terwilliger, T. C. (2000). Acta Cryst. D56, 965-972.]). The methods all modify the initial density, use this to generate a new set of phases which are combined with the experimental ones and then repeat the cycle. Providing the phase errors are properly estimated and the solvent boundary correctly outlined, this is extremely effective. Since the method depends on the recognition and enhancement of interpretable features in the electron density and the maps based on phases derived from the two enantiomorphs differ in quality, this procedure should also select the correct enantiomorph.

After this procedure, automated model building, cycled with maximum-likelihood weighted refinement of the partial model to further improve the phasing, can lead to a near-complete model in a very short time. The method is available in software packages such as ARP/wARP (Perrakis et al., 1999[Perrakis, A., Morris, R. & Lamzin, V. S. (1999). Nature Struct. Biol. 6, 458-463.]) and RESOLVE.

5. Finding the positions of the substructure atoms

These have to be deduced from Patterson or direct-methods searches using coefficients derived from isomorphous, dispersive or anomalous differences or from a combination of both. The methodology has beeen described in detail in many places, e.g. Weeks et al. (2003[Weeks, C. M., Adams, P. D., Berendzen, J., Brunger, A. T., Dodson, E. J., Grosse-Kunstleve, R. W., Schneider, T. R., Sheldrick, G. M., Terwilliger, T. C., Turkenburg, M. & Usón, I. (2003). In the press.]) and references therein. To summarize, for isomorphous differences

[|F_{PH}| - |F_{P}| \simeq 2|F_H| \cos (\varphi_{T} - \varphi_{H})]

and for anomalous differences

[|F^+| - |F^-| \simeq 2 F''_{A}\sin (\varphi_{T} - \varphi_{A}).]

Thus, in principle, the positions of anomalous scatterers can be found from the Bijvoet differences for single-wavelength data. There are several powerful automated search procedures such as those programmed in SnB (Miller et al., 1994[Miller, R., Gallo, S. M., Khalak, H. G. & Weeks, C. M. (1994). J. Appl. Cryst. 27, 613-621.]), SOLVE (Terwilliger & Berendzen, 1999[Terwilliger, T. C. & Berendzen, J. (1999). Acta Cryst. D55, 849-861.]), SHELXD (Schneider & Sheldrick, 2002[Schneider, T. R. & Sheldrick, G. M. (2002). Acta Cryst. D58, 1772-1779.]) or ACORN (Foadi et al., 2000[Foadi, J., Woolfson, M. M., Dodson, E. J., Wilson, K. S. & Yao, J.-X. (2000). Acta Cryst. D56, 1137-1147.]) which are usually successful, even with incomplete difference data extending to a resolution sufficient to separate the sites, perhaps to 3.5 Å. However, all depend primarily on the large differences and are likely to fail if there are even a few overestimated outliers. Collecting multiple observations seems to be almost essential for detecting and removing such rogue reflections. Fortunately, Patterson search methods work well when there are only a few substructure sites, whereas direct-methods procedures work best when there are many sites scattered throughout the unit cell, as is often the case for Se substitution. If the differences are reliable, it is possible to find many sites.

6. Estimation of the amount of anomalous signal in diffraction data

The mean ratio of the Bijvoet difference to the total protein amplitude is

[\langle \Delta F^{\pm}\rangle/\langle F \rangle = 2^{1/2}(N_{A}^{1/2}f''_{A})/[N_{P}^{1/2}f_{\rm eff}(\theta)],]

where feff = (1/N)[\textstyle \sum]fi is the effective scattering of an average atom at diffraction angle θ. The anomalous scattering signal [f''] does not depend on the resolution, but feff reduces with resolution and thus the percentage of anomalous signal could be expected to increase at high resolution, especially if the temperature factors of the anomalous scatterers are lower than the average value for all atoms of the macromolecule. However, weak intensities, which are more likely at high resolution, are measured with lower accuracy, spoiling the practical advantage of these effects. The true ΔF± values are often of the same order as the measurement errors, leading to a seriously overestimated 〈ΔF±〉. If represents the measurement error, ΔF[^{\pm}_{\rm obs}] = ΔF[^{\pm}_{\rm true}] ± , so 〈ΔF[^{\pm}_{\rm obs}]〉 = ([\Delta F^{\pm2}_{\rm true}] + 2)1/2.

The pattern of the average ratios of anomalous difference to total amplitude, 〈ΔF±〉/〈F〉, for some of the test structures is shown as a function of resolution in Fig. 2[link].

[Figure 2]
Figure 2
The calculated and observed 〈ΔF±〉/〈F〉 ratio as a function of resolution for (a) Cbm27(i), (b) ProtE and (c) Cel5a. Cbm27(i) and ProtE were successfully phased; although the high-resolution differences were overestimated, they provided some phase information which was then improved by density modification. The very weak signal from the S atoms in Cel5a was not sufficient to trigger useful phasing.

In practice, the significance of the anomalous signal contained in the measured set of intensities can be roughly estimated at the data-merging stage. If Friedel mates are treated as equivalent, the true differences between the intensities of the Friedel-related reflections will lead to increased merging R factors and distorted normal probability plots compared with the results obtained when the Friedel mates are kept seperate. Also, the list of potential outliers should reveal significant and consistent differences between some of the Bijvoet-related intensities.

An elegant method for assessing data quality, suggested by Schneider & Sheldrick (2002[Schneider, T. R. & Sheldrick, G. M. (2002). Acta Cryst. D58, 1772-1779.]), is to verify that the anomalous signal from different sets of measurements is correlated. The data sets can either come from different wavelength measurements or the data can be arbitrarily partitioned. It is of course necessary to have some level of multiplicity; i.e. anomalous pairs must have been measured more than once. An illustration of this is given in Fig. 3[link]. They found in practice that once the correlation falls below 0.25 the differences are too unreliable to be useful in placing the substructure or in estimating phases.

[Figure 3]
Figure 3
The correlation between the anomalous signal for three data sets collected for Cmb27(i). If there was no error, the correlation would be 100%. When it falls below 25% there will be little useful phasing and below 40% it is not useful to position the substructure. In this case, only data to 3.5 Å was used to find the Se sites.

7. Test examples to assess success or failure

There is often a wide gap between theory and practice and this is especially so with SAD phasing. There are many successful applications discussed in Dauter et al. (2002[Dauter, Z., Dauter, M. & Dodson, E. (2002). Acta Cryst. D58, 494-506.]). In most of these cases, the data were of extremely high quality. The following examples have been chosen to examine the power of the technique with more `normal' data sets and to allow us to pinpoint the reasons for success or failure as a prerequisite for designing better protocols. In fact, like molecular replacement, applications seem often to be either trivial or impossible! The method certainly works well if data quality is excellent, if there are reasonable experimental phases extending to 2 Å, if the solvent content is greater than 50% or if the diffraction data extend to 1.5 Å or beyond.

The statistics of diffraction data and phasing for each example data set is given in Tables 1[link] and 2[link]. The amount of anomalous signal in several of the data sets is illustrated in Fig. 2[link], where the average ratio of anomalous difference to total amplitude, 〈ΔF±〉/〈F〉, is given as a function of resolution. The results reported here are for models phased using MLPHARE and with the density-modification steps performed with DM. All models were refined with REFMAC (Murshudov et al., 1997[Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Acta Cryst. D53, 240-255.]) and the phase comparisons and map correlations are all performed against these models. All this software is available within the CCP4 suite (Collaborative Computational Project, Number 4, 1994[Collaborative Computational Project, Number 4 (1994). Acta Cryst. D50, 760-763.]). The failures, Cbm27(ii) and Cel5A, were also phased using the more sophisticated procedures coded in SHARP and SOLOMON, but without success.

Table 1
Statistics of X-ray data

Values in parenthese are for the highest resolution shell.

Crystal Cbm27(i) ProtE(i) ProtE(ii) Cbm27(ii) Lipase Cel5A
Space group P41212 I4122 I4122 P212121 P43212 P212121
Unit-cell parameters (Å)            
a (Å) 69.9 70.2 70.2 31.7 92.2 55.3
b (Å) 69.9 70.2 70.2 48.9 92.2 69.7
c (Å) 229.5 71.9 71.9 96.9 299.4 76.9
Wavelength (Å) 0.98 1.54/0.93 0.93 0.98 0.98 1.54
Resolution (Å) 2.00 1.7/1.3 1.3 1.78 2.10/2.78 1.68
Measured reflections 422308 162960 110389 97781 184654
Unique reflections 38522 11210 22383 15281 75988/33124 33982
Multiplicity 5.2 13 5.0 7.2 5.3
Rmerge (%) 7.2 (34.7) 4.5 (34.8) 4.2 (47.0) 6.6 (26.9) 3.7 (12.5)
I/σ(I) 10.5 (2.7) 23.6 (3.2) 28.2 (5.3) 6.4 (2.2) 40.7 (13.8)
Completeness (%) 100.0 (100.0) 99.9 (99.8) 99.9 (89.6) 96.0 (76.0) 99.9 (—) 98.3 (95.9)
Completeness (anomalous) (%) 99.1 (99.4) 70.4 (69.2) 76.8 (68.9) 96.0 (71.0) 95.7 (91.2)
Rmerge = [\textstyle \sum_{h}\sum_{i} |I_{i} - \langle I \rangle|/][\textstyle \sum_{h}|I_{i}|].

Table 2
Details of SAD phasing

FOM, overall figure of merit after MLPHARE or DM. CC, correlation coefficient between (Fobs, φcalc) Fourier map and the map calculated after DM. Δφ, average difference between phases calculated from the refined model and those obtained from MLPHARE or DM.

Crystal Cbm27(i) ProtE(i) ProtE(ii) Cbm27(ii) Lipase Cel5A
Wavelength (Å) 0.98 1.54 0.93 0.98 0.98 1.54
Resolution (Å) 2.00 1.7 1.3 1.78 2.10 (2.78) 1.68
Protein size (kDa) 80 (2 mols) 9 9 40 45 34
Solvent fraction 0.55 0.52 0.52 0.34 0.55 0.42
Substructure 2 Se 9 S 9 S 1 Se 2 U, 4 Au 8 S, 1 Br
[f''] (electron units) 4 0.5 0.2 4 0.5
ΔF±〉/〈F〉 est. (%) ∼4 ∼1.5 ∼4 3.0 ∼1
FOM MLPHARE 0.1 0.1 0.3 0.4 (0.34) 0.03
FOM DM 0.80 0.55 0.65 0.75/0.35 (0.75/0.55) 0.7
CC DM 0.86 0.52 0.51 (ACORN, 0.58) 0.37 0.79 (0.5) 0.25
Δφ MLPHARE (°) 75.4 75.1 65.0 (2.8 Å) 68 (72) 78
Δφ DM (°) 43.0 61.9 70.0 (2.8 Å) 71 (50) 78

7.1. Success – Cbm27(i) crystal form A: anomalous signal of Se

These well diffracting crystals have two copies of the molecule in the asymmetric unit, with 55% solvent. Se-containing protein was prepared and a MAD phasing experiment was carried out at the ESRF (Boraston et al., 2003[Boraston, A. B., Revett, T. J., Boraston, C. M., Nurizzo, D. & Davies, G. J. (2003). In the press.]). Two sites were positioned from the Patterson, with two more somewhat disordered ones positioned later from difference Fouriers and the structure phased in a straightforward manner using MLPHARE and DM with and without averaging. The phasing protocol was repeated using data from only one wavelength, which also gave excellent maps of comparable quality. The comparisons of the relative phase errors after the different procedures are illustrated in Fig. 4[link].

[Figure 4]
Figure 4
The phase differences for Cbm27(i) between those calculated from the final model and those derived from different phasing procedures. The initial phase error after SAD phasing was almost 10° greater that that for the MAD phasing, but the density-modification errors were very similar. Averaging the density for the two molecules in the asymmetric unit gave a further improvement.

7.2. Partial success – ProtE (E-fragment of human fibrinogen): anomalous signal of S

Excellent crystals of ProtE, a 90-residue fragment of the human fibrinogen were available, with one molecule in the asymmetric unit and 52% solvent (Brzozowski, 2003[Brzozowski, M. (2003). In preparation.]). The fragment contained nine S atoms, one methionine and four disulfide bridges and therefore seemed an ideal case for SAD phasing from the S signal alone. Highly redundant data were collected in-house to 1.7 Å. In fact, the structure was solved by molecular replacement before the experimental phasing was completed, but the phasing exercise was also sucessful. The sites for the four disulfides and the Met S atom were difficult to find and in somewhat special positions. We carried out many unsuccessful searches with different resolution limits and exclusion criteria. The correct set was found using carefully screened anomalous differences to 2.5 Å as input to SHELXD. All observations less than 3σ were excluded and any differences greater than four times the mean value for the resolution range were omitted. (This is equivalent to only using E values of less than 4 for the direct-methods step in SHELXD.) It is hard to judge success or failure at this stage, except by trying the phasing from many solutions. The `anomalous Cullis R factor', the ratio of the anomalous lack of closure to observed anomalous difference plotted by the MLPHARE program, is a useful criteria. This gives [Δ[F^{\pm}_{\rm obs}]Δ[F^{\pm}_{\rm calc}]]/Δ[F^{\pm}_{\rm obs}] as a function of resolution. In our experience, for correct solutions it should be less than 0.65 for at least the low-resolution bins. Anisotropic refinement of the correct sites to 1.7 Å using MLPHARE indicated how to split the disulfides. The SAD FOM fell off rapidly after 3 Å, but provided sufficient information for the density modification to improve the phases dramatically. The overall map correlation leapt from 0.3 for the SAD phases to 0.55. This is illustrated in Fig. 5[link]. An interesting extension of the structure-solution method was provided by ACORN. Synchroton data were available to 1.3 Å and this was sufficient for ACORN to generate an excellent phase set starting from the S positions alone.

[Figure 5]
Figure 5
The map correlations for ProtE for the main-chain residues of fragment B. The best agreement is for the 1.3 Å ACORN-generated set. The density modification after SAD phasing to 1.7 Å also gave greatly improved agreement with the final model.

7.3. Total failure – Cbm27(ii) crystal form B: anomalous signal of Se

A second crystal form of Cbm27 with one molecule in the asymmetric unit and 34% solvent was obtained which diffracted better than form A. Again, Se-containing protein was prepared and a SAD phasing experiment carried out at the ESRF. The major sites were positioned from the Patterson and the structure phased as before. However, this time, with such low solvent content, the density modification did not give any significant phase improvement, the overall phase error stuck at more than 60° and it would have been difficult to interpret the resultant maps. The structure was easily solved by molecular replacement using one of the Cbm27 form A molecules as a model.

7.4. Partial failure –  a lipase solved by isomorphous phasing

This example, provided by Jan Dohnalek (private communication), is included only to illustrate that whilst density modification is a powerful phase-improvement technique, it is much more effective if the starting phase set is of reasonable quality and extends to higher resolution. This structure has a solvent content of 55% and was solved by isomorphous replacement. Native data was available to 2.1 Å and U- and Au-derivative data were collected, at first only to 2.8 Å. The sites were easily obtained from a Patterson search, but the substitution was incomplete and the average figure of merit was 0.31. The map was not sufficiently clear to allow the tracing of an accurate solvent boundary and density-modification procedures failed to either improve the experimental phases or to extend the phase set to higher resolution. Once a second set of derivatives were prepared with more concentrated solutions, resulting in higher substitution, and derivative data were collected to 2.1 Å, the figure of merit increased by about 0.15 in all resolution ranges. The same density-modification procedure now rapidly improved these MIRAS phases. The overall map correlation to the final model increased from 0.45 to 0.7, with very few breaks in the chain density. It demonstrated that the initial phase set must be good enough to allow the solvent boundary to be recognized.

7.5. Phasing failure –  Cel5A complex

As a final test of the power of SAD phasing from S atoms, we investigated a complex of Cel5A (Varrot, 2000[Varrot, A. (2000). PhD thesis. University of York, England.]). This had been solved very easily by molecular replacement to investigate the substrate binding, but since in-house anomalous data to 1.68 Å was available and the structure contained eight S atoms and a bromine, we used it to test whether SAD phasing from these sites with such an unoptimized data set would have been possible. However, this proved to be a complete failure: the FOM from the SAD phasing was only 0.1 at 3 Å and fell to near zero at 1.6 Å. The density modification gave no improvement, demonstrating that to exploit such a weak signal, considerable care and time must be given at the data-collection stage.

8. Conclusions

The introduction of more accurate automatic detectors, as well as the use of crystal cryoprotection techniques, has now made it possible to collect diffraction intensities very accurately. Precise control of the wavelength of synchrotron radiation allows both anomalous and dispersive differences to be varied. At the same time, phasing software has improved dramatically. These developments have contributed to the popularity of the MAD method of phasing.

However, it has been demonstrated in many examples that it is now feasible to obtain interpretable electron-density maps from intensities containing the anomalous signal within a single data set recorded using only one X-ray wavelength (SAD). The accuracy of measurements required for successful SAD phasing seems to be comparable with that in routine MAD experiments, but in contrast to the MAD method, the wavelength need not be so finely tuned and indeed it is possible to use Cu Kα radiation for many problems. At this wavelength, S and Br both have a detectable signal ([f''_{A}] ≃ 0.5) and most metals have a very significant [f''_{A}].

This flexibility in wavelength used makes the experiment much less demanding; small fluctuations in the wavelength are not disastrous since the data do not need to be recorded at the peak of the anomalous scattering. Cross-correlation of the anomalous signal to assess the resolution limit for useful differences, routinely used in MAD data sets, can be still be used by partitioning the observations randomly. This feature is already incorporated in the CCP4 program SCALA (Evans, 1997[Evans, P. R. (1997). CCP4 Newslett. Protein Crystallogr. 33, 22-24.]).

However, the quality of a phase estimate from only two measurements is inevitably limited, with a near-bimodal probability distribution, no matter how accurate the data are, so success depends crucially on the density-modification step, which moves the probability distribution towards a unimodal form and gives phase estimates for centric reflections where the SAD experiment provides virtually no phasing information at all. These are most successful when there is limited phase information across the whole range of resolution, so in some ways the SAD (or MAD) experiment is ideal when there is no perfect isomorphism. However, if the initial phasing is poor, the solvent content is too low or the sites are in special positions (e.g. all sites in a polar space group at the same height along the rotation axis), this can fail.

It is also worth remembering that all crystallography becomes easier the higher the data resolution and in favorable cases it might prove a more effective strategy to collect one set of atomic resolution data and restrict the high-redundancy data sets required for experimental phasing techniques to a more limited resolution.

Acknowledgements

All workers in this field owe an immense debt of gratitude to Z. Dauter who demonstrated convincingly the value of SAD phasing in routine experiments. Members of the York Structural Biology Laboratory, in particular Marek Brzozowski, Jan Dohnalek, Didier Nurizzo and Annabelle Varrot, have provided both data and valuable discussions. The Wellcome Trust provided ED with funding.

References

First citationAbrahams, J. P. & Leslie, A. G. W. (1996). Acta Cryst. D52, 30–42.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationBlow, D. M. & Rossmann, M. G. (1961). Acta Cryst. 14, 1195–1202.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationBlundell, T. L. & Johnson, L. N. (1976). Protein Crystallography. London: Academic Press.  Google Scholar
First citationBoggon, T. J. & Shapiro, L. (2000). Structure, 8, R143–R149.  Web of Science CrossRef PubMed CAS Google Scholar
First citationBoraston, A. B., Revett, T. J., Boraston, C. M., Nurizzo, D. & Davies, G. J. (2003). In the press.  Google Scholar
First citationBrzozowski, M. (2003). In preparation.  Google Scholar
First citationCollaborative Computational Project, Number 4 (1994). Acta Cryst. D50, 760–763.  CrossRef IUCr Journals Google Scholar
First citationCowtan, K. (1999). Acta Cryst. D55, 1555–1567.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationDauter, Z., Dauter, M. & Dodson, E. (2002). Acta Cryst. D58, 494–506.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationDodson, E. J. & Vijayan, M. (1971). Acta Cryst. B27, 2402–2411.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationDrenth, J. (1999). Principles of Protein X-ray Crystallography, 2nd ed. Heidelberg: Springer.  Google Scholar
First citationEvans, P. R. (1997). CCP4 Newslett. Protein Crystallogr. 33, 22–24.  Google Scholar
First citationFan, H.-F., Hao, Q., Gu, Y.-X., Qian, J.-Z., Zheng, C.-D. & Ke, H. (1990). Acta Cryst. A46, 935–939.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationFoadi, J., Woolfson, M. M., Dodson, E. J., Wilson, K. S. & Yao, J.-X. (2000). Acta Cryst. D56, 1137–1147.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationFourme, R., Shepard, W. & Kahn, R. (1996). Prog. Biophys. Mol. Biol. 64, 167–199.  CrossRef Web of Science Google Scholar
First citationGonzález, A., Pedelacq, J. D., Sola, M., Gomis-Rüth, F.-X., Coll, M., Samama, J. P. & Benini, S. (1999). Acta Cryst. D55, 1449–1458.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationHauptman, H. A. (1982). Acta Cryst. A38, 289–294.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationHauptman, H. A. (1996). Acta Cryst. A52, 490–496.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationHendrickson, W. A. (1979). Acta Cryst. A35, 245–247.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationHendrickson, W. A. (1991). Science, 254, 51–58.  CrossRef PubMed CAS Web of Science Google Scholar
First citationHendrickson, W. A. (1999). J. Synchrotron. Rad. 6, 845–851.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationHendrickson, W. A. & Teeter, M. M. (1981). Nature (London), 290, 107–113.  CrossRef CAS Web of Science Google Scholar
First citationJames, R. W. (1958). The Optical Principles of the Diffraction by X-­rays. London: Bell & Sons.  Google Scholar
First citationKarle, J. (1980). Int. J. Quant. Chem. 7, 357–367.  CAS Google Scholar
First citationLa Fortelle, E. de & Bricogne, G. (1997). Methods Enzymol. 276, 472–494.  Google Scholar
First citationLangs, D. A., Blessing, R. H. & Guo, D. Y. (1999). Acta Cryst. A55, 755–760.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationMathews, B. W. (1966). Acta Cryst. 20, 230–239.  CrossRef IUCr Journals Web of Science Google Scholar
First citationMiller, R., Gallo, S. M., Khalak, H. G. & Weeks, C. M. (1994). J. Appl. Cryst. 27, 613–621.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationMurshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Acta Cryst. D53, 240–255.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationNorth, A. C. T. (1965). Acta Cryst. 18, 212–216.  CrossRef IUCr Journals Web of Science Google Scholar
First citationOtwinowski, Z. (1991). Proceedings of the CCP4 Study Weekend. Isomorphous Replacement and Anomalous Scattering, edited by Wolf, P. R. Evans & A. G. W. Leslie, pp. 80–86. Warrington: Darebury Laboratory.  Google Scholar
First citationPannu, N. S., McCoy, A. J. & Read, R. J. (2003). Acta Cryst. D59, 1801–1808.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationPerrakis, A., Morris, R. & Lamzin, V. S. (1999). Nature Struct. Biol. 6, 458–463.  Web of Science CrossRef PubMed CAS Google Scholar
First citationPhillips, J. C. & Hodgson, K. O. (1980). Acta Cryst. A36, 856–864.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationRamachandran, G. N. & Raman, S. (1956). Curr. Sci. 25, 348–351.  CAS Google Scholar
First citationSchneider, T. R. & Sheldrick, G. M. (2002). Acta Cryst. D58, 1772–1779.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationSim, G. A. (1959). Acta Cryst. 12, 813–815.  CrossRef IUCr Journals Web of Science Google Scholar
First citationTerwilliger, T. C. (2000). Acta Cryst. D56, 965–972.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationTerwilliger, T. C. & Berendzen, J. (1999). Acta Cryst. D55, 849–861.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationVarrot, A. (2000). PhD thesis. University of York, England.  Google Scholar
First citationWang, B.-C. (1985). Methods Enzymol. 115, 90–112.  CrossRef CAS PubMed Google Scholar
First citationWeeks, C. M., Adams, P. D., Berendzen, J., Brunger, A. T., Dodson, E. J., Grosse-Kunstleve, R. W., Schneider, T. R., Sheldrick, G. M., Terwilliger, T. C., Turkenburg, M. & Usón, I. (2003). In the press.  Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.

Journal logoBIOLOGICAL
CRYSTALLOGRAPHY
ISSN: 1399-0047
Follow Acta Cryst. D
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds