Jolly SAD

# 2002 International Union of Crystallography Printed in Denmark ± all rights reserved Examples of phasing macromolecular crystal structures based on single-wavelength anomalous dispersion (SAD) show that this approach is more powerful and may have more general application in structural biology than was anticipated. Better data-collection facilities and cryogenic techniques, coupled with powerful programs for data processing, phasing, density modi®cation and automatic model building, means that the SAD approach may gain wide popularity owing to its simplicity, less stringent wavelength requirements and faster data collection and phasing than the multi-wavelength (MAD) approach. It can be performed at any wavelength where anomalous scattering can be observed, in many cases using laboratory X-ray sources. Received 14 September 2001 Accepted 21 January 2002


Introduction
Formally, an experimental phase for any re¯ection can be uniquely estimated from three measurements of the amplitudes, provided that the coordinates of a model describing the differences can be determined. The method of multiple isomorphous replacement (MIR) uses related crystals where additional atoms, usually heavy metals, are soaked into the crystal from appropriate salt solutions. There is always a problem with isomorphism with respect to the native crystal; the salts often cause other rearrangements within the lattice apart from introducing the heavy atoms. This approach was augmented by exploiting the anomalous dispersion differences between F(h, k, l) and F(Àh, Àk, Àl) resulting from the resonant scattering of the heavy atoms within a derivative (single isomorphous replacement with anomalous scattering; SIRAS). Such differences were not affected by non-isomorphism, but were usually much weaker than the isomorphous differences and therefore harder to determine accurately.
The multi-wavelength anomalous dispersion (MAD) method uses only the wavelength dependence of the atomic structure factor of the anomalously scattering atoms for solving the phase problem (Phillips & Hodgson, 1980;Karle, 1980;Hendrickson, 1991Hendrickson, , 1999. In this approach, usually three or four data sets are collected at various wavelengths around the absorption edge of the anomalous scatterer present in the crystal and the differences in the f H and f HH contributions are utilized for phase calculation. Such MAD experiments are possible only at synchrotron X-ray sources, where the X-ray wavelength can be tuned to the desired values. Ideally, all measurements should be made from a single crystal, which must therefore be robust and able to survive lengthy irradiation. This almost always requires cryogenic techniques. The anomalous scatterer for MAD may be inherently contained in the metalloprotein (e.g. Zn, Cu or Fe), introduced by soaking (classic heavy atoms, e.g. Hg, Pt, Au compounds or Br ions introduced into the ordered solvent shell) or by genetic or chemical modi®cation, as for selenomethionine in proteins or bromouracil in DNA (Boggon & Shapiro, 2000). If conditions are favorable, the phasing power is excellent and MAD is presently the method of choice for solving new macromolecular crystal structures.
In the present era of high-throughput structure determination, new ideas are being put forward to make more ef®cient use of synchrotron time. It has been proposed (Gonza Â lez et al., 1999) that good phase estimates may be obtained by collecting more accurate data at fewer wavelengths. In some cases, data collected at one wavelength were suf®cient to determine the phases of both test and novel structures, as demonstrated by the solution of the structure of crambin (Hendrickson & Teeter, 1981) and as advocated by B. C. Wang in his classic work (Wang, 1985). In this paper, we demonstrate that this approach of single-wavelength anomalous dispersion (SAD) coupled with increasingly powerful phasing and density-modi®cation algorithms (de La Fortelle & Bricogne, 1997;Hauptman, 1996;Langs et al., 1999;Cowtan, 1999;Terwilliger, 2000) can solve the phase problem for macromolecular structures and demonstrate that the methodology is generally useful.

Background of phase determination
This is covered in several reviews (e.g. Blundell & Johnson, 1976;Drenth, 1999) and only a brief outline is presented here. X-rays are diffracted by atoms positioned within a crystal lattice. Most diffraction arises from the electrons surrounding the atomic nucleus and since this electron cloud has a radius comparable to the X-ray wavelength, the contribution falls off at higher diffraction angles, i.e. at higher resolution. This is represented by the atomic form factor. Such a signal from the whole atom is isotropic and can be treated as a real number, f 0 ().
If X-rays can excite those electrons which are able to jump from a lower to higher energy shell, an auxiliary resonant anomalous signal is observed and the atomic form factor can be expressed as a complex number f H + if HH . Generally, f HH is proportional to the atomic absorption of the X-rays and to their¯uorescence and f H follows the derivative of this function, according to the Kramer±Kronig transformation (James, 1958). In contrast to the normal atomic scattering factor f 0 , the anomalous dispersion corrections f H and f HH depend only on the wavelength ! of the X-rays used for the diffraction experiment and do not diminish with the diffraction angle. The full atomic form factor is The anomalous effect increases both with the atomic number of the scatterer and as the X-ray energy (inversely proportional to the wavelength) moves to the point where it corresponds to the resonant energy for the excitation of electrons from particular orbital levels. As the X-ray energy moves below such an absorption edge, the absorption and anomalous scattering effects diminish abruptly. The absorption edges are classi®ed according to the electron energy shells as K, L I , L II , L III or M edges. Moreover, for some atom types such as the lanthanides the f HH spectrum near L edges shows additional features, often increasing signi®cantly before dropping down abruptly beyond the edge. These white lines in f HH can be signi®cantly higher than the predicted level. In most macromolecules, there are few anomalous scatterers and their anomalous dispersion generates only small differences in intensity, so the diffraction data have to be measured very accurately to allow its utilization for phasing.
When the atomic form factor is real, i.e. the f HH contribution is zero, it is clear from the structure-factor equation that F(h, k, l) and F(Àh, Àk, Àl) will have the same magnitude and that 9(h, k, l) = À9(Àh, Àk, Àl). This is known as Friedel's law. However, when the form factor contains an imaginary contribution if HH , the re¯ections F(h, k, l) and F(Àh, Àk, Àl) will have different intensities and their phases are no longer complementary. In the MAD technique, several data sets are measured at different wavelengths ! i , with different values for the dispersive difference f H and the anomalous difference f HH . This results in differences between both the re¯ection amplitudes at the different wavelengths and between the amplitudes of the Friedel-related re¯ections measured at the same wavelength (Bijvoet differences). Once the positions of the anomalous scatterers are known, the protein phases for the re¯ections can be derived from these amplitude differences in an analogous way to the well established MIRAS approach.
The procedure has two independent stages. Firstly, the positions of anomalous scatterers have to be deduced from Patterson or direct-methods searches using coef®cients derived from either dispersive or anomalous differences or from a combination of both. In principle, the positions of anomalous scatterers can be found from the Bijvoet differences for single-wavelength data (Rossmann, 1961;Mukherjee et al., 1989).
Once a model of anomalous scatterers is obtained, the partial structure needs to be re®ned to improve its ability to predict the observed differences. Simultaneously, it is used to deduce protein phases from these differences and from the calculated anomalous model phases (Blundell & Johnson, 1976;Drenth, 1999). Formally, a phase can be determined from three observations and the appropriate models for MAD, MIR or SIRAS experiments, but the errors in both measurements and models mean that it is essential to use a probabilistic approach to obtain appropriate weights for the phase information. If there is only one partial structure model (as is the case for SAD, MAD or SIRAS experiments), the correct enantiomer can usually only be chosen by assessing which

Single-wavelength phasing
It is formally not possible to evaluate the protein phases exactly if only two experimental measurements are available. This is the case when the data are restricted to one wavelength (SAD), with only the Bijvoet difference available, or in the SIR case, when only the native and one derivative data set is measured. Even assuming that the measured protein amplitudes, F + and F À and the calculated amplitude and phase contributions of the anomalous partial structure, F A and 9 A , respectively, are error-free, there will be a twofold ambiguity in the estimation of the protein phase (Ramachandran & Raman, 1956). Fig. 1 shows that for the SAD case with all anomalous scatterers of the same kind, the two possible phase values of the protein structure factor 9 T are symmetrically oriented around (9 A À 90 ). The phase error for either solution is (9 T À 9 A + 90 ) and the ®gure of merit is cos(9 T À 9 A + 90 ).
Analogously, for the SIR case the two possible values of the protein phase are symmetrically oriented about the heavyatom phase 9 H . The protein phase is only determined uniquely when the protein and anomalous phases differ by 90 and both solutions coincide. This corresponds to the maximum possible Bijvoet difference.
The relation between the Bijvoet difference, ÁF AE , the phase of the protein 9 T and that of the anomalous substructure, 9 A , can be deduced from Fig. 1 (following Hendrickson, 1979), the contribution of the anomalous scattering to the total diffracting power of the crystal is small, F A << F T , then (|F + | + |F À |)/2 9 F T and ÁF AE jF j À jF À j 9 2F HH A sin9 T À 9 A X When 9 T T 9 A AE 90 there is an ambiguity, since sin9 T À 9 A sin180 À 9 T 9 A and, following Ramachandran & Raman (1956), . However, if the anomalous scatterers make up a substantial part of the structure, hF A i is comparable in magnitude to hF T i and 9 T will be correlated with 9 A . In the limiting case, when the whole structure consists of identical anomalous scatterers, 9 T = 9 A , |F + | = |F À | and no Bijvoet differences are observed. The probability of phase distribution resulting from anomalous scattering (Hendrickson, 1979) can be expressed as P anom 9 N expfÀÁF AE 2F HH A sin9 T À 9 A 2 a2E 2 gY where N is the normalizing factor and E the standard error estimation.
Of the two possibilities resulting from the sine ambiguity, there is a slightly higher probability that the protein phase 9 T has the value closer to 9 A . This is the basis for the proposal of Ramachandran & Raman (1956) that the value of 9 T closer to 9 A should be selected for initial phasing. Sim (1959) derived the statistical probability of the protein phase estimated from the known anomalous partial structure, U is the contribution of the normally scattering (unknown) atoms.
The above relations are illustrated for few test data sets in Fig. 2, where the differences of phases calculated for the whole model and from the anomalous scatterers only are shown as a function of experimentally measured Bijvoet differences. The sinusoidal dependence between ÁF AE and (9 T À 9 A ) shows that when the partial structure of anomalous scatterers constitutes a larger fraction of the total scattering, as for Li 2 SO 4 (Z 2 S a Z 2 i = 38%) or ferredoxin (21%), the protein The Argand diagram showing various contributions to the scattering factors. The measured amplitudes of both Friedel mates (F + and F À ) are shown in blue, the total normal scattering (F T ) in black, the contribution of normally scattering atoms (F N ) in green and that of the anomalous scatterers (F A , F H A and F HH A ) in red. If the anomalous substructure has been solved, the red vectors are known in length and phase. Two solutions are then possible, giving the same Bijvoet difference |F + | À |F À |, with the total phase, 9 T , symmetrically placed on both sides of 9 A À 90 . These two solutions have a different contribution of the normal scatterers, F N . phases (9 T ) tend to be closer to the anomalous scatterers phase (9 A ) than when they amount to only a small fraction, as for lysozyme (9%).
Several approaches for breaking the phase ambiguity have been used for solving crystal structures by SAD. Earlier methods included the following.
(i) Resolved anomalous phasing, as used originally by Hendrickson & Teeter (1981) for the solution of of crambin. One of the alternative phases is selected according to the combination of probabilities resulting from the partial structure, P par , and from anomalous scattering, P anom , except when there is a strong unimodal distribution of P anom for the largest Bijvoet differences.
(ii) Iterative single-wavelength anomalous scattering (ISAS) approach, introduced by Wang (1985). This method utilizes the`noise-®ltering' procedure in direct space by iteratively smoothing the electron density within the solvent region in the macromolecular crystal, which enhances the meaningful features within the protein region until they become interpretable.
(iii) Direct-methods applications as proposed by Hauptman (1982Hauptman ( , 1996 or by Fan et al. (1990), e.g. implemented in the program OASIS (Hao et al., 2000). In this approach, the initial phases estimated from the anomalous differences are re®ned on the basis of probabilistic relations between large normalized structure factors.
The modern approach uses carefully weighted probabilistic methods to determine the initial phases and their reliability. Maximum likelihood-based phasing is programmed in SHARP (de La Fortelle & Bricogne, 1997), taking into account various effects such as intensity measurement uncertainty. MLPHARE (Otwinowski, 1991) integrates the phase probabilities around the phase circle. These phase distributions are improved by density-modi®cation procedures such as those programmed in DM (Cowtan, 1999), SOLOMON (Abrahams & Leslie, 1996) and RESOLVE (Terwilliger, 2000). Providing the phase errors are properly estimated, this is an extremely powerful tool. There is no difference in the formal error estimate for the phase probabilities derived from the two enantiomorphs, but the maps differ in quality and only one will show interpretable structural features.

Estimation of the amount of anomalous signal in diffraction data
The mean ratio of the Bijvoet difference to the total protein amplitude is hÁF AE iahFi 2 1a2 N 1a2 a f HH A aN 1a2 P f eff X This formula is analogous to that given by Hendrickson & Teeter (1981), Dependence of the difference between the total protein phase and the phase of anomalous scatterers (both calculated from the model) on the measured Bijvoet difference for (a) lysozyme, (b) Clostridium acidurici ferredoxin with two Fe 4 S 4 clusters, (c) lithium sulfate. The distribution follows the sinusoidal relationship ÁF AE = 2F HH A sin(9 T À 9 A ) and shows that if the anomalous scatterers constitute a large part of the structure, the total protein phase 9 T tends to be closer to the anomalous phase 9 A of the two possible values 9 resolution, but f eff = (1/N) f i reduces with resolution and thus the percentage of anomalous signal could be expected to increase at high resolution. In addition, this effect may be enhanced if the temperature factors of the anomalous scatterers are lower than the average value for all atoms of the macromolecule. However, the weak intensities at high resolution are measured with lower accuracy, spoiling the practical advantage of these effects. The true ÁF AE values are often of the same order as the measurement errors, leading to seriously overestimated hÁF AE i. If 4 represents the measurement error, ÁF AE obs = ÁF AE true AE 4 and hÁF AE obs i = [(ÁF AE true ) 2 + 4 2 AE 24ÁF AE true ]. In practice, the signi®cance of the anomalous signal contained in the measured set of intensities can be roughly estimated at the data-merging stage. If Friedel mates are treated as equivalent, the true differences between the intensities of the Friedel-related re¯ections will lead to increased merging R factors and distorted normal probability plots compared with the results obtained when the Friedel mates are kept seperate. Also, the list of potential outliers should reveal signi®cant and consistent differences between some of the Bijvoet-related intensities.

Test examples on previously known structures
We present here the results of phasing of various diffraction data collected from crystals of known structures and solved again by the SAD approach. The statistics of diffraction data and phasing for each data set are given in Tables 1 and 2, respectively. The amount of anomalous signal in each data set is illustrated in Fig. 4, where the average ratio of anomalous difference to total amplitude, hÁF AE i/h Fi, is given as a function of resolution. All models were re®ned with REFMAC

A small structure with very weak anomalous signal: lithium sulfate monohydrate
This salt crystallizes in space group P2 1 , with a unit-cell volume of only 207 A Ê 3 . Diffraction data from a crystal of Li 2 SO 4 .xH 2 O were collected at the NSLS synchrotron beamline X9B with the ADSC Quantum-4 CCD detector. The wavelength used was 0.98 A Ê , where sulfur has very small values of f H = 0.183 and f HH = 0.234 electrons as calculated by CROSSEC (Cromer, 1983). The expected value of hÁF AE i/hFi is about 3% and the observed ratio is 2.85%. Two exposure passes were recorded and 3282 intensity measurements were merged to give 99 centric and 326 acentric re¯ections (for which Friedel mates were kept separate) to a resolution of 0.80 A Ê , corresponding to a complete copper-radiation sphere of re¯ections. R merge was 3.5% and R anom was 2.0%.
The anomalous difference Patterson synthesis (Fig. 3a) clearly showed the position of sulfur. Starting from the single atom of sulfur, a few cycles of re®nement and difference Fourier synthesis revealed the whole structure, including both water H atoms. In fact, the F obs ,(S) calc map with phases calculated solely from sulfur showed the complete image of the structure superimposed on its centrosymmetric representation. The subsequent re®nement was performed against F 2 with Friedel mates treated as separate re¯ections. It converged with a ®nal value of R1 = 2.37% and the Flack parameter x = À0.15 (AE0.12). The inverted enantiomer gave an R1 of 2.73% and x = 1.14 (AE0.14). The Flack absolute structure parameter x should re®ne to 0.0 for correct chirality and to 1.0 for inverted chirality (Flack, 1983). Table 1 X-ray data statistics.
Values in parentheses are for the highest resolution shell.
This example shows that even very small accurately measured anomalous scattering contribution can be quite meaningful. In this case, the anomalous signal was used to locate the S atom and to distinguish between two alternative enantiomers after re®nement. The single atom of sulfur provides 43% of the total scattering of this crystal.

Anomalous signal of sulfurs and chlorines: lysozyme
The results of single-wavelength phasing of the tetragonal HEW lysozyme have been published previously (Dauter et al., 1999). X-ray data to 1.55 A Ê resolution were collected from a native crystal grown from`standard' conditions containing 1 M of NaCl, using a wavelength of 1.54 A Ê . SHELXD (Sheldrick, 1998) using the observed anomalous differences as input data found 17 sites corresponding to all ten S atoms in the protein and to seven chloride ions located within the ordered solvent shell around the lysozyme molecule. Phasing using the program SHARP followed by density modi®cation using the program DM produced a clear electron-density map of the protein with a correlation coef®cient with the F obs map of 0.79.
These data were collected in a standard way, with the crystal in arbitrary orientation, without employing inverse-beam or mirror strategies. However, the four data-collection passes with different exposure times led to a high redundancy of measurements, which contributed to the accuracy of the estimated intensities and anomalous differences. The expected amount of anomalous signal was 1.5% and indeed the calculated hÁF AE i/hF i approached this value in the lower resolution shells (Fig. 4b).
The presence of chloride ions positioned around the surface of the lysozyme molecule held only by hydrophobic van der Waals or polar hydrogen-bond interactions suggested the possibility of using the anomalous signal of heavier halides, bromides or iodides, soaked into protein crystals for phasing (Dauter & Dauter, 2001).

DNA oligomer phased on phosphorus
Diffraction data from the crystal of d(CGCGCG) 2 DNA hexamer duplex in the Z-form were collected using a wavelength of 1.54 A Ê to 1.5 A Ê resolution in three passes with different exposures (Dauter & Adamiak, 2001). The orthorhombic crystal in space group P2 1 2 1 2 1 is densely packed, with about 25% solvent content. It contains 12 nucleosides connected by ten phosphates, one molecule of spermine and one magnesium ion in the asymmetric unit. Only P atoms   display a signi®cant anomalous signal at this wavelength, with an f HH value of 0.43 electron units. The diffraction data were collected with high redundancy and quality. The expected amount of anomalous signal is about 2%, which is a higher value than that for the sulfur signal in crambin or lysozyme (about 1.5%). Such signal can be expected for all DNA or RNA structures since, in contrast to the variable sulfur content in proteins, there is always one phosphate per nucleotide. The hÁF AE i/hF i ratio for measured data is shown in Fig. 4(b).
The direct-methods programs SHELXS (Sheldrick, 1986), SHELXD or SnB The hÁF AE i/hFi ratio as a function of resolution for all data quoted in x5 and x6. (a) Atomic resolution data sets, extending beyond 1.3 A Ê ; (b) data extending beyond 1.7 A Ê resolution; (c) data sets not exceeding 1.7 A Ê . The theoretically predicted curves taking into account the atomic scattering factors diminishing with resolution are drawn in brown. against data processed with different redundancies are given elsewhere (Dauter & Adamiak, 2001).

Small protein with weak anomalous signal: 2Zn insulin
The results obtained on the 2Zn insulin with 102 amino acids in the asymmetric unit are somewhat unexpected and strongly underline the role of density modi®cation in obtaining a good quality electron-density map from rather poor initial phases, especially at high resolution. 1.0 A Ê diffraction data from a rhombohedral 2Zn insulin crystal have been collected in order to carry out atomic resolution re®nement. The conditions were optimized for recording highresolution re¯ections, not the anomalous dispersion signal, so that although the completeness of unique re¯ections was 99%, only 72% (31 475 out of 43 476) had both Friedel mates measured. There are no centric re¯ections in this space group (R3). The wavelength of synchrotron radiation used, 0.927 A Ê , was distant from the absorption edge of zinc (1.28 A Ê ) and f HH is 2.26 electrons. The expected hÁF AE i/hF i ratio for two Zn atoms per about 2400 atoms in the protein is 1.4%. The hÁF AE i/hF i ratio calculated from the data as a function of resolution is shown in Fig. 4(a). It is much higher than the expected value; a result of the inaccurate estimates of Bijvoet differences.
The positions of the two Zn atoms were easily obtained from the anomalous difference Patterson. They lie on the threefold axis, about 16 A Ê away from each other. This constellation of two atoms in R3 is centrosymmetric, with the center of symmetry placed on the threefold axis at the midpoint between two zinc sites. SHARP luckily chose the proper handedness by re®ning the temperature factors of the two Zn atoms to slightly different values, B(Zn1) = 2.607 and B(Zn2) = 2.619 A Ê 2 , consistent with the correct enantiomer. The initial phasing gave an overall ®gure of merit (FOM) of 0.23. The subsequent density modi®cation by DM (in OMIT PERTURB mode) decisively broke the centrosymmetry and resulted in a phase set with an FOM = 0.81 and an interpretable map (Fig. 5a).

Fe and Zn rhombohedral rubredoxins
Two data sets on R3 Clostridium pasteurianum rubredoxin were collected for re®nement using synchrotron radiation of 0.92 A Ê wavelength ; the ®rst to 1.1 A Ê resolution from a crystal containing one FeS 4 cluster and 55 amino-acid residues and the second from the Zn-substituted variant to 1.2 A Ê resolution. In both data sets only 77% of all re¯ections had both Friedel mates measured. At this wavelength the anomalous f HH contribution of Fe is 1.35 and that of Zn is 2.23 electrons. The theoretical amount of anomalous signal hÁF AE i/hFi is 1.41% for Fe-substituted and 2.32% for Zn-substituted proteins, whereas the data show much higher values, 5.4 and 6.9% on average, suggesting that the Bijvoet differences contain signi®cant errors.
Anomalous difference Pattersons revealed clearly the MÐM peaks at the 6.6' (Fe) and 14.2' (Zn) levels. Phasing based on these metal sites by SHARP and DM produced easily interpretable electron-density maps for both proteins.

Ferredoxin with two Fe 4 S 4 clusters
Data to a resolution of 0.94 A Ê from a crystal of C. acidurici ferredoxin (CauFd) were collected with a wavelength of 0.88 A Ê and used to re®ne the structure at atomic resolution (Dauter et al., 1997). However, inspection of this data set has shown a considerable amount of anomalous signal (Fig. 4a). The f HH value of Fe at this wavelength is 1.25 electrons, so that eight Fe atoms in the 55-residue protein gives the expected hÁF AE i/hF i of 3.8%.
All direct-methods attempts to ®nd the positions of Fe atoms based on Bijvoet differences have failed. However, the iron substructure was easily solved by SHELXD run against the native non-anomalous data, with a 95% success rate of phasing trials. SHARP phasing based on anomalous differences followed by DM led to a clearly interpretable map.
Eight Fe atoms constitute 21% of the total scattering of the 396 non-H atoms in this molecule and, according to Sim phase-probability distribution (Sim, 1959), the protein phases are on average close to the anomalous substructure phases (Fig. 2b). It is possible to partially interpret the electrondensity map calculated with protein amplitudes and substructure phases and build the complete protein in this way after a few iterations. This can be achieved completely automatically by the application of wARP in the`warp_solve' mode. CauFd therefore behaves like a typical`small structure' and the anomalous diffraction signal is not necessary to solve it. A ®nal analysis of the phase errors against the ®nal model indicated two interesting features in this case: the SHARP phases were excellent but the ®gures of merit were much too low and the DM phases were distorted away from the Fe Acta Cryst.  phases by the assumptions in the default histogram matching applied to the electron density, which effectively`¯attened' the Fe contribution.

Subtilisin phased on three calcium ions
Subtilisins are known to have calcium-binding sites important for protein stability. Diffraction data from an orthorhombic crystal of subtilisin from Bacillus lentus (Betzel et al., 1988) grown in the presence of 1 M CaCl 2 , were collected using 1.54 A Ê synchrotron radiation. A total of 270 rotation was covered in a single pass to the rather modest resolution limit of 1.75 A Ê and the exposure time was adjusted to avoid overloaded detector pixels. The data merged with R merge = 2.9% and R anom = 1.5%. The expected value of hÁF AE i/hF i for three Ca 2+ ions, three methionine sulfurs and four chlorides per 270 residues is 1.35%, close to the observed value (Fig. 4c). The Harker sections of the anomalous difference Patterson synthesis (Fig. 3b) clearly revealed three calcium sites and some additional peaks resulting from sulfur sites.
SHELXS run against Bijvoet differences gave a clear solution, where ten highest peaks corresponded to three calcium ions (peaks 1, 2 and 3; f HH Ca = 1.28), three methionine sulfurs (peaks 4, 5 and 8; f HH S = 0.56) and four chloride ions (peaks 6, 7, 9 and 10; f HH Cl = 0.70). Table 3(a) lists the SHELXS E-map peaks and B factors of all anomalous scatterers in the re®ned structure. There is a clear correlation between SHELXS rank and the atomic anomalous scattering and B factor. In fact, the high temperature factors of the chloride ions may take into account their partial occupancies, so that sulfurs show up as higher peaks than chlorides.
Analogous behavior of sulfurs and chlorides has been observed in lysozyme (Dauter et al., 1999) Only three calcium sites were used for phasing with SHARP, which gave an FOM of 0.338 and of 0.769 after density modi®cation by DM. The resulting electron-density map was easily interpretable.

Subtilisin phased on Lu 3+ ions
Another crystal of subtilisin grown from 1 M NaCl was soaked in a 1 M solution of LuCl 3 for a short time and MAD data to 1.75 A Ê were collected at four wavelengths in the vicinity of the lutetium L III absorption edge (1.34 A Ê ), at the peak, in¯ection, high-energy and low-energy remote points of the¯uorescence spectrum, each with a total rotation of 180 (Table 1). SHELXD run against peak, in¯ection or highenergy remote data clearly identi®ed four Lu sites which had various fractional occupancies. Only one of the Lu  sites (the third strongest) corresponded to a calcium sites in the Ca-substituted subtilisin. The strongest Lu site was suf®cient for successful phasing of the protein by MAD based on all four wavelength data and also for phasing the single data sets at peak, in¯ection and remote wavelengths (Table 4). According to expectation, the low-energy remote data did not contain enough anomalous signal to provide meaningful phasing power in the SAD mode. Also, the phasing from the high-energy remote data using only one Lu site did not generate an interpretable map.

Glucose isomerase: a 44 kDa protein phased on a single Mn 2+ ion
Glucose isomerase from Streptomyces rubiginosus (Carrell et al., 1989) was supplied by Hampton Research. The I222 crystals were grown from 11% MPD, 0.1 M MgCl 2 , 0.05 M Tris buffer pH 7.0 with a protein concentration of 15 mg ml À1 . The glucose isomerase monomer has 388 amino acids and contains two Mn 2+ ions as a natural cofactor.
Diffraction data were collected at a synchrotron beamline to a resolution of 1.5 A Ê using a wavelength of 1.34 A Ê in two exposure passes. The crystal was rotated through 360 , which ensured high multiplicity. The data set was very complete and accurate (Table 1). For subsequent comparisons with reliable calculated phases, the available enzyme model was re®ned by REFMAC and wARP against these data to an R factor of 18.3%. The anomalous difference Fourier synthesis using these phases showed clear peaks for one metal (at 81'), a second metal (24') and nine sulfurs (18±12', eight Met SD and one Cys SG). Only the highly disordered N-terminal methionine sulfur was not visible. The strong metal site proved to be a fully occupied Mn 2+ ion and the second metal site is highly substituted by Mg 2+ , which is present in the crystallization medium. At this wavelength the f HH values are 2.23 electrons for Mn and 0.43 electrons for S; Mg does not show any signi®cant anomalous scattering effect. The theoretically expected amount of anomalous signal is about 0.9%. The hÁF AE i/hFi ratio in the diffraction data shown in Fig. 4(b) approximates this value almost to the highest resolution limit.
The anomalous difference Patterson revealed one prominent peak corresponding to the stronger metal site. The directmethods program SHELXS using Bijvoet differences gave a clear solution with ten anomalous scatterers correctly positioned (Table 3b).
The single strongest metal site positioned from the anomalous difference Patterson was input to SHARP and the phases with very low FOM of 0.15 submitted to density modi®cation by DM, resulting in a phase set with FOM = 0.78 and an excellent electron-density map (Fig. 5b).
According to the estimation of Wang (1985) the anomalous signal of about 0.6%, corresponding to one disul®de group in a 12.5 kDa protein, can lead to structure solution. The glucose isomerase data are close to this limit and indeed suggest that a signal smaller than 1% could be successfully used for phasing, provided the differences are measured suf®ciently accurately.

d-UTPase: a single-site Hg derivative
The structure of Escherichia coli dUTPase in space group R3 was originally solved by the MIRAS approach (Cedergren-Zeppezauer et al., 1992;Dauter et al., 1998) using a single-site mercury and a second, somewhat less substituted, platinum derivative. The data collected from the Hg-derivatized crystal extended to 2.0 A Ê and contained a signi®cant amount of anomalous signal, but were not measured with high redundancy and only 85% (9924 of 11 687) re¯ections had both Friedel mates.
The Hg atom identi®ed from the anomalous difference Patterson was input to SHARP which, after density modi®cation by DM, led to an interpretable electron-density map (Fig. 5c).
This structure was recently successfully phased against newly collected atomic resolution Hg-derivative data by the wARP approach (Gonza Â lez et al., 2001).

Examples of novel structures solved by SAD
Several new structures were recently solved using SAD data collected at the X9B beamline of NSLS, Brookhaven National Laboratory and the ADSC Quantum-4 CCD detector. All data were processed with HKL2000 (Otwinowski & Minor, 1997) and their statistics are given in Table 1.

Oligopeptide antibiotic phased on chlorines
Decaplanin, a cyclic nonapeptide antibiotic composed from unusual amino acid and glycosyl residues, contains one covalently bound Cl atom. Small polypeptides are notoriously dif®cult to derivatize. Their conformational¯exibility means that it is also not easy to apply molecular replacement, even if the similar structure is known. Direct methods can be applied only if crystals diffract to atomic resolution.
The P6 1 22 crystal form of decaplanin was recently solved (Lehmann, 2000) based on the anomalous signal within the data set collected to 1.6 A Ê resolution with a wavelength of 1.54 A Ê , where chlorine has an f HH of 0.70 electrons. The data were of high quality, characterized by an R merge of 5.6%, a redundancy of 10 for individual Friedel mates and an I/'(I) of 54 overall and 14 in the highest resolution shell.  Table 4 Phasing of Lu subtilisin against various SAD and MAD data. It was not known a priori how many antibiotic molecules could be expected in the asymmetric unit of the crystal, since such compounds can crystallize with a wide range of packing densities. SHELXD gave good solutions (in 20% of phase sets) with eight independent anomalous scatterer sites and eight molecules could be accommodated in the asymmetric unit of the crystal. SHARP gave an FOM of 0.39 and a subsequent DM run assuming 40% solvent in the crystal (corresponding to eight independent molecules) gave an FOM of 0.63. Surprisingly, the map showed only four independent molecules, each with one bound Cl atom and an additional four chloride ions each hydrogen bonded by three main-chain NH groups of the antibiotic. The DM run repeated with the solvent content appropriate for four molecules, 70%, produced phases with an FOM of 0.89 and an excellent map (Fig. 6a). The expected hÁF AE i/hFi ratio for eight independent Cl atoms is 2.0% and the amount of the anomalous signal in the data is close to that value (Fig. 4b). Four molecules of decaplanin contain 448 atoms and this structure is larger than rubredoxin or ferredoxin.

Serine-carboxyl proteinase: cryosoaked bromides
The structure of the 44 kDa Pseudomonas serine-carboxyl proteinase (PSCP) has recently been solved (Dauter et al., 2001) using single-wavelength data collected from the crystal soaked in a cryosolution containing 1 M NaBr. The PSCP crystals resisted all derivatization attempts with classic heavyatom reagents. The three-wavelength MAD data to 1.8 A Ê were collected around the bromine absorption edge, but only the peak-wavelength data were used for structure solution.
Nine sites from the SHELXD solution of the anomalous scatterers were selected for subsequent protein phasing. The 1.8 A Ê SHARP phases were extended by DM to the full resolution of the native data (1.4 A Ê ) with an FOM of 0.74. The resulting map, with a correlation coef®cient (CC) to an F obs map of 0.62, was interpretable and was subjected to automatic model building by wARP. The protein model thus obtained was almost complete and required only a small number of side chains to be built by hand.
Since the MAD data were available, it was possible to compare the results of single-wavelength and threewavelength phasing, which was performed a posteriori in an analogous way using SHARP and DM. The resulting threewavelength DM map was superior, with a CC to the F obs map of 0.76, but the wARP procedure led to a comparable result. However, this procedure required signi®cantly more time both for data collection and for phasing (SHARP took 75 h for three-wavelength data and 15 h for single-wavelength data).

Acyl-protein thioesterase: phasing on cryosoaked bromide ions
The structure of human acyl-protein thioesterase (APT) was also solved recently (Devedjiev et al., 2000) using singlewavelength data collected from a crystal soaked for 20 s in cryosolution containing 1.0 M NaBr. No separate native data were available. The bromide sites were found and promising preliminary phasing results were obtained shortly after diffraction images from the peak wavelength had been processed, so no further data collection with additional wavelengths was attempted.
There are two molecules of the protein, 56 kDa in total, in the asymmetric unit of the monoclinic P2 1 cell. The 1.8 A Ê data was measured at the Br¯uorescence peak wavelength in 360 images of 0.5 rotation each. The measured intensities were strong, with an I/'(I) of 30 overall and of 8 in the highest resolution shell. The R merge was 2.8% with Friedel mates not merged (average redundancy 1.9) and 4.8% with Friedel equivalents merged (average redundancy 3.8). The bromide sites were obtained easily both from SnB, with 309 solutions with R min in the range 0.29±0.35 against 691 unsuccessful phase sets with R min in the range 0.50±0.75, and from SHELXD, which gave 1000 successful solutions with the CC ®gure-ofmerit above 0.30 per 1000 attempted phase sets.
However, as usual with a number of halide sites having different occupancies, there was no clear demarcation in the peak heights to suggest which sites are meaningful. The eight strongest sites were used for the ®rst round of SHARP phasing and the set was expanded to 22 sites after inspecting residual peaks in two more rounds of SHARP. 18 of these sites could be grouped into two identical constellations, clearly con®rming the presence of two independent molecules in the asymmetric unit of the crystal related by a noncrystallographic twofold axis. The density modi®cation by DM increased the FOM from 0.40 to 0.85 and gave a map of high quality (Fig. 6b).

Mitomycin C resistance protein (MRD) phased on selenomethionines
The diffraction data were collected on a variant of MRD containing three SeMet residues in a 15 kDa protein. The wavelength was set to the peak of the selenium¯uorescence spectrum at 0.979 A Ê . The data extended to a resolution of 1.3 A Ê and scaled with an R merge of 3.8% and an R anom of 6.8%, suggesting a strong anomalous signal, despite each Friedel mate being measured only twice on average. For f HH Se = 4.5, the estimated value of hÁF AE i/hFi is 5.2%.
SHELXD run against Bijvoet differences gave the correct solution for three selenium sites in 100% of the trials. SHARP produced a phase set with an average FOM of 0.57. After DM (FOM = 0.69 at 1.3 A Ê ) the map was very clear and wARP automatically built 110 out of 130 residues with most of the side chains. The details of the structure solution and re®nement will be given elsewhere (Martin, Devedjiev, Dauter, He, Sherman, Derewenda & Derewenda, in preparation).

Discussion
Traditionally in macromolecular crystallography the anomalous signal of heavy atoms has been used as an auxiliary source of phasing in the MIR method of solving new crystal structures, with the isomorphous differences in intensities between the native and derivative data serving as a main source of phasing power. There were dif®culties in accurately measuring the small anomalous differences using photographic ®lm. However, the potential power of the anomalous scattering signal was recognized in early days of macromolecular crystallography (Ramachandran & Raman, 1956;Green et al., 1954;Rossmann, 1961). The use of anomalous scattering of sulfur for phasing crambin (Hendrickson & Teeter, 1981) and the simulations by Wang (1985) showed the great phasing potential of the anomalous signal even with single-wavelength data.
The introduction of more accurate automatic detectors, such as multiwire chambers, imaging plates and CCDs, as well as the use of crystal-freezing techniques, has now made it possible to collect diffraction intensities very accurately. These developments contributed to the popularity of the MAD method of phasing, which is currently the method of choice for solving novel crystal structures of macromolecules. The MAD method completely alleviates all problems of nonisomorphism of derivatives characteristic of the classic MIR approach. However, it requires precise control of the synchrotron-radiation wavelength for the proper estimation of both the anomalous and dispersive differences.
Several examples quoted above suggest that with the current availability of the state-of-the-art technical hardware (synchrotron beamlines, detectors) and computational software (data-processing programs, phasing and density-modi®cation algorithms) it is feasible to obtain interpretable electron-density maps from intensities containing the anomalous signal within a single data set recorded using only one X-ray wavelength. In contrast to the MAD method, the wavelength need not correspond to the maximum anomalous effect near the absorption edge of the corresponding anomalous scatterer.
The accuracy of measurements required for successful phasing by SAD seems to be comparable to that acquired in routine MAD experiments. In favorable cases, satisfactory results can be obtained with data of relatively poor quality with low completeness and redundancy, as in the case of 2Zn insulin or dUTPase. It should, however, be stressed that excellent data quality, achieved mainly by recording high multiplicity of measurements, leads to more accurate phases and as a result more interpretable electron-density maps. Inspection of Tables 1 and 2 con®rms that the most accurate phases were obtained for data collected with the highest redundancy of measurements, e.g. lysozyme, DNA, Ca subtilisin, glucose isomerase and decaplanin.
The resolution of diffraction data is not so important for SAD phasing as the data quality. In the examples discussed above as well as those available in the literature [e.g. crambin (Hendrickson & Teeter, 1981;Wang, 1985), neurophysin (Chen et al., 1991), IF3 (Biou et al., 1995), rusticyanin (Harvey et al., 1998), psoriasin (Brodersen et al., 2000) and obelin (Liu et al., 2000)], the resolution limits of the data have varied from 1.05 to 3.0 A Ê . The clear indication of the importance of data accuracy, which can be best achieved by multiple measurements of equivalent re¯ections, is shown by phasing a DNA oligonucleotide against a series of data sets of varying redundancy (Dauter & Adamiak, 2001).
The SAD phasing procedure generally consists of three stages: (i) ®nding anomalous scatterers, (ii) evaluation of initial phases and (iii) phase improvement by density modi®cation. The ®rst stage can utilize the anomalous difference Patterson vector search, direct methods or a combination of both. The evaluation of experimental phases can be achieved in various ways, for example by maximum-likelihood estimation. The third stage involves phase-modi®cation procedures such as solvent¯attening and histogram matching.
In all the examples presented above, a similar protocol and programs were used for identi®cation of anomalous sites, phasing and density modi®cation to ensure compatibility of results. However, other available programs were also tried and gave positive results, such as SOLVE (Terwilliger & Berendzen, 1999) and ACORN (Foadi et al., 2000) for searching for anomalous sites and phasing, MLPHARE (Otwinowski, 1991) for initial phasing or SOLOMON (Abrahams & Leslie, 1996) for density modi®cation.
In some cases, the full three-wavelength data were collected according to the principle of the MAD technique, but only one data set was eventually used for successful structure solution by SAD (x5.2). In these cases, the common MAD anomalous scatterers such as Se or Br were used and diffraction data were collected in a standard way, with crystals in arbitrary orientation and without using the inverse-beam approach. The redundancy was not high enough to ensure the full completeness of anomalous data. In addition, the intensities were not collected to absolutely the highest potentially achievable resolution; instead, attention was directed to appropriate estimation of the lowest resolution strongest re¯ections, which play the most important role in the Patterson search, direct methods and phasing and are important for the correct appearance of electron-density maps. In these cases the anomalous signal amounted to about 4±5% of the total scattering and all structure-solution steps proceeded smoothly; easily interpretable electron-density maps were obtained when protein models were constructed in an automatic way by wARP.
A few examples quoted above are based on the data collected without optimizing the anomalous signal in mind. During data collection of 2Zn insulin or rubredoxins, no effort was directed towards enhancing the anomalous signal contained in the re¯ection intensities. The data-collection protocol was instead optimized to achieve the highest possible resolution and completeness within the asymmetric unit of the reciprocal lattice. Nevertheless, even the rather weak anomalous signal present in these incomplete data led to successful phasing of the crystal structure. In these examples, the weakness of the anomalous signal has been compensated by the very high resolution of the diffraction data.

Conclusions
The theoretical potential of the anomalous scattering signal as a sole source of phasing information has been known since the early days of macromolecular crystallography (Ramachandran & Raman, 1956). The practical utilization of SAD by Hendrickson & Teeter (1981) has shown that it can indeed lead to successful phasing, provided the data can be measured accurately. Wang (1985) estimated that the anomalous signal at a level lower than 1% of the total scattering may be suf®cient. To measure diffraction intensities at such levels of accuracy is a challenge, but the progress in data-collection techniques observed in recent years (cryofreezing, twodimensional detectors and data-reduction programs) makes such tasks feasible. On the other hand, the progress and automation in phasing and density-modi®cation algorithms makes the SAD structure solution easier and faster.
A limited number of novel structures have been solved by the SAD approach since crambin. The examples presented in this paper suggest that the strength of SAD phasing is greater than has been anticipated. In many cases of MAD phasing a successful result can be obtained from a single data set (Rice et al., 2000), which requires much less time and effort for data collection and handling and minimizes the crystal radiation decay. It can be expected that the SAD approach will gain wider applicability, especially in the current era of highthroughput structural projects.
The anomalous diffraction data discussed here are available on request from ZD.