The magic triangle goes MAD: experimental phasing with a bromine derivative

5-Amino-2,4,6-tribromoisophthalic acid is used as a phasing tool for protein structure determination by MAD phasing. It is the second representative of a novel class of compounds for heavy-atom derivatization that combine heavy atoms with amino and carboxyl groups for binding to proteins.

Experimental phasing is an essential technique for the solution of macromolecular structures. Since many heavyatom ion soaks suffer from nonspecific binding, a novel class of compounds has been developed that combines heavy atoms with functional groups for binding to proteins. The phasing tool 5-amino-2,4,6-tribromoisophthalic acid (B3C) contains three functional groups (two carboxylate groups and one amino group) that interact with proteins via hydrogen bonds. Three Br atoms suitable for anomalous dispersion phasing are arranged in an equilateral triangle and are thus readily identified in the heavy-atom substructure. B3C was incorporated into proteinase K and a multiwavelength anomalous dispersion (MAD) experiment at the Br K edge was successfully carried out. Radiation damage to the bromine-carbon bond was investigated. A comparison with the phasing tool I3C that contains three I atoms for single-wavelength anomalous dispersion (SAD) phasing was also carried out.

Introduction
Experimental phasing is vital for the determination of threedimensional protein structures using single-crystal X-ray diffraction. Although about two-thirds of newly deposited structures in the Protein Data Bank were solved using molecular replacement, experimental phasing suffers less from model bias and is required for samples that do not have any structurally related entries.
Methods for experimental phasing based on the anomalous scattering of certain atoms, SAD (single-wavelength anomalous dispersion) and MAD (multi-wavelength anomalous dispersion), have largely replaced traditional methods such as MIR (multiple isomorphous replacement). 1 With the latest advances in synchrotron hardware, e.g. improved detectors, phasing with the weak anomalous signal from intrinsic scatterers has become a possible option, although high-quality data are required. These are more readily obtained when high symmetry enables a high data redundancy to be achieved, but low-symmetry examples have also proved successful (Lakomek et al., 2009).
In the case of suboptimal data quality or the absence of suitable intrinsic scatterers, external anomalously scattering atoms have to be introduced into the protein crystal. The main route for solving novel protein structures is without doubt the use of a selenomethionine derivative (Hendrickson, 1999). Similarly, the chemical incorporation of brominated nucleobases has become an important technique for nucleic acid structure determination (Peterson et al., 1996). Heavy-atom soaks traditionally have a low success rate, but systematic heavy-atom screening with conventional heavy-metal ions has been performed using gel electrophoresis (Boggon & Shapiro, 2000), mass spectrometry (Agniswamy et al., 2008) or a database approach (Sugahara et al., 2005).
It has also been shown that a more comprehensive treatment of anomalous scattering can yield further phase information. The anisotropy of anomalous scattering can be exploited to enhance the phase information present in the collected data (Schiltz & Bricogne, 2008). The combination of phase information from a partial molecular-replacement solution and weak experimental phases has also led to a number of successes (e.g. Tereshko et al., 2008;Schuermann & Tanner, 2003;Roversi et al., 2010), and is also included in the Auto-Rickshaw server (Panjikar et al., 2009).
Since most of these are specialized applications and are not generally applicable, a quick and easy approach towards experimental phasing would still be desirable for the derivatization of protein crystals and experimental phase determination.

The magic triangle I3C
Often, heavy-atom derivatives suffer from nonspecific binding. This results in low occupancy of the heavy-atom sites or in derivatization failing completely. We have developed a new class of compounds that combine heavy atoms for phasing with functional groups for specific interaction with biological macromolecules.
The first representative of this novel class of compounds is 5-amino-2,4,6-triiodoisophthalic acid (hereafter referred to as I3C; Fig. 1a). The three I atoms, which are arranged in an equilateral triangle (with a side of 6 Å ), give rise to a strong anomalous signal using in-house Cu K radiation. I3C has been incorporated into three test proteins (lysozyme, thaumatin and porcine elastase) either by cocrystallization or soaking . The three functional groups of I3C interact with the protein via hydrogen bonds. The amino group interacts with hydrogen acceptors such as the carbonyl O atom of asparagine or glutamine residues and, most importantly, with the carbonyl O atom of the protein backbone. The carboxylate groups interact with the hydrogen-donor groups found in serine, threonine, lysine or tyrosine. The most prominent interaction of the carboxylate group is its interaction with arginine. Since the three I atoms in I3C form an equilateral triangle, a successful substructure solution can readily be identified when inspecting the heavy-atom positions.
I3C has also been used to solve a novel protein structure. Experimental phases could be derived for the 35 kDa protein Mh-p37, which had resisted other phasing attempts (Sippel et al., 2008). I3C was introduced by soaking the protein crystals at a low pH value (in contrast to pH values of 6-8 for the test proteins in . The derivatization was successful using a lower I3C concentration (40 mM instead of 0.25 or 0.5 M as for the test proteins).
The strong anomalous signal of the I atoms renders I3C a powerful phasing tool for both in-house and synchrotron data. However, the I K edge ( = 0.374 Å ) and L I edge ( = 2.39 Å ) are not commonly accessible on synchrotron beamlines. Therefore, I3C can only be used for SAD or SIRAS phasing.

The MAD triangle B3C
Here, we report on the new phasing tool 5-amino-2,4,6-tribromoisophthalic acid (hereafter referred to as B3C; Fig. 1b). Analogously to I3C, the bromine compound B3C has three functional groups for hydrogen bonding. The three Br atoms arranged in an equilateral triangle (with a side of 5.65 Å ) are suitable for MAD experiments since the Br K edge ( = 0.920 Å ) falls within the normal energy range of a macromolecular crystallography beamline.
B3C was incorporated into crystals of proteinase K and a four-wavelength MAD experiment was carried out. Radiation damage was also investigated since it is known that radiolysis of the anomalous scatterers, e.g. the Br atom in brominated nucleotides, can prevent structure solution (Ennifar et al., 2002). We also compared the phasing ability of B3C with that of I3C for proteinase K.

Crystallization and protein derivatization
In this study, the incorporation of B3C and I3C into proteinase K was investigated. Synthesis and crystallographic characterization of B3C have been described elsewhere (Beck, Herbst-Irmer et al., 2009). Practical advice on the incorporation of B3C (and I3C) into protein crystals is also available (Beck, da Cunha et al., 2009) The phasing tools I3C and B3C. (a) The magic triangle I3C (5-amino-2,4,6-triiodoisophthalic acid) with the equilateral triangle of I atoms (side of 6.0 Å ) shown. (b) The MAD triangle B3C (5-amino-2,4,6-tribromoisophthalic acid) with the equilateral triangle of Br atoms (side of 5.65 Å ) shown. two carboxyl groups. 2 Protein crystals were obtained using the sitting-drop vapour-diffusion method.
Proteinase K (279 residues, 28.9 kDa; obtained from Sigma-Aldrich) was crystallized at 293 K by mixing 2.5 ml 20 mg ml À1 protein solution with an equal volume of precipitant containing 0.1 M Tris pH 7.2 and 1.28 M ammonium sulfate. Crystals appeared within one week. Protein crystals were soaked for about 10 s in 0.5 M B3C or I3C solution which also contained the same salt and buffer concentrations as the crystallization drop. The crystals were back-soaked for 5 s in a cryosolution containing the same salt and buffer concentrations with 30% glycerol but no heavy-atom compound. The crystals were then flash-cooled in liquid nitrogen.

Data collection and processing
Data were collected at 106 K on beamline X10SA (PXII) at SLS, Villigen, Switzerland. For the B3C data, a fluorescence scan was performed to locate the Br K edge. Interestingly, the spectrum showed two peaks at the Br edge (Fig. 2). Therefore, two peak data sets were collected, followed by high-energy remote and inflection-point data sets (see Table 1 for details). For the I3C data only one data set was collected, at the same wavelength as in-house Cu K (1.54178 Å ; Table 1). Data collection for all data sets was carried out with 0.5 s exposure per frame and 1 frame width.
To investigate radiation damage to B3C, a series of experiments was carried out. The same crystal that had been used for the MAD experiment was exposed to full photon flux (1.82 Â 10 12 photons s À1 at 13.473 keV), followed by data collection at the same energy (corresponding to the wavelength at peak2; Table 1). Full beam exposure was maintained for 2 s and was increased to 30 and 60 s for the last two burn exposures, respectively. A total of six data sets were collected from the same crystal following this procedure.

Substructure solution and data analysis
For B3C, four data sets (two peak, one high-energy remote and one inflection point) were used for MAD phasing (see Fig. 3 for anomalous data statistics). Data sets were prepared with XPREP (Bruker AXS Inc., Madison, Wisconsin, USA). Substructure solution was carried out with SHELXD (Schneider & Sheldrick, 2002). Inspection of the heavy-atom sites revealed the presence of equilateral triangles (with side lengths of about 5.6 Å ).
The two peaks in the fluorescence spectrum (Fig. 2) can be rationalized by the anisotropy of the anomalous signal, although fluorescence scans at different orientations of the

Figure 2
Fluorescence scan of proteinase K with B3C incorporated. The datacollection energies of peak1, peak2 and the inflection point (see Table 1) are marked by vertical lines. The two peaks close to the Br K edge are clearly visible.
2 The pK a values of the two carboxyl groups in B3C could not be determined experimentally by potentiometric titration (probably owing to the fact that the molecule forms a zwitterion in solution); the pK a values of the nonbrominated compound have not been reported in the literature. To obtain the stock solution, the compound cannot be dissolved directly in water since it has a poor solubility. After adding double equimolar amounts of base, the two carboxyl groups are mostly deprotonated, producing a salt with high solubility. The pH of the stock solution is usually in the range 7-8.
crystal (not carried out) would be required to confirm this. Similar effects have been observed for proteins containing selenomethionine and brominated nucleotides (Schiltz & Bricogne, 2008). For I3C, SAD phasing was carried out using data collected at 1.54178 Å (f 00 for iodine is 6.85 e at this wavelength; see Fig. 3b for anomalous data statistics). Substructure solution with SHELXD resulted in heavy-atom sites that also formed equilateral triangles (with a side length of about 6 Å ).
Free variables were introduced to refine the occupancy of each B3C or I3C site and subsequently also the occupancy of each single halogen atom separately in B3C or I3C. For the occupancy refinement, thermal displacement parameters were kept fixed for the halogen atoms (at B = 15.8 Å 2 for both bromine and iodine). The final model from the MAD data-set refinement (peak2) was used for further refinement of the radiation-damage data sets.
A comparison of the results from the MAD and the SAD phasing experiments can be found in Table 3

Figure 3
Anomalous and dispersive data statistics. (a) Anomalous correlation coefficient (CC) between the MAD data sets. HREM, high-energy remote; INFL, inflection point (data-collection wavelengths can be found in Table 1). The correlation coefficients do not depend on the (estimated) values. Data were truncated at 2.5 Å for heavy-atom substructure solution (where CC falls below 30% for all data sets). (b) Anomalous signal for the four bromine MAD data sets and the iodine SAD data set (red); pure noise would correspond to d 00 /sig ' 0.798. Here, the cutoff for the MAD data sets cannot be determined easily. The iodine data set shows a strong anomalous signal (data collected at 1.5418 Å ). Data were truncated at 2.0 Å for heavy-atom substructure solution for the iodine data set (where d 00 /sig falls below 1.2). (c) Dispersive differences between the MAD data sets. HREM, high-energy remote; INFL, inflection point (data-collection wavelengths can be found in Table 1). Radiation damage could have caused the apparent differences in d 0 values; because one wavelength was completed before the next wavelength was measured, systematic errors could be introduced that could also lead to an overestimation of the dispersive signal.

B3C and I3C binding sites
Four binding sites for B3C (Fig. 4) and three binding sites for I3C were observed in proteinase K. Two sites coincide for both derivatives; one of these is the main site with the highest occupancy shown in Fig. 5. The common occupancies for all three halogen atoms per site were refined with SHELXL to 0.42, 0.13, 0.10 and 0.09 for B3C (refinement against peak2 data set) 3 and 0.72, 0.17 and 0.14 for I3C. Interestingly, the occupancy of the main site differs significantly although similar soaking conditions were used. The difference might be a consequence of different soaking times or crystal properties or may be attributed to the different chemical properties of the two compounds (containing either I or Br atoms).
The interactions of the small molecules in proteinase K are very similar to those previously reported for lysozyme, thaumatin and elastase . The three functional groups of the phasing tools form hydrogen bonds to side chains or the main chain of the protein. Interactions for the main site of B3C (corresponding to site 1 in Fig. 4) are shown in Fig. 5. One carboxylate group interacts with a serine residue and the amide H atom of the protein backbone. The other carboxylate group interacts with the amino group of an asparagine through hydrogen bonding. Interactions of the amino group with the protein or other interactions of the carboxylate groups are also observed.
Restraints for the refinement were derived from the smallmolecule crystal structures of I3C  and B3C (Beck, Herbst-Irmer et al., 2009). Model and restraints files for REFMAC (CIF format) and SHELX are available by email request from TB.

Radiation damage
The effect of irradiation is depicted in Fig. 6. Owing to changes in the experimental setup after the MAD experiment (the crystal-to-detector distance was changed from 160 to 200 mm and the crystal was re-centred after deicing), the occupancies from the refinement against peak2 (see above) deviate from the occupancies obtained from the radiationdamage experiments and are therefore not depicted in Fig. 6. A loss of about 15% in the occupancy of the bromine sites can be observed (Fig. 6a) after the MAD experiment and six consecutive radiation-damage experiments (burn-collect). The program RADDOSE (Murray et al., 2004) was used for dose calculations. Interestingly, not all bromine-carbon bonds suffered to the same extent on irradiation (Fig. 6b). Although the bromine-carbon bonds were cleaved, the MAD experiment was still successful. Further MAD experiments with B3C will show whether radiation damage can actually obstruct structure solution with B3C.
Preliminary results from radiation-damage experiments with I3C (results not shown here) indicate that I3C is at least as susceptible as B3C to radiation and probably even more. Similar results have been reported for halogenated nucleotides (Olié ric et al., 2009). B3C in proteinase K: substructure density calculated with SHELXE (using F A and derived from the dispersive and anomalous differences) is contoured at 4 and covers the whole asymmetric unit. Clear density for the four B3C molecules can be seen.

Figure 5
B3C (site 1) in proteinase K at the interface of two proteinase K molecules. Substructure density calculated with SHELXE (using F A and ) is shown for B3C at 4. Hydrogen bonds are depicted as dashed lines and distances are given in Å . The two carboxylate groups interact with Asn270 and Ser45. of choice for in-house data collection since the I atoms give rise to a strong anomalous signal with Cu K radiation (SAD or SIRAS phasing). If diffraction is too weak for in-house phasing, B3C is suitable for MAD data collection at a synchrotron beamline, taking advantage of the additional phase information from data collection at different wavelengths.
The fixed geometrical arrangement of the heavy atoms facilitates structure solution since the triangles can readily be identified in the heavy-atom substructure. A new version of SHELXD that takes this information into account is currently being tested. B3C and I3C could serve as test candidates for further investigations of the anisotropy of anomalous scattering. The molecular arrangement of the anomalous scatterers provides additional information for refinement of the parameters that describe the anisotropy of the anomalous scattering.
The effect of radiation damage on B3C and its phasing capabilities has been investigated. Although the brominecarbon bond in B3C suffers considerably from radiolysis, a MAD experiment could still be carried out successfully. In order to take advantage of the radiolysis of the anomalous scatterers, it is recommended that the inflection data set is collected at the end of the MAD experiment. The lower occupancy of the bromine sites arising from radiation damage caused by previous data collection results in an increase in the dispersive signal. The possibility of carrying out radiationinduced phasing (RIP; Ravelli et al., 2003) with the compounds presented here cannot be excluded, but was not further investigated within the scope of this study.
We find that B3C binds to the surface of the protein at the periphery. Interestingly, although several modes of hydrogenbonding interactions have been observed for B3C and I3C, no aromatic interactions have been found in the crystals investigated to date. The bulky halogen atoms and the carboxylate groups arranged perpendicular to the benzene ring may hinderinteractions with aromatic side chains.
In this study, the phasing tools were incorporated by soaking. There is a recent example in which B3C was incorporated into protein crystals by means of cocrystallization, resulting in a high occupancy of the B3C site (Beck, de Cunha et al., 2009). Currently, new phasing tools with different functional groups are being synthesized and tested. It has been noted previously that small molecules similar to B3C and I3C can promote crystal growth. A crystallization screen with a set of I3C/B3C-like molecules having different functional groups should shed some more light on this issue.
Regine Herbst-Irmer is thanked for advice regarding the refinement. Vincent Olié ric and the beam staff at SLS, Villigen, Switzerland are thanked for support during data collection. Clemens Vonrhein is thanked for providing PDB statistics regarding the importance of MAD/SAD versus MIR/ SIR. Kathrin Meindl and Julian Henn are thanked for advice regarding the preparation of figures. Financial support from the International Centre for Diffraction Data (Ludo Frevel Scholarship Award 2009 to TB), the German Research Foundation (DFG International Research Training Group 1422 Biometals) and the German Academic Foundation is greatly appreciated. Radiation damage for B3C site 1 in proteinase K. The absorbed dose for the crystal is given in MGy (1 Gy = 1 J kg À1 ). (a) Refined occupancies (SHELXL) for the three Br atoms and the mean value (black) of site 1. The first data point is the occupancy after MAD data collection and a single burn and the subsequent five points show the change in occupancy with increasing dose (see x4.2). A drop in the refined occupancies of about 15% is observed after a dose of 12.9 MGy. (b) Electron density at site 1 contoured at 1. The serine residue is used as a reference since its density is not affected by irradiation. The density at the Br atoms clearly decreases with dose, but not all three Br atoms are affected equally.