research papers
Molecular placement of experimental electron density: a case study on UDP-galactopyranose mutase
aCentre for Biomolecular Sciences, The University, St Andrews KY16 9ST, Scotland, and bJoint Structural Biology Group, BP-220, ESRF, Grenoble CEDEX F-38043, France
*Correspondence e-mail: naismith@st-andrews.ac.uk
The structure of UDP-galactopyranose mutase, the enzyme responsible for the conversion of UDP-galactopyranose to UDP-galactofuranose, has been solved. The structure solution required the use of two crystal forms and a selenomethionine variant. Crystal form P21 was used to collect a complete MAD data set, a native data set and a single-wavelength non-isomorphous selenomethionine data set. A starting set of MAD phases was then improved by non-crystallographic averaging and cross-crystal averaging of all P21 data. The initial maps were of such low quality that transformation matrices between cells could not be determined. It was therefore assumed that although there were large changes in unit-cell parameters, the molecule occupied the same position in each cell. This starting assumption was allowed to refine during the averaging procedure and did so satisfactorily. Despite a visible increase in the quality of the map allowing some secondary-structural elements to be located, the overall structure could not be traced and refined. The rediscovery of the second crystal form, P212121, allowed the collection of a native data set to 2.4 Å. Molecular placement of electron density was used to determine the relationship between the two unit cells. In this study, only the already averaged P21 experimental density could be placed in the P212121 map. Less extensively density-modified maps did not give a clear solution. The study suggests even poor non-isomorphous data can be used to significantly improve map quality. The relationship between P21 and P212121 could then be used in a final round of cross-crystal averaging to generate phases. The resulting map was easily traced and the structure has been refined. The structure sheds important light on a novel mechanism and is also a therapeutic target in the treatment of tuberculosis.
Keywords: molecular replacement; tuberculosis; sugar nucleotide; contractase.
1. Introduction
We have previously reported the purification and crystallization of mutase from Escherichia coli (McMahon et al., 1999). Mutase initially crystallized in two different forms, only one of which was reproducible. These crystals (in the P21 space group) were small two-dimensional diamond-shaped crystals that proved very difficult to work with, as they diffracted poorly and were highly mosaic. This paper describes the rediscovery of the second crystal form of mutase (in the P212121 space group). We detail the approach required to solve the novel structure of E. coli mutase, utilizing the CCP4 programs (Collaborative Computational Project, Number 4, 1994) AMoRe (Navaza, 1994), DM and DM_Multi (Cowtan, 1994) to place electron density into the new cell and determine the phases.
Our laboratory has studied this protein because it is vital to the formation of the cell wall and or the capsule in many pathogenic bacteria. The biosynthesis of the cell wall is an attractive target for structure-assisted drug design of novel antibacterial drugs mainly because there are a large number of compounds in the cell wall that are not found in mammalian and other higher organisms. One such compound is galactofuranose (Galf). Galf is a component of the LPS in many Gram-negative bacteria, found in the O antigens of many species including Klebsiella pneumonia (Koplin et al., 1997) and E. coli (Nassau et al., 1996). Galactofuranosyl residues are also an important component of the arabino-galactan complex that forms part of the mycolitic layer in the cell walls of mycobacteria (Weston et al., 1998). Galf is incorporated into cell walls from UDP-Galf, formed by the enzyme UDP-galactopyranose mutase (mutase) from UDP-galactopyranose (Galp) (Fig. 1; Weston et al., 1998). The chemistry involved in this ring-contraction mechanism is completely unprecedented and is the source of great interest.
2. Results
2.1. Protein purification
The purification of E. coli mutase has been described previously (McMahon et al., 1999). Production of the SeMet derivative protein was carried out by transfection of the Met− strain B834 (Novagen) with the plasmid containing the E. coli gene (Nassau et al., 1996) and cell growth and induction of protein production were carried out in essentially the same manner as the native. Purification of the SeMet protein was as described for the native protein, with 2 mM fresh DTT in all buffers and solutions.
2.2. Crystallization
2.2.1. P21 crystal form
Crystals in the P21 form were grown in 20% PEG 4K, 12% 2-propanol, 0.01 M L-cysteine, 100 mM HEPES pH 7.6 and 8 mg ml−1 protein as described previously (McMahon et al., 1999). SeMet protein crystals were grown in an identical manner to the wild-type protein, yielding smaller crystals than the wild type. A key point was that several seeding steps were required to obtain crystals of sufficient size for diffraction. This protocol was extremely reproducible.
2.2.2. Re-obtaining the P212121 crystal form
A solitary single crystal was obtained in the P212121 form and this was originally described in McMahon et al. (1999). We have been unable to reproduce this crystal. The initial crystal was grown from an early purification run of mutase and had impurities in it. Subsequent purifications resulted in homogenous protein. Deliberately `contaminated' protein did not crystallize. However, by the streak-seeding of P21 crystals into drops containing 10–20% PEG 6K, 100 mM HEPES pH 7.0, 0.01 M L-cysteine, 2 mM DTT and 2–3 mg ml−1 protein, either pre-equilibrated overnight (for the 10% drops) or seeded immediately (for the 20% drops), we were successful in growing crystals in the P212121 form. These crystals grew overnight to 0.2 × 0.2 × 0.1 mm in size. Repeated seedings and macroseeding with single crystals generally failed to increase the size.
2.3. Data collection
2.3.1. P21 data
The SeMet-1 data set to 3.0 Å was collected as described previously (McMahon et al., 1999). The native and SeMet-2 data sets for the P21 crystal form were collected on beamline ID14-4 at the ESRF synchrotron in Grenoble. Although crystals were generally highly variable in diffraction and had high mosaic spread, several crystals of sufficient quality were found to collect the necessary data sets. Owing to the fragile nature of the crystals, we used a poor-quality larger crystal to scan in order to determine the three wavelengths for the MAD data set. The native data set consisted of 180 non-overlapping 1° images collected for 30 s to 2.7 Å and the SeMet data set was to 2.8 Å resolution for two wavelengths and 3.2 Å for the third (Table 1), each consisting of 180 1° non-overlapping images collected for 15 s. Data were processed using the programs MOSFLM (Leslie, 1992) and SCALA (Evans, 1997), treating the Bijvoet pairs as independent measurements for the SeMet data set.
‡Anomalous, completeness, redundancy and Rmerge calculations for MAD data sets treat F+ and F- as separate observations. §Rmerge = /, where I(h) is the measured diffraction intensity and the summation includes all observations. ¶R is the R factor = /. Rfree is the R factor calculated using 5% of the data that were excluded from the |
2.3.2. P212121 data
The data for the P212121 crystal form were collected on beamline ID14-1 at the ESRF synchrotron in Grenoble. A single crystal of about 0.15 × 0.15 × 0.075 mm in size had to be cryoprotected in a relatively complex manner. The crystal was moved (using a glass capillary ∼0.2 mm size) into a stabilizing buffer (25% PEG 4K, 15% 2-propanol, 100 mM HEPES pH 7.6) and after ∼1 min was moved rapidly through stabilizing buffer plus 7.5% glycerol into stabilizing buffer plus 15% glycerol, in which it was then frozen. The initial image of this crystal showed high mosaicity. The mosaicity was significantly reduced by a round of reannealing/flash-freezing by blocking the Cryostream for 5 s. A data set was collected to 2.4 Å resolution, consisting of 220 0.5° non-overlapping images collected for 30 s each. Data were processed using MOSFLM and SCALA (details are given in Table 1).
2.4. Structure solution
2.4.1. P21 solution
The program SOLVE 1.17 (Terwilliger & Berendzen, 1999) was used to locate the selenium sites in the P21 crystal form. Nine of a possible 14 sites were found (Z score = 20, FOM = 0.43). Eight of these sites were consistent with a twofold rotation suggested by the self-rotation map. The initial map was improved by solvent flattening with DM (Cowtan, 1994) with phase extension to 2.7 Å. As shown in Fig. 2(b), the map was very poor and no convincing evidence of secondary structure was found. Close analysis of the data indicated that the anomalous signal from the SeMet data was very weak. One indicator of the poor quality of the anomalous signal is the ratio Δano/σ(Δano), which in the case of the mutase data was significantly less than one for all three wavelengths (0.67, 0.62 and 0.58 for the peak, inflection and remote wavelengths, respectively). SOLVE also indicated the same problem. Some additional `improvement' of the phases was made by placing a 50 Å sphere as a mask centred at the centroid of the selenium positions. This mask was improved by manually rebuilding the map to fill areas of density through visual inspection.
MLPHARE (Otwinowski, 1991) was used to generate SIROAS phases using the peak wavelength for SeMet-2 against the native data set. Consistent with the weak signal, the anomalous occupancy tended to zero when refined in MLPHARE. We then began cross-crystal averaging between the native (SIROAS phases), SeMet-2 (MAD phases) and SeMet-1 data. A monomer mask derived from the earlier 50 Å mask was used as the NCS mask. The translation matrices for these data sets were assumed to be unity and allowed to refine in the program. This is despite effectively complete non-isomorphism between SeMet-1 and the two other data sets; the average mean fractional isomorphous difference was 0.525. Attempts to improve the matrices prior to DMMULTI by density-placement calculations (Cowtan, 1994) gave worse results (as measured by correlation coefficients and visual examination of maps). The results from this DMMULTI run were promising, with a final for monomer A between native and SeMet-1 of 0.935 and an FOM of 0.66 for 2.8 Å P21 native data. This map was the used for an initial model-building step. Clear secondary-structure elements (mostly α-helices with a few β-strands) were visible for the first time in the map. The map is shown in Fig. 2(c). A polyalanine model consisting of 300 residues was traced into this map, possible secondary-structure elements were assigned and this model was used to search DALI (Holm & Sanders, 1993) and DEJAVU (Kleywegt & Jones, 1997) for similar proteins; however, no solutions were found. An attempt was made to refine this model using CNS (Brunger et al., 1998), but the model did not refine.
2.4.2. P212121 solution
At the point where the P21 data failed to render a solution, the P212121 crystal form was rediscovered and the data was collected. The transformation matrix between the two crystal forms was generated by using molecular placement of the cross-crystal averaged electron density from P21. The P21 model was used to create a mask of the monomer inside the P21 (using MAPMASK; Collaborative Computational Project, Number 4, 1994). This mask was then applied to the `final' map from the same solution P21 solution (NCS, cross-crystal averaged) (using MAPMASK and OVERLAPMASK; Jones & Stuart, 1991). The region outside of the mask was flattened and the resulting density was transformed to a P1 symmetry cell (EXTEND; Collaborative Computational Project, Number 4, 1994). This map was then back-transformed using SFALL (Collaborative Computational Project, Number 4, 1994) to generate the structure factors, which were normalized to E values (ECALC; Collaborative Computational Project, Number 4, 1994) and ordered for use in AMoRe (Navaza, 1994) by CAD (Collaborative Computational Project, Number 4, 1994). The calculated data were sorted and tabled in AMoRe and then run against the P212121 data to find solutions for the monomer (see Table 2). The matrix between the two monomers in the P212121 was derived by applying the AMoRe monomer solution to the positions of the Se atoms determined from SOLVE. The relationship between the two sets of Se atoms determined the matrix between the monomers. We based this on the assumption that the position of the two monomers did not change between the two different crystal forms. We then used cross-crystal averaging (DMMULTI; Cowtan, 1994) using this transformation and four data sets [P21 native with the `final' phases, P21 SeMet-2 peak wavelength with phases from SOLVE/DM (SIROAS phases), SeMet-1 data set and the P212121 data set]. The rotation matrix for the different P21 data sets was assumed to be unity and all of the matrices were allowed to be refined by DMMULTI. Table 3 shows the starting and finishing correlation coefficients and FOMs from the final DMMULTI run. Fig. 3 shows the density of the experimental phases (prior to refining) at the position of refined FAD.
|
|
At this point, a polyalanine model was built into the experimental density map, allowing tracing of 321 residues (of 367); the FAD was also found. The resulting model and known NCS operators were then used for real-space averaging (RAVE). This allowed further identification of 20 residues and also allowed the joining of different segments of the structure. This model was run through DALI (Holm & Sanders, 1993) and DEJAVU (Kleywegt & Jones, 1997). This confirmed that the FAD-binding domain was similar to a number of other proteins; however, the second domain and linking region were novel. was begun using CNS (Brunger et al., 1998), starting with simulated annealing, rigid-body and positional/B-factor This allowed the location of the final 26 residues and joining of the last segments (Fig. 4). Further proceeded smoothly and resulted in a model with final as listed in Table 4.
‡Rfree is calculated on 5% of data excluded during |
3. Discussion
The solution of the E. coli mutase structure has proven to be exceedingly difficult owing to the small size and high degree of mosaicity of the crystals. The P21 crystal form with native and SeMet derivative (MAD) data sets proved insufficient to solve the most likely owing to the very poor anomalous signal in the MAD data sets. The poor anomalous signal was probably the result of the high mosaicity and concomitant poor diffraction. Attempts to obtain heavy-metal derivatives of the P21 crystal form were hampered by our inability to screen crystals in-house. Limited synchrotron time also prevented extensive screening and no useful information was obtained with the putative platinum derivative collected at Grenoble (McMahon et al., 1999).
The discovery that the P212121 crystals could be grown by seeding with the P21 crystal form proved instrumental in solving the structure. The initial P212121 crystal reported earlier was probably grown in a drop containing enough impurities to provide a nucleation site; however, subsequent attempts to return to this condition were unsuccessful owing to, ironically, greater attention to sample purity. Micro/streak seeding of P21 crystals into these drops was enough to allow the P212121 form to be grown. These crystal proved to be much better for diffraction purposes, despite a much more rigorous freezing protocol.
Owing to the fact that there is apparently a mathematical relationship between the two cells (as in Table 1: aP212121 = bP21, bP212121 = cP21, cP212121 = 2aP21), we began deriving the rotation matrix between the two crystal forms mathematically. The rapid success of using density meant that we never completed the mathematical transformation.
During a post-mortem analysis we tried to place electron density from the initial P21 MAD data directly obtained from SOLVE. No solution was found, although careful examination of the rotation-function solutions did show one peak with approximately the correct rotation/translation solution, although the z-axis translation was quite different (solution = 42.07, 87.17, 332.57, 0.21, 0.18, 0.38). The same test of the density-modified map (Fig. 2b) gave similarly poor results. Experience in the laboratories suggests a good indicator of a correct solution can be found by viewing the min/max density of the rotation map. The ratio of (max. density/r.m.s. deviation):(min. density/r.m.s. deviation) gives a fairly reliable indicator of height of peaks above noise level. In our example, the successful gave a ratio of 8.3:(−3.9) compared with 4.9:(−0.9) and 5.0:(−3.9) for the results from SOLVE and DM (Cowtan, 1994), respectively.
Our results confirm the adage that while α-helices are better guides towards structure solution. We have also demonstrated that placement of electron density and cross-crystal averaging are very powerful tools for solving structures using very poor initial phases. The post-mortem analysis shows that only by extracting the maximum phase information from the SeMet-2 data (by manual adjustment of the mask and SIROAS phases from the peak wavelength) and cross-crystal averaging against the non-isomorphous SeMet-1 data set (which was very poor quality), could good enough density be obtained for placement in the P212121 cell.
and FOM are useful guides toward determining whether the phases derived are correct, visual examination of maps and the appearance of such easily discernable features asFootnotes
‡Current address: Department of Chemistry, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0314, USA.
Acknowledgements
We thank Garry Taylor for the assistance with placement of density, Kevin Cowtan for help with DM/DM_Multi and Eleanor Dodson for many helpful suggestions and comments on solving the using many different approaches.
References
Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Acta Cryst. D54, 905–921. Web of Science CrossRef CAS IUCr Journals Google Scholar
Collaborative Computational Project, Number 4 (1994). Acta Cryst. D50, 760–763. CrossRef IUCr Journals Google Scholar
Cowtan, K. (1994). Jnt CCP4/ESF–EACBM Newsl. Protein Crystallogr. 31, 34–38. Google Scholar
Engh, R. A. & Huber, R. (1991). Acta Cryst. A47, 392–400. CrossRef CAS Web of Science IUCr Journals Google Scholar
Esnouf, R. M. (1997). J. Mol. Graph. Model. 15, 132–134. CrossRef CAS PubMed Web of Science Google Scholar
Evans, P. R. (1997). Jnt CCP/ESF–EACBM Newsl. Protein Crystallogr. 33, 22–24. Google Scholar
Holm, L. & Sanders, C. (1993). J. Mol. Biol. 233, 123–138. CrossRef CAS PubMed Web of Science Google Scholar
Jones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Acta Cryst. A47, 110–119. CrossRef CAS Web of Science IUCr Journals Google Scholar
Jones, Y. & Stuart, D. I. (1991). Proceedings of the CCP4 Study Weekend. Isomorphous Replacement and Anomalous Scattering, edited by W. Wolf, P. R. Evans & A. G. W. Leslie, pp. 39–48. Warrington: Daresbury Laboratory. Google Scholar
Kleywegt, G. J. & Jones, T. A. (1997). Methods Enzymol. 277, 525–545. CrossRef PubMed CAS Web of Science Google Scholar
Koplin, R., Brisson, J.-R. & Whitfield, C. (1997). J. Biol. Chem. 272, 4121–4128. CrossRef CAS PubMed Web of Science Google Scholar
Leslie, A. G. W. (1992). Jnt CCP4/ESF–EACMB Newsl. Protein Crystallogr. 26. Google Scholar
McMahon, S. A., Leonard, G. L., Buchanan, L. V., Giraud, M.-F. & Naismith, J. H. (1999). Acta Cryst. D55, 399–402. Web of Science CrossRef CAS IUCr Journals Google Scholar
Nassau, P. M., Martin, S. L., Brown, R. E., Weston, A., Monsey, D., McNeil, M. R. & Duncan, K. (1996). J. Bacteriol. 178, 1047–1052. CAS PubMed Web of Science Google Scholar
Navaza, J. (1994). Acta Cryst. A50, 157–163. CrossRef CAS Web of Science IUCr Journals Google Scholar
Otwinowski, Z. (1991). Proceedings of the CCP4 Study Weekend. Isomprphous Replacement and Anomalous Scattering, edited by W. Wolf, P. R. Evans & A. G. W. Leslie, pp. 80–86. Warrington: Daresbury Laboratory. Google Scholar
Terwilliger, T. C. & Berendzen, J. (1999). Acta Cryst. D55, 849–861. Web of Science CrossRef CAS IUCr Journals Google Scholar
Weston, A., Stern, R. J., Lee, R. E., Nassau, P. M., Monsey, D., Martin, S. L., Scherman, M. S., Besra, G. S., Duncan, K. & McNeil, M. R. (1998). Tubercle Lung Dis. 78, 123–131. Web of Science CrossRef Google Scholar
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.