Cloning, purification and preliminary crystallographic analysis of the Bacillus subtilis GTPase YphC–GDP complex

Crystals of a selenomethionine-incorporated YphC–GDP complex have been grown using the hanging-drop vapour-diffusion method and polyethylene glycol as a precipitating agent.


Introduction
Members of the GTPase superfamily are critical components of many signalling pathways, in which the cycling between 'on' (GTP bound), 'off' (GDP bound) and apo states plays an important role in regulatory processes including cell division, cell cycling, signal transduction, mRNA translation and hormone signalling (Bourne et al., 1991;Vetter & Wittinghofer, 2001). Despite low sequence identity across the GTPase superfamily, their structures share a common fold comprising of a central -sheet flanked by -helices, with the five regions (G1-G5) that show sequence similarity between the different GTPase subfamilies being involved in nucleotide binding (Leipe et al., 2002). In different GTPases the cycling between the GTP-bound 'on' state and the GDP-bound 'off' state following GTP hydrolysis has been seen to result in conformational changes in two distinct regions termed the switch I and switch II regions, which include motifs G2 and G3, respectively (Bourne et al., 1991;Knudsen et al., 2001).
Genome sequence data has suggested that bacteria possess 11 universally conserved GTPases, many of which have been proposed to interact with the ribosome (Caldon et al., 2001). Amongst these, the EngA family of GTPases are unique as they contain two GTPase domains joined by a variable-length acidic linker (Caldon et al., 2001;Hwang & Inouye, 2001). The EngA family are thought to act as a cellular messenger by forming interactions with the ribosome, with the overexpression of EngA in Escherichia coli restoring the growth of null mutants of an rRNA methyltransferase (RrmJ) which modifies the 23S rRNA in intact 50S ribosomal subunits (Tan et al., 2002). This family appears to be restricted to bacteria and a number of important parasites such as Plasmodium and Eimeria, but absent in man, yeast and fungi. Studies in Bacillus subtilis and Neisseria have shown that the EngA homologues in these organisms are essential for bacterial survival, with knockouts in the former displaying an increase in cell length, nucleoid condensation and abnormally curved cell shape (Morimoto et al., 2002;Mehr et al., 2000). The essentiality of EngA suggests that it might form a promising target for antimicrobial agents.
Structural studies of an EngA homologue in Thermotoga maritima (TmDer) have revealed a domain architecture in which the two GTPase domains flank a C-terminal domain which adopts a fold reminiscent of an RNA-binding KH-domain (Robinson et al., 2002). In addition, although not added during crystallization, a GDP nucleotide remained bound to the second GTPase domain, whilst in the first GTPase domain two phosphates could be observed whose positions are approximately equivalent to those expected for the and phosphates of GTP. This has led to the suggestion that in the structure of TmDer the first GTPase domain mimics a GTP-bound form of the enzyme (Robinson et al., 2002). In order to better understand how this important family of GTPases facilitates their function when cycling between GTP-and GDP-bound states, we have cloned, overexpressed, purified, crystallized and collected a MAD data set to 2.5 Å resolution of the B. subtilis EngA homologue YphC in complex with GDP.

Cloning and overexpression
The coding sequence of yphc was amplified from genomic DNA of the 168 strain of B. subtilis using Pwo DNA polymerase (Roche) and the primers ATGGGTAAACCTGTCGTAGCC (forward) and TT-ATTTTCTAGCTCTTGCAAATATTTTG (reverse). The resulting YphC gene was ligated into a pETBlue-1 vector using an AccepTor vector kit (Novagen), creating an expression vector pMAT1 which was subsequently extracted and transformed into Escherichia coli Tuner (DE3) (Novagen). In order to produce SeMet-incorporated YphC protein, the transformed E. coli Tuner was grown in LB minimum medium containing 10.5 g l À1 K 2 HPO 4 , 1 g l À1 (NH 4 ) 2 PO 4 , 4.5 g l À1 KH 2 PO 4 , 0.5 g l À1 trisodium citrateÁ2H 2 O, 5 g l À1 glycerol, 0.5 g l À1 adenine, guanosine, thymine and uracil, 1 ml l À1 MgSO 4 Á7H 2 O, 4 mg l À1 thiamine, 40 mg l À1 selenomethionine and 100 mg l À1 of the amino acids Lys, Phe and Thr in addition to 50 mg l À1 Ile, Leu and Val. Growth was carried out at 310 K with vigorous aeration until an OD 600 of 0.6 was reached, at which point overexpression of YphC was induced by the addition of 1 mM IPTG; growth then continued at 310 K for 5 h, after which the cells were harvested by centrifugation at 4000g for 20 min at 277 K.

Purification
Cells containing the overexpressed SeMet-incorporated YphC were disrupted by sonication in buffer A (50 mM Tris-HCl pH 8.0) and the cell debris was removed by centrifugation at 43 000g for 10 min. Analysis of the soluble fraction by SDS-PAGE showed a large overexpression band corresponding to the expected molecular weight of YphC of approximately 48 kDa. The supernatant was collected and loaded onto a DEAE-Sepharose Fast Flow column (Amersham Biosciences) and YphC was eluted with a linear gradient of 0-0.5 M NaCl in buffer A. The fractions containing the highest concentration of YphC [estimated by the method of Bradford (1976) using the Bio-Rad protein-assay reagent] were combined and 4.0 M (NH 4 ) 2 SO 4 was added to a final concentration of 1.7 M, at which concentration YphC is soluble. The precipitated protein was then removed by centrifugation and the supernatant was loaded onto a Phenyl-ToyoPearl 650S (Tosoh) column and eluted with a reverse gradient of (NH 4 ) 2 SO 4 from 1.2 to 0 M in buffer A. The sample was subsequently subjected to gel filtration using a Hi-Load Superdex 200 column (Amersham Biosciences) equilibrated with 0.5 M NaCl in buffer A and eluted with the same buffer. Peak fractions containing YphC were concentrated to 15 mg ml À1 in a VivaSpin 10 000 Da molecular-weight cutoff concentrator and the buffer exchanged to 10 mM Tris-HCl pH 8.0, which contained no antioxidants. The purity of the SeMet protein was checked by SDS-PAGE and estimated to be over 95%.

Crystallization and preliminary X-ray analysis
Crystals of SeMet-incorporated YphC were grown using the hanging-drop vapour-diffusion technique, mixing 2.0 ml protein solution (15 mg ml À1 YphC in 10 mM Tris-HCl pH 8.0, 5 mM GDP and 5 mM MgCl 2 ) with 2.0 ml reservoir solution at 290 K. Initial screening of crystallization conditions was conducted using Crystal Screen 1, Crystal Screen 2 and the PEG/Ion Screen (Hampton Research); the most promising hit was produced in PEG/Ion Screen solution No. 1 [0.2 M sodium fluoride and 20%(w/v) PEG 3350]. This condition was subsequently refined to achieve an optimal reservoir solution of 0.4 M sodium fluoride and 14%(w/v) PEG 3350, producing crystals which took approximately one week to grow. For data collection, crystals of the YphC-GDP complex were flash-cooled to 100 K in a cryoprotectant solution consisting of 0.4 M sodium fluoride, 16%(w/v) PEG 3350 and 20%(w/v) glycerol.
A multiple-wavelength anomalous diffraction (MAD) experiment was carried out on a single crystal of selenomethionine-incorporated YphC-GDP complex on station 10.1 at the Daresbury Synchrotron Radiation Source. The three wavelengths for the MAD experiment (peak, inflection and remote) were chosen near the selenium absorption edge based on the fluorescence absorption spectrum obtained from a frozen crystal at 100 K. For each wavelength, a total of 180 images were collected to 2.5 Å using a 1 oscillation width on a MAR CCD 165 detector (Fig. 1).

Results and discussion
the asymmetric unit giving a V M value of 2.3 Å 3 Da À1 , which is within the range observed by Matthews for protein crystals (Matthews, 1977). The data were subsequently processed using the MOSFLM (Leslie, 1992) and SCALA (Collaborative Computational Project, Number 4, 1994) packages and analysis of the pattern of systematic absences is consistent with the correct space group being P2 1 2 1 2 1 . Data-collection and processing statistics are presented in Table 1. Given the quality of the derivative data and in order to minimize any potential bias, we have chosen to proceed with the structure determination using the MAD method. Ultimately, it is hoped that a complete solution of the YphC-GDP complex structure will lead to a better understanding of the EngA family and reveal conformational changes between the different nucleotide-bound forms of this important family of GTPases.