Received 14 April 2004
Cloning, expression, purification, crystallization and preliminary X-ray characterization of the full-length single-stranded DNA-binding protein from the hyperthermophilic bacterium Aquifex aeolicus
David J. Clarke,a Christopher G. Northey,b Lynsey A. Mack,b Iain W. McNae,b Dmitriy Alexeev,b Lindsay Sawyerb and Dominic J. Campopianoa*
Single-stranded DNA-binding (SSB) proteins stabilize single-stranded DNA, which is exposed by separation of the duplex during DNA replication, recombination and repair. The SSB protein from the hyperthermophile Aquifex aeolicus has been overexpressed in Escherichia coli, purified and characterized and crystals of the full-length protein (147 amino acids; Mr 17 131.20) have been grown by vapour diffusion from ammonium sulfate pH 7.5 in both the absence and presence of ssDNA [dT(pT)68]. All crystals diffract to around 2.9 Å resolution and those without bound DNA (native) belong to space group P21, with two tetramers in the asymmetric unit and unit-cell parameters a = 80.97, b = 73.40, c = 109.76 Å, = 95.11°. Crystals containing DNA have unit-cell parameters a = 108.65, b = 108.51, c = 113.24 Å and could belong to three closely related space groups (I222, I212121 or I41) with one tetramer in the asymmetric unit. Electrospray mass spectrometry of the crystals confirmed that the protein was intact. Molecular replacement with a truncated E. coli SSB structure has revealed the position of the molecules in the unit cell and refinement of both native and DNA-bound forms is under way.
Single-stranded DNA-binding (SSB) proteins have been shown to play an essential role in many aspects of DNA metabolism (Chase & Williams, 1986). They preferentially bind and protect vulnerable single-stranded DNA (ssDNA), which is formed transiently during DNA replication, recombination and repair. SSB proteins are characterized by the presence of a conserved OB-fold motif (oligonucleotide/oligosaccharide/oligopeptide-binding fold), which is typically 100 amino acids in length (Murzin, 1993).
SSB proteins can be divided into two distinct groups based on their quaternary structure. Eukaryotic SSB proteins, known as replication protein A (RPA), are exemplified by human RPA, which has a heterotrimeric structure comprising three subunits: RPA70, RPA32 and RPA14 (of molecular weights 70, 32 and 14 kDa, respectively; Wold, 1997). The RPA complex contains six OB folds, four of which bind DNA: three on RPA70 and one on RPA32 (Bastin-Shanower & Brill, 2001). An N-terminal domain on RPA70 has also been shown to be involved in protein-protein interactions (Jacobs et al., 1999). In contrast, bacterial SSB proteins form homotetramers, with each subunit containing one DNA-binding domain (Raghunathan et al., 1997). These DNA-binding domains are located at the N-termini of the individual SSB protein subunits and form the characteristic OB folds. While the N-terminus of each subunit binds ssDNA and contains the homotetramer interface, it is thought that the C-terminal domain is involved in interactions with other protein components of DNA metabolism. The C-terminal domain of bacterial SSB proteins exhibits low sequence homology across species, with the exception of the terminal six residues, which form a highly conserved negatively charged DDDIPF motif. This motif is essential for the function of Escherichia coli SSB protein in vivo (Curth et al., 1996) and has been shown to interact directly with the 3'-5' ssDNA-degrading exonuclease I (Genschel et al., 2000). The tails of both the E. coli and the Sulfolobus solfataricus SSB proteins are not involved in DNA binding, but are thought to play roles in mediating protein-protein interactions with other subunits within the DNA polymerase complex (Bruck et al., 2002). There is also evidence that a mutually exclusive interaction between the C-terminal domain of E. coli SSB protein, DNA polymerase and primase is utilized as a three-point switch to initiate the exchange of places of these two proteins on DNA (Yuzhakov et al., 1999). Furthermore, a recent report suggests that the interaction between the DNA polymerase and SSB from RB69 (a T4-like bacteriophage) results in an increase in the overall affinity of the SSB protein for ssDNA (Sun & Shamoo, 2003). Finally, Gulbis and coworkers have recently proposed a positively charged patch on the subunit of Pol III holoenzyme which may interact with the C-terminal acidic region of SSB (Gulbis et al., 2004).
The DNA-binding domains and OB folds from SSB proteins have been well studied and structural information is available from a variety of organisms spanning all three kingdoms of life (Bochkareva et al., 2001; Bochkarev et al., 1997, 1999; Webster et al., 1997; Raghunathan et al., 2000; Yang et al., 1997; Kerr et al., 2003). However, crystallization of a full-length bacterial protein has proved problematic and most studies have used proteolytic N-terminal fragments of SSB proteins; consequently, little is known about the structure of its C-terminal domain. Efforts to crystallize the intact E. coli SSB tetramer resulted in autolysis during crystallization and the structure determination omitted 30 amino acids from the C-terminus (Matsumoto et al., 2000). It has been postulated that the C-terminal domain of the E. coli SSB is cleaved to decrease unfavourable interactions for crystallization which result from its high glutamine content. The most recent crystallographic study by Kerr and coworkers presents a 1.2 Å structure of a trypsin-cleaved fragment of the SSB from the crenarcheote S. solfataricus, missing some 28 amino-acid residues from the C-terminal tail (Fig. 1).
| || Figure 1 |
Sequence alignment of the SSB proteins from A. aeolicus (SSB Aae), E. coli (SSB Eco) and S. solfataricus (SSB Sso). The asterix (*) under residue L112 of E. coli SSB (SSB Eco) denotes the limit of the resolution of chymotryptic fragment (SSBc, residues 1-135, represented by a $ sign under W135) used to determine the structure of the native and DNA-bound SSB complex described by Raghunathan et al. (1997, 2000) (PBD codes 1kaw and 1eyg , respectively). The hash symbol (#) under residue R119 indicates the C-terminus of the tryptic fragment of S. solfataricus SSB (SSB Sso) crystallized by Kerr et al. (2003) (residues 1-119; PBD code 1o7i ). The addition sign (+) under N145 of SSB Eco denotes the limit of structure determination from the autolytic fragment crystallized by Matsumoto et al. (2000) (residues 1-145; PBD code 1qvc ). Notice the glutamate-rich C-terminus of SSB Aae in comparison to the glutamine-rich tail of SSB Eco.
Extensive investigations of the binding mode for ssDNA to SSB have revealed a complex series of protein-protein and protein-DNA interactions (Lohman & Ferrari, 1994; Raghunathan et al., 2000). Different binding modes [referred to as (SSB)35 and (SSB)65] and cooperativities have been observed that are dependent upon oligonucelotide length, salt and protein concentration. For example, E. coli (SSB)35 binds about 35 nucleotides and in this case only two of the four subunits in the tetramer bind to the DNA. In contrast, in the (SSB)65 binding mode all four subunits of the tetramer are involved in DNA binding, although it appears that there is a `limited' type of intertetramer cooperativity. Thus, using dT(pT)69 various combinations with ratios of one SSB subunit binding to one or two DNA oligomers or two SSB subunits to one DNA are possible depending on the conditions. However, the use of a high (>0.2 M NaCl) salt concentration appears to favour formation of one SSB tetramer binding to one dT(pT)69.
To investigate the structure of a full-length SSB, we report here the cloning, overexpression, crystallization and initial data collection for crystals of the SSB protein (147 amino acids; Mr 17 131.20; Fig. 1) from the hyperthermophilic bacterium A. aeolicus (SSB Aae) in both the free and the DNA-bound forms. In contrast to E. coli SSB, primary structure analysis of the SSB Aae reveals a polyglutamic acid region at its C-terminus and a EDEIPF motif (Fig. 1). We hope that the crystal structure of the A. aeolicus SSB protein will facilitate the study of the complex protein-protein interactions mediated through the C-terminus of bacterial SSB proteins and the data may also reveal the structural basis for the increased stability of this SSB at elevated temperatures. Further, the DNA-bound structure may reveal details of the (SSB)65 binding mode.
The ssb gene was identified from the complete A. aeolicus genome sequence (Deckert et al., 1998), amplified by polymerase chain reaction and the resulting 451 bp fragment was subsequently inserted into the pET-23a expression vector (Novagen) using NdeI/HindIII restriction sites. The fidelity of the construct, pET23a/ssb, was verified by DNA sequencing before transformation of E. coli BL21(DE3)/pLysS (Novagen). Cells were grown in 2YT growth media supplemented with ampicillin (100 µg ml-1) in shake flasks at 310 K and 250 rev min-1 to OD600 = 0.8 prior to induction with 1.0 mM isopropyl-1-thio--D-galactopyranoside (IPTG). After a further 4 h of growth, cells were collected by centrifugation and stored at 253 K.
The cell pellet was resuspended in 10 ml buffer A (50 mM Tris-HCl pH 7.0) per gram of cell paste, disrupted by sonication (15 pulses of 30 s at 30 s intervals) at 277 K and the cell lysate was centrifuged (30 min, 35 000g). The supernatant was filtered (0.45 µM) before being applied to a 26/10 Q-Sepharose anion-exchange column (Amersham Biosciences) and eluted with a linear NaCl gradient (0-1.0 M) in the same buffer. SDS-PAGE analysis revealed that SSB eluted between 460 and 510 mM NaCl. It is interesting to note that the protein migrates with an apparent weight of 23 kDa on SDS-PAGE compared with its theoretical weight of 17.1 kDa, an anomaly which could be a consequence of the high number of acidic residues in the C-terminus, also observed by Bruck et al. (2002) (Fig. 1 and inset in Fig. 2). Fractions containing SSB were pooled and concentrated to 5 ml by ultrafiltration (Vivaspin) using a 10 kDa cutoff membrane (Vivascience).
| || Figure 2 |
Analysis of the crystallized DNA-bound SSB protein by ESI-MS. The main figure shows the deconvoluted mass of 17 127.3 (obtained using Transform software; Micromass UK), consistent with the theoretical value of 17 131.2; right inset, ion envelope of crystallized SSB protein; left inset, SDS-PAGE analysis of crystallized SSB protein results in a single band running at an anomalous weight of 23 kDa.
The concentrated sample was further purified by size-exclusion chromatography using a previously calibrated Hiload 26/60 Superdex 200 column (Amersham Biosciences) equilibrated and eluted in buffer C (50 mM Tris-HCl, 100 mM NaCl pH 7.0). The SSB protein had a retention time similar to that of bovine serum albumin (66 kDa), suggesting that the 17.1 kDa protein forms a stable homotetramer in solution: a property characteristic of bacterial SSB proteins. The purity of the SSB protein preparation was judged to be >98% by SDS-PAGE and this optimized protocol consistently yielded 10-15 mg of purified protein per litre of cell culture. The purified protein was a single species of 17 131.0 ± 1.50 Da by ESI mass spectrometry, which is in agreement with the theoretical weight of full-length A. aeolicus SSB protein (17 131.2 Da).
Initial crystals were obtained using Molecular Dimensions Structure Screens 1 and 2 and the sitting-drop vapour-diffusion method at 290 K. The drop consisted of 5 µl protein solution (7 mg ml-1 in buffer C) and 5 µl precipitant. Over two weeks, small crystals of native protein were observed under three different conditions, with the best quality obtained using 100 mM HEPES pH 7.5, 2%(v/v) PEG 400, 2.0 M (NH4)2SO4 pH 7.5 as the precipitant. After refining the crystallization conditions, larger crystals were obtained after four weeks using the same precipitant at a pH of 7.0. Co-crystallization of the DNA-bound protein was achieved by mixing 7.5 mg ml-1 protein in a 1:1 molar ratio (tetramer:ssDNA) with 69-mer dT(pT)68 (MWG Biotech). The complex was incubated on ice for 60 min and centrifuged (10 min, 35 000g) prior to crystallizations being set up. Each crystallization drop comprised 1.5 µl protein in 50 mM Tris pH 7.0, 0.1 M NaCl and 1.5 µl precipitant. All were set up at 290 K. Crystals grew within one week; the best quality crystals were obtained using 100 mM HEPES pH 7.5, 2.3 M (NH4)2SO4.
Crystals of native protein of approximate dimensions 0.1 × 0.2 × 0.2 mm were flash-cooled in a 20% glycerol well solution and X-ray data for the native SSB were collected at 100 K (Cryostream cooler; Oxford Cryosystems, Oxford, England) on a MAR Research 345 imaging plate mounted on an Enraf-Nonius FR591 rotating-anode generator, = 1.5418 Å, fitted with Osmic mirrors and operating at 40 kV, 110 mA. Crystals of similar dimensions were obtained for the DNA-bound form and data were also collected at 100 K on station 14.2 ( = 0.978 Å) at the SRS, CLRC Daresbury Laboratory. Analysis of the diffraction data for both crystals using MOSFLM/SCALA (Leslie, 1992; Collaborative Computational Project Number 4, 1994) produced the data shown in Table 1 and allowed the assignment of the native crystals to space group P21. For the DNA-bound data, similar processing statistics were obtained with space groups I222, I212121 and I41.
+Rmerge = , where <Ii(h)> is the mean intensity of the i symmetry-equivalent reflections
For the native SSB data set, a model of the SSB from Escherichia coli (PDB code 1qvc ; Matsumoto et al., 2000) was used to search for an initial solution using MOLREP (Vagin & Teplyakov, 1997). The search molecule was trimmed of its flexible loops and amino-acid side chains to produce a tetrameric polyalanine structure. The top rotation-function solution produced a satisfactory translation-function solution that was then used to aid location of the second tetramer. No solution was obtained using the S. solfataricus structure as a search model. For the DNA-bound SSB, a multicopy search with MOLREP using the partially refined SSB Aae dimer (R = 0.272, Rfree = 0.308) provided solutions, the best of which contained two dimers per asymmetric unit in each of the three space groups. 20 cycles of rigid-body refinement were followed by ten cycles of restrained refinement. The statistics for this process are also shown in Table 1. Refinement of both crystal forms is currently in progress while attempts are being made to improve the diffraction quality of the crystals.
To ensure that no autolysis of the protein had occurred, a single crystal of DNA-bound SSB was dissolved in 10 mM HEPES pH 8.1 and analysed by SDS-PAGE and ESI-MS, which revealed no significant degradation of the protein. SDS-PAGE analysis produced a single band around 23 kDa in keeping with the observed anomalous mobility of the native SSB. Only one major species was observed by ESI-MS with a mass of 17 127.3 ± 2.7 (Fig. 2), in good agreement with the predicted weight. No LCMS data could be obtained from the dissolved native SSB crystal.
It is clear from the sequence alignment of the SSB proteins that the A. aeolicus and the E. coli proteins are more closely related to each other than either is to S. solfataricus SSB (Fig. 1). This is borne out by the fact that a molecular-replacement solution using the E. coli structure was obtained relatively easily, whilst no satisfactory solution could be obtained with the S. solfataricus structure. The very high resolution of S. solfataricus SSB reveals why this should be so in that the actual molecular structure is much more closely related to the eukaryotic SSB fold than that of E. coli SSB (Kerr et al., 2003). Consequently, despite a modest sequence identity, the structures are distinct.
For the DNA-bound crystals reported here, there is an ambiguity as regards the space group. Given an SSB tetramer in the asymmetric unit, using the monomer Mr of 17 100 and that of the DNA as 20 927, the VM can be calculated to be 2.43, 2.11, 1.86 or 1.51 Å3 Da-1 for zero, 0.5, one or two bound DNA 69-mers per tetramer. The expected 1:1 complex requiring one DNA oligomer per asymmetric unit corresponds to a VM of 1.86 Å3 Da-1, which is within the range found by Matthews (1968), albeit quite close to the lower limit. The physiological tetramer as observed in the native structure sits on a crystallographic dyad in both I41 and I212121, whereas the tetramer sits on a screw dyad axis in I222. It is impossible for there to be exact twofold symmetry for the SSB tetramer with a single DNA oligomer bound, although a pseudo-twofold arrangement is possible. Given the limited resolution of the present X-ray data and the current state of the refinement, such a situation cannot yet be ruled out. However, if the DNA oligomer is shared between two tetramers in some fashion, this could permit the DNA-bound tetramer to lie upon a crystallographic dyad, while maintaining four subunits and a single DNA molecule in the asymmetric unit (Ferrari et al., 1994). The initial electron-density maps in each of the three space groups all show extra electron density near regions of the protein expected to bind DNA (Raghunathan et al., 2000). Examination of the maps together with the statistics shown in Table 1 leads us to prefer I222 as the space group, but we are continuing to refine all three possibilities. These refinements should clarify this uncertainty and also allow us to estimate the occupancy of the DNA.
In summary, our expression and purification strategy has produced full-length SSB from the hyperthermophile A. aeolicus with no autolysis observed by mass spectrometry and SDS-PAGE. The flexible C-terminal tail is present in the crystals reported here, unlike the truncated SSB used in both the E. coli and S. solfataricus structure determinations. Our initial refinement of the structures of both forms has allowed the clear assignment of the electron density to residues 1-38 and 41-108 and we are currently refining the models in an effort to distinguish the C-terminal residues.
We wish to thank Professors Karl Stetter and Robert Huber (University of Regensburg) for the kind gift of A. aeolicus chromosomal DNA and Professor Jim Naismith and Dr Iain Kerr (University of St Andrews) for kindly providing the coordinates of the SSB fragment from S. solfataricus prior to publication. The Biotechnology and Biological Sciences Research Council UK and the University of Edinburgh supported this work (DJC, LAM).
Bastin-Shanower, S. A. & Brill, S. J. (2001). J. Biol. Chem. 276, 36446-36453.
Bochkarev, A., Bochkareva, E., Frappier, L. & Edwards, A. M. (1999). EMBO J. 18, 4498-4504.
Bochkarev, A., Pfuetzner, R. A., Edwards, A. M. & Frappier, L. (1997). Nature (London), 385, 176-181.
Bochkareva, E., Belegu, V., Korolev, S. & Bochkarev, A. (2001). EMBO J. 20, 612-618.
Bruck, I., Yuzhakov, A., Yurieva, O., Jeruzalmi, D., Skangalis, M., Kuriyan, J. & O'Donnell, M. (2002). J. Biol. Chem. 277, 17334-17348.
Chase, J. W. & Williams, K. R. (1986). Annu. Rev. Biochem. 55, 103-106.
Collaborative Computational Project, Number 4 (1994). Acta Cryst. D50, 760-763.
Curth, U., Genschel, J., Urbanke, C. & Greipel, J. (1996). Nucleic Acids Res. 24, 2706-2711.
Deckert, G., Warren, P. V., Gaasterland, T., Young, W. G., Lenox, A. L., Graham, D. E., Overbeek, R., Snead, M. A., Keller, M., Aujay, M., Huber, R., Feldman, R. A., Short, J. M., Olsen, G. J. & Swanson, R. V. (1998). Nature (London), 392, 353-358.
Ferrari, M. E., Bujalowski, W. & Lohman, T. M. (1994). J. Mol. Biol. 236, 106-123.
Genschel, J., Curth, U. & Urbanke, C. (2000). J. Biol. Chem. 381, 183-192.
Gulbis, J. M., Kazmirski, S. L., Finkelstein, J., Kelman, Z., O'Donnell, M. & Kuriyan, J. (2004). Eur. J. Biochem. 271, 439-449.
Jacobs, D. M., Lipton, A. S., Isern, N. G., Daughdrill, G. W., Lowry, D. F., Gomes, X. & Wold, M. S. (1999) J. Biomol. NMR, 14, 321-331.
Kerr, I. D., Wadsworth, R. I. M., Cubeddu, L., Blankenfeldt, W., Naismith, J. H. & White, M. F. (2003). EMBO J. 22, 2561-2570.
Leslie, A. G. W. (1992). Jnt CCP4/ESF-EACMB Newsl. Protein Crystallogr. 26.
Lohman, T. M. & Ferrari, M. E. (1994). Annu. Rev. Biochem. 63, 527-570.
Matsumoto, T., Morimoto, Y., Shibata, N., Kinebuchi, T., Shimamoto, N., Tsukihara, T. & Yasuoka, N. (2000). J. Biochem. 127, 329-335.
Matthews, B. W. (1968). J. Mol. Biol. 33, 491-497.
Murzin, A. G. (1993). EMBO J. 12, 861-867.
Raghunathan, S., Kozlov, A. G., Lohman, T. M. & Waksman, G. (2000). Nature Struct. Biol. 7, 648-652.
Raghunathan, S., Ricard, C. S, Lohman, T. M. & Waksman, G. (1997). Proc. Natl Acad. Sci. USA, 94, 6652-6657.
Savvides, S. N., Raghunathan, S., Futterer, K., Kozlov, A. G., Lohman, T. M. & Waksman, G. (2004). Protein Sci. 13, 1942-1947.
Sun, S. & Shamoo, Y. (2003). J. Biol. Chem. 278, 3876-3881.
Vagin, A. & Teplyakov, A. (1997). J. Appl. Cryst. 30, 1022-1025.
Webster, G., Genschel, J., Curth, U., Urbanke, C., Kang, C. & Hilgenfeld, R. (1997). FEBS Lett. 411, 313-316.
Wold, M. S. (1997). Annu. Rev. Biochem. 66, 61-92.
Yang, C., Curth, U., Urbanke, C. & Kang, C. (1997). Nature Struct. Biol. 4, 153-157.
Yuzhakov, A., Kelman, Z. & O'Donnell, M. (1999). Cell, 96, 153-163.