structural communications
The RpfC (Rv1884) atomic structure shows high structural conservation within the resuscitation-promoting factor catalytic domain
aCrystallography, Institute for Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London, Malet Street, London WC1E 7HX, England, bCentre de Biochimie Structurale, CNRS UMR 5048, 29 Rue de Navacelles, 34090 Montpellier, France; INSERM U1054, Université Montpellier I, Montpellier, France, cDepartment of Microbial Diseases, UCL–Eastman Dental Institute, University College London, 256 Gray's Inn Road, London WC1X 8LD, England, and dThe Advanced Centre for Biochemical Engineering, Department of Biochemical Engineering, University College London, Torrington Place, London WC1E 7JE, England
*Correspondence e-mail: n.keep@mail.cryst.bbk.ac.uk, martin@cbs.cnrs.fr
The first structure of the Mycobacterium tuberculosis, is reported. The structure was solved using once the had been correctly identified as twinned P21 rather than the apparent C2221 by searching for sites in P1. The structure displays a very high degree of structural conservation with the previously published structures of the catalytic domains of RpfB (Rv1009) and RpfE (Rv2450). This structural conservation highlights the importance of the versatile domain composition of the RPF family.
of RpfC (Rv1884), one of the resuscitation-promoting factors (RPFs) from3D view: 4ow1
PDB reference: RpfC, 4ow1
1. Introduction
Resuscitation-promoting factors (RPFs) have attracted much interest since their discovery in the late 1990s. These proteins resuscitate bacteria that have entered a dormant state, allowing them to proliferate normally. Despite some key advances since the first protein identification and characterization, their precise mechanism of action remains elusive. The protein was first isolated in Micrococcus luteus, where a heat-labile, non-dialysable and trypsin-sensitive factor present in culture supernatants was able to resuscitate non-growing cells (Mukamolova et al., 1998). The factor was identified as a protein and named resuscitation-promoting factor. In this same seminal study, corresponding genes in other GC-rich Gram-positive bacteria, most notably in Mycobacterium tuberculosis, were also identified. The resuscitating function was later confirmed in M. tuberculosis (Mukamolova et al., 2002). This is an important finding, as one third of the human population is latently infected with M. tuberculosis in a dormant form. This represents a large population reservoir for reactivation of tuberculosis and also a potential novel therapeutic avenue for treating tuberculosis.
Sequence analysis coupled with homology modelling led to the hypothesis that the conserved RPF et al., 2004). The prediction was confirmed by the first solution structure of the RpfB from M. tuberculosis, which showed that the domain is a short version of the c-type lysozyme lacking the first helix (Cohen-Gonsaud et al., 2005). Later, various experiments unambiguously demonstrated that the RPF domain is a hydrolase (Mukamolova et al., 2006; Telkov et al., 2006).
could be a transglycosidase belonging to the family of c-type lysozymes (Cohen-Gonsaud, KeepFive RPF paralogues are present in M. tuberculosis (rpfA–E). They contain a conserved but the domain composition shows variability also found in other species (Ravagnani et al., 2005). The mycobacterial RPF proteins share a common 70-amino-acid RPF domain and the presence of N-terminal signal sequences suggesting that the proteins are translocated to an extracellular location. RpfC (176 amino acids), RpfD (154 amino acids) and RpfE (172 amino acids) consist almost solely of the RPF domain and signal sequence and are supposed to have a paracrine function. RpfB (362 amino acids) possesses a G5 domain that may be involved in binding (Ruggiero et al., 2009) and a prokaryotic membrane lipoprotein lipid-attachment site that may confer it with a juxtacrine function, while RpfA (407 amino acids) possesses a low compositional complexity domain that may confer an autocrine function (Mukamolova et al., 1998).
Initial studies showed that deletion of individual rpf genes had no significant phenotypic consequences (Downing et al., 2004; Tufariello et al., 2004). This suggests that the mycobacterial RPF proteins are functionally redundant. The deletion of the entire mycobacterial rpf gene family is also dispensable for growth (Kana et al., 2008). However, phenotypic alterations appear with the deletion of three or more rpf genes and reveal a functional hierarchy of the mycobacterial Rpf proteins that has been reviewed elsewhere (Kana & Mizrahi, 2010).
The question arises as to the functional specificity of the various RPF paralogues. Is specificity based on small changes within the RPF et al., 2005) and various X-ray structures of RpfB (Ruggiero et al., 2009, 2013; Squeglia et al., 2013) and, very recently, the structure of RpfE have been published (Mavrici et al., 2014). In this paper, we describe the X-ray structure of the RpfC Despite the presence of multiple copies in the and strong noncrystallographic translation, we succeeded in solving the structure using The structure highlights the high degree of structural conservation within the RPF domains, which could explain why the mycobacterial paralogues are functionally redundant.
structure itself or on the domain organization? The solution structure (Cohen-Gonsaud2. Methods
2.1. Protein preparation and crystallogenesis
The sequence coding for the NdeI and NheI sites of the pET15-TEV plasmid to generate a recombinant protein containing a six-histidine tag at the N-terminus cleavable by Tobacco Etch Virus (TEV) protease (Cohen-Gonsaud, Barthe et al., 2004). The N-terminus after cleavage corresponds to the first amino acid of the mature RpfC after predicted cleavage of the signal peptide. The experimentally determined start codon is residue 34 of the UniProt entry (RPFC_MYCTU; Raman et al., 2004) and the first 34 residues (34–67 of the UniProt entry) are the signal peptide. Therefore, we number the protein structure from residue Gly1, which is Gly68 in the UniProt entry. The last 17 residues of the protein were predicted to be disordered from the RpfB structure and were excluded from this construct.
of RpfC (residues Gly68–Lys159 of UniProt RPFC_MYCTU) was cloned into theProtein expression was carried out in Escherichia coli Rosetta2 (DE3) strain grown in ZYM5052 auto-induction medium at 25°C for 36 h (Studier, 2005). Cells were harvested and lysed by sonication in 100 mM Tris pH 7.5, 2 mM β-mercaptoethanol (BME) (buffer A). The lysate was cleared by centrifugation at 48 000g for 1 h at 4°C. The supernatant was loaded onto a nickel–NTA column (GE Healthcare) equilibrated with buffer A and was eluted with buffer A supplemented with 300 mM imidazole (buffer B). The eluted protein fraction was dialysed (3 kDa cutoff) against 20 mM Tris pH 7.5, 2 mM BME (buffer C) overnight at 4°C in the presence of TEV protease. The cleaved protein was further purified by gel filtration on a HiLoad Superdex 75 column (GE Healthcare, Amersham, England) equilibrated in buffer C before being concentrated for crystallization trials. Crystals grew readily in 22 of the 96 conditions of The Classics Suite (Qiagen, Hilden, Germany), but all belonged to the same with the condition 0.1 M sodium citrate pH 5, 20%(w/v) PEG 6000 giving the best crystals. Some optimization of this condition was carried out and a slight improvement was achieved using 0.1 M sodium citrate pH 5, 22%(w/v) PEG 6000. The crystals were cryoprotected in the crystallization condition with 20% ethylene glycol.
2.2. Data collection, processing and phasing
Default processing of data sets using either XDS (Kabsch, 2010) or iMosflm (Powell et al., 2013) always gave C2221. Data sets were reprocessed in P21 (Table 1) with care taken to use an Rfree selection that meant that all pseudoequivalent reflections were in the refined or the free data set. A thin-shell Rfree file was obtained using SFTOOLS from CCP4 (Winn et al., 2011) from an RpfC data set indexed in C2221 with unit-cell parameters a = 65.12, b = 142.88, c = 88.93 Å, α = β = γ = 90°. The initial file was expanded to the lowest symmetry P1. From there, the file was modified to match the unit-cell parameters to the integrated P21 data. The first reindexing was carried out to set the angle to β = 114° using the transformation matrix (100, 001, −110) with unit-cell parameters a = 65.12, b = 88.93, c = 157.02 Å. Finally, the software REINDEX from CCP4 was used with settings h = h, k = k, l = l/2 to give the correct unit-cell lengths a = 65.12, b = 88.93, c = 78.51 Å, α = γ = 90, β = 114.50°. The free set was then reduced to the P21 and used as the source of free reflection flags for all other data sets.
|
Initial phasing was carried out by MrBUMP (Keegan & Winn, 2008) using the of the of RpfB (PDB entry 3e05 ; Ruggiero et al., 2009). A solution with four copies in the was found in C2221 but would not refine below an Rfree of 0.500 using MOLREP (Vagin & Teplyakov, 2010). However, two copies of this model were found in the P21 and refined with the use of to a final Rfree of 0.236 using REFMAC5 (Murshudov et al., 2011; see Table 2). There is a noncrystallographic translation of (0.554, 0.0, 0.109) in fractional coordinates of 50% of the origin peak. With the improvements in including noncrystallographic translation since this work was originally carried out, current versions of Phaser (McCoy et al., 2007) and MOLREP can solve this structure more routinely from a single RpfB chain.
|
3. Results and discussion
3.1. Structure-solution problems
Many data sets were collected from crystals of RpfC or the point mutations RpfC_E13A or RpfC_E13M with and without potential substrates and including selenomethionine-substituted RpfC_E13M at the ESRF, SLS, SOLEIL and Diamond synchrotrons. The automatic space-group assignment for all data sets gave the C2221, with unit-cell parameters of around a = 66, b = 141, c = 90 Å, α = β = γ = 90°. The resolutions of the data sets ranged from 3.0 to 1.9 Å. This would predict four copies of the RpfC chain in the We failed to obtain a molecular-replacement solution using our NMR structure (PDB entry 1xsf ; Cohen-Gonsaud et al., 2005). Slightly better solutions were found using the of the RpfB with R and Rfree of around 0.45 and 0.50, respectively, but these would not refine further. Attempts at Se or S SAD also did not give solutions. However, anomalous site searching using (Dumas & van der Lee, 2008), which works in P1, indicated that the data were probably in P21, as eight sites could be found using the SeMet RpfC_E13M data in this This data set did not yield a useable map, probably owing to the data set being twinned (0.41 from a Britton plot) and the presence of only weak anomalous signal that only extended to around 3.8 Å as assessed by phenix.xtriage (Zwart et al., 2005) and CTRUNCATE from CCP4. However, with the C2221 solution from the of RpfB (Ruggiero et al., 2009) in P21 (unit-cell parameters a = 65, b = 88, c = 78 Å, α = γ = 90, β = 114.50°) to give eight copies in the and refining with twin operators h, k, l and -h, -k, h + l allowed to acceptable R and Rfree values on carefully selecting the free set (see Table 2). was not apparent from the L-test (Yeates, 1988) or the moments of E, but was estimated for the final data set as 0.41 from the H-test (Padilla & Yeates, 2003) and 0.45 in a Britton plot (Fisher & Sweet, 1980) as tested by CTRUNCATE. Other data sets gave similar The final refined fraction in REFMAC5 for the deposited structure was 0.463 for -h, -k, h + l. Despite soaking and co-crystallizing with a range of substrates and substrate fragments, for example N-acetylglucosamine (NAG), polymers of up to five repeats of N-acetylglucosamine and NAG-N-acetylmuramic acid, and peptidoglycan fragments that are generated by a number of enzymes, we never obtained clear density for substrates in the active site. We have therefore deposited the structure of the wild-type RpfC (PDB entry 4ow1 ).
as3.2. Structure analysis
The a). Coupled with the two folds give rise to the pseudo-C2221 symmetry.
consists of eight copies of the RpfC chain. A set of four copies is generated by two twofold axes perpendicular to the crystallographic twofold; a single translation of (0.554, 0.0, 0.109) then generates the second set of four copies (Fig. 1Chains A, E and S have the most residues modelled into electron density (Gly1–Lys86) with an extra helix beyond the end of the conserved domain (Gly78). Chain B has the least modelled residues (Pro4–Gly78); the other chains are between these limits. We have modelled an ethylene glycol (the cryoprotectant) where a benzamidine molecule is present in the RpfB structures with PDB codes 4kpm (Squeglia et al., 2013) and 4emn (Ruggiero et al., 2013). As for the benzamidine in 4kpm , this is only seen in one of the similar interfaces. Benzamidine and ethylene glycol are not all that similar, but this observation indicates that this region in RPFs prefers binding small organic molecules to water. This region is part of the predicted binding site of a hexasaccharide based on superposition of the lysozyme-cleaved hexasaccharide complex with PDB code 1lzs (Song et al., 1994). The crystal packing of the two adjacent chains close to the benzamidine/ethylene glycol site is almost perfectly conserved in our RpfC structure and in the RpfB structures, despite there being no evidence of this contact being physiological. The two pairs of chain superimpose with an r.m.s.d. of 1.1 Å over 149 residues using SSM (Krissinel & Henrick, 2004), which is not much larger than that for the single chains (see below). The RPF domains are sufficiently close to clash with the superposed disaccharide in this region. The trisaccharide in 4kpm coincides with the other part of the cleaved saccharide in 1lzs (Fig. 1b).
As expected, the structural conservation between the new RpfC α r.m.s.d. between the two structures (our structure versus PDB entry 4kl7 ; Squeglia et al., 2013) is only 0.90 Å for 76 residues aligned by SSM with 52% sequence identity over the domain (Figs. 2a and 2b). Compared with the recent RpfE structure (PDB entry 4cge ; Mavrici et al., 2014), the calculated Cα r.m.s.d. is even lower at 0.82 Å for 77 residues with 62% sequence identity (Figs. 1c and 2a). Most of the backbone geometry is conserved, including the connecting loops between the helices. This is in accordance with the first NMR structure that we determined, where the 30 calculated structures shared a low r.m.s.d. of 0.57 Å, low thermal motion as shown by NOE (Nuclear Overhauser Effect) ratios (Cohen-Gonsaud, Barthe et al., 2004) and a well ordered fold for the RPF domain. The only difference observed is located within a short sequence insertion that is present in the RpfB RPF domain compared with the other four M. tuberculosis RPF proteins (Figs. 1c and 2b). In RpfC two residues display an elongated conformation (42GVGN45), very similar to RpfE (137GSGS140), to connect α-helices 2 and 3, while a 310-helix (321GLRYAPR327) is present in RpfB. This small change within the secondary-structure composition does not change the relative orientation of α-helices 2 and 3 within the RPF fold (Fig. 1c). The variation in surface charge between RpfB and RpfE has previously been noted (Mavrici et al., 2014). RpfC has two lysines, Lys26 and Lys33, on one side of the sugar-binding cleft, which are tyrosines in RpfA, RpfB and RpfD or a leucine in RpfE and serine or threonine in RpfA, RpfD and RpfE or an aspartate in RpfB (Fig. 2b), respectively. This leads to a different charge distribution around the ligand-binding pocket, which may have a role in specificity (Fig. 2c). Mavrici et al. (2014) suggested that Arg126 may play a role in binding the peptide part of the conferring specificity on RpfE.
structure that we have determined in this study and the extensively studied RpfB domain is high. The calculated C4. Conclusion
The RpfC structure M. tuberculosis have similar substrates, although variation in charge around the active site may give rise to small variations in the specificity for different modifications. The high degree of conservation of the RPF domain explains why the protein is functionally redundant, but most importantly shows that the auxiliary domain composition is mainly responsible for the functional variability.
displays a high degree of structural conservation with the other members of the mycobacterial resuscitation-promoting factor family. Based on the structure that we have solved, we propose that the five RPFs fromFootnotes
‡Current address: Virology Department, Structural Virology Unit, Institut Pasteur, 28 Rue du Docteur Roux, 75724 Paris CEDEX 15, France.
Acknowledgements
This work was supported by grants from the French Infrastructure for Integrated Structural Biology (FRISBI) ANR-10-INSB-05-01 (MC-G), an X-TB Struct EU grant (MC-G and NHK), MRC grant G0401038 (GR, NHK, BH and JW), a Bloomsbury studentship to F-XC and a Commonwealth Studies Commission studentship MYCS-2011-252 to DHXQ.
References
Cohen-Gonsaud, M., Barthe, P., Bagnéris, C., Henderson, B., Ward, J., Roumestand, C. & Keep, N. H. (2005). Nature Struct. Mol. Biol. 12, 270–273. CAS Google Scholar
Cohen-Gonsaud, M., Barthe, P., Pommier, F., Harris, R., Driscoll, P. C., Keep, N. H. & Roumestand, C. (2004). J. Biomol. NMR, 30, 373–374. Web of Science PubMed CAS Google Scholar
Cohen-Gonsaud, M., Keep, N. H., Davies, A. P., Ward, J., Henderson, B. & Labesse, G. (2004). Trends Biochem. Sci. 29, 7–10. Web of Science PubMed CAS Google Scholar
Downing, K. J., Betts, J. C., Young, D. I., McAdam, R. A., Kelly, F., Young, M. & Mizrahi, V. (2004). Tuberculosis, 84, 167–179. Web of Science CrossRef PubMed CAS Google Scholar
Dumas, C. & van der Lee, A. (2008). Acta Cryst. D64, 864–873. Web of Science CrossRef CAS IUCr Journals Google Scholar
Edgar, R. C. (2004). Nucleic Acids Res. 32, 1792–1797. Web of Science CrossRef PubMed CAS Google Scholar
Fisher, R. G. & Sweet, R. M. (1980). Acta Cryst. A36, 755–760. CrossRef CAS IUCr Journals Web of Science Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kana, B. D., Gordhan, B. G., Downing, K. J., Sung, N., Vostroktunova, G., Machowski, E. E., Tsenova, L., Young, M., Kaprelyants, A., Kaplan, G. & Mizrahi, V. (2008). Mol. Microbiol. 67, 672–684. Web of Science CrossRef PubMed CAS Google Scholar
Kana, B. D. & Mizrahi, V. (2010). FEMS Immunol. Med. Microbiol. 58, 39–50. Web of Science CrossRef PubMed CAS Google Scholar
Keegan, R. M. & Winn, M. D. (2008). Acta Cryst. D64, 119–124. Web of Science CrossRef CAS IUCr Journals Google Scholar
Krissinel, E. & Henrick, K. (2004). Acta Cryst. D60, 2256–2268. Web of Science CrossRef CAS IUCr Journals Google Scholar
Mavrici, D., Prigozhin, D. M. & Alber, T. (2014). Protein Sci. 23, 481–487. Web of Science CrossRef CAS PubMed Google Scholar
McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674. Web of Science CrossRef CAS IUCr Journals Google Scholar
McNicholas, S., Potterton, E., Wilson, K. S. & Noble, M. E. M. (2011). Acta Cryst. D67, 386–394. Web of Science CrossRef CAS IUCr Journals Google Scholar
Mukamolova, G. V., Kaprelyants, A. S., Young, D. I., Young, M. & Kell, D. B. (1998). Proc. Natl Acad. Sci. USA, 95, 8916–8921. Web of Science CrossRef CAS PubMed Google Scholar
Mukamolova, G. V., Murzin, A. G., Salina, E. G., Demina, G. R., Kell, D. B., Kaprelyants, A. S. & Young, M. (2006). Mol. Microbiol. 59, 84–98. Web of Science CrossRef PubMed CAS Google Scholar
Mukamolova, G. V., Turapov, O. A., Young, D. I., Kaprelyants, A. S., Kell, D. B. & Young, M. (2002). Mol. Microbiol. 46, 623–635. Web of Science CrossRef PubMed CAS Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Padilla, J. E. & Yeates, T. O. (2003). Acta Cryst. D59, 1124–1130. Web of Science CrossRef CAS IUCr Journals Google Scholar
Powell, H. R., Johnson, O. & Leslie, A. G. W. (2013). Acta Cryst. D69, 1195–1203. Web of Science CrossRef CAS IUCr Journals Google Scholar
Raman, S., Hazra, R., Dascher, C. C. & Husson, R. N. (2004). J. Bacteriol. 186, 6605–6616. Web of Science CrossRef PubMed CAS Google Scholar
Ravagnani, A., Finan, C. L. & Young, M. (2005). BMC Genomics, 6, 39. Google Scholar
Ruggiero, A., Marchant, J., Squeglia, F., Makarov, V., De Simone, A. & Berisio, R. (2013). J. Biomol. Struct. Dyn. 31, 195–205. Web of Science CrossRef CAS PubMed Google Scholar
Ruggiero, A., Tizzano, B., Pedone, E., Pedone, C., Wilmanns, M. & Berisio, R. (2009). J. Mol. Biol. 385, 153–162. Web of Science CrossRef PubMed CAS Google Scholar
Song, H., Inaka, K., Maenaka, K. & Matsushima, M. (1994). J. Mol. Biol. 244, 522–540. CrossRef CAS PubMed Web of Science Google Scholar
Squeglia, F., Romano, M., Ruggiero, A., Vitagliano, L., De Simone, A. & Berisio, R. (2013). Biophys. J. 104, 2530–2539. Web of Science CrossRef CAS PubMed Google Scholar
Studier, F. W. (2005). Protein Expr. Purif. 41, 207–234. Web of Science CrossRef PubMed CAS Google Scholar
Telkov, M. V., Demina, G. R., Voloshin, S. A., Salina, E. G., Dudik, T. V., Stekhanova, T. N., Mukamolova, G. V., Kazaryan, K. A., Goncharenko, A. V., Young, M. & Kaprelyants, A. S. (2006). Biochemistry (Mosc.), 71, 414–422. Web of Science CrossRef PubMed CAS Google Scholar
Tufariello, J. M., Jacobs, W. R. Jr & Chan, J. (2004). Infect. Immun. 72, 515–526. Web of Science CrossRef PubMed CAS Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
Winn, M. D. et al. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Yeates, T. O. (1988). Acta Cryst. A44, 142–144. CrossRef CAS Web of Science IUCr Journals Google Scholar
Zwart, P. H., Grosse-Kunstleve, R. W. & Adams, P. D. (2005). CCP4 Newsl. Protein Crystallogr. 43, contribution 7. Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.