research papers
Using a partial atomic model from medium-resolution cryo-EM to solve a large crystal structure
aInstitute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, 08028 Barcelona, Spain, bInstitut de Biologia Molecular de Barcelona (IBMB–CSIC), Baldiri Reixac 10, 08028 Barcelona, Spain, cCentro Nacional de Biotecnología (CNB–CSIC), Darwin 3, 28049 Madrid, Spain, and dCentro de Investigaciones Biológicas (CIB–CSIC), Ramiro de Maeztu 9, 28040 Madrid, Spain
*Correspondence e-mail: miquel.coll@irbbarcelona.org
Medium-resolution cryo-electron microscopy maps, in particular when they include a significant number of α-helices, may allow the building of partial models that are useful for molecular-replacement searches in large crystallographic structures when the structures of homologs are not available and experimental phasing has failed. Here, as an example, the solution of the structure of a bacteriophage portal using a partial 30% model built into a 7.8 Å resolution cryo-EM map is shown. Inspection of the self-rotation function allowed the correct state to be determined, and density-modification procedures using rotation matrices and a mask based on the cryo-EM structure were critical for solving the structure. A workflow is described that may be applicable to similar cases and this strategy is compared with direct use of the cryo-EM map for molecular replacement.
Keywords: molecular replacement; cryo-EM; density modification; bacteriophage portal.
1. Introduction
X-ray crystallography is the technique that has provided the most high-resolution information in the field of structural biology. Although nowadays it is considered to be a well established technique, solving the structures of certain samples, such as large complexes, continues to be a challenge. Experimental phasing strategies by ).
and anomalous diffraction may be time-consuming or may fail when well diffracting crystals or their derivatives are difficult to obtain. In many cases, (MR) becomes the best, or even the only, option for solving these types of structures. This method requires the availability of a structurally similar model (Rossmann & Blow, 1962In recent years, single-particle cryo-electron microscopy (cryo-EM) has experienced a resolution revolution (see Nogales, 2016, and references cited therein). The development of direct electron detectors and the availability of new processing programs, such as MotionCor2 (Zheng et al., 2017) and RELION (Scheres, 2016), have paved the way to obtaining atomic models which can be directly built into the high-resolution cryo-EM maps. Indeed, cryo-EM has some methodological advantages when compared with X-ray crystallography: it requires lower amounts of sample, it avoids the crystallization bottleneck and it is able to deal with heterogeneous samples. However, the process of obtaining a high-resolution cryo-EM structure may still be quite arduous, including steps that are hard to automate, such as grid preparation and data processing (Doerr, 2016). In many cases, achieving map resolutions that allow the full tracing of atomic models is not straightforward. Consequently, challenging projects remain stuck in the intermediate-resolution range. In difficult cases where one technique alone is not able to succeed, combining data from cryo-EM and X-ray crystallography may be an effective strategy.
The structural study of viral capsids with icosahedral symmetry, in which intermediate-resolution cryo-EM reconstructions were used as phasing models, has contributed significant methodological advances (Stuart & Abrescia, 2013). This approach takes advantage of the icosahedral symmetry present in the samples and uses symmetrized maps. After the cryo-EM resolution revolution, similar procedures applicable to samples without such high orders of symmetry have also been developed (Xiong, 2008; Jackson et al., 2015; Zeng et al., 2018). In these approaches, the cryo-EM map was used directly as an initial model for density modification.
In this article, we show a case example in which a combination of the X-ray crystallography and cryo-EM techniques can be used in a different way, using a partial cryo-EM atomic model instead of the cryo-EM map for MR. Both strategies appear to be equally valid in our example, with that described here being a possible alternative in the case of the failure of direct use of the cryo-EM map.
The bacteriophage portal protein (also named connector) is found at a unique vertex of the viral capsid and is essential for procapsid assembly, genome encapsidation, tail assembly and genome ejection. Its overall architecture corresponds to a ring-like hollow cylindrical dodecamer (Cuervo & Carrascosa, 2012). However, portals have also been described to be able to assemble as undecameric or tridecameric complexes after overexpression, although they are only incorporated into the procapsids as dodecamers. This heterogeneity implies an additional difficulty in their structural characterization (Sun et al., 2015). The portal protein of the T7 bacteriophage is coded by the gp8 gene and has a predicted molecular weight of 59 kDa, which would give a multimeric complex of 650–770 kDa, depending on its state.
2. X-ray crystallographic preliminary studies
2.1. Crystallization
The gp8 protein was expressed, purified and crystallized as described previously (Cuervo et al., 2019). Data set 1, presented here and which yielded PDB entry 6tjp, was obtained from a bar-shaped crystal grown by the hanging-drop vapour-diffusion technique at 293 K using a 4.4 mg ml−1 protein sample in the following conditions: 0.2 M CaCl2, 0.1 M HEPES pH 7.5, 18%(w/v) PEG 400. The crystal was cooled in the same condition with 30%(w/v) PEG 400 as a cryoprotectant and kept in liquid nitrogen until X-ray data collection. Data set 2 yielded the gp8closed structure (PDB entry 6qx5), and details of its crystallization and data-collection statistics have previously been published (Cuervo et al., 2019).
2.2. Data collection and analysis
For data set 1, X-ray data were collected on beamline ID14-4 at the European Synchrotron Radiation Facility (ESRF), Grenoble, France. Diffraction data were indexed and integrated with XDS and scaled, reduced and merged using XSCALE (Kabsch, 2010). Although a total of 270 images were collected, the statistics improved significantly when considering only the first 125 images (Table 1). All of the following X-ray data analyses were carried out using the CCP4 suite of crystallographic programs (Winn et al., 2011).
|
2.2.1. state and self-rotation function
Matthews coefficient (VM) calculations on both data sets suggested one portal oligomer per but these calculations were not conclusive for determining the number of protomers of each portal oligomer. Nevertheless, self-rotation function (SRF) calculations performed with MOLREP (Vagin & Teplyakov, 2010) indicated that the T7 portal was composed of 13 protomers in the data set 1 crystal. Figs. 1(a), 1(b), 1(c) and 1(d) show different stereographic projections of the SRF at χ = 180°, χ = 30°, χ = 27.7° and χ = 25.7°. Comparing the peaks at χ = 30°, χ = 27.7° and χ = 25.7°, which would correspond to the presence of dodecameric, tridecameric or tetradecameric NCS, respectively, the highest peak was found to be in the χ = 27.7° section. Consistent with this observation, there were 13 peaks in the χ = 180° section perpendicular to the 13-fold axis. Therefore, the complex present in the crystal was a tridecamer, with one ring per and with a solvent content of 49%.
A similar analysis performed with data set 2 showed that it corresponded to a dodecameric form of gp8 (Figs. 1e, 1f, 1g and 1h). In this case comparison of the SRF peaks at χ = 32.7°, χ = 30° and χ = 27.7° revealed the highest peak at section χ = 30°, which corresponds to a 12-fold NCS axis. Moreover, there were 12 peaks at χ = 180°, which were perpendicular to the 12-fold axis. Thus, data set 2 consisted of a single dodecameric ring per with a solvent content of 57%.
3. Structure solution
Experimental phasing was attempted extensively both with selenomethionine-derivative protein crystals and heavy-atom or cluster soaking. However, well diffracting derivative crystals could not be obtained. Moreover, no model with sufficient similarity to the gp8 protein to perform MR was available. Therefore, a new strategy was planned, which consisted of structural characterization of the sample by cryo-EM.
3.1. Using a medium-resolution cryo-EM map to obtain an initial model
Single-particle cryo-EM data were initially collected using a Talos Arctica microscope (Cuervo et al., 2019; Table 2). Data processing was challenging owing to the heterogeneity of the sample, which contained a mixture of different oligomeric states, and the lack of lateral orientations. Eventually, the 3D classification of a subset of 1200 particles with RELION (Scheres, 2016) yielded a map of a tridecameric portal at 7.8 Å resolution. Coot (Emsley et al., 2010) was then used to interpret the map and to build a preliminary partial gp8 monomeric model. Built as polyalanine chains, the model consisted of nine α-helices. It contained 194 residues of the total of 536 amino acids present in the gp8 monomer. The sequence of the residues could not be established because the connectivity between the α-helices was not clear and their direction was difficult to determine. Once the partial monomeric model had been built, a tridecameric partial model was constructed, applying rotation matrices and a translation vector to account for the 13-fold axis running along the centre of the particle channel. We used standard rotation matrices with an angle of 2π/13 around the model axis and the corresponding translation to keep the model centred on the EM volume. This calculation was implemented in a short gawk script that also updated the chain names. The 13-fold ring model was then real-space refined against the 7.8 Å resolution cryo-EM map with Phenix (Fig. 2; Afonine et al., 2018).
|
3.2. Molecular replacement
The resulting tridecameric partial model was used for MR with Phaser (McCoy et al., 2007) against data set 1. A unique solution was found with a positive log-likelihood gain (LLG) of 96 and a final translation-function Z-score value of 12.2, which indicated that the structure had been solved. The orientation of the symmetry axes of order 13 agreed with the outcome of the SRF. The peak in the χ = 27.7° section appeared at θ = 70°, φ = 0°, which indicated that the NCS axis of order 13 was located on the XZ (or ac) plane, inclined 70° from the Z (c) axis (Fig. 3). The model was then subjected to rigid-body by protomer using REFMAC5 (Murshudov et al., 2011), which moved all of the protomers significantly (3.5 Å) towards the central channel, shrinking the particle diameter and central channel. This observation led us to suspect that the pixel size given for the cryo-EM data was not accurate, as was confirmed later by the microscope facility. We initially were given a pixel size of 1.42 Å per pixel, while a later calibration of the instrument gave 1.37 Å per pixel. The magnification-factor error could be a serious issue when using cryo-EM data for MR. In our case, such an error translated into a shift of more than 6.5 Å in the diameter of an object of approximately 180 Å. However, phasing with the partial model was still successful, probably because the built helices were mostly not located at the edge of the particle.
3.3. Density modification and model building
After rigid-body α-helical model fitted well into the electron density, but features further away from the initial model were not interpretable (Fig. 4, upper panel). Density-modification (DM) procedures were critical in order to improve the map (Cowtan, 2010). Solvent flattening, histogram matching and NCS averaging were applied with masks generated from the cryo-EM map. A mask of the whole complex was used for solvent flattening, while a slice of it comprising 27.7° of the portal map was used for NCS averaging. The same rotation matrices as used to build the cryo-EM atomic tridecamer were applied. A number of variations were made to the setting parameters and the best conditions were identified in terms of final average NCS correlation: a phase-extension protocol by resolution steps, starting at 7.9 Å, with solvent and averaging masks updated every 50 and 20 cycles, respectively, and a total of 104 cycles. The correlation between NCS-related regions of the map was 0.849. This DM procedure yielded a fully interpretable electron-density map of the particle (Fig. 4, lower panel). A model of the tridecameric protein could be built and refined. During model building, NCS-averaged maps calculated with Coot proved to be critical for the correct interpretation of the maps. of the structure yielded a tridecameric model containing 481 amino acids per monomer (Fig. 5, Table 3; PDB entry 6tjp).
the MR map showed that the partial
|
4. Comparison with the direct use of a cryo-EM map for MR
The direct use of medium-resolution cryo-EM maps (such as that described in Agirrezabala et al., 2005) for phasing crystallographic data had previously been attempted without success. However, after solving the structure using the partial model, MR was again tried as an exercise with the new `post-resolution-revolution' cryo-EM map, both with the inaccurate pixel size used during model building and with the corrected pixel size. In both cases a correct MR solution was obtained, with TFZ and LLG values of 16.0 and 81, respectively, in the first case and 26.7 and 546, respectively, in the second case. These values are better than when a partial model is used, in particular when the scale-factor error is corrected, but it has to be noted that only less than a third of the structure is used in the partial atomic model case.
In addition, the final maps obtained by phasing with the partial model and with the cryo-EM map were also compared, showing that both of them have a similar level of detail in all of the domains, which would allow the building of the final model (Fig. 6). Thus, both strategies appear to give, in our example, the correct solution.
5. Solution of the physiological dodecameric gp8 atomic model
The monomeric structure of gp8 was used to solve the structure of the protein in its physiological dodecameric form using data set 2 and placing 12 copies of the monomer by MR (Cuervo et al., 2019). All structures were refined with REFMAC5 and Phenix (Liebschner et al., 2019), first applying tight NCS restraints, which were progressively relaxed on the side chains. All models were validated with MolProbity (Williams et al., 2018).
6. Discussion
Cryo-EM maps have successfully been used to directly phase X-ray data by MR (Wynne et al., 1999; Chandran et al., 2009; Song et al., 2015). Here, we describe an alternative protocol that also combines X-ray crystallography and cryo-EM data for solving macromolecular structures. Information from the cryo-EM experiment is incorporated into the workflow in two specific steps: MR and DM.
The workflow we present here is based on building a partial model de novo, and therefore no high-resolution information such as a previous atomic model is required. When compared with the direct use of the cryo-EM map for MR, this strategy avoids the map-preparation steps required in these protocols (Jackson et al., 2015). The extra step of building a partial model in Coot was fast using the automatic helix-building option in Coot, and thus does not increase the time effort.
We suggest the following protocol as an alternative to the direct use of the cryo-EM map to phase crystallographic structures (Fig. 7).
|
It is important to note that for DM calculation procedures the accurate orientation and location of the NCS axis, as well as an accurate mask from the cryo-EM volume, are necessary to allow the rapid and significant improvement of the crystallographic electron-density map.
Using this protocol, we managed to phase X-ray data with a partial model of the protein containing only 30% of the residues built as polyalanine chains. Although we also managed to subsequently solve the structure by performing MR with the cryo-EM map as an initial model, we present this workflow as an alternative, which in our case yielded the correct solution in a short time.
7. Conclusions
Despite its spectacular advances, cryo-EM may not always provide maps of sufficient resolution to allow the building and α-helices. On the other hand, large X-ray crystallographic structures can sometimes be difficult to determine because of a lack of derivatives or the long experimental procedures that are required before well diffracting crystals that are useful for phasing are obtained. As structural biology projects become more challenging, dealing with heterogeneous and large complexes, combining data from cryo-EM and X-ray crystallography emerges as an advantageous strategy. This can be performed by directly using the cryo-EM map as an initial model for MR or, as an alternative, by using a partial model built on the cryo-EM map as shown here.
of a full atomic model, depending on the behaviour of the sample. However, medium-resolution cryo-EM maps can be obtained rapidly and usually show clear secondary-structure features, in particularAcknowledgements
The authors wish to thank the personnel of the following facilities for help during X-ray and cryo-EM data acquisition: beamline ID14-4 of the European Synchrotron Radiation Facility (ESRF), Grenoble, France and the Cryo-EM Facility of the Centro Nacional de Biotecnología–Centro de Investigaciones Biológicas (CNB–CIB), Madrid, Spain.
Funding information
The following funding is acknowledged: Ministry of Science, Innovation and Universities of Spain (grant Nos. BFU2014-53550-P and BFU2017-83720-P to Miquel Coll; grant No. BFU2014-54181 to José L. Carrascosa; contract No. SEV-2013-0347 to Ana Cuervo; contract No. RYC-2011-09071 to Cristina Machon; award No. SEV-2015-0500 to IRB Barcelona; award No. MDM-2014-0435 to IBMB Structural Biology Unit); Catalan Government CERCA Programme (grant to IRB Barcelona).
References
Afonine, P. V., Poon, B. K., Read, R. J., Sobolev, O. V., Terwilliger, T. C., Urzhumtsev, A. & Adams, P. D. (2018). Acta Cryst. D74, 531–544. Web of Science CrossRef IUCr Journals Google Scholar
Agirrezabala, X., Martín-Benito, J., Valle, M., González, J. M., Valencia, A., Valpuesta, J. M. & Carrascosa, J. L. (2005). J. Mol. Biol. 347, 895–902. CrossRef PubMed CAS Google Scholar
Chandran, V., Fronzes, R., Duquerroy, S., Cronin, N., Navaza, J. & Waksman, G. (2009). Nature, 462, 1011–1015. CrossRef PubMed CAS Google Scholar
Cowtan, K. (2010). Acta Cryst. D66, 470–478. Web of Science CrossRef CAS IUCr Journals Google Scholar
Cuervo, A. & Carrascosa, J. L. (2012). Curr. Opin. Biotechnol. 23, 529–536. CrossRef CAS PubMed Google Scholar
Cuervo, A., Fàbrega-Ferrer, M., Machón, C., Conesa, J. J., Fernández, F. J., Pérez-Luque, R., Pérez-Ruiz, M., Pous, J., Vega, M. C., Carrascosa, J. L. & Coll, M. (2019). Nat. Commun. 10, 3746. CrossRef PubMed Google Scholar
Doerr, A. (2016). Nat. Methods, 13, 23. CrossRef PubMed Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. Web of Science CrossRef CAS IUCr Journals Google Scholar
Jackson, R. N., McCoy, A. J., Terwilliger, T. C., Read, R. J. & Wiedenheft, B. (2015). Nat. Protoc. 10, 1275–1284. Web of Science CrossRef CAS PubMed Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Liebschner, D., Afonine, P. V., Baker, M. L., Bunkóczi, G., Chen, V. B., Croll, T. I., Hintze, B., Hung, L.-W., Jain, S., McCoy, A. J., Moriarty, N. W., Oeffner, R. D., Poon, B. K., Prisant, M. G., Read, R. J., Richardson, J. S., Richardson, D. C., Sammito, M. D., Sobolev, O. V., Stockwell, D. H., Terwilliger, T. C., Urzhumtsev, A. G., Videau, L. L., Williams, C. J. & Adams, P. D. (2019). Acta Cryst. D75, 861–877. Web of Science CrossRef IUCr Journals Google Scholar
McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Nogales, E. (2016). Nat. Methods, 13, 24–27. Web of Science CrossRef CAS PubMed Google Scholar
Rossmann, M. G. & Blow, D. M. (1962). Acta Cryst. 15, 24–31. CrossRef CAS IUCr Journals Web of Science Google Scholar
Scheres, S. H. W. (2016). Methods Enzymol. 579, 125–157. Web of Science CrossRef CAS PubMed Google Scholar
Song, W., Wang, J., Han, Z., Zhang, Y., Zhang, H., Wang, W., Chang, J., Xia, B., Fan, S., Zhang, D., Wang, J., Wang, H.-W. & Chai, J. (2015). Nat. Struct. Mol. Biol. 22, 782–787. Web of Science CrossRef CAS PubMed Google Scholar
Stuart, D. I. & Abrescia, N. G. A. (2013). Acta Cryst. D69, 2257–2265. Web of Science CrossRef IUCr Journals Google Scholar
Sun, L., Zhang, X., Gao, S., Rao, P. A., Padilla-Sanchez, V., Chen, Z., Sun, S., Xiang, Y., Subramaniam, S., Rao, V. B. & Rossmann, M. G. (2015). Nat. Commun. 6, 7548. CrossRef PubMed Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
Williams, C. J., Headd, J. J., Moriarty, N. W., Prisant, M. G., Videau, L. L., Deis, L. N., Verma, V., Keedy, D. A., Hintze, B. J., Chen, V. B., Jain, S., Lewis, S. M., Arendall, W. B., Snoeyink, J., Adams, P. D., Lovell, S. C., Richardson, J. S. & Richardson, J. S. (2018). Protein Sci. 27, 293–315. Web of Science CrossRef CAS PubMed Google Scholar
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Wynne, S. A., Crowther, R. A. & Leslie, G. W. (1999). Mol. Cell, 3, 771–780. CrossRef PubMed CAS Google Scholar
Xiong, Y. (2008). Acta Cryst. D64, 76–82. Web of Science CrossRef CAS IUCr Journals Google Scholar
Zeng, L., Ding, W. & Hao, Q. (2018). IUCrJ, 5, 382–389. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Zheng, S. Q., Palovcak, E., Armache, J.-P., Verba, K. A., Cheng, Y. & Agard, D. A. (2017). Nat. Methods, 14, 331–332. Web of Science CrossRef CAS PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.