research communications
Structure of an RNA helix with pyrimidine mismatches and cross-strand stacking
aDepartment of Biochemistry, University of Wisconsin–Madison, Madison, WI 53706, USA, and bDepartment of Chemistry, University of Illinois Urbana–Champaign, Urbana, IL 61801, USA
*Correspondence e-mail: sebutcher@wisc.edu
The structure of a 22-base-pair RNA helix with mismatched pyrimidine base pairs is reported. The helix contains two symmetry-related CUG sequences: a triplet-repeat motif implicated in myotonic dystrophy type 1. The CUG repeat contains a U–U mismatch sandwiched between Watson–Crick pairs. Additionally, the center of the helix contains a dimerized UUCG motif with tandem pyrimidine (U–C/C–U) mismatches flanked by U–G wobble pairs. This region of the structure is significantly different from previously observed structures that share the same sequence and neighboring base pairs. The tandem pyrimidine mismatches are unusual and display sheared, cross-strand stacking geometries that locally constrict the helical width, a type of stacking previously associated with purines in internal loops. Thus, pyrimidine-rich regions of RNA have a high degree of structural diversity.
Keywords: RNA; myotonic dystrophy type 1; pyrimidine mismatches.
PDB reference: dimerized UUCG motif, 6e7l
1. Introduction
Cellular transcriptomes are large with myriad important biological functions. However, only a few percent of the structural coordinates in the Worldwide Protein Data Bank correspond to RNA, and some of these are redundant. At the time of writing, there are 1213 unique RNA-containing structures at a resolution of 3 Å or higher in the RNA 3D Motif Atlas (Petrov et al., 2013). From these data it is apparent that RNA molecules are structurally diverse. Whereas many structural motifs have been described for RNA (Butcher & Pyle, 2011), it is likely that new motifs will be discovered as more structures are solved.
Myotonic dystrophy type 1 (DM1) is a heritable disease caused by the expansion of genomically encoded CUG repeats in the 3′ untranslated region of the dystrophia myotonica protein kinase (DMPK) ). The CUG repeats are thought to form hairpin stem-loop structures that sequester the splicing factor muscleblind-like protein 1 (MBNL1), resulting in splicing defects (Miller et al., 2000). Crystal structures of RNAs containing CUG repeats have previously been determined (Coonrod et al., 2012; Kiliszek et al., 2009; Kumar et al., 2011; Mooers et al., 2005). In these structures, the CUG repeats are composed of C–G base pairs that sandwich U–U mismatches. Previous structures have shown that the U–U mismatches can adopt heterogeneous structures, with either zero, one or two hydrogen bonds (Coonrod et al., 2012; Kiliszek et al., 2009; Kumar et al., 2011; Mooers et al., 2005).
(Mirkin, 2007The UUCG tetraloop is one of the most stable and commonly occurring RNA loop sequences (Cheong et al., 1990). Structures of the UUCG tetraloop have been determined (Allain & Varani, 1995; Ennifar et al., 2000; Nichols et al., 2018; Nozinovic et al., 2010). It has previously been observed that during crystallization RNA hairpins containing UUCG tetraloops can dimerize into double helices in which the UUCG sequence forms non-Watson–Crick base pairs (Berger et al., 2019; Cruse et al., 1994; Holbrook et al., 1991). The two previous crystal structures of dimerized UUCG sequences contain U–G wobble pairs flanking mismatched U–C pairs that are bridged by an intervening water molecule.
Here, we report the (a)] to provide a platform for analyzing compounds designed to target the CUG repeat sequence (Arambula et al., 2009). Instead, the RNA crystallized into a duplex in which the two CUG repeats are related by twofold symmetry and form a U–U mismatch flanked by C–G pairs. In the CUG repeat, the U–U base pair has two hydrogen bonds. The dimerized UUCG sequence displays novel cross-strand stacking of pyrimidine pairs, with inter-strand hydrogen bonds between the uracil nucleobase on one strand and the uracil ribose 2′ O atom of the opposite strand.
of an RNA that contains a CUG repeat and a UUCG sequence. The RNA was designed to form a hairpin with an isolated CUG repeat [Fig. 12. Materials and methods
2.1. RNA production
A putative RNA hairpin (5′-GGGCUGCACUUCGGUGCUGCCC-3′) was purchased from Integrated DNA Technologies. The synthesized RNA was resuspended in anion-exchange buffer (300 mM NaCl, 20 mM potassium phosphate pH 6.5, 1 mM EDTA, 1 mM sodium azide) and immobilized on a 1 ml HiTrap Q column (GE Healthcare). The column was washed with ten volumes of buffer prior to step elution in anion-exchange buffer supplemented with 2 M NaCl. The resulting was concentrated using centrifugal filters with a 3 kDa cutoff (Amicon) and then iteratively diluted tenfold and reconcentrated three times using a buffer containing only 20 mM deuterated bis-Tris pH 6.5. The RNA was then concentrated to 150 µM. A small of this RNA was resolved on an analytical nondenaturing polyacrylamide gel, which showed a trace amount (∼5%) of RNA migrating as a slower species that is presumed to be an intermolecular dimer (data not shown).
The RNA was subsequently concentrated to approximately 1.5 mM (∼10 mg ml−1) prior to monitoring its association with the compound `JFA' (Arambula et al., 2009) via 1H NMR (data not shown). For this process, the compound JFA was in 100% DMSO and was added stepwise to a final approximate twofold stoichiometric excess, resulting in a 300 µl sample containing approximately 800 µM RNA, 1600 µM JFA, 5% DMSO, 5% D2O and 20 mM deuterated bis-Tris pH 6.5%. The RNA with `JFA' was finally concentrated using 3 kDa cutoff centrifugal filters (Amicon) to a volume of approximately 100 µl without additional treatment before crystallization screening.
2.2. Crystallization, and refinement
High-throughput crystallization screening was performed by sitting-drop vapor diffusion in 96-well plates at 4°C using 0.2 µl RNA solution, 0.2 µl crystallization reagent and a reservoir volume of 50 µl with a Mosquito crystallization robot (TTP Labtech). After a few weeks, several small crystals (∼10 × 50 µm) were obtained using a crystallization reagent consisting of 0.1 M HEPES pH 7.4, 20% PEG 3350, 20% glycerol, 10% MPD. Crystals were harvested with 100 µm LD MicroLoops (MiTeGen) and vitrified via rapid immersion in liquid nitrogen.
Diffraction data were collected on NE-CAT beamline 24-ID-E at the Advanced Photon Source using an MD2 diffractometer and an EIGER 16M detector. All scientific software was managed though a local SBGrid client (Morin et al., 2013). The data were integrated using XDS (Kabsch, 2010). Initial point-group estimation and scaling were performed in POINTLESS (Evans, 2011) and AIMLESS (Evans & Murshudov, 2013), respectively. Xtriage (Adams et al., 2010) was used to assay potential in the diffraction data after identification of the correct (see below).
Initial phases were determined by Phaser (McCoy et al., 2007) with ideal A-form duplex RNA as the initial search model. was attempted in all possible space groups within the P4 A single solution in P41212 yielded an initial map of sufficient quality to determine that the RNA was in the form of an intermolecular dimer rather than the anticipated hairpin structure. Manual model building was performed in Coot (Emsley et al., 2010) and subsequent automated and model validation in PHENIX (Afonine et al., 2012) and REFMAC (Murshudov et al., 2011; Winn et al., 2011). All figures were prepared with PyMOL (https://www.pymol.org). Coordinates and structure factors have been deposited in the Protein Data Bank under accession code 6e7l; diffraction images are available from the SBGrid Data Bank at https://doi.org/10.15785/SBGRID/712.
using3. Results
The 22-nucleotide RNA strand contains two CUG repeats and a UUCG sequence, and is capable of forming a hairpin or a duplex conformation [Fig. 1(a)]. The crystals diffracted X-rays to 2.59 Å resolution [Fig. 1(b) and Table 1]. The electron density was well resolved for the entire RNA, which formed an intermolecular duplex in the crystal with the two strands related by twofold thus, only one stand of the duplex is present in the crystallographic [Fig. 1(c)]. For the purposes of discussion, we give one strand in the duplex the numbering 1–22 and the other 1′–22′. The RNA adopts an A-form geometry for all except the UUCG sequence regions, which are involved in crystal contacts [Figs. 1(d) and 2(a)].
|
All ribose sugar puckers are C3′-endo, with the exception of U11 and U11′, which are C2′-endo. The UUCG region forms an unusual structure, with two U–C base pairs that are cross-strand stacked [Fig. 2(b)]. The U–C base pairs form a hydrogen bond between the uracil O2 and the cytosine N3 amino group. An additional inter-strand hydrogen bond is formed between the uracil N3 and the uracil ribose O2′. This conformation is significantly different from previous structures of the same sequence, which lacked cross-strand stacking (Berger et al., 2019; Cruse et al., 1994; Holbrook et al., 1991) [Fig. 2(c)]. The cross-strand stacked U–C base pairs are flanked by U–G wobble pairs. The U–G wobble-pair region is involved in helical packing within the [Fig. 1(d)], mediated by minor-groove interactions that are stabilized by inter-helical hydrogen bonds involving 2′ hydroxyl groups, similar to `ribose-zipper' interactions (Tamura & Holbrook, 2002).
The two CUG regions are symmetry-related, with identical structures. The CUG repeat structure is composed of a Watson–Crick C–G pair, a noncanonical U–U pair with two hydrogen bonds and a Watson–Crick G–C pair. The U–U base pair has hydrogen bonds between the imino N atoms and the O2 and O4 atoms (Fig. 3). This type of U–U base pair has previously been termed a `type V' pair (Fig. 3; Coonrod et al., 2012).
4. Discussion
(CUG)N repeats in RNA (where N is the number of repeats) form helices with U–U mismatches that display heterogeneous base-pairing patterns (Coonrod et al., 2012; Kiliszek et al., 2009; Kumar et al., 2011; Mooers et al., 2005). The base-paired 5′-CUG-3′ sequences in the structure reported here are symmetry-related and form a `type V' base pair (Fig. 3), which has been observed previously by crystallography (Kumar et al., 2011) and NMR (Parkesh et al., 2011). The CUG repeat is predominately A-form, with a small degree of cross-strand overlap that places the central uridine within van der Waals radius of the guanosine on the opposite strand. This slight degree of cross-strand stacking has been noted previously in the structure of (CUG)6 (Mooers et al., 2005). The geometry of the U–U wobble places the O2 and O4 ketone O atoms in close proximity. While we do not observe associated cations in this structure, the close approach of ketone O atoms in G–U wobble pairs is known to create a cation-binding site, which can be utilized for phasing (Keel et al., 2007).
Cross-strand stacking in RNA et al., 2005; Correll et al., 1997; Gautheret et al., 1994; Lee et al., 2006; SantaLucia et al., 1990). To our knowledge, the dimerized UUCG structure reported here is a very rare example of a pyrimidine-only interaction with extensive cross-strand stacking. One other known example of cross-strand pyrimidine stacking in RNA occurs in the low-pH structure of the i-motif, which involves intercalated and cross-strand stacked cytidines (Snoussi et al., 2001). Thus, the unusual structure reported here helps to expand our general knowledge of RNA conformational space.
typically involves purines (ChenSupporting information
PDB reference: dimerized UUCG motif, 6e7l
Link https://doi.org/10.15785/SBGRID/712
Diffraction images.
Acknowledgements
We thank Dr Craig Bingman for helpful suggestions. Use of the Advanced Photon Source, an Office of Science User Facility operated for the US Department of Energy (DOE) Office of Science by Argonne National Laboratory, was supported by the US DOE under Contract No. DE-AC02-06CH11357. Use of NE-CAT was supported by National Institutes of Health (NIH) grants P41 GM103403 and S10 RR029205.
Funding information
This work was funded by NIH/NIGMS grant R35 GM118131 to SEB and NIH/NIAMS grant R01 AR069645 to SCZ. LDH is a member of the NIH Chemistry-Biology Interface Training Grant (NRSA 1-T-32-GM070421).
References
Adams, P. D., Afonine, P. V., Bunkóczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L.-W., Kapral, G. J., Grosse-Kunstleve, R. W., McCoy, A. J., Moriarty, N. W., Oeffner, R., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C. & Zwart, P. H. (2010). Acta Cryst. D66, 213–221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Afonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Allain, F. H. & Varani, G. (1995). J. Mol. Biol. 250, 333–353. CrossRef CAS PubMed Google Scholar
Arambula, J. F., Ramisetty, S. R., Baranger, A. M. & Zimmerman, S. C. (2009). Proc. Natl Acad. Sci. USA, 106, 16068–16073. CrossRef PubMed CAS Google Scholar
Berger, K. D., Kennedy, S. D. & Turner, D. H. (2019). Biochemistry, 58, 1094–1108. CrossRef CAS PubMed Google Scholar
Butcher, S. E. & Pyle, A. M. (2011). Acc. Chem. Res. 44, 1302–1311. Web of Science CrossRef CAS PubMed Google Scholar
Chen, G., Znosko, B. M., Kennedy, S. D., Krugh, T. R. & Turner, D. H. (2005). Biochemistry, 44, 2845–2856. CrossRef PubMed CAS Google Scholar
Cheong, C., Varani, G. & Tinoco, I. Jr (1990). Nature (London), 346, 680–682. CrossRef PubMed CAS Web of Science Google Scholar
Coonrod, L. A., Lohman, J. R. & Berglund, J. A. (2012). Biochemistry, 51, 8330–8337. CrossRef CAS PubMed Google Scholar
Correll, C. C., Freeborn, B., Moore, P. B. & Steitz, T. A. (1997). Cell, 91, 705–712. CrossRef CAS PubMed Web of Science Google Scholar
Cruse, W. B., Saludjian, P., Biala, E., Strazewski, P., Prangé, T. & Kennard, O. (1994). Proc. Natl Acad. Sci. USA, 91, 4160–4164. CrossRef CAS PubMed Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. Web of Science CrossRef CAS IUCr Journals Google Scholar
Ennifar, E., Nikulin, A., Tishchenko, S., Serganov, A., Nevskaya, N., Garber, M., Ehresmann, B., Ehresmann, C., Nikonov, S. & Dumas, P. (2000). J. Mol. Biol. 304, 35–42. Web of Science CrossRef PubMed CAS Google Scholar
Evans, P. R. (2011). Acta Cryst. D67, 282–292. Web of Science CrossRef CAS IUCr Journals Google Scholar
Evans, P. R. & Murshudov, G. N. (2013). Acta Cryst. D69, 1204–1214. Web of Science CrossRef CAS IUCr Journals Google Scholar
Gautheret, D., Konings, D. & Gutell, R. R. (1994). J. Mol. Biol. 242, 1–8. CrossRef CAS PubMed Web of Science Google Scholar
Holbrook, S. R., Cheong, C., Tinoco, I. Jr & Kim, S.-H. (1991). Nature (London), 353, 579–581. CrossRef PubMed CAS Web of Science Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Keel, A. Y., Rambo, R. R., Batey, R. T. & Kieft, J. S. (2007). Structure, 15, 761–772. CrossRef PubMed CAS Google Scholar
Kiliszek, A., Kierzek, R., Krzyzosiak, W. J. & Rypniewski, W. (2009). Nucleic Acids Res. 37, 4149–4156. CrossRef PubMed CAS Google Scholar
Kumar, A., Park, H., Fang, P., Parkesh, R., Guo, M., Nettles, K. W. & Disney, M. D. (2011). Biochemistry, 50, 9928–9935. CrossRef CAS PubMed Google Scholar
Lee, J. C., Gutell, R. R. & Russell, R. (2006). J. Mol. Biol. 360, 978–988. CrossRef PubMed CAS Google Scholar
McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674. Web of Science CrossRef CAS IUCr Journals Google Scholar
Miller, J. W., Urbinati, C. R., Teng-Umnuay, P., Stenberg, M. G., Byrne, B. J., Thornton, C. A. & Swanson, M. S. (2000). EMBO J. 19, 4439–4448. CrossRef PubMed CAS Google Scholar
Mirkin, S. M. (2007). Nature (London), 447, 932–940. Web of Science CrossRef PubMed CAS Google Scholar
Mooers, B. H., Logue, J. S. & Berglund, J. A. (2005). Proc. Natl Acad. Sci. USA, 102, 16626–16631. CrossRef PubMed CAS Google Scholar
Morin, A., Eisenbraun, B., Key, J., Sanschagrin, P. C., Timony, M. A., Ottaviano, M. & Sliz, P. (2013). Elife, 2, e01456. Web of Science CrossRef PubMed Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Nichols, P. J., Henen, M. A., Born, A., Strotz, D., Güntert, P. & Vögeli, B. (2018). Commun. Biol. 1, 61. CrossRef PubMed Google Scholar
Nozinovic, S., Fürtig, B., Jonker, H. R. A., Richter, C. & Schwalbe, H. (2010). Nucleic Acids Res. 38, 683–694. CrossRef PubMed CAS Google Scholar
Parkesh, R., Fountain, M. & Disney, M. D. (2011). Biochemistry, 50, 599–601. CrossRef CAS PubMed Google Scholar
Petrov, A. I., Zirbel, C. L. & Leontis, N. B. (2013). RNA, 19, 1327–1340. Web of Science CrossRef CAS PubMed Google Scholar
SantaLucia, J. Jr, Kierzek, R. & Turner, D. H. (1990). Biochemistry, 29, 8813–8819. CrossRef CAS PubMed Google Scholar
Snoussi, K., Nonin-Lecomte, S. & Leroy, J. L. (2001). J. Mol. Biol. 309, 139–153. Web of Science CrossRef PubMed CAS Google Scholar
Tamura, M. & Holbrook, S. R. (2002). J. Mol. Biol. 320, 455–474. Web of Science CrossRef PubMed CAS Google Scholar
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.