

research papers
of 18-mer DNA structuresaInstitute of Biotechnology of the Czech Academy of Sciences, BIOCEV, Průmyslová 595, 252 50 Vestec, Czech Republic, and bFaculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, Břehová 7, 115 19 Prague 1, Czech Republic
*Correspondence e-mail: bohdan.schneider@ibt.cas.cz
Nine new crystal structures of CG-rich DNA 18-mers with the sequence 5′-GGTGGGGGC-XZ-GCCCCACC-3′, which are related to the bacterial repetitive extragenic palindromes, are reported. 18-mer with the central XZ dinucleotide systematically mutated to all 16 sequences show complex behavior in solution, but all ten so far successfully crystallized 18-mers crystallized as A-form duplexes. The protocol benefited from the recurrent use of geometries of the dinucleotide conformer (NtC) classes as restraints in regions of poor electron density. The restraints are automatically generated at the dnatco.datmos.org web service and are available for download. This NtC-driven protocol significantly helped to stabilize the structure The NtC-driven protocol can be adapted to other low-resolution data such as cryo-EM maps. To test the quality of the final structural models, a novel validation method based on comparison of the electron density and conformational similarity to the NtC classes was employed.
Keywords: DNA structure; dnatco.datmos.org; structure validation; structure refinement; base pairing.
1. Introduction
The ability to form pairs between nitrogenous bases is fundamental to the biological functions of et al., 2006). In DNA, the formation of non-Watson–Crick pairs influences the duplex geometry by deflecting it from its optimum compatible with the canonical (Kunz et al., 2009
) and the ability of DNA duplexes to incorporate these pairs depends to a large part on the plasticity of the DNA backbone. The formation of non-Watson–Crick pairs can be stabilized by of aromatic rings of the bases that are isosteric with the Watson–Crick pairs and therefore compliant with the helical architecture (Westhof, 2014
Duplex destabilization by the formation of non-Watson–Crick pairs increases its structural flexibility, which can lead to the formation of multiple molecular species in solution. Competition of these species in the crystallization batch influences the process of crystallization and may decrease the quality of the resulting crystals or even preclude crystal formation. This fact, and a broader issue of the emergence of DNA and especially RNA structures with low crystallographic resolutions around 3 Å, drew our attention to the process of the et al., 2020). The known geometries of the NtC classes provide the possibility to use them as restraints in protocols. The effectiveness of this process needs to be rigorously tested; in this work, we are making the first step.
This paper builds on our previous studies of CG-rich DNA et al., 2015; Kolenko et al., 2020
), where we have explained the biological relevance of these sequences. The CD spectra of REP-related 18-mers of sequence 5′-GGTGGGGC-XZ-GCCCCACC-3′, in which we mutated the central XZ dinucleotide, indicated that these 16 behave differently in solution. Therefore, we wanted to further analyze the structural differences of these 18-mers by X-ray crystallography, especially from the point of view of the differences between XZ dinucleotides that are capable and incapable of forming Watson–Crick pairs. We succeeded in the crystallization of nine new 18-mers and present their crystal structures here (Table 1
). The limited quality of the diffraction data of these 18-mers, with resolutions between 2.5 and 3.0 Å, called for a new approach to and we used our knowledge of the NtC classes to restrain the dinucleotide geometries in some of the refined structures. We believe that the modifications to the protocol and the new validation criteria presented here may be beneficial for other lower resolution nucleic acid structures.
2. Methods and materials
2.1. Crystallization experiments
We studied CG-rich sequences related to bacterial REP elements (Bertels & Rainey, 2011) from Cardiobacterium hominis. were synthesized by and purchased from Sigma–Aldrich and Generi Biotech with standard desalting purification.
All crystallized sequences can be written as 5′-GGTGGGGC-XZ-GCCCCACC-3′ and we succeeded in crystallization of 18-mers where XZ were the dinucleotides AT, AC, AG, CC, CG, GC, GT, TA and TC; they are further named Chom-18-AT, Chom-18-AC, Chom-18-AG etc. All oligonucleotides were dissolved in 50 mM Tris pH 8 to a final concentration of 1 mM and stored in the freezer (−18°C). Prior to crystallization, the were thawed at 20°C, heated to 95°C in a thermoblock for 5 min and cooled to 20°C. Initial screening was performed with the Natrix screen from Hampton Research. The most promising conditions, F2 and F4, were further optimized. Condition F2 consisted of 80 mM NaCl (salt), 12 mM KCl (salt), 20 mM MgCl2 (salt), 0.04 M sodium cacodylate pH 6.5 (buffer), 30%(v/v) MPD (precipitant) and 12 mM spermine·(HCl)4 (additive) and condition F4 consisted of 80 mM SrCl2 (salt), 0.04 M sodium cacodylate pH 6.5 (buffer), 35%(v/v) MPD (precipitant) and 12 mM spermine·(HCl)4 (additive). Optimization was performed in a hanging-drop vapor-diffusion setup. The final crystallization conditions are listed in Supplementary Table S1; the volume of the drops was 3 µl, with a 2:1 or 1:1 ratio of DNA stock:reservoir solution, and the reservoir volume was 1000 µl. The variants crystallized within one to four days. Microseeding greatly improved the efficiency of the crystal growth of the Chom-18-AG variant. Crystallization attempts at 10°C failed. Photographs of several crystals are depicted in Supplementary Fig. S1.
The optimized crystallization conditions for all 18-mers contained Sr2+ cations. Crystals of several variants initially grew in conditions with a lower concentration of Sr2+ or even without the cation, but these conditions produced twinned or small needle-like crystals that were not suitable for diffraction measurements. Further optimization of these conditions that included various metal and nonmetal cations only led to improved crystal quality with solutions containing Sr2+ cations. Therefore, we conclude that the interaction of DNA with Sr2+ was important for the formation of acceptably well diffracting crystals. We also did not observe the formation of crystals in other conditions without sodium cacodylate, MPD and spermine.
2.2. Data collection
Diffraction data were collected at the BESSY II synchrotron operated by the Helmholtz-Zentrum Berlin (Mueller et al., 2015) and on a D8 Venture (Bruker) diffractometer at the Center of Molecular Structure, Institute of Biotechnology of the Czech Academy of Sciences. Crystals were flash-cooled in liquid nitrogen and data were collected at 100 K. During data collection for Chom-18-AG, we tried lowering the humidity with an HCLab (Arinax). This procedure did not yield better diffraction images. Due to the presence of sufficient amounts of MPD (∼20%) in the crystallization batch, no additional cryoprotective procedure was necessary. Inspection of the diffraction images did not show radiation damage. Mosaicity values were in the range 0.19–0.57°. Diffraction images were processed in XDS and AIMLESS (Kabsch, 2010
; Agirre et al., 2023
). Raw diffraction images of the best diffracting crystals have been deposited with Zenodo. Data-collection statistics and Zenodo links are given in Table 1
2.3. protocol using the NtC classes
MOLREP (Vagin & Teplyakov, 2010; Agirre et al., 2023
) showed that the structure solution for all the 18-mers is almost identical to the structure with PDB code 6ros, with one 18-mer strand in the (Kolenko et al., 2020
), and we therefore proceeded with rigid-body was carried out in an NtC-enhanced local fork of phenix.refine version 1.19.2 (Liebschner et al., 2019
); the statistics are listed in Table 2
. Approximately 5% of all reflections were used as a control (free) set.
Despite the involvement of model rebuilding with Coot (Emsley et al., 2010), structure of Chom-18-AT, Chom-18-CG, Chom-18-GC, Chom-18-GT, Chom-18-TC and especially Chom-18-AG remained unstable. Therefore, we decided to restrain the dinucleotide geometries in these structures to the known geometries of the NtC classes. We used the Chom-18-AC variant as the starting reference model because it has the highest resolution and most of its dinucleotides were assigned to NtC classes.
The partial reference model was built from et al., 2002) and are now available.
Initial model and structure factors were uploaded to the DNATCO web service at https://dnatco.datmos.org. After the coordinate file has been uploaded in mmCIF or PDB format, the user is presented with automatically generated NtC restraints for Phenix and CCP4 (Agirre et al., 2023) as well as commands for MacroMolecule Builder (MMB; Flores & Altman, 2010
). The restraints are generated automatically only for model dinucleotides with a root-mean-square deviation (r.m.s.d.) to the closest NtC atoms of within 0.5 Å. The limit of 0.5 Å was determined empirically to restrain only those parts of the structure that are close to the known conformations as defined by the NtC classes. While the default restraints perform well in most cases, the DNATCO web service allows finer tuning of the NtC restraint parameters. Users are intuitively guided to choose an alternative NtC based on the provided `similarity' and `connectivity' plots; at the same time, the fit of the newly proposed dinucleotide geometry to the electron density is calculated. Weights of restraint parameters controlling the width (sigma) of the energy function used by the software are assigned automatically. Advanced users can, however, use DNATCO to modify the overall weight or even to assign per-dinucleotide weights, allowing tighter control over the The automatically generated and optional user-tuned restraint files can be downloaded in the tab under the respective choice of Phenix, REFMAC or MMB software.
NtC restraint files contain the corresponding combinations of torsional and pseudo-bond parameters for the sugar-phosphate backbone torsions, including torsions in the (deoxy)ribose moieties. Below is an excerpt from the phenix.refine restraint file for Chom-18-AG:
This excerpt modifies one of 22 mostly torsional parameters for the first dinucleotide step in the model. The action keywords ntc_delete and ntc_change are introduced because phenix.refine automatically generates a partial set of torsion restraints that are inconsistent with NtC definitions; these restraints are first removed and NtC-derived values are assigned. The base pairs in positions 8–11 were left unrestrained. The restraint file downloaded from dnatco.datmos.org is then edited to add other parameters such as the number of cycles, strategy etc. The is then run: phenix.refine coordinates.pdb data.mtz dnatco_refine.params, where dnatco_refine.params is the input file. However, this approach currently requires the patched version of phenix.refine available from the DNATCO website, which is available upon request.
The restraint file for REFMAC works with the current version of the software (Murshudov et al., 2011). The software was used to independently check the convergence of the process. Below is part of the REFMAC restraint file used to refine Chom-18-AG using a tighter sigma for the energy function:
In cases where the selected target NtC differs significantly from the initial model (r.m.s.d. of >0.5 Å as discussed above) or when the patched version of phenix.refine is not available, a viable option is to download NtC-related commands for the MMB software instead and create a model, which is then used as a reference model in phenix.refine. In the case where the actual model geometry and the geometry targeted by the restraints are distant, the target values can be omitted from the Larger rearrangements of the model using only NtC restraints can thus fail and MMB or Coot intervention or the use of the `idealized' model from DNATCO is needed.
In summary, the NtC classes guide the describes the practical steps of of Chom-18-AG (PDB entry 7z82), the structure with the worst quality diffraction, Chom-18-TC (PDB entry 7z81) and Chom-18-CG (PDB entry 7z7u).
3. Results and discussion
3.1. NtC-driven refinement
NtC classes provided valuable help with building the initial models of six of the nine 18-mers, i.e. Chom-18-AT, Chom-18-CG, Chom-18-GC, Chom-18-GT, Chom-18-TC and Chom-18-AG. The rationale for using the geometries of the NtC classes as restraints is that NtC classes, which are defined by the most probable combinations of 12 sugar-phosphate backbone geometric parameters, represent the most probable dinucleotide structures (Černý, Božíková, Svoboda et al., 2020). In a broader context, NtC classes correspond to dinucleotide local energy minima. It is therefore logical to use them as guides for fitting and refining low-resolution electron densities, where the stress on parametrization of is more consequential.
The NtC-supported 7z82). The crystals of Chom-18-AG diffracted to the lowest crystallographic resolution of ∼3 Å. The was initially unstable, producing several negative peaks along the sugar-phosphate backbone in the map and unsatisfactory Rwork and Rfree values of 0.305 and 0.431, respectively. To improve the results of we generated the NtC restraints, which provided torsional parameters for phenix.refine. This decreased the Rwork and Rfree values to final values of 0.297 and 0.322, respectively (Table 2). In addition to this improvement, we also noticed cleaner electron-density maps with fewer diffraction minima along the sugar-phosphate backbone compared with the starting phases of the cycle. As expected, closeness to the NtC standards increased substantially. The average confal score (the confal score quantifies the agreement between the analyzed dinucleotides and the NtC-defining conformers; for details, see Schneider et al., 2018
) for the entire structure increased significantly from 46 to 64, corresponding to a shift from the 47th to the 84th percentile with respect to all nucleic acid structures in the PDB. The number of unassigned (NANT) dinucleotide steps changed from five to four (Table 3
and Supplementary Table S2).
While the NtC-unrestrained Chom-18-CG model had two dinucleotides that were unassigned to NtC classes, all dinucleotides are assigned to NtC classes in the restrained model (Table 3 and Supplementary Table S2). Although the number of unassigned steps remained the same in the Chom-18-TC structure, the average confal score and the average improved. The r.m.s.d.s between the NtC-restrained and nonrestrained models were 0.45 and 0.57 Å for the Chom-18-TC and Chom-18-CG structures, respectively (Table 3
). Additionally, fewer sessions and cycles of and manual rebuilding were necessary compared with unrestrained by NtCs. Application of the NtC geometrical restraints improved the fit to the electron density marginally; the RSCCs of the constrained and nonconstrained models remained approximately the same (Table 3
and Fig. 1
). In the case of the other structures, the use of NtCs decreased the Rwork and Rfree values marginally, but the geometrical closeness to the NtC classes increased (Supplementary Table S2).
![]() | Figure 1 Comparison of NtC-nonrestrained (a, c) and restrained (b, d) of residue DG4 of Chom-18-CG (a, b) and residue DC18 of Chom-18-TC (c, d). The 2mFo − DFc electron density is contoured in gray at the 1σ level and the mFo − DFc electron density is contoured in green for positive and in red for negative at the 3σ level. Images were drawn with CCP4MG (McNicholas et al., 2011 ![]() |
To summarize, our first experience with NtC-restrained
indicates that it makes the process more robust for lower quality diffraction data and improves the fit to the electron density, and at the same time improves the agreement with the known conformations, as represented here by dinucleotide NtC fragments.3.2. General features of the 18-mer DNA structures
All 18-mers crystallized as isomorphic tetragonal crystals with one strand of a right-handed antiparallel duplex in the Supplementary Table S2. The structures can be characterized overall as deformed A-form duplexes (Fig. 2a). Four of the new structures describe palindromic duplexes potentially with all Watson–Crick pairs: Chom-18-AT (PDB entry 7z7k), Chom-18-CG (PDB entry 7z7u), Chom-18-GC (PDB entry 7z7w) and Chom-18-TA (PDB entry 7z7z). Six 18-mers have sequences with the two central forming non-Watson–Crick pairs: Chom-18-AC (PDB entry 7z7l), Chom-18-CC (PDB entry 7z7m), Chom-18-GT (PDB entry 7z7y), Chom-18-TC (PDB entry 7z81), Chom-18-TT (PDB entry 6ros; Kolenko, Svoboda et al. 2020
) and Chom-18-AG (PDB entry 7z82). Detailed analysis of and backbone geometry is provided below.
![]() | Figure 2 The architecture and crystal packing of ten analyzed DNA 18-mers. (a) The duplexes have the overall shape of the A-form. The two symmetry-related strands are colored blue and red, the two central are depicted in green and the yellow spheres are Sr2+ cations. (b) The crystal packing. Two duplexes whose atoms are closer than 4.0 Å are highlighted in red and blue; all duplexes in gray are further than 4.0 Å from these two duplexes. Images were drawn for Chom-18-AC (PDB entry 7z7l) using ChimeraX (Pettersen et al., 2021 ![]() |
The crystal packing of all structures is virtually identical. The strand in the 7z7l) is given in Supplementary Table S3; the smallest number of contacts shorter than 4.0 Å (21) is observed in Chom-18-GT (PDB entry 7z7y) and the largest number (28) is observed in Chom-18-AC (PDB entry 7z7l). The touching duplexes are highlighted in color in Fig. 2(b). The two central variable dinucleotides do not directly participate in crystal packing; the distances of their atoms to the atoms of symmetry-related duplexes are greater than 6.5 Å. As we have already discussed (Kolenko et al., 2020
), this packing arrangement is reminiscent of that observed in octamers, for instance d(GGGGCCCC)2 (PDB entry 2ana; McCall et al., 1985
), and in d(GCGGGCCCGC)2 decamers (PDB entries 137d and 138d; Ramakrishnan & Sundaralingam, 1993
), where two neighboring sugar rings of one strand stack on the first base pair of a symmetry-related duplex. In all three cited cases, the hydrophobic surfaces of the terminal base pairs stack on the sugar ring edges and may form a few direct or water-bridged (PDB entries 136d and 137d) hydrogen bonds. It is notable that similar packing interactions occur for duplexes of different lengths of eight, ten and 18 All of these duplexes crystallized in different space groups.
All ten analyzed structures have most of the dinucleotides in A-like conformers. The AA00 class describing the canonical A-form prevails, while the less populated A-like NtC classes (AA08, AA04 and AA01) occur more in the central region (Table 4). Only Chom-18-CG, Chom-18-GC and Chom-18-AC have all dinucleotides assigned (classified as NtC classes AA##); unassigned dinucleotides (NtC class NANT) are mostly localized near the central base pair. Most dinucleotides with deformed backbones and unassigned dinucleotides are observed in Chom-18-YY and especially Chom-18-AG, pointing to a highly deformed backbone.
Despite the overall similarity of the duplexes, the central region with a variable dinucleotide sequence shows a trend depending on the central dinucleotide. When we measure the distances between the C1′ atom of nucleotide 9 and C1′ of its symmetry-related base-paired nucleotide 10 in all 18-mers, the order from the shortest to the longest is TT (7.9 Å) < CC < TC < GT < GC < AC < AT < TA < CG < AG (12.1 Å). This trend follows the size of the pyrimidine–pyrimidine (Y–Y), pyrimidine–purine (Y–R/R–Y) and purine–purine (R–R) pairs regardless of the type of base pair involved. The same pattern is observed for P–P distances across the strand (data not shown). The poor quality of the Chom-18-AG crystals and the unsuccessful crystallization of the three R–R 18-mers with central GG, GA and AA dinucleotides may indicate that the central pairs of these R–R 18-mers are becoming too large to be accommodated in the same helical architecture. The observation that the crystal packing can accommodate relatively small changes in the molecular shape has been made previously on a set of Dickerson–Drew dodecamer structures (Dickerson et al., 1994).
All reported structures co-crystallized with the Sr2+ cation located between 6 and 7 and (by symmetry) 12 and 13. Sr2+ cations interact with the keto O6 atoms of guanines 6 and 7. The second Sr2+ cation is observed in Chom-18-GT and Chom-18-TT (PDB entries 7z7y and 6ros, respectively). Chom-18-TT also contains a third Sr2+ cation observed at the twofold axis between the central pairs 9 and 10.
As in our previous studies of REP-related et al., 2015; Kolenko et al., 2020
), we investigated the behavior of the DNA in solution by The spectra of all ten analyzed 18-mers show complex sequence-dependent features that are described in the supporting information and Supplementary Fig. S3.
3.3. Validation by correlation between electron density and geometry
The annotation of nucleic acid structures by NtC classes opens a way to a simple yet powerful validation of the structure quality by correlating the geometries of analyzed dinucleotides and their fit to the experimental electron density. For each dinucleotide, we performed the following.
Fig. 3 displays the RSCC–r.m.s.d. scattergrams calculated for dinucleotides of ten analyzed structures (red dots) and, as gray contours, values calculated for a curated ensemble of 497 chains of sequentially nonredundant uncomplexed DNA from crystal structures with crystallographic resolution higher than 2.6 Å selected according to Biedermannová et al. (2022
![]() | Figure 3 Scattergrams of real-space correlation coefficients (RSCCs) and root-mean-square deviations (r.m.s.d.s) of dinucleotides that are (a) assigned and (b) unassigned to NtC classes. The gray contours denote values for 99%, 95%, 50% and 5% of values in the data set of a curated ensemble of 497 chains of sequentially nonredundant and uncomplexed DNA from previously selected crystal structures with crystallographic resolution higher than 2.6 Å (Biedermannová et al., 2022 ![]() ![]() |
In Fig. 3, we show two scattergrams, the first displaying the relationship between and r.m.s.d. for all dinucleotides assigned to NtC classes and the second displaying the same relationship for unassigned dinucleotides (formally class NANT). The data in the pictures are divided into four rectangles by the vertical line separating dinucleotides whose model and experimental electron densities correlate at 80% and the horizontal line for r.m.s.d. values of 1.0 Å. The data in the rectangles are interpreted as follows.
The difference between the scattergrams for assigned and unassigned dinucleotides is evident. The assigned dinucleotides (Fig. 3a) have a large majority of dinucleotides in rectangle (i) (`good' structures), but a significant fraction of dinucleotides are still `over-refined' in rectangle (ii). The distributions of the template ensemble (gray contours) and the analyzed structures (red dots) are about the same. A large fraction of over-refined dinucleotides can be interpreted as the fitting of geometrically well known fragments into inconclusively shaped electron density.
In contrast, the RSCC–r.m.s.d. scattergram looks different for unassigned dinucleotides (Fig. 3b). The values of the reference ensemble of structures are scattered in all four rectangles, with significant fractions of over-refined (20%), unique (14%) and even poor (6%) dinucleotide geometries. The unassigned dinucleotides from ten analyzed structures are distributed evenly between the good and over-refined rectangles. The distributions of the reference and analyzed dinucleotides are different because the underlying structures are different: while the reference set contains variable structures with potentially uniquely shaped dinucleotides [upper right quadrant (iii)], the dinucleotides in the analyzed structures are all part of conventional double helices that do not depart from conventional A-like conformations close to the NtC classes AA##. In such a case, does not call for a radical departure from the known conformations and converges in the over-refined quadrant (ii).
3.4. Base pairing
All central base pairs in the ten analyzed structures form base pairs by Watson–Crick edges (Leontis & Westhof, 2001). Fig. 4
summarizes the assignments of these base pairs in the Saenger notation (Saenger, 1984
) as archived by the PDB in mmCIF files as _ndb_struct_na_base_pair.hbond_type_28.
![]() | Figure 4 Topologies of the central base pairs (residue 9 and symmetry-related residue 10) in ten analyzed structures. The numbers in the insets indicate the Saenger base-pair notation (Saenger, 1984 ![]() |
3.4.1. 18-mers with all able to form Watson–Crick pairs
All four Chom-18-mers with two central
(residues 9 and 10) able to form Watson–Crick pairs were crystallized. Base pairs A–T and T–A are classified as Watson–Crick pairs, while the C–G pair adopts a specific orientation characterized by a large value of one of the base parameters, shear (2.97 Å), and is not classified. The topology of the G–C pair is compatible with the Watson–Crick pair, but it was not classified as a pair because its atoms do not comply with the hydrogen-bond geometry.3.4.2. 18-mers with two central not able to form canonical base pairs
Of the four YY 18-mers, only Chom-18-CT could not be crystallized. Both Chom-18-TC and Chom-18-CC have high propeller twist; its extreme value in Chom-18-CC precludes assignment of the base-pair category. The geometry of the base pair in Chom-18-TT is different due to the interaction of the thymine O4 major-groove O atoms with the Sr2+ cation.
Two of the four YR and RY variants unable to form Watson–Crick pairs were crystallized, Chom-18-AC and Chom-18-GT, but their base-pairing topology was not assigned.
Finally, only one of the four RR variants, Chom-18-AG, was crystallized. The A–G pair is strongly nonplanar; despite this, the pair is classified. Two successive voluminous A–G base `pairs' were observed in a decamer 1d8x; Gao et al., 1999). In analogy to Chom-18-AG, the bases of PDB entry 1d8x are moved from their common plane; this effect is called `sheared bases' in the original paper.
Structures of Chom-18-mers with central dinucleotides that are capable and incapable of forming Watson–Crick pairs are not distinguishable by any single geometrically interpretable feature of the backbone such as NtC class (Table 4 and Supplementary Table S4) or base parameters (supporting information and Supplementary Fig. S4); their backbone geometries are locked in the A-form duplex (Fig. 2
4. Conclusions
Of the 16 permutations of the central dinucleotide in 18-mer XZ-GCCCCACC-3′, we crystallized ten. The possible XZ combinations are indicated in Fig. 4. Nine structures are reported here (Tables 1
and 2
) and we analyze them together with our previously reported structure with PDB code 6ros (Kolenko et al., 2020
). All crystallized as isomorphic A-form duplexes (Fig. 2
) despite their circular-dichroism spectra showing complex structural behavior, which is likely to be caused by conformational heterogeneity in solution.
The diffraction data for the analyzed structures were of limited resolution between 2.5 and 3.0 Å and the ) improved the convergence of the improved the fit to the electron density and decreased the Rfree values. The restraints are automatically generated by the dnatco.datmos.org web service and are available for download. The protocol benefited significantly from the recurrent use of geometries of the NtC classes as restraints because it stabilized the final models especially in regions of diffuse electron density. The proposed protocol is quite general and is generalizable to other crystal structures. Its applicability to cryo-EM data of nucleic acid structures needs to be tested.
The structures of Chom-18-mers with a central dinucleotide capable and incapable of forming Watson–Crick pairs are not distinguishable by any single geometrically interpretable feature. The local geometric distortions from the A-form as described by the NtC classes are not reflected immediately at the central mismatched , Supplementary Fig. S4).
To validate structural qualities, we employed our previously developed analysis using two-dimensional scattergrams of et al. 2020; Fig. 3
). The scattergrams provide an easy visual indication of potentially incorrectly refined structural fragments and thus help in quick validation regardless of the size and complexity of the structure.
5. Data availability
The presented data are available from the Protein Data Bank as PDB entries 7z7l (Chom-18-AC), 7z82 (Chom-18-AG), 7z7k (Chom-18-AT), 7z7m (Chom-18-CC), 7z7u (Chom-18-CG), 7z7w (Chom-18-GC), 7z7y (Chom-18-GT), 7z7z (Chom-18-TA and 7z81 (Chom-18-TC). Diffraction images have been deposited with the Zenodo server (see Table 1).
6. Related literature
The following references are cited in the supporting information for this article: Hoogsteen (1963), Jaumot et al. (2002
), Kim et al. (1993
), Li et al. (2019
), Neidle (2008
), Nikolova et al. (2011
), Skelly et al. (1993
), del Villar-Guerra et al. (2018
) and Vorlíčková et al. (2012
Supporting information
Link https://10.5281/zenodo.6333817
Raw diffraction images for PDB entry 7z7k.
Link https://10.5281/zenodo.6336683
Raw diffraction images for PDB entry 7z7l.
Link https://10.5281/zenodo.6336722
Raw diffraction images for PDB entry 7z7m.
Link https://10.5281/zenodo.6336839
Raw diffraction images for PDB entry 7z7u.
Link https://10.5281/zenodo.6337128
Raw diffraction images for PDB entry 7z7w.
Link https://10.5281/zenodo.6597387
Raw diffraction images for PDB entry 7z7y.
Link https://10.5281/zenodo.6597824
Raw diffraction images for PDB entry 7z7z.
Link https://10.5281/zenodo.6336683
Raw diffraction images for PDB entry 7z81.
Link https://10.5281/zenodo.6336707
Raw diffraction images for PDB entry 7z82.
Supplementary text and Supplementary Tables and Figures. DOI: https://doi.org/10.1107/S2059798323004679/rr5230sup1.pdf
Diffraction data were collected on BL14.2 at the BESSY II electron storage ring operated by the Helmholtz-Zentrum Berlin (Mueller et al., 2015). We would particularly like to acknowledge the help and support of Thomas Hauss during the experiment. We acknowledge CIISB, Instruct-CZ center (CF Biophysics, CF Cryst, CF Diff) supported by MEYS Czech Republic (LM2018127) and European Regional Development Fund-Project `UP CIISB' (CZ.02.1.01/0.0/0.0/18_046/0015974).
Funding information
This research was funded by project INTER-ACTION (LTAUSA18197) from MEYS Czech Republic, by an institutional grant to the Czech Academy of Sciences (grant RVO 86652036) and by project CAAS CZ.02.1.01/0.0/0.0/16_019/0000778 from MEYS Czech Republic.
Afonine, P. V., Poon, B. K., Read, R. J., Sobolev, O. V., Terwilliger, T. C., Urzhumtsev, A. & Adams, P. D. (2018). Acta Cryst. D74, 531–544. Web of Science CrossRef IUCr Journals Google Scholar
Agirre, J., Atanasova, M., Bagdonas, H., Ballard, C. B., Baslé, A., Beilsten-Edmands, J., Borges, R. J., Brown, D. G., Burgos-Mármol, J. J., Berrisford, J. M., Bond, P. S., Caballero, I., Catapano, L., Chojnowski, G., Cook, A. G., Cowtan, K. D., Croll, T. I., Debreczeni, J. É., Devenish, N. E., Dodson, E. J., Drevon, T. R., Emsley, P., Evans, G., Evans, P. R., Fando, M., Foadi, J., Fuentes-Montero, L., Garman, E. F., Gerstel, M., Gildea, R. J., Hatti, K., Hekkelman, M. L., Heuser, P., Hoh, S. W., Hough, M. A., Jenkins, H. T., Jiménez, E., Joosten, R. P., Keegan, R. M., Keep, N., Krissinel, E. B., Kolenko, P., Kovalevskiy, O., Lamzin, V. S., Lawson, D. M., Lebedev, A. A., Leslie, A. G. W., Lohkamp, B., Long, F., Malý, M., McCoy, A. J., McNicholas, S. J., Medina, A., Millán, C., Murray, J. W., Murshudov, G. N., Nicholls, R. A., Noble, M. E. M., Oeffner, R., Pannu, N. S., Parkhurst, J. M., Pearce, N., Pereira, J., Perrakis, A., Powell, H. R., Read, R. J., Rigden, D. J., Rochira, W., Sammito, M., Sánchez Rodríguez, F., Sheldrick, G. M., Shelley, K. L., Simkovic, F., Simpkin, A. J., Skubak, P., Sobolev, E., Steiner, R. A., Stevenson, K., Tews, I., Thomas, J. M. H., Thorn, A., Valls, J. T., Uski, V., Usón, I., Vagin, A., Velankar, S., Vollmar, M., Walden, H., Waterman, D., Wilson, K. S., Winn, M. D., Winter, G., Wojdyr, M. & Yamashita, K. (2023). Acta Cryst. D79, 449–461. CrossRef IUCr Journals Google Scholar
Berman, H. M., Battistuz, T., Bhat, T. N., Bluhm, W. F., Bourne, P. E., Burkhardt, K., Feng, Z., Gilliland, G. L., Iype, L., Jain, S., Fagan, P., Marvin, J., Padilla, D., Ravichandran, V., Schneider, B., Thanki, N., Weissig, H., Westbrook, J. D. & Zardecki, C. (2002). Acta Cryst. D58, 899–907. Web of Science CrossRef CAS IUCr Journals Google Scholar
Bertels, F. & Rainey, P. B. (2011). Mob. Genet. Elements, 1, 262–301. CrossRef PubMed Google Scholar
Biedermannová, L., Černý, J., Malý, M., Nekardová, M. & Schneider, B. (2022). Acta Cryst. D78, 1032–1045. CrossRef IUCr Journals Google Scholar
Černý, J., Božíková, P., Malý, M., Tykač, M., Biedermannová, L. & Schneider, B. (2020). Acta Cryst. D76, 805–813. Web of Science CrossRef IUCr Journals Google Scholar
Černý, J., Božíková, P., Svoboda, J. & Schneider, B. (2020). Nucleic Acids Res. 48, 6367–6381. Web of Science PubMed Google Scholar
Charnavets, T., Nunvar, J., Nečasová, I., Völker, J., Breslauer, K. J. & Schneider, B. (2015). Biopolymers, 103, 585–596. Web of Science CrossRef CAS PubMed Google Scholar
Dickerson, R. E., Goodsell, D. S. & Neidle, S. (1994). Proc. Natl Acad. Sci. USA, 91, 3579–3583. CrossRef CAS PubMed Web of Science Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. Web of Science CrossRef CAS IUCr Journals Google Scholar
Flores, S. C. & Altman, R. B. (2010). RNA, 16, 1769–1778. Web of Science CrossRef CAS PubMed Google Scholar
Gao, Y.-G., Robinson, H., Sanishvili, R., Joachimiak, A. & Wang, A. H.-J. (1999). Biochemistry, 38, 16452–16460. Web of Science CrossRef PubMed CAS Google Scholar
Hoogsteen, K. (1963). Acta Cryst. 16, 907–916. CrossRef IUCr Journals Google Scholar
Iyer, R. R., Pluciennik, A., Burdett, V. & Modrich, P. L. (2006). Chem. Rev. 106, 302–323. Web of Science CrossRef PubMed CAS Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Jaumot, J., Escaja, N., Gargallo, R., González, C., Pedroso, E. & Tauler, R. (2002). Nucleic Acids Res. 30, e92. CrossRef PubMed Google Scholar
Kim, J. L., Nikolov, D. B. & Burley, S. K. (1993). Nature, 365, 520–527. CrossRef CAS PubMed Google Scholar
Kolenko, P., Svoboda, J., Černý, J., Charnavets, T. & Schneider, B. (2020). Acta Cryst. D76, 1233–1243. CrossRef IUCr Journals Google Scholar
Kumar, K. S. D., Gurusaran, M., Satheesh, S. N., Radha, P., Pavithra, S., Thulaa Tharshan, K. P. S., Helliwell, J. R. & Sekar, K. (2015). J. Appl. Cryst. 48, 939–942. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kunz, C., Saito, Y. & Schär, P. (2009). Cell. Mol. Life Sci. 66, 1021–1038. CrossRef PubMed CAS Google Scholar
Leontis, N. B. & Westhof, E. (2001). RNA, 7, 499–512. Web of Science CrossRef PubMed CAS Google Scholar
Li, S., Olson, W. K. & Lu, X.-J. (2019). Nucleic Acids Res. 47, W26–W34. Web of Science CrossRef CAS PubMed Google Scholar
Liebschner, D., Afonine, P. V., Baker, M. L., Bunkóczi, G., Chen, V. B., Croll, T. I., Hintze, B., Hung, L.-W., Jain, S., McCoy, A. J., Moriarty, N. W., Oeffner, R. D., Poon, B. K., Prisant, M. G., Read, R. J., Richardson, J. S., Richardson, D. C., Sammito, M. D., Sobolev, O. V., Stockwell, D. H., Terwilliger, T. C., Urzhumtsev, A. G., Videau, L. L., Williams, C. J. & Adams, P. D. (2019). Acta Cryst. D75, 861–877. Web of Science CrossRef IUCr Journals Google Scholar
McCall, M., Brown, T. & Kennard, O. (1985). J. Mol. Biol. 183, 385–396. CrossRef CAS PubMed Web of Science Google Scholar
McNicholas, S., Potterton, E., Wilson, K. S. & Noble, M. E. M. (2011). Acta Cryst. D67, 386–394. Web of Science CrossRef CAS IUCr Journals Google Scholar
Mueller, U., Förster, R., Hellmig, M., Huschmann, F. U., Kastner, A., Malecki, P., Pühringer, S., Röwer, M., Sparta, K., Steffien, M., Ühlein, M., Wilk, P. & Weiss, M. S. (2015). Eur. Phys. J. Plus, 130, 141. Web of Science CrossRef Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Neidle, S. (2008). Principles of Nucleic Acid Structure. London: Academic Press. Google Scholar
Nikolova, E. N., Kim, E., Wise, A. A., O'Brien, P. J., Andricioaei, I. & Al-Hashimi, H. M. (2011). Nature, 470, 498–502. Web of Science CrossRef CAS PubMed Google Scholar
Pettersen, E. F., Goddard, T. D., Huang, C. C., Meng, E. C., Couch, G. S., Croll, T. I., Morris, J. H. & Ferrin, T. E. (2021). Protein Sci. 30, 70–82. Web of Science CrossRef CAS PubMed Google Scholar
Ramakrishnan, B. & Sundaralingam, M. (1993). Biochemistry, 32, 11458–11468. CrossRef CAS PubMed Web of Science Google Scholar
Saenger, W. (1984). Principles of Nucleic Acid Structure. New York: Springer-Verlag. Google Scholar
Schneider, B., Božíková, P., Nečasová, I., Čech, P., Svozil, D. & Černý, J. (2018). Acta Cryst. D74, 52–64. Web of Science CrossRef IUCr Journals Google Scholar
Skelly, J. V., Edwards, K. J., Jenkins, T. C. & Neidle, S. (1993). Proc. Natl Acad. Sci. USA, 90, 804–808. CrossRef CAS PubMed Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
Villar-Guerra, R. del, Trent, J. O. & Chaires, J. B. (2018). Angew. Chem. Int. Ed. 57, 7171–7175. Google Scholar
Vorlíčková, M., Kejnovská, I., Bednářová, K., Renčiuk, D. & Kypr, J. (2012). Chirality, 24, 691–698. PubMed Google Scholar
Westhof, E. (2014). FEBS Lett. 588, 2464–2469. CrossRef CAS PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.