research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoSTRUCTURAL
BIOLOGY
ISSN: 2059-7983

A structural role for tryptophan in proteins, and the ubiquitous Trp Cδ1—H⋯O=C (backbone) hydrogen bond

crossmark logo

aDepartment of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908-0736, USA, and bDepartment of Chemistry and Biochemistry, Utah State University, Logan, Utah, USA
*Correspondence e-mail: zsd4n@virginia.edu

Edited by E. Chrysina, National Hellenic Research Foundation, Greece (Received 12 March 2024; accepted 9 June 2024; online 28 June 2024)

Tryptophan is the most prominent amino acid found in proteins, with multiple functional roles. Its side chain is made up of the hydrophobic indole moiety, with two groups that act as donors in hydrogen bonds: the Nɛ—H group, which is a potent donor in canonical hydrogen bonds, and a polarized Cδ1—H group, which is capable of forming weaker, noncanonical hydrogen bonds. Due to adjacent electron-withdrawing moieties, C—H⋯O hydrogen bonds are ubiquitous in macromolecules, albeit contingent on the polarization of the donor C—H group. Consequently, Cα—H groups (adjacent to the carbonyl and amino groups of flanking peptide bonds), as well as the Cɛ1—H and Cδ2—H groups of histidines (adjacent to imidazole N atoms), are known to serve as donors in hydrogen bonds, for example stabilizing parallel and antiparallel β-sheets. However, the nature and the functional role of interactions involving the Cδ1—H group of the indole ring of tryptophan are not well characterized. Here, data mining of high-resolution (r ≤ 1.5 Å) crystal structures from the Protein Data Bank was performed and ubiquitous close contacts between the Cδ1—H groups of tryptophan and a range of electronegative acceptors were identified, specifically main-chain carbonyl O atoms immediately upstream and downstream in the polypeptide chain. The stereochemical analysis shows that most of the interactions bear all of the hallmarks of proper hydrogen bonds. At the same time, their cohesive nature is confirmed by quantum-chemical calculations, which reveal interaction energies of 1.5–3.0 kcal mol−1, depending on the specific stereochemistry.

1. Introduction

Tryptophan (Trp) is the largest amino acid, with important functional roles in proteins. It is often found at protein–protein interfaces, such as antibody–antigen interfaces, accounting for tight interactions and specificity (Samanta & Chakrabarti, 2001[Samanta, U. & Chakrabarti, P. (2001). Protein Eng. Des. Sel. 14, 7-15.]), and is ubiquitous in the ligand/substrate-binding sites of, for example, lectins and various enzymes (Zhang et al., 2004[Zhang, Y., Deshpande, A., Xie, Z., Natesh, R., Acharya, K. R. & Brew, K. (2004). Glycobiology, 14, 1295-1302.]; Spier & Lummis, 2000[Spier, A. D. & Lummis, S. C. R. (2000). J. Biol. Chem. 275, 5620-5625.]). It is also enriched on the surface of membrane proteins embedded in the lipid membrane, where its hydrophobic indole moiety interacts intimately with the lipid phase (Khemaissa et al., 2021[Khemaissa, S., Sagan, S. & Walrant, A. (2021). Crystals, 11, 1032.]). The multiple functions of Trp are contingent on the conformation that it adopts in the active site or at the interface. Consequently, understanding the nature of the forces stabilizing the discrete conformations of this amino acid is essential in structural biology and drug discovery.

The structure of Trp is defined by four dihedral angles (Fig. 1[link]): the backbone Ramachandran φ and ψ angles and the two side-chain dihedral angles χ1 and χ2. The first, χ1, is a rotameric angle with minimum energies at −60° (g−, or m), +60° (g+, or p) and 180° (trans, or t). In contrast, χ2, which involves the sp2 γ-carbon, should in theory only assume values of −90° or +90°. However, it was noted early on that a significant cohort of Trp residues in proteins exhibit an unfavourable m0 conformation (Lovell et al., 2000[Lovell, S. C., Word, J. M., Richardson, J. S. & Richardson, D. C. (2000). Proteins, 40, 389-408.]; we follow the notation introduced by Lovell and coworkers here, where the letter m, p and t is followed by the value of the χ2 angle). Recent results reaffirm that m0 constitutes ∼10% of the Trp conformers in proteins, while m95 and t-105 dominate the conformational space, with a combined frequency of 65.7% (Hameduh et al., 2023[Hameduh, T., Mokry, M., Miller, A. D., Heger, Z. & Haddad, Y. (2023). J. Chem. Inf. Model. 63, 4405-4422.]). The question that arises is what are the noncovalent interactions that are responsible for stabilizing the conformations of Trp, especially noncanonical conformations. In the first attempt to address this question, Petrella & Karplus (2004[Petrella, R. J. & Karplus, M. (2004). Proteins, 54, 716-726.]) studied 25 protein crystal structures determined at a resolution of 2.0 Å or higher. Based on observed stereochemistry and molecular-dynamics calculations, they concluded that C—H⋯O hydrogen bonds, including those with TrpCδ1—H as a donor, were involved in stabilizing the m0 conformation. In contrast, a subsequent more comprehensive study of nonredundant protein crystal structures determined to better than 2.5 Å resolution concluded that the Cδ1—H group does not appear to impact the local stereochemistry, perhaps due to a low energy of the interactions (Nanda & Schmiedekamp, 2008[Nanda, V. & Schmiedekamp, A. (2008). Proteins, 70, 489-497.]).

[Figure 1]
Figure 1
The four conformational dihedral angles defining the structure of a tryptophan residue within a polypeptide.

The existence of hydrogen bonds in which a polarized C—H group can serve as a donor was initially invoked in 1937 to explain the physical properties of mixtures of chloroform with acetone (Glasstone, 1937[Glasstone, S. (1937). Trans. Faraday Soc. 33, 200-207.]). Subsequently, such hydrogen bonds have been independently postulated based on the stereochemistry of selected intermolecular interactions observed in the crystal structures of organic compounds (Sutor, 1962[Sutor, D. J. (1962). Nature, 195, 68-69.], 1963[Sutor, D. J. (1963). J. Chem. Soc. 1963, 1105-1110.]; Taylor & Kennard, 1982[Taylor, R. & Kennard, O. (1982). J. Am. Chem. Soc. 104, 5063-5070.]). More recent spectroscopic (for example infrared and NMR) and computational studies provided detailed insights into the nature of this class of interactions (Hobza & Havlas, 2000[Hobza, P. & Havlas, Z. (2000). Chem. Rev. 100, 4253-4264.]; Joseph & Jemmis, 2007[Joseph, J. & Jemmis, E. D. (2007). J. Am. Chem. Soc. 129, 4620-4632.]; Majerz & Olovsson, 2012[Majerz, I. & Olovsson, I. (2012). RSC Adv. 2, 2545-2552.]; Driver et al., 2016[Driver, R. W., Claridge, T. D. W., Scheiner, S. & Smith, M. D. (2016). Chem. A Eur. J. 22, 16513-16521.]; Shi & Min, 2023[Shi, L. X. & Min, W. (2023). J. Phys. Chem. B, 127, 3798-3805.]; Gilli et al., 1994[Gilli, P., Bertolasi, V., Ferretti, V. & Gilli, G. (1994). J. Am. Chem. Soc. 116, 909-915.]; Gilli & Gilli, 2000[Gilli, G. & Gilli, P. (2000). J. Mol. Struct. 552, 1-15.]; Isaacs et al., 1999[Isaacs, E. D., Shukla, A., Platzman, P. M., Hamann, D. R., Barbiellini, B. & Tulk, C. A. (1999). Phys. Rev. Lett. 82, 600-603.], 2000[Isaacs, E. D., Shukla, A., Platzman, P. M., Hamann, D. R., Barbiellini, B. & Tulk, C. A. (2000). J. Phys. Chem. Solids, 61, 403-406.]; Derewenda, 2023[Derewenda, Z. S. (2023). Int. J. Mol. Sci. 24, 13165.]). As a result, the current definition of a hydrogen bond endorsed by IUPAC includes C—H groups as donors (Arunan et al., 2011[Arunan, E., Desiraju, G. R., Klein, R. A., Sadlej, J., Scheiner, S., Alkorta, I., Clary, D. C., Crabtree, R. H., Dannenberg, J. J., Hobza, P., Kjaergaard, H. G., Legon, A. C., Mennucci, B. & Nesbitt, D. J. (2011). Pure Appl. Chem. 83, 1637-1641.]).

Although generally regarded as being significantly weaker than canonical hydrogen bonds, the interaction energy of C—H⋯O bonds is enhanced if the C—H group is polarized by an adjacent electron-withdrawing moiety, such as nitrogen in heterocyclic compounds next to a methine group, i.e. =CH—. Biological macromolecules, i.e. proteins and nucleic acids, contain a number of such groups capable of forming C—H⋯O bonds. The occurrence and significance of these interactions have been the subject of several comprehensive reviews (Scheiner, 2006a[Scheiner, S. (2006a). Hydrogen Bonding: New Insights, edited by S. J. Grabowski, pp. 263-292. Dordrecht: Springer.]; Gu et al., 1999[Gu, Y. L., Kar, T. & Scheiner, S. (1999). J. Am. Chem. Soc. 121, 9411-9422.]; Horowitz & Trievel, 2012[Horowitz, S. & Trievel, R. C. (2012). J. Biol. Chem. 287, 41576-41582.]; Derewenda, 2023[Derewenda, Z. S. (2023). Int. J. Mol. Sci. 24, 13165.]). In DNA and RNA, methine groups in nitrogen bases are involved in base-pairing and base–pentose interactions (Beiranvand et al., 2021[Beiranvand, N., Freindorf, M. & Kraka, E. (2021). Molecules, 26, 2268.]; Yurenko et al., 2011[Yurenko, Y. P., Zhurakivsky, R. O., Samijlenko, S. P. & Hovorun, D. M. (2011). J. Biomol. Struct. Dyn. 29, 51-65.]; Balaceanu et al., 2017[Balaceanu, A., Pasi, M., Dans, P. D., Hospital, A., Lavery, R. & Orozco, M. (2017). J. Phys. Chem. Lett. 8, 21-28.]). In proteins, the side chain of histidine contains highly polarized Cɛ1—H and Cδ2—H groups (particularly in the protonated, i.e. imidazolium, state) which are often involved in hydrogen bonds (Steinert et al., 2022[Steinert, R. M., Kasireddy, C., Heikes, M. E. & Mitchell-Koch, K. R. (2022). Phys. Chem. Chem. Phys. 24, 19233-19251.]), including functionally important groups in the active sites of enzymes such as serine hydrolases (Derewenda et al., 1994[Derewenda, Z. S., Derewenda, U. & Kobos, P. M. (1994). J. Mol. Biol. 241, 83-93.]). The main-chain Cα—H group is another example of a polarized bond, despite the sp3 hybridization of carbon, owing to the adjacent electron-withdrawing peptide linkages. These groups are directly involved in stabilizing the β secondary structure via Cα—H⋯O=C interstrand bonds in both parallel and antiparallel sheets (Derewenda et al., 1995[Derewenda, Z. S., Lee, L. & Derewenda, U. (1995). J. Mol. Biol. 252, 248-262.]; Scheiner, 2005[Scheiner, S. (2005). J. Phys. Chem. B, 109, 16132-16141.], 2006b[Scheiner, S. (2006b). J. Phys. Chem. B, 110, 18670-18679.], 2010[Scheiner, S. (2010). Curr. Org. Chem. 14, 106-128.]).

Tryptophan contains a polarized methine group within the indole moiety. The Nɛ1 atom polarizes the adjacent Cδ1—H bond, making it suitable to serve as a hydrogen-bond donor. Ab initio calculations showed the energy of such a hydrogen bond to a water molecule to be −2.1 kcal mol−1, with a C⋯O distance of 3.35 Å (Scheiner et al., 2002[Scheiner, S., Kar, T. & Pattanayak, J. (2002). J. Am. Chem. Soc. 124, 13257-13264.]). Given the dramatic increase in the number of protein structures determined at high resolution, particularly during the Structural Genomics Initiative (Standley et al., 2022[Standley, D. M., Nakanishi, T., Xu, Z., Haruna, S., Li, S., Nazlica, S. A. & Katoh, K. (2022). Biophys. Rev. 14, 1247-1253.]), we decided to revisit the question of the role of the TrpCδ1—H group in protein structures and its possible role in Trp side-chain stereochemistry. Using a subset of nonredundant protein structures from the PDB, with a conservative resolution cutoff of 1.5 Å, we discovered that Cδ1—H groups have a high propensity to interact with main-chain carbonyl O atoms, specifically with those located nearby in the polypeptide chain. Our stereochemical analysis is consistent with the notion that these interactions have all of the properties of hydrogen bonds, and quantum-mechanical calculations of interaction energies corroborate this conclusion. The presence of hydrogen bonds involving the TrpCδ1—H group correlates with hitherto uncharacterized discrete structural motifs, with important implications for protein structure and function.

2. Methods

2.1. Data mining in the Protein Data Bank and stereochemical analysis

A subset of crystal structures determined to a resolution of 1.5 Å or better was extracted from the Protein Data Bank. Redundancy was reduced by using a maximum 95% amino-acid identity cutoff. This resulted in a database of 7911 structures. The vast majority did not contain H atoms; those that did had various C—H distances depending on the refinement program used. Notably, Phenix uses 0.93 Å, which is significantly shorter than the actual value of the Cδ1—H distance in indole/tryptophan. It is well established from spectroscopy that the C—H distance shortens in the ethane/ethene/ethyne series, from 1.099 to 1.091 and 1.070 Å, respectively, although the difference may not stem from hybridization but from the coordination number of carbon (Vermeeren et al., 2021[Vermeeren, P., Wolters, L. P., Paragi, G. & Fonseca Guerra, C. (2021). ChemPlusChem, 86, 812-819.]). Inspection of the crystal structures of multiple Trp derivatives in the Cambridge Structural Database shows a variation from 0.93 to 1.13 Å, a range of ∼20% (data not shown). The most accurate measurements of the C—H bonds in crystals are from neutron diffraction. They show that sp3 and sp2 C—H bonds shorten to 1.092 and 1.081 Å, respectively (Lu et al., 2021[Lu, N., Elakkat, V., Thrasher, J. S., Wang, X. P., Tessema, E., Chan, K. L., Wei, R. J., Trabelsi, T. & Francisco, J. S. (2021). J. Am. Chem. Soc. 143, 5550-5557.]). As PyMOL adds riding H atoms to Cδ1 of Trp at 1.09 Å, we used its algorithm to add them to all investigated structures, thus replacing the existing atoms.

This database was searched for any contacts between the H atoms of the Cδ1—H and O atoms, with a dHO distance of 2.86 Å (sum of van der Walls radii) and a minimum αH of 110° (recommended as a minimum hydrogen-bond angle by IUPAC). In this study, we relied on a set of van der Waals radii that differ from those introduced by Bondi (1964[Bondi, A. (1964). J. Phys. Chem. 68, 441-451.]), which are still routinely used. A recent reassessment of the atomic values of van der Waals radii (Chernyshov et al., 2020[Chernyshov, I. Y., Ananyev, I. V. & Pidko, E. A. (2020). ChemPhysChem, 21, 359.]) noted that Bondi's values consistently underestimate the position of the energy minima by 0.3–0.4 Å. Using a new concept of line-of-sight and also taking chemical context into account, Chernyshov et al. (2020[Chernyshov, I. Y., Ananyev, I. V. & Pidko, E. A. (2020). ChemPhysChem, 21, 359.]) provided a revised set of values. They suggest values of 1.21 Å for hydrogen in the context of C—H⋯X contacts (where X is not hydrogen) and 1.65 Å for an sp2 oxygen in a neutral carbonyl group. The sum, 2.86 Å, is the value we use rather than 2.72 Å, which would reflect Bondi's values. Similarly, we note that the new sum of van der Waals radii for sp2 carbon and sp2 carbonyl oxygen is 3.56 Å rather than 3.22 Å, as previously inferred from Bondi's values. Importantly, the estimate of 3.56 Å is more in line with the observed C⋯O distances in C—H⋯O bonds, established theoretically as 3.35 Å for TrpCδ1—H⋯water (Scheiner et al., 2002[Scheiner, S., Kar, T. & Pattanayak, J. (2002). J. Am. Chem. Soc. 124, 13257-13264.]) and experimentally as 3.34 Å between methine in theophylline and oxygen in formaldehyde (Southern & Bryce, 2022[Southern, S. A. & Bryce, D. L. (2022). Solid State Nucl. Magn. Reson. 119, 101795.]).

The resulting database of close contacts had another layer of redundancy due to the presence of noncrystallographic symmetry, which includes biologically relevant oligomers. To eliminate multiple observations of the same contact, we arbitrarily selected the median interaction from oligomeric structures. We assumed that at 1.5 Å resolution or higher, differences between monomers may be due to genuine differences in crystal packing, and so averaging would not be appropriate. However, as the shortest distances might be encumbered by errors, the median contact might be more representative. This final nonredundant data set was used for further calculations of stereochemistry.

The stereochemical analysis was also performed using the PyMOL scripting engine. For each contact identified, the exact distance between the H atom and the O atom was determined, as well as additional geometric parameters as described in Section 3[link]. The database was then split into clusters depending on the number of amino acids between the donor and acceptor groups. The data arising were recorded in tabular form using Excel for each identified conformational cluster separately. All statistical analysis was then carried out in Excel.

2.2. Quantum-chemical calculations of interaction energies

Quantum-chemical calculations were performed via the density-functional approach (DFT) within the context of the M06-2X functional (Zhao & Truhlar, 2008[Zhao, Y. & Truhlar, D. G. (2008). Theor. Chem. Acc. 120, 215-241.]), which has been shown to be an accurate means of treating hydrogen bonds and related noncovalent bonds (Kříž & Řezáč, 2022[Kříž, K. & Řezáč, J. (2022). Phys. Chem. Chem. Phys. 24, 14794-14804.]; Boese, 2015[Boese, A. D. (2015). ChemPhysChem, 16, 978-985.]; Kozuch & Martin, 2013[Kozuch, S. & Martin, J. M. L. (2013). J. Chem. Theory Comput. 9, 1918-1931.]; Walker et al., 2013[Walker, M., Harvey, A. J. A., Sen, A. & Dessent, C. E. H. (2013). J. Phys. Chem. A, 117, 12590-12600.]; Thanthiriwatte et al., 2011[Thanthiriwatte, K. S., Hohenstein, E. G., Burns, L. A. & Sherrill, C. D. (2011). J. Chem. Theory Comput. 7, 88-96.]; Liao et al., 2003[Liao, M. S., Lu, Y. & Scheiner, S. (2003). J. Comput. Chem. 24, 623-631.]; Deible et al., 2014[Deible, M. J., Tuguldur, O. & Jordan, K. D. (2014). J. Phys. Chem. B, 118, 8257-8263.]; Li et al., 2014[Li, A., Muddana, H. S. & Gilson, M. K. (2014). J. Chem. Theory Comput. 10, 1563-1575.]; Mardirossian & Head-Gordon, 2013[Mardirossian, N. & Head-Gordon, M. (2013). J. Chem. Theory Comput. 9, 4453-4461.]; Elm et al., 2013[Elm, J., Bilde, M. & Mikkelsen, K. V. (2013). Phys. Chem. Chem. Phys. 15, 16442-16445.]; Bhattacharyya et al., 2013[Bhattacharyya, S., Bhattacherjee, A., Shirhatti, P. R. & Wategaonkar, S. (2013). J. Phys. Chem. A, 117, 8238-8250.]). A polarized triple-ζ def2-TZVP basis set was chosen so as to afford a large and flexible set. The Gaussian 16 program (Frisch et al., 2016[Frisch, M. J., Trucks, G. W., Schlegel, H. B., Scuseria, G. E., Robb, M. A., Cheeseman, J. R., Scalmani, G., Barone, V., Petersson, G. A., Nakatsuji, H., Li, X., Caricato, M., Marenich, A. V., Bloino, J., Janesko, B. G., Gomperts, R., Mennucci, B., Hratchian, H. P., Ortiz, J. V., Izmaylov, A. F., Sonnenberg, J. L., Williams-Young, D., Ding, F., Lipparini, F., Egidi, F., Goings, J., Peng, B., Petrone, A., Henderson, T., Ranasinghe, D., Zakrzewski, V. G., Gao, J., Rega, N., Zheng, G., Liang, W., Hada, M., Ehara, M., Toyota, K., Fukuda, R., Hasegawa, J., Ishida, M., Nakajima, T., Honda, Y., Kitao, O., Nakai, H., Vreven, T., Throssell, K., Montgomery, J. A. Jr, Peralta, J. E., Ogliaro, F., Bearpark, M. J., Heyd, J. J., Brothers, E. N., Kudin, K. N., Staroverov, V. N., Keith, T. A., Kobayashi, R., Normand, J., Raghavachari, K., Rendell, A. P., Burant, J. C., Iyengar, S. S., Tomasi, J., Cossi, M., Millam, J. M., Klene, M., Adamo, C., Cammi, R., Ochterski, J. W., Martin, R. L., Morokuma, K., Farkas, O., Foresman, J. B. & Fox, D. J. (2016). Gaussian 16 Revision C.01. Gaussian Inc., Wallingford, Connecticut, USA.]) was chosen as the specific means to conduct these computations. The interaction energy Eint of each dyad was evaluated as the difference between the energy of the complex and the sum of the energies of the two constituent subunits. The counterpoise procedure (Boys & Bernardi, 1970[Boys, S. F. & Bernardi, F. (1970). Mol. Phys. 19, 553-566.]) was applied to correct basis-set superposition error.

3. Results and discussion

3.1. Identification of interactions involving TrpCδ1—H as the donor group

We generated a database of nonredundant protein crystal structures refined at a resolution of 1.5 Å or higher from the Protein Data Bank (Burley et al., 2022[Burley, S. K., Bhikadiya, C., Bi, C., Bittrich, S., Chen, L., Crichlow, G. V., Duarte, J. M., Dutta, S., Fayazi, M., Feng, Z., Flatt, J. W., Ganesan, S. J., Goodsell, D. S., Ghosh, S., Kramer Green, R., Guranovic, V., Henry, J., Hudson, B. P., Lawson, C. L., Liang, Y., Lowe, R., Peisach, E., Persikova, I., Piehl, D. W., Rose, Y., Sali, A., Segura, J., Sekharan, M., Shao, C., Vallat, B., Voigt, M., Westbrook, J. D., Whetstone, S., Young, J. Y. & Zardecki, C. (2022). Protein Sci. 31, 187-208.]; see Section 2[link] for the definition of redundancy etc.). Next, we calculated the positions of riding H atoms in all structures with the TrpCδ1—H distance set to 1.09 Å. We then identified interactions involving TrpCδ1—H groups as donors and potential oxygen acceptors, i.e. waters, hydroxyl groups (Ser, Thr and Tyr), side-chain groups (Asx and Glx) and main-chain carbonyl O atoms, using a maximum distance cutoff for H⋯O (dHO) of 2.86 Å and a minimum Cδ1—H⋯O angle (αH) of 110° (see Section 2[link] for an explanation of the cutoff criteria).

We obtained 17 012 close contacts, 5983 of which were with water O atoms. Another 1046 contacts involved Glu and Asp carboxylate groups and 1010 contacts were with side-chain hydroxyl groups of Ser, Thr and Tyr. A further 542 contacts involved side-chain carbonyl groups of Asn and Gln. Interestingly, nearly half of all contacts, i.e. 8431 (49.6%), were with backbone carbonyl O atoms, which are particularly strong acceptors owing to their partial negative charge. Given the preponderance of these interactions, we focused on this group of contacts and analysed the respective stereochemistry in order to assess their character and potential function.

3.2. The stereochemistry of the TrpCδ1—H⋯O=Cbackbone contacts

In order to characterize the stereochemistry of interactions involving TrpCδ1—H groups, we first calculated the distribution of the donor–acceptor, or C⋯O, distances (dCO), as well as the C—H⋯O angles (αH), separately for all carbonyl O atoms as donors and for water O atoms (Figs. 2[link] and 3[link]). The distribution of distances to carbonyl O atoms has a distinct maximum at 3.35 Å. In contrast, water O atoms were found further away on average, at 3.55 Å. The shortest distances in both cases were just below 3 Å. αH increases gradually for both types of interactions with the C⋯O distance.

[Figure 2]
Figure 2
The stereochemical parameters used in this study. dHO, dCO and τ are given in ångströms and all angles are given in degrees.
[Figure 3]
Figure 3
Left: a histogram of the number of interactions of TrpCδ1—H with backbone carbonyl O atoms as a function of the distance dCO (green bars) and a mean value of the αH angle in each group, corrected with cubic interpolation. Right: the same statistics for interactions with water molecules.

It should be stressed that intramolecular steric constraints significantly impact the observed distance distributions. Nevertheless, we note interesting trends. The peak of the dHO distribution is shorter by 0.2 Å compared with the sum of the van der Waals radii of O and C atoms used in this study (i.e. 3.56 Å; see Section 2[link]), suggesting a cohesive interaction. The higher deviation from linearity than observed in canonical hydrogen bonds can be rationalized in terms of the van der Waals interactions between the donor C and acceptor O atom. Specifically, at shorter C⋯O distances the αH angle assumes more acute values, as the H atom is pushed out to avoid steric collision between H and O, which are further apart by at least 0.3 Å than the corresponding distance in canonical hydrogen bonds, owing to the partly covalent character of the latter. Overall, the stereochemistry is consistent with that expected for C—H⋯O hydrogen bonds in small organic molecules (Taylor & Kennard, 1982[Taylor, R. & Kennard, O. (1982). J. Am. Chem. Soc. 104, 5063-5070.]).

Next, we calculated a scatter plot of the two Trp side-chain dihedral angles, i.e. χ1 and χ2, for all TrpCδ1—H⋯O=Cbackbone contacts (Fig. 4[link]). The purpose was to investigate whether the various structural motifs involve Trp side chains in canonical or strained conformations. Nine conformer clusters are observed. The results are intriguing: although low-energy m105 and t-105 are the dominant clusters, as expected, not only is m0 strongly represented, but the unfavourable t0 has a nearly equal frequency, and some cases of p0 are also identifiable.

[Figure 4]
Figure 4
Distribution of side-chain dihedral angles for Trp residues involved in contacts with all main-chain carbonyl O atoms. Blue outlines indicate the most populous, low-energy clusters found in proteins, green shows energetically favourable but less common clusters and red represents theoretically unfavourable conformations.

We then asked what the separation was for the observed pairs of interacting moieties along the polypeptide chain. Fig. 5[link] illustrates the relative register in the sequence between the donor Trp and the acceptor carbonyl group. Positive values indicate that acceptor O atoms are located downstream in the sequence, and negative values refer to oxygen acceptors that are located upstream, i.e. towards the amino-terminus. The most common interactions are those with peptide O atoms in nearby positions: +1, −1, −2, −3 and −4 (Fig. 5[link]). Intrigued by this observation, we carried out additional stereochemical characterization for all contacts within each class (Fig. 2[link]), including the C=O⋯H angle (αO) and the Cα—C=O⋯H dihedral angle (ξ), which allowed calculation of the elevation of the hydrogen from the sp2 plane (τ). Canonical hydrogen bonds demonstrate a strong preference for hydrogens to cluster with αO angles in the range 120–240° and close to the sp2 plane (i.e. low elevation; Murray-Rust & Glusker, 1984[Murray-Rust, P. & Glusker, J. P. (1984). J. Am. Chem. Soc. 106, 1018-1025.]), and similar trends, albeit not as pronounced, have been reported for C—H⋯O bonds (Taylor & Kennard, 1982[Taylor, R. & Kennard, O. (1982). J. Am. Chem. Soc. 104, 5063-5070.]). We were interested in whether we could reproduce these trends in the present study. Finally, we calculated the Ramachandran angles for all of the Trp residues involved to identify possible correlations between local secondary structure and side-chain conformation.

[Figure 5]
Figure 5
A histogram showing the number of contacts between TrpCδ1—H as a donor and the ith main-chain carbonyl O atom as the acceptor. For example, −2 denotes an acceptor located two peptide units upstream in the sequence.

All calculations up to this point were carried out using raw coordinates from the Protein Data Bank (except for the riding hydrogen positions, which were added independently). As we embarked on the detailed analysis of specific structures, we were concerned about inconsistencies inherent in the data sets in the PDB introduced by different protocols or refinement and different software. Specifically, we were concerned about the lack of inclusion of H atoms during refinement, the lack of coordinates in the file etc. To avoid bias, all structures described below were subjected to additional standardized refinement and addition of riding H atoms at correct, uniform positions using the PyMOL script. Details are described in the supporting information and Supplementary Table S1.

3.2.1. The Cδ1—H → O=C (+1) class

In this class of interactions, the Cδ1—H group of Trp points towards the carbonyl O atom of the next residue downstream in the sequence, reaching across a single peptide bond. This requires a favourable combination of four dihedral angles: two Ramachandran angles, ψ in Trp and φ in the residue downstream, and both the χ1 and χ2 angles in the Trp side chain. There are three possible combinations, leading to only three specific conformational clusters out of the nine possible (Fig. 6[link]). The most populous (361 structures) is a distinct, tight cluster corresponding to the rather rare (4.7% frequency) p90 conformer (average χ1 and χ2 of 64° and 90°, respectively). The Trp residue is invariably in the β-secondary structure and the Cδ1—H approaches the acceptor O atom from the re face. The bond is close to linear (the average αH is 153°), but the angle on the acceptor is unfavourable (average αO of 107°; Fig. 7[link]a), resulting in the hydrogen being located significantly outside the sp2 plane of oxygen (average 2.2 Å). A number of such interactions result in very close dHO distances.

[Figure 6]
Figure 6
A double scatter plot (Ramachandran φ/ψ, blue; conformational, χ1/χ2, red) for Trp residues in all structural motifs in the +1 class. The clusters are identified by type as shown in Fig. 4[link].
[Figure 7]
Figure 7
Examples of the three conformational Trp clusters in the +1 class. (a) p90 (PDB entry 1v5v; only the Cβ atom of Trp178 is shown for clarity), (b) t0 (PDB entry 3ts3), (c) t-105 (PDB entry 4ge6).

Both remaining clusters are in the trans conformation with χ1 close to 180°. The first is identifiable as t0 (154 structures). In this cluster, Cδ1—H also approaches the O atom from the re face (as defined by IUPAC), with H significantly out of the sp2 plane, and the dHO distances are often short. The bond tends to be less linear than in p90, with an average αH of 137°, and the angle on the acceptor (αO) is unfavourable (average of 106°) (Fig. 7[link]b), although the H atom is closer to the sp2 plane (average τ of 1.9 Å).

The second trans cluster is the canonical t-105 (113 structures), showing optimal Cδ1—H⋯O bond stereochemistry. This is accomplished specifically when the downstream residue is proline (15 of the 30 shortest distances, including the five shortest distances) or alanine (nine of the 30 shortest distances). The reason is that the secondary structure of this residue needs to be of the collagen type, and both proline and alanine have a strong preference for this conformation (Berisio et al., 2002[Berisio, R., Vitagliano, L., Mazzarella, L. & Zagari, A. (2002). Protein Sci. 11, 262-270.]; Parchaňský et al., 2013[Parchaňský, V., Kapitán, J., Kaminský, J., Šebestík, J. & Bouř, P. (2013). J. Phys. Chem. Lett. 4, 2763-2768.]). The stereochemistry leads to a mean αO of 123°, with the hydrogen on average only 0.6 Å out of the sp2 plane, in an excellent position to interact with one of the free sp2 electron pairs of oxygen (Fig. 7[link]c).

3.2.2. The Cδ1—H → O=C (−1) class

In this unique type of contact, the Cδ1—H group of the indole ring points towards the preceding peptide, engaging in an interaction with the carbonyl O atom immediately upstream in the sequence. The vast majority in this group (694 structures) are in the unfavourable m0 conformation, initially identified by Lovell et al. (2000[Lovell, S. C., Word, J. M., Richardson, J. S. & Richardson, D. C. (2000). Proteins, 40, 389-408.]). Our observation rationalizes the high frequency of this conformer. The motif restricts the Ramachandran φ angle to a narrow range of −90° to −135°, while ψ is allowed a broader range (Fig. 8[link]). The average χ2 is −3.2°. Although the C—H⋯O interaction is close to linear (the average αH is 146.3°), the average αO is very unfavourable (86°) and the hydrogen is out of the amide plane by more than 2 Å on average. We note that such motifs often occur within a β-strand or at the end of one, resulting in a sharp turn.

[Figure 8]
Figure 8
Class −1 of interactions. (a) A double scatter plot (Ramachandran φ/ψ, blue; conformational, χ1/χ2, red) for Trp residues in all motifs. (b) An example from the m0 cluster (PDB entry 2df6; only the Cβ atom of Trp43 is shown for clarity).

A small minority of contacts in this class, i.e. 30 examples, are of the m105 type and almost all involve Trp residues in the αL region of the Ramachandran plot, with long dHO distances. Such stereochemistry suggests weak interactions. There are only three structures in the p-90 cluster.

3.2.3. The Cδ1—H → O=C (−2) class

More conformational freedom is allowed in this class of contacts owing to the insertion of a residue between the acceptor and Trp. Although this motif is more diverse, the same three conformational clusters are observed as were seen in the previous class, albeit with very different frequencies (Fig. 9[link]). By far the most common here is the canonical m105 conformation, with 698 structures. With very few exceptions, Trp is in the β-secondary conformation, with the hydrogen this time approaching from the si face and significantly outside the sp2 plane. The average αH and αO angles are 137° and 114°, respectively.

[Figure 9]
Figure 9
The −2 class of interactions. A double scatter plot (Ramachandran φ/ψ, blue; conformational, χ1/χ2, red) for Trp residues in all structural motifs identified in this class.

The m0 cluster is represented by 190 structures. It is very close in conformational space to m105 because the m105 structures are shifted to lower χ2, with an average value of 82°, while the m0 cluster is also shifted to higher values of χ2, with an average of 23°. In both groups Trp is primarily found in extended, β-secondary conformations, although right-handed and left-handed helical structures are also observed.

There are 277 motifs that constitute the p-90 cluster. The secondary conformation of Trp is restricted to right-handed α-helices and β-structure only. Examples of each of the clusters are shown in Fig. 10[link].

[Figure 10]
Figure 10
Examples of the three conformational Trp clusters in the +2 class. (a) m0 (PDB entry 5k4b; only the Cβ atoms of non-Trp residues are shown for clarity), (b) m105 (PDB entry 2vbk), (c) p-90 (PDB entry 6qo9).

Of note is the fact that many of the motifs in all three clusters resemble the classic type II β-turn. The conformation of Trp is such that the Cδ1—H group mimics the peptide amide which would serve as a donor in a classical β-turn, adding just one atom to the turn (11 atoms instead of 10). Therefore, the direction of the hydrogen bond is preserved, with residue i donating the hydrogen bond to residue i − 2. Unlike the canonical β-turn, this structural feature does not reverse the direction of the polypeptide chain but creates kinks and turns of ∼110°.

3.2.4. The Cδ1—H → O=C (−3) class

In this class, two amino acids are inserted between the acceptor carbonyl group and Trp, adding additional degrees of freedom. Nevertheless, we observe the presence of the same three conformational clusters as was the case for the −1 and −2 classes, i.e. m105, m0 and p-90. The difference is that owing to weaker steric constraints, the m105 and m0 clusters are now distinctly separate and closer to the theoretical values for χ2 angles (averages of 98.5° and −3.6°, respectively), and the frequencies are decidedly shifted towards the canonical, low-energy conformations. There are 361 structures in the m105 cluster and 290 in the p-90 cluster, with only 34 in the unfavourable m0 group (Fig. 11[link]).

[Figure 11]
Figure 11
The −3 class of interactions. A double scatter plot (Ramachandran φ/ψ, blue; conformational, χ1/χ2, red) for Trp residues in all structural motifs in this class.

The m105 cluster contains motifs with Trp found in both α and β secondary structures. The average αH is 137.9°, but αO is again unfavourable (average 113.8°). Except for a few outliers, the p-90 cluster is stereochemically tight, with a mean χ1 of 66° and χ2 of −89°. The vast majority of the motifs contain Trp in an α-helical form, and the putative hydrogen bond has a more favourable geometry, with an αH of 138.5° and an αO of 135.4°, with an average elevation of 0.6 Å on the si face. The small m0 cluster contains several motifs with Trp in α, β and left-handed helical secondary conformations. The dHO distances are longer in this cluster, with an average αH of 140.7° and αO of 136.4°

Examples of a structural motif from each of the clusters are shown in Fig. 12[link].

[Figure 12]
Figure 12
Examples of the three conformational Trp clusters in the +3 class. (a) m0 (PDB entry 5js4; only the Cβ atoms of non-Trp residues are shown for clarity), (b) m105 (PDB entry 6x8o), (c) p-90 (PDB entry 6qo9). Note that (b) and (c) contain three-centred hydrogen bonds from the TrpCδ1—H and amide groups to the i − 3 carbonyl reminiscent of a 310-helical hydrogen-bonding pattern (canonical amide-to-carbonyl hydrogen bonds are shown as fine dashed lines).
3.2.5. The Cδ1—H → O=C (−4) class

This is the most ubiquitous and the most diverse motif, owing to the flexibility generated by the insertion of three residues between the acceptor and donor amino acids. Nevertheless, perhaps surprisingly, only the same three conformational clusters are again present: m105, m0 and p-90. The canonical m105 conformer (average χ2 of 99°) is by far the most common, with nearly 1500 examples, compared with only 80 examples of p-90 and just 44 of m0 (Fig. 13[link]). The majority, i.e. ∼75%, of motifs in the m105 cluster contain Trp in the α-helical conformation, often at the C-terminus of an α-helix (Fig. 14[link]), capping the i − 4 carbonyl with three-centre hydrogen bonds donated by the main-chain amide and the Cδ1—H group.

[Figure 13]
Figure 13
The −4 class of interactions. A double scatter plot (Ramachandran φ/ψ, blue; conformational, χ1/χ2, red) for Trp residues in all structural motifs in this class.
[Figure 14]
Figure 14
Examples of the three conformational Trp clusters in the +4 class. (a) m0 (PDB entry 7mzy; only the Cβ atoms of non-Trp residues are shown for clarity, (b) m105 (PDB entry 4gxw), (c) p-90 (PDB entry 3s92). Note that all motifs contain hydrogen bonds from the TrpCδ1—H and amide groups to the i − 4 carbonyl, capping it with a three-centred bond.

The 80 motifs in the p-90 cluster (average χ2 of −88°) contain primarily (85%) α-helical Trp, with a slightly more favourable average αH of 138°. Most of these motifs also contain a three-centred hydrogen bond such that the amide group and Cδ1—H cap the carbonyl O atom of residue i − 4. This is analogous to the recently documented capping of carbonyl O atoms within membrane helices by Thr and Ser hydroxyls, with a net gain of 127% in enthalpy compared with a single hydrogen bond (Brielle & Arkin, 2020[Brielle, E. S. & Arkin, I. T. (2020). J. Am. Chem. Soc. 142, 14150-14157.]).

The rare m0 motifs also contain Trp in both α and β secondary conformations. They tend to have an unfavourable angular stereochemistry, with an average αH of 128° and αO of 141°, and longer dHO distances.

3.3. The interaction energies of C—H⋯O=C bonds

Whereas the stereochemical descriptors of close inter­atomic contacts provide useful information for the identification of hydrogen bonds, proximity per se does not imply a cohesive interaction or a structural function in the stabilization of a specific conformation. Historically, this was the argument used by Jerry Donohue in his criticism of June Sutor's proposal for the existence of C—H⋯O bonds based on crystallographic data (Schwalbe, 2012[Schwalbe, C. H. (2012). Crystallogr. Rev. 18, 191-206.]). To support his view, he quoted Ramachandran's opinion that H⋯O distances of 2.2 Å in proteins need not necessarily indicate the presence of a hydrogen bond (Ramachandran et al., 1963[Ramachandran, G. N., Ramakrishnan, C. & Sasisekharan, V. (1963). J. Mol. Biol. 7, 95-99.]). It is in principle true that the presence of a hydrogen bond is only hypothesized based on stereochemistry, and its strength is somewhat speculatively inferred from parameters such as linearity (αH) and hydrogen–acceptor distance (dHO). However, current knowledge of the physical chemistry of the hydrogen bond makes it possible to predict its existence based on the nature of the participating groups and stereochemistry with a very high degree of confidence. The nature of the various structural motifs described above, harbouring close Cδ1—H⋯O=C interactions, is strongly suggestive of cohesive hydrogen bonds, but to assess the energies we turned to quantum-mechanical calculations.

It has been shown by one of us (Scheiner et al., 2002[Scheiner, S., Kar, T. & Pattanayak, J. (2002). J. Am. Chem. Soc. 124, 13257-13264.]) that a water molecule binds as a hydrogen-bond acceptor to the Cδ1—H of indole with an energy of −2.1 kcal mol−1 at a dCO distance of 3.35 Å. Because a peptide carbonyl is a stronger acceptor, we repeated this calculation for acetamide, representing an amide group, and indole as a model for Trp. The planes of the two molecules were perpendicular to avoid any steric repulsions, with a fully linear C—H⋯O=C arrangement. Following the optimization of dHO (2.27 Å), we obtained a value for the energy of the interaction (Eint) of −2.6 kcal mol−1, which is consistent with a stronger bond. (For comparison, we also calculated the Eint value for the inter­action of a carbonyl O atom of acetamide with the aromatic Cɛ2—H group of indole; the result was −1.05 kcal mol−1).

The above calculations use a perfectly linear C=O⋯H—C bond as a model system. The motifs found in actual protein structures are quite different from such ideal stereochemistry, and specifically many show αH and αO values that deviate significantly from linearity. We were interested in whether the energies of these interactions are still significant when compared with the reference system. To this end, we used eight representative cases from among those described above, with αH ranging from 135° to 172°, αO ranging from 94° to 165° and τ ranging from 0.3 to 2.15 Å. In each case, we truncated the Trp moiety to 3-methylindole and the acceptor peptide to N-methylacetamide, added H atoms using the PyMOL script and calculated interaction energies (Eint; see Section 2[link]). The results are shown in Table 1[link] and Fig. 15[link].

Table 1
Interaction energies (Eint) calculated for 3-methylindole and N-methylacetamide pairs based on the coordinates of specific interactions in protein structures

The ΔdCO values are the changes in the C⋯O distance (dCO) resulting from additional refinement; the final dHA values (i.e. hydrogen⋯acceptor distances) were obtained after the riding H atoms were replaced with those calculated by PyMOL. Other parameters are αH (the C—H⋯O angle), αO (the C=O⋯H angle) and τ (the elevation of H from the sp2 plane).

PDB entry Class Conformer dCO (Å) ΔdCO (Å) αH (°) αO (°) τ (Å) dHA (Å) Trp Acceptor Eint (kcal mol−1)
3ts3 1 t0 3.026 0.027 153 120 1.36 2.029 612 Ser613 −2.12
4ge6 1 t-105 3.143 0.052 172 128 0.61 2.109 468B Pro469 −2.84
6qo9 −2 p-90 3.098 0.005 171 162 0.28 2.023 26B Gln24B −2.81
5k4b −2 m0 2.862 0.069 140 153 −0.76 2.107 419A Val417 −1.57
2vbk −2 m105 3.245 0.025 160 94 −2.15 2.230 304A Gly302 −1.53
6zjs −3 m0 3.167 0.018 135 165 −0.42 2.343 470A Glu467 −2.94
5js4 −3 m105 3.245 −0.023 166 152 0.80 2.183 347A Leu344 −2.74
7mzy −4 m0 3.405 0.017 163 137 −0.52 2.375 913A Ala909 −2.41
[Figure 15]
Figure 15
The stereochemistry of the Cδ1—H⋯O=C interactions for which energies of interaction have been calculated (Table 1[link]) superposed on the Trp side chain. The PDB codes are shown for each carbonyl O atom.

All interactions show cohesive Eint values irrespective of stereochemistry. As expected, the weakest Eint values were obtained for those interactions in which the H atom is located significantly out of the sp2 plane of the acceptor O atom. It appears that the αH and αO angles are less of a factor: both can be as low as ∼130° without a significant reduction in Eint, as long as the hydrogen is within ∼0.8 Å of the sp2 plane.

We also noted that many of the structural motifs that we investigated show dHO distances as short as ∼2.0 Å, significantly shorter than the predicted optimal distance of ∼2.3 Å. We wondered whether such short interactions, resulting from intramolecular constraints, might be less favourable.

We used PDB entry 3ts3 structure as a model case. We translated the 3-methylindole moiety along the H⋯O line and evaluated Eint between 2.5 and 1.8 Å (Fig. 16[link]). We find that while Eint reaches a maximum at 2.3 Å, the interaction is cohesive down to ∼1.85 Å, which corresponds well to the shortest observed contacts in the crystal structures. There is little loss of energy when the bond is stretched to 2.5 Å, consistent with the primarily electrostatic nature of the interaction.

[Figure 16]
Figure 16
The dependence of the energy of interaction (Eint) on the dHO distance for an N-methylacetamide and 3-methylindole pair derived from PDB entry 3ts3. The arrow shows the position on the energy curve corresponding to the actual dHO distance in the crystal structure, i.e. 2.02 Å. The black arrow indicates the line along which the 3-methylindole moiety was translated to obtain the curve of Eint versus distance.

4. Conclusions

It is well established that main-chain/side-chain interactions mediated by hydrogen bonds are involved in specific conformational motifs, often capping secondary-structure elements such as helices and β-sheets (Eswar & Ramakrishnan, 2000[Eswar, N. & Ramakrishnan, C. (2000). Protein Eng. Des. Sel. 13, 227-238.]; Krishna Deepak & Sankararamakrishnan, 2016[Krishna Deepak, R. N. & Sankararamakrishnan, R. (2016). Biochemistry, 55, 3774-3783.]). However, such motifs reported to date invariably involved canonical hydrogen bonds, i.e. those involving N and O atoms. Typical examples are Asx-turns, in which the side-chain carbonyl O atom of Asp or Asn engages the main-chain amide of the i + 2 residue, mimicking a β-turn (D'mello et al., 2022[D'mello, V. C., Goldsztejn, G., Mundlapati, V. R., Brenner, V., Gloaguen, E., Charnay-Pouget, F., Aitken, D. J. & Mons, M. (2022). Chem. A Eur. J. 28, e202200969.]). Similarly, Nδ1 of the histidine imidazole has been shown to engage with the backbone amide groups (Krishna Deepak & Sankararamakrishnan, 2016[Krishna Deepak, R. N. & Sankararamakrishnan, R. (2016). Biochemistry, 55, 3774-3783.]). Interestingly, Cδ1 of Trp occupies a position isosteric to Oδ of Asx and Nδ1 of His, and because it is protonated it engages the carbonyl and not the amide groups of the main chain. Our study demonstrates that the Cδ1—H group of a tryptophan residue plays an important role in stabilizing unique structural motifs by engaging as an hydrogen-bond donor with main-chain carbonyl O atoms nearby in the sequence. The most common such interactions involve residues one peptide unit downstream, i.e. i +1, or 1–4 peptide units upstream, i.e. i − 1 to i − 4. Interestingly, Trp is found in these motifs in only six of the possible nine conformers, with the i + 1 class containing only p90, t0 and t-105 conformers, while the remaining four classes show Trp only in m105, m0 and p-90 conformations. The frequencies of the high-energy m0 and t0 conformers is increased significantly in those classes where the contacts are strongly restricted by short-range steric constraints, while m105, the most populous class found in proteins, is strongly enriched in the −3 and −4 classes. Our work helps to explain the relatively common occurrence of the m0 and t0 classes. It is important to note that the function of Trp residues is intimately contingent on their conformation. For example, Trp in transmembrane helices occurs most often in m0, t0 and p-90 conformations, all of which have been characterized in our study (de Jesus & Allen, 2013[Jesus, A. J. de & Allen, T. W. (2013). Biochim. Biophys. Acta, 1828, 864-876.]). Of importance is our observation that in the −3 and −4 classes Trp is often engaged in capping the acceptor O atom with hydrogen bonds donated by both the amide and Cδ1—H groups.

We also present evidence based on quantum-chemical calculations that the short Cδ1—H⋯O=C contacts revealed by structural data mining are in fact invariably cohesive interactions of the order of approximately half a canonical hydrogen bond, and less sensitive to specific stereochemistry, such as C—H⋯O and H⋯O=C angles, than previously thought. The critical factor is the position of the H atom close to the sp2 plane of the acceptor O atom.

5. Related literature

The following references are cited in the supporting information for this article: Adams et al. (2010[Adams, P. D., Afonine, P. V., Bunkóczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L.-W., Kapral, G. J., Grosse-Kunstleve, R. W., McCoy, A. J., Moriarty, N. W., Oeffner, R., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C. & Zwart, P. H. (2010). Acta Cryst. D66, 213-221.]), Emsley et al. (2010[Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486-501.]) and Kovalevskiy et al. (2018[Kovalevskiy, O., Nicholls, R. A., Long, F., Carlon, A. & Murshudov, G. N. (2018). Acta Cryst. D74, 215-227.]).

Supporting information


Footnotes

Current address: Department of Biochemistry, Biophysics and Biotechnology, Doctoral School of Exact and Natural Sciences, Jagiellonian University, Krakow, Poland.

Acknowledgements

The authors declare no competing financial interests.

Funding information

ZSD and WM are supported by Harrison Family Funds; WM and ZSD acknowledge National Institutes of Health grants GM132595 and GM086457, respectively.

References

First citationAdams, P. D., Afonine, P. V., Bunkóczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L.-W., Kapral, G. J., Grosse-Kunstleve, R. W., McCoy, A. J., Moriarty, N. W., Oeffner, R., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C. & Zwart, P. H. (2010). Acta Cryst. D66, 213–221.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationArunan, E., Desiraju, G. R., Klein, R. A., Sadlej, J., Scheiner, S., Alkorta, I., Clary, D. C., Crabtree, R. H., Dannenberg, J. J., Hobza, P., Kjaergaard, H. G., Legon, A. C., Mennucci, B. & Nesbitt, D. J. (2011). Pure Appl. Chem. 83, 1637–1641.  Web of Science CrossRef CAS Google Scholar
First citationBalaceanu, A., Pasi, M., Dans, P. D., Hospital, A., Lavery, R. & Orozco, M. (2017). J. Phys. Chem. Lett. 8, 21–28.  CrossRef CAS PubMed Google Scholar
First citationBeiranvand, N., Freindorf, M. & Kraka, E. (2021). Molecules, 26, 2268.  CrossRef PubMed Google Scholar
First citationBerisio, R., Vitagliano, L., Mazzarella, L. & Zagari, A. (2002). Protein Sci. 11, 262–270.  Web of Science CrossRef PubMed CAS Google Scholar
First citationBhattacharyya, S., Bhattacherjee, A., Shirhatti, P. R. & Wategaonkar, S. (2013). J. Phys. Chem. A, 117, 8238–8250.  Web of Science CrossRef CAS PubMed Google Scholar
First citationBoese, A. D. (2015). ChemPhysChem, 16, 978–985.  CrossRef CAS PubMed Google Scholar
First citationBondi, A. (1964). J. Phys. Chem. 68, 441–451.  CrossRef CAS Web of Science Google Scholar
First citationBoys, S. F. & Bernardi, F. (1970). Mol. Phys. 19, 553–566.  CrossRef CAS Web of Science Google Scholar
First citationBrielle, E. S. & Arkin, I. T. (2020). J. Am. Chem. Soc. 142, 14150–14157.  CrossRef CAS PubMed Google Scholar
First citationBurley, S. K., Bhikadiya, C., Bi, C., Bittrich, S., Chen, L., Crichlow, G. V., Duarte, J. M., Dutta, S., Fayazi, M., Feng, Z., Flatt, J. W., Ganesan, S. J., Goodsell, D. S., Ghosh, S., Kramer Green, R., Guranovic, V., Henry, J., Hudson, B. P., Lawson, C. L., Liang, Y., Lowe, R., Peisach, E., Persikova, I., Piehl, D. W., Rose, Y., Sali, A., Segura, J., Sekharan, M., Shao, C., Vallat, B., Voigt, M., Westbrook, J. D., Whetstone, S., Young, J. Y. & Zardecki, C. (2022). Protein Sci. 31, 187–208.  Web of Science CrossRef CAS PubMed Google Scholar
First citationChernyshov, I. Y., Ananyev, I. V. & Pidko, E. A. (2020). ChemPhysChem, 21, 359.  CrossRef PubMed Google Scholar
First citationDeible, M. J., Tuguldur, O. & Jordan, K. D. (2014). J. Phys. Chem. B, 118, 8257–8263.  CrossRef CAS PubMed Google Scholar
First citationDerewenda, Z. S. (2023). Int. J. Mol. Sci. 24, 13165.  CrossRef PubMed Google Scholar
First citationDerewenda, Z. S., Derewenda, U. & Kobos, P. M. (1994). J. Mol. Biol. 241, 83–93.  CrossRef CAS PubMed Web of Science Google Scholar
First citationDerewenda, Z. S., Lee, L. & Derewenda, U. (1995). J. Mol. Biol. 252, 248–262.  CrossRef CAS PubMed Web of Science Google Scholar
First citationD'mello, V. C., Goldsztejn, G., Mundlapati, V. R., Brenner, V., Gloaguen, E., Charnay-Pouget, F., Aitken, D. J. & Mons, M. (2022). Chem. A Eur. J. 28, e202200969.  Google Scholar
First citationDriver, R. W., Claridge, T. D. W., Scheiner, S. & Smith, M. D. (2016). Chem. A Eur. J. 22, 16513–16521.  CrossRef CAS Google Scholar
First citationElm, J., Bilde, M. & Mikkelsen, K. V. (2013). Phys. Chem. Chem. Phys. 15, 16442–16445.  CrossRef CAS PubMed Google Scholar
First citationEmsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationEswar, N. & Ramakrishnan, C. (2000). Protein Eng. Des. Sel. 13, 227–238.  CrossRef CAS Google Scholar
First citationFrisch, M. J., Trucks, G. W., Schlegel, H. B., Scuseria, G. E., Robb, M. A., Cheeseman, J. R., Scalmani, G., Barone, V., Petersson, G. A., Nakatsuji, H., Li, X., Caricato, M., Marenich, A. V., Bloino, J., Janesko, B. G., Gomperts, R., Mennucci, B., Hratchian, H. P., Ortiz, J. V., Izmaylov, A. F., Sonnenberg, J. L., Williams-Young, D., Ding, F., Lipparini, F., Egidi, F., Goings, J., Peng, B., Petrone, A., Henderson, T., Ranasinghe, D., Zakrzewski, V. G., Gao, J., Rega, N., Zheng, G., Liang, W., Hada, M., Ehara, M., Toyota, K., Fukuda, R., Hasegawa, J., Ishida, M., Nakajima, T., Honda, Y., Kitao, O., Nakai, H., Vreven, T., Throssell, K., Montgomery, J. A. Jr, Peralta, J. E., Ogliaro, F., Bearpark, M. J., Heyd, J. J., Brothers, E. N., Kudin, K. N., Staroverov, V. N., Keith, T. A., Kobayashi, R., Normand, J., Raghavachari, K., Rendell, A. P., Burant, J. C., Iyengar, S. S., Tomasi, J., Cossi, M., Millam, J. M., Klene, M., Adamo, C., Cammi, R., Ochterski, J. W., Martin, R. L., Morokuma, K., Farkas, O., Foresman, J. B. & Fox, D. J. (2016). Gaussian 16 Revision C.01. Gaussian Inc., Wallingford, Connecticut, USA.  Google Scholar
First citationGilli, G. & Gilli, P. (2000). J. Mol. Struct. 552, 1–15.  Web of Science CrossRef CAS Google Scholar
First citationGilli, P., Bertolasi, V., Ferretti, V. & Gilli, G. (1994). J. Am. Chem. Soc. 116, 909–915.  CrossRef CAS Web of Science Google Scholar
First citationGlasstone, S. (1937). Trans. Faraday Soc. 33, 200–207.  CrossRef CAS Google Scholar
First citationGu, Y. L., Kar, T. & Scheiner, S. (1999). J. Am. Chem. Soc. 121, 9411–9422.  Web of Science CrossRef CAS Google Scholar
First citationHameduh, T., Mokry, M., Miller, A. D., Heger, Z. & Haddad, Y. (2023). J. Chem. Inf. Model. 63, 4405–4422.  CrossRef CAS PubMed Google Scholar
First citationHobza, P. & Havlas, Z. (2000). Chem. Rev. 100, 4253–4264.  Web of Science CrossRef PubMed CAS Google Scholar
First citationHorowitz, S. & Trievel, R. C. (2012). J. Biol. Chem. 287, 41576–41582.  Web of Science CrossRef CAS PubMed Google Scholar
First citationIsaacs, E. D., Shukla, A., Platzman, P. M., Hamann, D. R., Barbiellini, B. & Tulk, C. A. (1999). Phys. Rev. Lett. 82, 600–603.  Web of Science CrossRef CAS Google Scholar
First citationIsaacs, E. D., Shukla, A., Platzman, P. M., Hamann, D. R., Barbiellini, B. & Tulk, C. A. (2000). J. Phys. Chem. Solids, 61, 403–406.  CrossRef CAS Google Scholar
First citationJesus, A. J. de & Allen, T. W. (2013). Biochim. Biophys. Acta, 1828, 864–876.  PubMed Google Scholar
First citationJoseph, J. & Jemmis, E. D. (2007). J. Am. Chem. Soc. 129, 4620–4632.  Web of Science CrossRef PubMed CAS Google Scholar
First citationKhemaissa, S., Sagan, S. & Walrant, A. (2021). Crystals, 11, 1032.  CrossRef Google Scholar
First citationKovalevskiy, O., Nicholls, R. A., Long, F., Carlon, A. & Murshudov, G. N. (2018). Acta Cryst. D74, 215–227.  Web of Science CrossRef IUCr Journals Google Scholar
First citationKozuch, S. & Martin, J. M. L. (2013). J. Chem. Theory Comput. 9, 1918–1931.  Web of Science CrossRef CAS PubMed Google Scholar
First citationKrishna Deepak, R. N. & Sankararamakrishnan, R. (2016). Biochemistry, 55, 3774–3783.  CrossRef CAS PubMed Google Scholar
First citationKříž, K. & Řezáč, J. (2022). Phys. Chem. Chem. Phys. 24, 14794–14804.  PubMed Google Scholar
First citationLi, A., Muddana, H. S. & Gilson, M. K. (2014). J. Chem. Theory Comput. 10, 1563–1575.  CrossRef CAS PubMed Google Scholar
First citationLiao, M. S., Lu, Y. & Scheiner, S. (2003). J. Comput. Chem. 24, 623–631.  CrossRef PubMed CAS Google Scholar
First citationLovell, S. C., Word, J. M., Richardson, J. S. & Richardson, D. C. (2000). Proteins, 40, 389–408.  Web of Science CrossRef PubMed CAS Google Scholar
First citationLu, N., Elakkat, V., Thrasher, J. S., Wang, X. P., Tessema, E., Chan, K. L., Wei, R. J., Trabelsi, T. & Francisco, J. S. (2021). J. Am. Chem. Soc. 143, 5550–5557.  CrossRef CAS PubMed Google Scholar
First citationMajerz, I. & Olovsson, I. (2012). RSC Adv. 2, 2545–2552.  CrossRef CAS Google Scholar
First citationMardirossian, N. & Head-Gordon, M. (2013). J. Chem. Theory Comput. 9, 4453–4461.  CrossRef CAS PubMed Google Scholar
First citationMurray-Rust, P. & Glusker, J. P. (1984). J. Am. Chem. Soc. 106, 1018–1025.  CrossRef CAS Web of Science Google Scholar
First citationNanda, V. & Schmiedekamp, A. (2008). Proteins, 70, 489–497.  CrossRef PubMed CAS Google Scholar
First citationParchaňský, V., Kapitán, J., Kaminský, J., Šebestík, J. & Bouř, P. (2013). J. Phys. Chem. Lett. 4, 2763–2768.  PubMed Google Scholar
First citationPetrella, R. J. & Karplus, M. (2004). Proteins, 54, 716–726.  Web of Science CrossRef PubMed CAS Google Scholar
First citationRamachandran, G. N., Ramakrishnan, C. & Sasisekharan, V. (1963). J. Mol. Biol. 7, 95–99.  CrossRef PubMed CAS Web of Science Google Scholar
First citationSamanta, U. & Chakrabarti, P. (2001). Protein Eng. Des. Sel. 14, 7–15.  Web of Science CrossRef CAS Google Scholar
First citationScheiner, S. (2005). J. Phys. Chem. B, 109, 16132–16141.  Web of Science CrossRef PubMed CAS Google Scholar
First citationScheiner, S. (2006a). Hydrogen Bonding: New Insights, edited by S. J. Grabowski, pp. 263–292. Dordrecht: Springer.  Google Scholar
First citationScheiner, S. (2006b). J. Phys. Chem. B, 110, 18670–18679.  CrossRef PubMed CAS Google Scholar
First citationScheiner, S. (2010). Curr. Org. Chem. 14, 106–128.  CrossRef CAS Google Scholar
First citationScheiner, S., Kar, T. & Pattanayak, J. (2002). J. Am. Chem. Soc. 124, 13257–13264.  CrossRef PubMed CAS Google Scholar
First citationSchwalbe, C. H. (2012). Crystallogr. Rev. 18, 191–206.  Web of Science CrossRef Google Scholar
First citationShi, L. X. & Min, W. (2023). J. Phys. Chem. B, 127, 3798–3805.  CrossRef CAS PubMed Google Scholar
First citationSouthern, S. A. & Bryce, D. L. (2022). Solid State Nucl. Magn. Reson. 119, 101795.  CrossRef PubMed Google Scholar
First citationSpier, A. D. & Lummis, S. C. R. (2000). J. Biol. Chem. 275, 5620–5625.  CrossRef PubMed CAS Google Scholar
First citationStandley, D. M., Nakanishi, T., Xu, Z., Haruna, S., Li, S., Nazlica, S. A. & Katoh, K. (2022). Biophys. Rev. 14, 1247–1253.  CrossRef CAS PubMed Google Scholar
First citationSteinert, R. M., Kasireddy, C., Heikes, M. E. & Mitchell-Koch, K. R. (2022). Phys. Chem. Chem. Phys. 24, 19233–19251.  CrossRef CAS PubMed Google Scholar
First citationSutor, D. J. (1962). Nature, 195, 68–69.  CAS Google Scholar
First citationSutor, D. J. (1963). J. Chem. Soc. 1963, 1105–1110.  CrossRef Google Scholar
First citationTaylor, R. & Kennard, O. (1982). J. Am. Chem. Soc. 104, 5063–5070.  CrossRef CAS Web of Science Google Scholar
First citationThanthiriwatte, K. S., Hohenstein, E. G., Burns, L. A. & Sherrill, C. D. (2011). J. Chem. Theory Comput. 7, 88–96.  Web of Science CrossRef CAS PubMed Google Scholar
First citationVermeeren, P., Wolters, L. P., Paragi, G. & Fonseca Guerra, C. (2021). ChemPlusChem, 86, 812–819.  CrossRef CAS PubMed Google Scholar
First citationWalker, M., Harvey, A. J. A., Sen, A. & Dessent, C. E. H. (2013). J. Phys. Chem. A, 117, 12590–12600.  Web of Science CrossRef CAS PubMed Google Scholar
First citationYurenko, Y. P., Zhurakivsky, R. O., Samijlenko, S. P. & Hovorun, D. M. (2011). J. Biomol. Struct. Dyn. 29, 51–65.  CrossRef CAS PubMed Google Scholar
First citationZhang, Y., Deshpande, A., Xie, Z., Natesh, R., Acharya, K. R. & Brew, K. (2004). Glycobiology, 14, 1295–1302.  CrossRef PubMed CAS Google Scholar
First citationZhao, Y. & Truhlar, D. G. (2008). Theor. Chem. Acc. 120, 215–241.  Web of Science CrossRef CAS Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoSTRUCTURAL
BIOLOGY
ISSN: 2059-7983
Follow Acta Cryst. D
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds