opinions
Crystallographic searches for weak interactions – the limitations of data mining
aFR Organische Chemie, Universität des Saarlandes, Stadtwald, Saarbrücken, D-66041, Germany
*Correspondence e-mail: ch12hs@rz.uni-sb.de
1. Introduction
Weak non-covalent interactions can exert a major influence on molecular structures and properties, due to their great number particularly in organic compounds. Weak interactions have been aptly reviewed, mainly with respect to the often used method of crystallography (Desiraju & Steiner, 1999, 2001; Desiraju, 2002, and references cited therein). Typically one observes large scatter and broad histograms, e.g. with the very weak C—H⋯π bonds (Takahashi et al., 2010). Even stronger hydrogen bonds with ionic partners show up as rather broad profiles in histograms of interaction energies, E, as functions of donor–acceptor distances (Fig. 1) (Gilli et al., 2009).
2. Organic fluorine as hydrogen-bond acceptor
The possible hydrogen bond with organic fluorine is one of the most debated weak non-covalent interactions and it has been reviewed several times (Schneider, 2012; Champagne et al., 2015; Dalvit & Vulpetti, 2016); including a recent paper in this journal (Taylor, 2017). There the often-cited controversy is blamed on a misunderstanding, mainly due to the neglect of competing stronger interactions. However, the most often-cited 1997 paper by Dunitz and Taylor (1997) not only carries in the title Organic Fluorine Hardly Ever Accepts Hydrogen Bonds, based on their finding of only 0.6% relevant hits in the CSD, but claimed that the weakness of this hydrogen bond was backed by calculations, was in accord with physicochemical studies and with physical properties of fluorinated organic compounds.
The problem with many crystallographic studies is that all kinds of fluorine-containing compounds are present in the Cambridge Structural Database (CSD), and that most often fluorine has not been introduced in there with the aim to allow hydrogen bonds with fluorine. Fluorine is used very often in view of its well known strong substituent effects (Smart, 2001), also on electrostatic potentials; it can in for example, lead to a twofold decrease of reaction rates at distances as large as 10 Å, or corresponding 13C NMR shift changes by 0.25 p.p.m. (Schneider & Becker, 1989). That statistical analyses of the Protein Data Bank (PDB) exhibit more contacts with fluorine as acceptor is not really surprising (Taylor, 2017), as in contrast to databases such as the CSD, which include all kinds of synthetic derivatives, many compounds in the PDB have been included knowing that fluorine produces special effects in reaction with proteins. A recent publication describes solid solutions of two drugs which differ only by a single fluorine atom in which F⋯F interactions play a decisive role (de Castro Fonseca et al., 2018).
With necessary precautions, all new crystallographic and computational analyses, as well as equilibrium measurements in solution, now support the existence of weak hydrogen bonds with organic fluorine.
3. The limitations of data mining
Data mining has become a powerful tool to extract chemical information from a multitude of data sets, including crystal structures (Hautier, 2014). In particular drug discovery is widely supported by data mining (Wassermann et al., 2015; Yang et al., 2009) and is closely related to the identification of non-covalent forces for drug binding. Related approaches for the prediction of host–guest complexes are based on the use of training sets from data of known complexes and/or molecular similarity-based screening methods. Usually many descriptors are necessary for stability prediction, e.g. seven descriptors for cyclodextrin complexes (Steffen et al., 2017).
Chemistry and physics are sciences where compounds or systems can be designed which allow properties such as weak interactions to be analysed experimentally with intelligent approaches, in contrast to social sciences, economics or biology which essentially rely on the analysis of already existing systems. Thus, systematic analyses of supramolecular complexes in solution have provided consistent experimental data for all kinds of intermolecular forces, which can be used also to reliably predict stabilities of biological associations and can be compared to interactions in crystals (Schneider, 2009; Biedermann & Schneider, 2016; Hunter, 2004).
With respect to crystallographic methods, one should not forget that they measure structures, not energies, and that crystal packing is determined by a multitude of interactions (Dunitz & Gavezzotti, 2005, 2009). The use of van der Waals radii as the cut-off criterion for weak interactions has been criticized (Aakeroy et al., 1999). Empirical distance and angle relationships between large numbers of interacting aggregates have been proposed for the distinction between van der Waals and hydrogen-bonded interactions (van den Berg & Seddon, 2003). It also has been stated that the lowest-energy structure need not be present in a crystal (Dunitz & Gavezzotti, 2009). For example, a solid-state structure of a complex between a cyclophane and benzene showed the benzene ring outside the cavity (Hilgenfeld & Saenger, 1982), while an NMR study in solution showed the benzene ring inside in the cavity (Wald & Schneider, 2009).
As far as the frequency of intermolecular contacts in crystals is used as the indicator for non-covalent interactions, the indiscriminate use of all available data in the CSD can be misleading, particularly when it comes to weak forces. Instead one can, for example, limit the search to compounds which contain only weak donors and acceptors. Desiraju et al. have studied many structures where fluorine is the only heteroatom and always found strong evidence for its propensity as an acceptor, even with the C—H bond as a donor (Desiraju, 2005; Thakur et al., 2010). Solid solutions of benzoic acid and fluorobenzoic acids show evidence of C—H⋯F hydrogen bonds, with their strength increasing with higher F content in the crystal structures (Chakraborty & Desiraju, 2018). Alternatively, one could exclude not only metal-containing structures but also those with short distances to known stronger acceptors, such as oxygen, nitrogen, etc. Such a screening would also limit the often observed scatter and large variation of distances (Fig. 1) in crystals.
References
Castro Fonseca, J. de, Clavijo, J. C. T., Alvarez, N., Ellena, J. & Ayala, A. P. (2018). Cryst. Growth Des. 18, 3441–3448. Google Scholar
Aakeroy, C. B., Evans, T. A., Seddon, K. R. & Pálinkó, I. (1999). New J. Chem. 23, 145–152. Google Scholar
Berg, J. van den & Seddon, K. R. (2003). Cryst. Growth Des. 3, 643–661. Google Scholar
Biedermann, F. & Schneider, H.-J. (2016). Chem. Rev. 116, 5216–5300. Web of Science CrossRef CAS PubMed Google Scholar
Chakraborty, S. & Desiraju, G. R. (2018). Cryst. Growth Des. 18, 3607–3615. CrossRef Google Scholar
Champagne, P. A., Desroches, J. & Paquin, J.-F. (2015). Synthesis, 47, 306–322. CAS Google Scholar
Dalvit, C. & Vulpetti, A. (2016). Chem. Eur. J. 22, 7592–7601. Web of Science CrossRef CAS PubMed Google Scholar
Desiraju, G. R. (2002). Acc. Chem. Res. 35, 565–573. Web of Science CrossRef PubMed CAS Google Scholar
Desiraju, G. R. (2005). Chem. Commun. 24, 2995–3001. Web of Science CrossRef Google Scholar
Desiraju, G. R. & Steiner, T. (1999). In The Weak Hydrogen Bond in Structural Chemistry and Biology. Oxford University Press. Google Scholar
Desiraju, G. R. & Steiner, T. (2001). In The Weak Hydrogen Bond in Structural Chemistry and Biology. 2nd edition. Oxford University Press. Google Scholar
Dunitz, J. D. & Gavezzotti, A. (2005). Angew. Chem. Int. Ed. 44, 1766–1787. Web of Science CrossRef CAS Google Scholar
Dunitz, J. D. & Gavezzotti, A. (2009). Chem. Soc. Rev. 38, 2622–2633. Web of Science CrossRef PubMed CAS Google Scholar
Dunitz, J. D. & Taylor, R. (1997). Chem. Eur. J. 3, 89–98. CSD CrossRef CAS Web of Science Google Scholar
Gilli, P., Pretto, L., Bertolasi, V. & Gilli, G. (2009). Acc. Chem. Res. 42, 33–44. Web of Science CrossRef PubMed CAS Google Scholar
Hautier, G. (2014). Top. Curr. Chem. 345, 139–179. Web of Science CrossRef Google Scholar
Hilgenfeld, R. & Saenger, W. (1982). Angew. Chem. Int. Ed. Engl. 21, 1690–1701. CrossRef Google Scholar
Hunter, C. A. (2004). Angew. Chem. Int. Ed. 43, 5310–5324. Web of Science CrossRef CAS Google Scholar
Schneider, H.-J. (2009). Angew. Chem. Int. Ed. 48, 3924–3977. Web of Science CrossRef CAS Google Scholar
Schneider, H.-J. (2012). Chem. Sci. 3, 1381–1394. Web of Science CrossRef CAS Google Scholar
Schneider, H.-J. & Becker, N. (1989). J. Phys. Org. Chem. 2, 214–224. CrossRef Web of Science Google Scholar
Smart, B. E. (2001). J. Fluor. Chem. 109, 3–11. Web of Science CrossRef CAS Google Scholar
Steffen, A., Karasz, M., Thiele, C., Lengauer, T., Kämper, A., Wenz, G. & Apostolakis, J. (2017). New J. Chem. 31, 1941–1949. Web of Science CrossRef Google Scholar
Takahashi, O., Kohno, Y. & Nishio, N. (2010). Chem. Rev. 110, 6049–6076. Web of Science CrossRef Google Scholar
Taylor, R. (2017). Acta Cryst. B73, 474–488. Web of Science CrossRef IUCr Journals Google Scholar
Thakur, T. S., Kirchner, M. T., Bläser, D., Boese, R. & Desiraju, G. R. (2010). CrystEngComm, 12, 2079–2085. Web of Science CSD CrossRef CAS Google Scholar
Wald, P. & Schneider, H.-J. (2009). Eur. J. Org. Chem. 2009, 3450–3453. Web of Science CrossRef Google Scholar
Wassermann, A. M., Lounkine, E., Davies, J. W., Glick, M. & Camargo, L. M. (2015). Drug Discovery Today, 20, 422–434. Web of Science CrossRef Google Scholar
Yang, Y., Adelstein, S. J. & Kassis, A. I. (2009). Drug Discovery Today, 14, 147–154. Web of Science CrossRef Google Scholar
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.