opinions\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoSTRUCTURAL SCIENCE
CRYSTAL ENGINEERING
MATERIALS
ISSN: 2052-5206

Crystallographic searches for weak interactions – the limitations of data mining

CROSSMARK_Color_square_no_text.svg

aFR Organische Chemie, Universität des Saarlandes, Stadtwald, Saarbrücken, D-66041, Germany
*Correspondence e-mail: ch12hs@rz.uni-sb.de

Edited by A. Nangia, CSIR–National Chemical Laboratory, India (Received 20 April 2018; accepted 24 May 2018; online 22 June 2018)

1. Introduction

Weak non-covalent interactions can exert a major influence on molecular structures and properties, due to their great number particularly in organic compounds. Weak interactions have been aptly reviewed, mainly with respect to the often used method of crystallography (Desiraju & Steiner, 1999[Desiraju, G. R. & Steiner, T. (1999). In The Weak Hydrogen Bond in Structural Chemistry and Biology. Oxford University Press.], 2001[Desiraju, G. R. & Steiner, T. (2001). In The Weak Hydrogen Bond in Structural Chemistry and Biology. 2nd edition. Oxford University Press.]; Desiraju, 2002[Desiraju, G. R. (2002). Acc. Chem. Res. 35, 565-573.], and references cited therein). Typically one observes large scatter and broad histograms, e.g. with the very weak C—H⋯π bonds (Takahashi et al., 2010[Takahashi, O., Kohno, Y. & Nishio, N. (2010). Chem. Rev. 110, 6049-6076.]). Even stronger hydrogen bonds with ionic partners show up as rather broad profiles in histograms of interaction energies, E, as functions of donor–acceptor distances (Fig. 1[link]) (Gilli et al., 2009[Gilli, P., Pretto, L., Bertolasi, V. & Gilli, G. (2009). Acc. Chem. Res. 42, 33-44.]).

[Figure 1]
Figure 1
Large variations of donor⋯acceptor distances in hydrogen bonds. Hydrogen-bond energies E as a function of DA distances; negative or positive charge-assisted hydrogen bonds marked as (−) or (+); ordinary hydrogen-bond marked as (•); colored horizontal lines on the bottom show the ranges of variation of the DA distances for each type of bond from dD·A (vdW) to the shortest value dD. The dashed curve connecting the (•) points represents the particularly weak interactions with organic halogen acceptors. Reprinted with permission from Gilli et al. (2009[Gilli, P., Pretto, L., Bertolasi, V. & Gilli, G. (2009). Acc. Chem. Res. 42, 33-44.]). Copyright with permission 2009 American Chemical Society.

2. Organic fluorine as hydrogen-bond acceptor

The possible hydrogen bond with organic fluorine is one of the most debated weak non-covalent interactions and it has been reviewed several times (Schneider, 2012[Schneider, H.-J. (2012). Chem. Sci. 3, 1381-1394.]; Champagne et al., 2015[Champagne, P. A., Desroches, J. & Paquin, J.-F. (2015). Synthesis, 47, 306-322.]; Dalvit & Vulpetti, 2016[Dalvit, C. & Vulpetti, A. (2016). Chem. Eur. J. 22, 7592-7601.]); including a recent paper in this journal (Taylor, 2017[Taylor, R. (2017). Acta Cryst. B73, 474-488.]). There the often-cited controversy is blamed on a misunderstanding, mainly due to the neglect of competing stronger interactions. However, the most often-cited 1997 paper by Dunitz and Taylor (1997[Dunitz, J. D. & Taylor, R. (1997). Chem. Eur. J. 3, 89-98.]) not only carries in the title Organic Fluorine Hardly Ever Accepts Hydrogen Bonds, based on their finding of only 0.6% relevant hits in the CSD, but claimed that the weakness of this hydrogen bond was backed by molecular orbital calculations, was in accord with physicochemical studies and with physical properties of fluorinated organic compounds.

The problem with many crystallographic studies is that all kinds of fluorine-containing compounds are present in the Cambridge Structural Database (CSD), and that most often fluorine has not been introduced in there with the aim to allow hydrogen bonds with fluorine. Fluorine is used very often in view of its well known strong substituent effects (Smart, 2001[Smart, B. E. (2001). J. Fluor. Chem. 109, 3-11.]), also on electrostatic potentials; it can in steroids, for example, lead to a twofold decrease of reaction rates at distances as large as 10 Å, or corresponding 13C NMR shift changes by 0.25 p.p.m. (Schneider & Becker, 1989[Schneider, H.-J. & Becker, N. (1989). J. Phys. Org. Chem. 2, 214-224.]). That statistical analyses of the Protein Data Bank (PDB) exhibit more contacts with fluorine as acceptor is not really surprising (Taylor, 2017[Taylor, R. (2017). Acta Cryst. B73, 474-488.]), as in contrast to databases such as the CSD, which include all kinds of synthetic derivatives, many compounds in the PDB have been included knowing that fluorine produces special effects in reaction with proteins. A recent publication describes solid solutions of two drugs which differ only by a single fluorine atom in which F⋯F interactions play a decisive role (de Castro Fonseca et al., 2018[Castro Fonseca, J. de, Clavijo, J. C. T., Alvarez, N., Ellena, J. & Ayala, A. P. (2018). Cryst. Growth Des. 18, 3441-3448.]).

With necessary precautions, all new crystallographic and computational analyses, as well as equilibrium measurements in solution, now support the existence of weak hydrogen bonds with organic fluorine.

3. The limitations of data mining

Data mining has become a powerful tool to extract chemical information from a multitude of data sets, including crystal structures (Hautier, 2014[Hautier, G. (2014). Top. Curr. Chem. 345, 139-179.]). In particular drug discovery is widely supported by data mining (Wassermann et al., 2015[Wassermann, A. M., Lounkine, E., Davies, J. W., Glick, M. & Camargo, L. M. (2015). Drug Discovery Today, 20, 422-434.]; Yang et al., 2009[Yang, Y., Adelstein, S. J. & Kassis, A. I. (2009). Drug Discovery Today, 14, 147-154.]) and is closely related to the identification of non-covalent forces for drug binding. Related approaches for the prediction of host–guest complexes are based on the use of training sets from data of known complexes and/or molecular similarity-based screening methods. Usually many descriptors are necessary for stability prediction, e.g. seven descriptors for cyclo­dextrin complexes (Steffen et al., 2017[Steffen, A., Karasz, M., Thiele, C., Lengauer, T., Kämper, A., Wenz, G. & Apostolakis, J. (2017). New J. Chem. 31, 1941-1949.]).

Chemistry and physics are sciences where compounds or systems can be designed which allow properties such as weak interactions to be analysed experimentally with intelligent approaches, in contrast to social sciences, economics or biology which essentially rely on the analysis of already existing systems. Thus, systematic analyses of supramolecular complexes in solution have provided consistent experimental data for all kinds of intermolecular forces, which can be used also to reliably predict stabilities of biological associations and can be compared to interactions in crystals (Schneider, 2009[Schneider, H.-J. (2009). Angew. Chem. Int. Ed. 48, 3924-3977.]; Biedermann & Schneider, 2016[Biedermann, F. & Schneider, H.-J. (2016). Chem. Rev. 116, 5216-5300.]; Hunter, 2004[Hunter, C. A. (2004). Angew. Chem. Int. Ed. 43, 5310-5324.]).

With respect to crystallographic methods, one should not forget that they measure structures, not energies, and that crystal packing is determined by a multitude of interactions (Dunitz & Gavezzotti, 2005[Dunitz, J. D. & Gavezzotti, A. (2005). Angew. Chem. Int. Ed. 44, 1766-1787.], 2009[Dunitz, J. D. & Gavezzotti, A. (2009). Chem. Soc. Rev. 38, 2622-2633.]). The use of van der Waals radii as the cut-off criterion for weak interactions has been criticized (Aakeroy et al., 1999[Aakeroy, C. B., Evans, T. A., Seddon, K. R. & Pálinkó, I. (1999). New J. Chem. 23, 145-152.]). Empirical distance and angle relationships between large numbers of interacting aggregates have been proposed for the distinction between van der Waals and hydrogen-bonded interactions (van den Berg & Seddon, 2003[Berg, J. van den & Seddon, K. R. (2003). Cryst. Growth Des. 3, 643-661.]). It also has been stated that the lowest-energy structure need not be present in a crystal (Dunitz & Gavezzotti, 2009[Dunitz, J. D. & Gavezzotti, A. (2009). Chem. Soc. Rev. 38, 2622-2633.]). For example, a solid-state structure of a complex between a cyclo­phane and benzene showed the benzene ring outside the cavity (Hilgenfeld & Saenger, 1982[Hilgenfeld, R. & Saenger, W. (1982). Angew. Chem. Int. Ed. Engl. 21, 1690-1701.]), while an NMR study in solution showed the benzene ring inside in the cavity (Wald & Schneider, 2009[Wald, P. & Schneider, H.-J. (2009). Eur. J. Org. Chem. 2009, 3450-3453.]).

As far as the frequency of intermolecular contacts in crystals is used as the indicator for non-covalent interactions, the indiscriminate use of all available data in the CSD can be misleading, particularly when it comes to weak forces. Instead one can, for example, limit the search to compounds which contain only weak donors and acceptors. Desiraju et al. have studied many structures where fluorine is the only heteroatom and always found strong evidence for its propensity as an acceptor, even with the C—H bond as a donor (Desiraju, 2005[Desiraju, G. R. (2005). Chem. Commun. 24, 2995-3001.]; Thakur et al., 2010[Thakur, T. S., Kirchner, M. T., Bläser, D., Boese, R. & Desiraju, G. R. (2010). CrystEngComm, 12, 2079-2085.]). Solid solutions of benzoic acid and fluoro­benzoic acids show evidence of C—H⋯F hydrogen bonds, with their strength increasing with higher F content in the crystal structures (Chakraborty & Desiraju, 2018[Chakraborty, S. & Desiraju, G. R. (2018). Cryst. Growth Des. 18, 3607-3615.]). Alternatively, one could exclude not only metal-containing structures but also those with short distances to known stronger acceptors, such as oxygen, nitro­gen, etc. Such a screening would also limit the often observed scatter and large variation of distances (Fig. 1[link]) in crystals.

References

First citationCastro Fonseca, J. de, Clavijo, J. C. T., Alvarez, N., Ellena, J. & Ayala, A. P. (2018). Cryst. Growth Des. 18, 3441–3448.  Google Scholar
First citationAakeroy, C. B., Evans, T. A., Seddon, K. R. & Pálinkó, I. (1999). New J. Chem. 23, 145–152.  Google Scholar
First citationBerg, J. van den & Seddon, K. R. (2003). Cryst. Growth Des. 3, 643–661.  Google Scholar
First citationBiedermann, F. & Schneider, H.-J. (2016). Chem. Rev. 116, 5216–5300.  Web of Science CrossRef CAS PubMed Google Scholar
First citationChakraborty, S. & Desiraju, G. R. (2018). Cryst. Growth Des. 18, 3607–3615.  CrossRef Google Scholar
First citationChampagne, P. A., Desroches, J. & Paquin, J.-F. (2015). Synthesis, 47, 306–322.  CAS Google Scholar
First citationDalvit, C. & Vulpetti, A. (2016). Chem. Eur. J. 22, 7592–7601.  Web of Science CrossRef CAS PubMed Google Scholar
First citationDesiraju, G. R. (2002). Acc. Chem. Res. 35, 565–573.  Web of Science CrossRef PubMed CAS Google Scholar
First citationDesiraju, G. R. (2005). Chem. Commun. 24, 2995–3001.  Web of Science CrossRef Google Scholar
First citationDesiraju, G. R. & Steiner, T. (1999). In The Weak Hydrogen Bond in Structural Chemistry and Biology. Oxford University Press.  Google Scholar
First citationDesiraju, G. R. & Steiner, T. (2001). In The Weak Hydrogen Bond in Structural Chemistry and Biology. 2nd edition. Oxford University Press.  Google Scholar
First citationDunitz, J. D. & Gavezzotti, A. (2005). Angew. Chem. Int. Ed. 44, 1766–1787.  Web of Science CrossRef CAS Google Scholar
First citationDunitz, J. D. & Gavezzotti, A. (2009). Chem. Soc. Rev. 38, 2622–2633.  Web of Science CrossRef PubMed CAS Google Scholar
First citationDunitz, J. D. & Taylor, R. (1997). Chem. Eur. J. 3, 89–98.  CSD CrossRef CAS Web of Science Google Scholar
First citationGilli, P., Pretto, L., Bertolasi, V. & Gilli, G. (2009). Acc. Chem. Res. 42, 33–44.  Web of Science CrossRef PubMed CAS Google Scholar
First citationHautier, G. (2014). Top. Curr. Chem. 345, 139–179.  Web of Science CrossRef Google Scholar
First citationHilgenfeld, R. & Saenger, W. (1982). Angew. Chem. Int. Ed. Engl. 21, 1690–1701.  CrossRef Google Scholar
First citationHunter, C. A. (2004). Angew. Chem. Int. Ed. 43, 5310–5324.  Web of Science CrossRef CAS Google Scholar
First citationSchneider, H.-J. (2009). Angew. Chem. Int. Ed. 48, 3924–3977.  Web of Science CrossRef CAS Google Scholar
First citationSchneider, H.-J. (2012). Chem. Sci. 3, 1381–1394.  Web of Science CrossRef CAS Google Scholar
First citationSchneider, H.-J. & Becker, N. (1989). J. Phys. Org. Chem. 2, 214–224.  CrossRef Web of Science Google Scholar
First citationSmart, B. E. (2001). J. Fluor. Chem. 109, 3–11.  Web of Science CrossRef CAS Google Scholar
First citationSteffen, A., Karasz, M., Thiele, C., Lengauer, T., Kämper, A., Wenz, G. & Apostolakis, J. (2017). New J. Chem. 31, 1941–1949.  Web of Science CrossRef Google Scholar
First citationTakahashi, O., Kohno, Y. & Nishio, N. (2010). Chem. Rev. 110, 6049–6076.  Web of Science CrossRef Google Scholar
First citationTaylor, R. (2017). Acta Cryst. B73, 474–488.  Web of Science CrossRef IUCr Journals Google Scholar
First citationThakur, T. S., Kirchner, M. T., Bläser, D., Boese, R. & Desiraju, G. R. (2010). CrystEngComm, 12, 2079–2085.  Web of Science CSD CrossRef CAS Google Scholar
First citationWald, P. & Schneider, H.-J. (2009). Eur. J. Org. Chem. 2009, 3450–3453.  Web of Science CrossRef Google Scholar
First citationWassermann, A. M., Lounkine, E., Davies, J. W., Glick, M. & Camargo, L. M. (2015). Drug Discovery Today, 20, 422–434.  Web of Science CrossRef Google Scholar
First citationYang, Y., Adelstein, S. J. & Kassis, A. I. (2009). Drug Discovery Today, 14, 147–154.  Web of Science CrossRef Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.

Journal logoSTRUCTURAL SCIENCE
CRYSTAL ENGINEERING
MATERIALS
ISSN: 2052-5206
Follow Acta Cryst. B
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds