Significant progress in predicting the crystal structures of small organic molecules – a report on the fourth blind test

Day, G.M.; Cooper, T.G.; Cruz-Cabeza, A.J.; Hejczyk, K.E.; Ammon, H.L.; Boerrigter, S.X.M.; Tan, J.S.; Della Valle, R.G.; Venuti, E.; Jose, J.; Gadre, S.R.; Desiraju, G.R.; Thakur, T.S.; van Eijck, B.P.; Facelli, J.C.; Bazterra, V.E.; Ferraro, M.B.; Hofmann, D.W.M.; Neumann, M.A.; Leusen, F.J.J.; Kendrick, J.; Price, S.L.; Misquitta, A.J.; Karamertzanis, P.G.; Welch, G.W.A.; Scheraga, H.A.; Arnautova, Y.A.; Schmidt, M.U.; van de Streek, J.; Wolf, A.K.; Schweizer, B.

doi:10.1107/S0108768109004066

feature articles

STRUCTURAL SCIENCE
CRYSTAL ENGINEERING
MATERIALS

ISSN: 2052-5206

Volume 65| Part 2| April 2009| Pages 107-125

doi:10.1107/S0108768109004066

Significant progress in predicting the crystal structures of small organic molecules – a report on the fourth blind test

^aThe Pfizer Institute for Pharmaceutical Materials Science, University Chemical Laboratory, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, England, ^bDepartment of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742-2021, USA, ^cSchool of Pharmacy and Pharmaceutical Sciences, Purdue University, West Lafayette, Indiana, USA, ^dDipartimento di Chimica Fisica e Inorganica and INSTM-UdR, Università di Bologna, Viale Risorgimento 4, I-40136 Bologna, Italy, ^eUniversity of Pune, Ganeshkhind, Pune 411007, India, ^fSchool of Chemistry, University of Hyderabad, Hyderabad 500 046, India, ^gDepartment of Crystal and Structural Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands, ^hDepartment of Biomedical Informatics and Center for High Performance Computing, University of Utah, 155 South 1452 East Rm 405, Salt Lake City, UT 84112-0190, USA, ⁱCenter for High Performance Computing, University of Utah, 155 South 1452 East Rm 405, Salt Lake City, UT 84112-0190, USA, ^jDepartamento de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, Pab. I (1428), Buenos Aires, Argentina, ^kCRS4, Edificio 1, Loc. Piscinamanna, 09010 Pula (CA), Italy, ^lAvant-garde Materials Simulation Deutschland GmbH, Merzhauser Strasse 177, D-79100, Germany, ^mInstitute of Pharmaceutical Innovation, University of Bradford, Bradford BD7 1DP, England, ⁿDepartment of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, England, ^oUniversity Chemical Laboratory, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, England, ^pBaker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA, ^qInstitute of Inorganic and Analytical Chemistry, University of Frankfurt, Max-von-Laue-Strasse 7, D-60438 Frankfurt am Main, Germany, and ^rOrganic Chemical Laboratory, ETH-Zurich, CH-8093 Zurich, Switzerland
^*Correspondence e-mail: gmd27@cam.ac.uk

(Received 12 December 2008; accepted 3 February 2009)

We report on the organization and outcome of the fourth blind test of crystal structure prediction, an international collaborative project organized to evaluate the present state in computational methods of predicting the crystal structures of small organic molecules. There were 14 research groups which took part, using a variety of methods to generate and rank the most likely crystal structures for four target systems: three single-component crystal structures and a 1:1 cocrystal. Participants were challenged to predict the crystal structures of the four systems, given only their molecular diagrams, while the recently determined but as-yet unpublished crystal structures were withheld by an independent referee. Three predictions were allowed for each system. The results demonstrate a dramatic improvement in rates of success over previous blind tests; in total, there were 13 successful predictions and, for each of the four targets, at least two groups correctly predicted the observed crystal structure. The successes include one participating group who correctly predicted all four crystal structures as their first ranked choice, albeit at a considerable computational expense. The results reflect important improvements in modelling methods and suggest that, at least for the small and fairly rigid types of molecules included in this blind test, such calculations can be constructively applied to help understand crystallization and polymorphism of organic molecules.

Keywords: prediction; blind test; polymorph.

1. Introduction

This paper reports the results of the fourth blind test of crystal structure prediction, an international test of current methods hosted by the Cambridge Crystallographic Data Centre (CCDC) and which we will refer to as CSP2007.

Crystal structure prediction (CSP) has been a long-standing goal of computational materials chemistry. The grand aim is the ability to predict, by computational methods, how a molecule will crystallize (i.e. unit cell, space group and all atomic positions), given only its chemical diagram and, perhaps, crystallization conditions. With the chemical diagram as the main input, such methods could be used even prior to the synthesis of the given molecule, leading to the possibility of the computationally led design of molecules that will crystallize with desired structural and physical properties. Alternatively, for a molecule with a known crystal structure, CSP could help assess the likelihood of as-yet undiscovered polymorphism. The latter application is the main motivation from the industrial sector (e.g. pharmaceuticals or pigments), where the unanticipated appearance of a new polymorph, with its different physical properties, can be very undesirable.

Over the past couple of decades, many methods have been developed for the purpose of CSP (Beyer et al., 2001 ; Verwer & Leusen, 1998 ) and, starting in 1999, the CCDC has organized periodic blind tests to assess the reliability of individual methods and to provide an objective picture of the status of the field. These blind tests involve a set of molecules being set as targets, with participating research groups challenged to predict their crystal structures, which were unknown to the predictors prior to the test. This approach allows a side-by-side comparison of the range of methods on the same set of molecules. This type of blind test is increasingly being used to monitor advances in several areas of predictive modelling, such as protein folding (Moult et al., 2007 ), ligand–protein binding, the prediction of solvation energies (Nicholls et al., 2008 ), solubilities (Llinàs et al., 2008 ), and physical properties of fluids (Case et al., 2007 ). Developments in these areas are necessarily usually tested by retrospective prediction (or `postdiction') of known properties or structures, whereas blind tests require prospective prediction of unknown data; successful prediction in such a setting is therefore more convincing.

The aims and methods used to approach CSP have the most in common with protein-structure prediction (PSP), which has also been the subject of blind assessments in the world-wide community, in the `Critical Assessment of Techniques for Protein Structure Prediction', or CASP, exercises (Moult et al., 2007). Both CSP and PSP are usually approached as problems in global energy minimization, assuming that the resulting structure is determined solely by energy. Computation of kinetics of crystallization is largely absent from current CSP methods, with only simple models of crystal growth occasionally being used to assess putative structures (Anghel et al., 2002 ; Day & Price, 2003 ; Coombes et al., 2005 ). Protein-folding kinetics have also been addressed in PSP (Khalili et al., 2006 ), but the main focus in both communities has been on locating the lowest-energy structures on the complex energy surface. This energy-based approach requires a high quality potential function and a good global optimization procedure. In recent years, PSP has started to emphasize free energy (Brooks III et al., 2001 ; Liwo et al., 2007 ) and the calculation of both structure and thermodynamic properties, whereas CSP has largely focused on structure determination based on potential energy. Lattice dynamics (Anghel et al., 2002; van Eijck, 2001 ; Day et al., 2005 ) and molecular dynamics (Karamertzanis et al., 2008 ; Raiteri et al., 2005 ) simulations are only occasionally used to evaluate free energies in CSP. Computational efforts in both communities also make use of some experimental information to guide the modelling: PSP often makes use of the structures of homologous proteins, whereas CSP calculations are frequently guided by space-group statistics from the Cambridge Structural Database (Allen, 2002 ). Occasionally, the results of an energy-based search are biased using a synthon approach, where re-ranking of the low-energy structures is based on the absence or presence of commonly occurring structural motifs in the crystal structures of similar molecules (Dey et al., 2005 , 2006 ). It can also be tempting to introduce a subjective assessment of structural features in the ranking of putative crystal structures (Day & Motherwell, 2006 ). There is clearly much room for variation in methods and, therefore, the need to compare them side-by-side as developments are made and new approaches are tested.

Both CSP (in the CCDC blind test exercises) and PSP (CASP exercises) have been carried out as blind prediction tests in the world-wide community at regular intervals – every 2–3 years, in the case of crystal structure prediction. Based on the results of these tests, the progress in structure prediction in PSP (Oldziej et al., 2005 ; Borreguero & Skolnick, 2007 ) has improved considerably in recent years, more so than in the first three tests of CSP (Lommerse et al., 2000 ; Motherwell et al., 2002 ; Day et al., 2005). This paper reports on the results of the fourth crystal structure prediction blind test.

2. Organization and approach

The organization of this latest blind test, CSP2007, was in most aspects the same as the first three such evaluations of the field, which have been published: CSP1999 (Lommerse et al., 2000); CSP2001 (Motherwell et al., 2002); CSP2004 (Day et al., 2005). Invitations to participate were sent to 23 research groups known to be active in the field. This year, it was felt that, with a growing community working towards crystal structure prediction, the blind test should be open to participation from anyone making developments in the field. Therefore, the test was advertised through the newsletters and websites of crystallographic associations so that interested groups could contact the organisers and take part. In the end, 14 research groups participated.

The previous blind tests put forward targets for prediction in the following three categories:

(1) small, rigid molecules; only the elements C, H, N and O; less than ca 25 atoms;
(2) rigid molecules, containing elements or functional groups that present a challenge for modelling methods, and are allowed to be up to ca 30–40 atoms;
(3) molecules with several degrees of conformational flexibility, usually the rotation about exocyclic single bonds.
Molecules fitting these three categories have been included in CSP2007. Furthermore, with increasing interest within the crystal engineering community in the structures of multicomponent crystals – salts, solvates and cocrystals – an additional category was added to the current test:
(4) a two-component crystal of rigid molecules.

This new fourth category specifically tests methods for sampling packing space with more than one independent molecule, which was introduced in the CSP2004 by allowing the possibility of Z′ > 1 crystal structures in categories 1–3. In fact, the inability of many search methods to consider more than one independent molecule contributed to the lack of prediction success for the small rigid molecule (XI) in CSP2004 (Day et al., 2005). With the new category specifically testing methods for multiple independent molecules in this blind test, restrictions were reintroduced for categories 1–3: the crystal structures could be in any space group, but must have only one independent molecule (Z′ ≤ 1).

Crystallographers were contacted with a request for unpublished structures and suitable candidates were sent to an independent referee (Professor A. L. Spek, Laboratory of Crystal and Structural Chemistry, Utrecht University) who checked that they conformed to our criteria. To be suitable, a crystal structure had to be of high quality with all atoms located. After considerable effort, we collected one candidate for category 1, three for category 2, four for category 3 and three for category 4. Chemical diagrams of all candidates were then presented to an independent colleague (Dr Sijbren Otto, University Chemical Laboratory, Cambridge), who agreed to choose one target from each of the categories. The molecular diagrams and crystallization conditions, as shown in Table 1, were sent by email to all participants on 16 January 2007. Following the numbering used in the previous blind tests, we refer to these molecules by the Roman numerals (XII)–(XV).

Table 1
Diagrams and crystallization conditions for the molecules of CSP2007

Molecule		Crystallization conditions
(XII)		Grown from the melt by laser heating methods, T = 178 K
(XIII)		Crystallized from acetonitrile
(XIV)		Crystallized overnight by diethyl ether/hexane diffusion
(XV)		1:1 cocrystal, crystallized by slow evaporation from ethanol

We kept the format the same as in previous blind tests, allowing each participating group to submit three predictions for each system. Participants were asked to send their predictions for each molecule to Professor Spek, who held the experimentally determined crystal structures throughout the test. As well as the three `official' predictions, analysis of extended lists of the crystal structures generated by each group can provide useful insight into the performance of the methods (van Eijck, 2005 ). Therefore, participants were encouraged to submit longer lists of their predicted structures, separately from their `official' three, but preferably in ranked order. The deadline for submissions was 20 July 2007 and the experimentally determined crystal structures of all four systems were circulated to each participant on 23 July, for post-analysis of their predictions. A workshop was held at the Cambridge Crystallographic Data Centre in September 2007 to discuss the results.

3. Methodologies

Details of the methods used by the 14 participating research groups vary significantly, although most do involve three general steps:

(i) calculating three-dimensional molecular structures from the chemical diagrams;
(ii) searching the crystal packing phase space for the possible crystal packings;
(iii) assessing the generated structures to rank them in order of likelihood of formation.

Dividing the methods into these steps is mainly to aid discussion, as the steps do overlap in some methods. For example, the structure generation step often involves calculating and locally minimizing lattice energies, with the final energies used to rank the structures; in this case, steps (ii) and (iii) are not independent.

A brief discussion of the methods used in the latest blind test is provided here in the main body of the paper, and a summary of some key details for each participant is provided in Table 2, along with key references for most of the methods. For more detailed methodological descriptions, which were provided by many of the participants, the reader should refer to the supplementary material and references provided in the footnotes to Table 2.¹

Table 2
Summary of methodologies

Contributor	Molecules attempted	Programs	Refs	Search generation	Space groups considered
(a)
Ammon	(XII)–(XV)	MOLPAK, DMAREL	(a)	Grid-based systematic	P1, P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, P2₁2₁2, C2/c, Pbca, Pbcn, Pna2₁, Pca2₁, C2, Cc
Boerrigter, Tan	(XII)–(XV)	CERIUS2 Polymorph Predictor	(b)	Monte Carlo simulated annealing	P2₁/c, $[P\bar 1]$ , P2₁2₁2₁, P2₁, C2/c, Pbca, Pnma, Pbcn, Pna2₁, P1, Cc
Day, Cooper, Cruz Cabeza, Hejczyk	(XII)–(XV)	Crystal Predictor, CERIUS2 OFF, DMAREL	(c)	Structures generated using a low discrepancy Sobol' sequence	(XII)–(XIV) P1, P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, C2/c, Pbca, Pbcn, Pna2₁, Pca2₁, Pnma, P2₁2₁2, Cc, C2, C2/m, Pc, P2₁/m, P2/c, Pccn, $[R\bar 3]$
					(XV) P1, P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, C2/c, Pbca, Pbcn, Pna2₁, Pnma, Cc, C2/m, Pc
Della Valle, Venuti	(XII), (XIII), (XV)	Xfind, WMIN, IONIC, PLATON	(d)	Structures generated using a low-discrepancy Sobol' sequence	Z′ = 1: P1, $[P\bar 1]$ , P2₁, P2₁/c, C2/c, P2₁2₁2₁, Pna2₁, Pbca, Pnma Z′ = 2: P1
Desiraju, Thakur	(XIII)–(XV)	CERIUS2 Polymorph Predictor	(e)	Monte Carlo simulated annealing	P1, P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, P2₁2₁2, C2/c, Pbca, Pbcn, Pna2₁, Pca2₁, C2, Cc, C2/m, P2₁/m, P2/c, Pnma
van Eijck	(XII)–(XV)	UPACK, XTINKER	(f)	Randomly generated starting structures	P1, P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, C2/c, Pbca, Pbcn, Pna2₁, Pca2₁, Cc, C2, Pc
Facelli, Bazterra, Ferraro	(XII)–(XV)	MGAC	(g)	Modified genetic algorithm	P1, $[P\bar 1]$ , P2₁, C2, Pc, Cc, P2₁/c, C2/c, P2₁2₁2₁, Pca2₁, Pna2₁, Pbcn, Pbca, Pnma
Hofmann	(XII)–(XIV)	FlexCryst	(h)	Random search with calibrated cell	P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, C2/c, Pbca, Pbcn, Pna2₁, Pca2₁, Cc, C2, P2₁/m
Jose, Gadre	(XII)–(XV)	GA-CG-MTA algorithm for crystal structure prediction	(i)	Genetic algorithm	P1, P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, Pbca, Pbcn, Pnma, Pna2₁, Pca2₁, Cc, C2, Pc
Neumann, Leusen, Kendrick	(XII)–(XV)	GRACE1.0 and VASP	(j)	Random search with molecular flexibility	All 230 space groups searched
Price, Misquitta, Karamertzanis, Welch	(XII)–(XV)	MOLPAK or Crystal Predictor, DMAREL, DMAflex, CamCASP	(k)	Grid-based systematic or using Sobol sequence	P1, P2₁/c, $[P\bar 1]$ , P2₁, P2₁2₁2₁, P2₁2₁2, C2/c, Pbca, Pbcn, Pna2₁, Pca2₁, C2, Cc
Scheraga, Arnautova	(XII)–(XIV)	CRYSTALG	(l)	Conformation-family Monte Carlo (CFMC)	No symmetry information used – P1 with varying Z (= 2, 4, 8)
Schmidt, van de Streek, Wolf	(XII)–(XV)	CRYSCA	(m)	Randomly generated starting structures	(XII): P2₁/c, C2/c, P2₁, $[P\bar 1]$ , P2₁2₁2₁, Pbca
					(XIII): P2₁/c, C2/c, P2₁, $[P\bar 1]$ , P2₁2₁2₁, Pbca, P1, Pna2₁, Pca2₁
					(XIV): P2₁/c, C2/c, P2₁, $[P\bar 1]$ , Pbca
					(XV): P2₁/c, C2/c, P2₁, $[P\bar 1]$ , P2₁2₁2₁, Pbca, P1, Pna2₁, Pca2₁, Cc
Schweizer	(XII), (XIII), (XV)	ZIP-PROMET, PIXEL	(n)	Stepwise construction of dimers and layers	$[P\bar 1]$ , P2₁, P2₁/c, C2/c, P2₁2₁2₁, Pbca

		Lattice energy/fitness function		Other criteria used
Contributor	Molecular model	Electrostatic	Other	to select submissions
(b)
Ammon	Rigid throughout	Atomic multipoles	Empirical exp-6
Boerrigter, Tan	Rigid throughout	Electrostatic potential derived core-shell model using the CS-RQ method	Dreiding exp-6 force field
Day, Cooper, Cruz Cabeza, Hejczyk	(XII), (XIII), (XV) rigid throughout (XIV) partly flexible during energy minimization	Atomic multipoles	Empirical exp-6 [(XII), (XIV), (XV)], specifically fitted anisotropic exp-6 (XIII)	Free energy [(XII), (XIII)]
Della Valle, Venuti	Rigid throughout	Atomic charges	Empirical exp-6	Free energy
Desiraju, Thakur	Rigid for search, flexible for energy minimization	Atomic charges	COMPASS force field	Assessment of packing (XIV) and synthon-based re-ranking (XV)
van Eijck	Flexible throughout	Atomic multipoles [for (XIII) charges only]	Empirical exp-6
Facelli, Bazterra, Ferraro	Flexible throughout	Atomic charges	GAFF 6-12
Hofmann	Rigid throughout	Trained potentials
Jose, Gadre	Flexible throughout	CG–MTA ab initio energy HF/STO-3G
Neumann, Leusen, Kendrick	Flexible throughout	Plane-wave density functional theory supplemented by an empirical C₆R⁻⁶
Price, Misquitta, Karamertzanis, Welch	Rigid for search, some flexibility during energy minimization for (XIV)	Atomic multipoles	Empirical exp-6 [(XII), (XIV), (XV)] non-empirically derived anisotropic exp-6 (XIII)	Choice 2 and 3 considered properties and motif
Scheraga, Arnautova	Rigid throughout	Atomic charges	Empirical exp-6 W99 (XII) ECEPP-05 (XIII) with specifically fitted halogen parameters ECEPP-05 (XIV)
Schmidt, van de Streek, Wolf	(XII), (XIII) rigid, (XIV), (XV) some flexibility throughout	Atomic charges	Empirical exp-6
Schweizer	Rigid throughout	Energy minimization with exp-6 UNI potential and energy calculations with pixel-based method
References: (a) Holden et al. (1993 ), Busing (1981 ); (b) Verwer & Leusen (1998), Tan et al. (2009 ); (c) Karamertzanis & Pantelides (2005 ), Day, Motherwell & Jones (2005 ), Cooper et al. (2007 ), Cruz Cabeza et al. (2006 ), Day et al. (2007 ); (d) Della Valle et al. (2008 ), Brillante et al. (2008 ), Busing (1981), Signorini et al. (1991 ); Spek (2003 ); (e) Sarma & Desiraju (2002 ), Dey et al. (2005, 2006); (f) Mooij et al. (1999 ), van Eijck & Kroon (2000), van Eijck (2001, 2002); (g) Bazterra et al. (2007 ); (h) Hofmann & Lengauer (1997 ), Hofmann & Apostolakis (2003 ), Hofmann & Kuleshova (2005); (i) unpublished method – see supplementary material ; (j) Neumann & Perrin (2005), Neumann et al. (2008 ), Neumann (2008); (k) Holden et al. (1993), Willock et al. (1995 ), Karamertzanis & Pantelides (2005), Karamertzanis & Price (2006 ), Karamertzanis & Pantelides (2007 ), Misquitta & Stone (2007 ), Misquitta et al. (2008), Price (2008); (l) Pillardy et al. (2001 ); (m) Schmidt & Englert (1996 ), Schmidt & Kalkhof (1997 ); (n) Gavezzotti (1999–2000 ), Gavezzotti (2004 ).

3.1. Methods of generating the molecular structure

The molecular structure that is used as the building block in the crystal structure search is usually derived from a force field or quantum-mechanics electronic structure calculation and there has been little focus since the previous blind tests on refining methods used here. For rigid molecules there have rarely been failures in crystal structure prediction that are due to a poor choice of starting molecular structure. Many of the methods treat the resulting molecular structure as rigid throughout the remainder of the calculations, assuming that crystal packing forces are too small to significantly distort the molecular geometry. Other methods allow intramolecular degrees of freedom to vary during the search and/or final energy minimizations.

3.2. Generating trial crystal structures

Many approaches have been proposed to search the energy landscape for the lowest-energy crystal structures. Amongst the participants in this blind test, the most popular method was to generate large numbers of structures with random or quasi-random values for crystal structure variables (unit-cell parameters, positions and orientations of the molecules). Variations on the random search were used by six of the 14 groups. The others applied a variety of methods: Monte Carlo types of search (three groups); genetic algorithms (two groups); systematic grid-based searches (two groups) and Gavezzotti's PROM approach (1 group), which involves the stepwise construction of crystal structures from the most promising dimers, chains and layers.

Many groups made use of space-group symmetry to guide their search, with most focusing on a set (ranging from 4 to about 20) of the most commonly adopted space groups for organic molecules with Z′ = 1. Only one group (Neumann, Leusen and Kendrick) considered all 230 space groups. The alternative approach is to generate P1 unit cells with varying numbers of total molecules in the unit cell. This strategy was employed by two groups: Scheraga and Arnautova generated P1 structures with two, four and eight molecules in the unit cell, locating space-group symmetry in the resulting structures after energy minimization using the CRYCOM program (Dzyabchenko, 1994 ). Della Valle and Venuti performed P1 searches with 1 and 2 independent molecules as well as Z′ = 1 searches in the common space groups.

3.3. Ranking of structures

The final ranking of structures was usually based on calculated lattice energies of the structures generated by the crystal structure search. Therefore, most of the variability in the ranking of structures results from different choices of model for the crystal energies (Table 2). Two groups went beyond the static lattice-energy approach and included lattice-dynamics contributions to the free-energy differences between structures. New methods of evaluating energies have been introduced in this blind test, including atom–atom potentials derived purely from molecular quantum-mechanical calculations and the direct applicaton of quantum-mechanical electronic structure calculations to the crystal structures.

Aside from methods based purely on potential or free energies, there were attempts to include other criteria in the ranking of crystal structures. The cocrystal (XV) was the most attractive target for non-energetic assessment, as it is the only molecule in this blind test with the possibility of strong hydrogen bonding. Hydrogen-bond analysis of the structures of cocrystals of similar molecules could therefore be used to assess the hydrogen bonding in predicted crystal structures.

3.4. Treatments of the molecular flexibility in (XIV) and the independent molecules in (XV)

We summarize the various methods of treating the flexible molecule (XIV) in Table 3, adopting the nomenclature shown in Table 1 for the torsion angles. The search strategies for the cocrystal (XV) are also summarized.

Table 3
Summary of conformational treatment of molecule (XIV) and approach taken for generating cocrystal structures for (XV)

Contributor	Conformational treatment of (XIV)	Treatment of intramolecular energy in (XIV)	Search strategy used for target (XV)
Ammon	The gas phase minimum was used and kept rigid throughout	None	Searched separately with all four dimers as building blocks
Boerrigter, Tan	The gas phase (B3LYP/cc-PVDZ) minimum was used	Dreiding/X6 force field	Both molecules treated independently. Both conformations of the acid were considered
Day, Cooper, Cruz Cabeza, Hejczyk	Searches were carried out with 12 starting conformations, varying ω (∠CNC) and τ₁ (∠CNCN). These two angles were allowed to optimize during lattice energy minimization	Taken from a separate B3LYP/6-31G** calculation on the final optimized conformer in each crystal structure	Both molecules treated independently. Both conformations of the acid were considered
Della Valle, Venuti	(XIV) was not attempted	(XIV) was not attempted	Both molecules treated independently and these searches were supplemented by searches using dimers as building blocks
Desiraju, Thakur	Searches were carried out with seven rigid conformations, varying ω (∠CNC)	Dreiding force field	Searched separately with the two lowest-energy dimers as building blocks
van Eijck	Standard starting geometry for optimized values of ω, τ₁ and τ₂; random values for the two methyl dihedrals	Specifically 6-31G* derived torsional potentials	Both molecules treated independently. Only one starting conformation for the acid group was considered
Facelli, Bazterra, Ferraro	ω, τ₁ and τ₂ were searched within the genetic algorithm, along with cell parameters and molecular positions	GAFF (Generic Amber Force Field)	Both molecules treated independently
Hofmann	The gas phase (DMol3 pwc/dnp) minimum was used and kept rigid throughout	Only one conformation was considered	(XV) was not attempted
Jose, Gadre	Torsion angles were varied within the search	The Hartree–Fock energies include both inter- and intramolecular energies	Searched separately with the three lowest-energy dimers as building blocks
Neumann, Leusen, Kendrick	Conformational freedom is searched automatically within the crystal structure generation	The intramolecular energy is part of the total DFT energy	Both molecules treated independently
Price, Misquitta, Karamertzanis, Welch	Searches were carried out with ten starting conformations, varying ω (∠SCNC) and τ₁ (∠CNCN). These angles were later allowed to optimize during DMAflex lattice energy minimization	Calculated from MP2/6-31G** calculations on the conformation in the DMAflex minimized crystal structure	Searched separately with all four dimers as building blocks (both molecules were treated independently in a Crystal Predictor search which was not completed by the blind test deadline, see supplementary material )
Scheraga, Arnautova	Searches were carried out with nine rigid conformations, varying ω (∠SCNC) in the range 60–85° with the remaining torsional angles optimized for the isolated molecule	Taken from DFT/6-31G** calculated energy of the relevant conformation	(XV) was not attempted
Schmidt, van de Streek, Wolf	Starting geometry taken from a HF/6-31G** optimization. Torsion angles were allowed to vary during the search and minimization	Six-term cosine series fitted to the HF/6-31G** calculated energy surface	Searched separately with two dimer structures as building blocks, with flexibility in the relative orientations and conformation of acid
Schweizer	(XIV) was not attempted	(XIV) was not attempted	Searched separately with two of the dimer structures as building blocks

4. Results

This paper is accompanied by a large amount of supplementary material : the coordinates of the experimental structures, lists of predicted structures by each participant, as well as detailed descriptions of methodology, results and post-analysis by most of the participating research groups. Before discussing the results of the predictions, the crystal packing in the X-ray determined crystal structures of the four systems is described.

4.1. Experimental structures

4.1.1. Molecule (XII)

Acrolein (C₃H₄O), or 2-propenal, was chosen as the blind test target for category (1). Acrolein melts at ∼186 K (Timmermans, 1922 ), so crystal growth was performed at 178 K in situ by laser-assisted zone refinement (Boese & Nussbaumer, 1994 ). The crystal structure was solved from X-ray diffraction data at 150 K and the molecule was found to crystallize in the orthorhombic space group Pbca with Z′ = 1 (Forster et al., 2007 ), with the molecule adopting the energetically favourable s-trans conformation. The lack of conventional hydrogen-bond donor groups means that the crystal structure is determined by weak interactions and each molecule is found to make six short C—H⋯O contacts with neighbouring molecules (Fig. 1), three as a C—H donor and three around the acceptor O atom. These contacts form a three-dimensional network through the crystal.

Figure 1
Packing diagrams of the crystal structure of molecule (XII). Grey = carbon, white = hydrogen, red = oxygen. C—H⋯O contacts (with R_O⋯H shorter than the sum of van der Waals radii) are indicated as blue lines.

4.1.2. Molecule (XIII)

A halogenated molecule, 1,3-dibromo-2-chloro-5-fluorobenzene (C₆H₂Br₂ClF), was selected for category (2), as a test of challenging atom types for simulations. Molecule (XIII) was crystallized from acetonitrile and the structure was solved from X-ray diffraction data at T = 173 K (Britton, 2008 ). The molecule crystallizes with Z′ = 1 in the space group P2₁/c. With three different halogens in the molecule, there are many possible types of halogen–halogen contacts, which are expected to be crucial in determining the crystal structure. There are both Br⋯Br and F⋯F close contacts in the observed structure (Fig. 2), while the closest intermolecular contact with the chlorine involves a H atom with an uninteresting Cl⋯H distance of 3.04 Å. Br atoms interact in quartets, with each C—Br bond pointing to the side of one other bromine atom (Fig. 2, left). There are two independent Br⋯Br close contact distances of 3.55 and 3.63 Å; the C—Br⋯Br angles are 101.7 and 175.5° around the shorter contact, and 87.4 and 169.8° around the longer contact. F atoms form nearly head-to-head close contacts between coplanar molecules with an F⋯F distance of 2.87 Å. There is offset face-to-face stacking of the aromatic molecules along the crystallographic direction a, while the molecules make tilted edge-to-face and edge-to-edge contacts in the b and c directions.

Figure 2
Packing diagrams of the crystal structure of molecule (XIII). Grey = carbon, white = hydrogen, yellow = fluorine, green = chlorine, brown = bromine. Short atom–atom contacts (with interatomic separation shorter than the sum of van der Waals radii) are indicated as blue lines.

4.1.3. Molecule (XIV)

N-(Dimethylthiocarbamoyl)benzothiazole-2-thione (C₁₀N₁₀N₂S₃) was crystallized by diethyl ether/hexane diffusion and the crystal structure was determined from X-ray diffraction data at T = 150 K (Blake et al., 2007 ). The molecules pack with P2₁/c space-group symmetry.

The conformational flexibility can be defined by three exocyclic torsion angles (Table 1), as well as rotation of the methyl groups, whose orientations are unlikely to be important in the crystal packing. One of these torsions (ω in Table 1) defines the angle of the thioformamide group out of the plane of the rings and the other two (τ₁ and τ₂ in Table 1) describe the orientation and planarity around the N atom; if the dimethylamine group is assumed to be planar, then the conformational flexibility can be reduced to two torsion angles. In the observed structure, the five heavy atoms of the thioformamide group (SCNC₂) are almost perfectly planar and nearly perpendicular to the benzothiazole plane (Fig. 3). The angle between mean planes of the thioformamide (SCNC₂) and benzothiazole (C₇NS) is 79.2°. There is a lack of hydrogen-bond donors in the molecule and almost all close intermolecular atom–atom contacts (i.e. shorter than the sum of van der Waals radii) are between S and H atoms.

Figure 3
Crystal structure of molecule (XIV). Grey = carbon, white = hydrogen, blue = nitrogen, yellow = sulfur. Short contacts (with interatomic separation shorter than the sum of van der Waals radii) are indicated as blue lines.

4.1.4. Target (XV)

The new category for this blind test was defined as a two-component crystal and the chosen target was the cocrystal formed between 2-amino-4-methylpyrimidine and 2-methylbenzoic acid. A 1:1 cocrystal was formed by slow evaporation of an ethanol solution and the crystal structure was solved from X-ray diffraction data at 203 K (Aakeröy, 2007 ). The prediction of what stoichiometry would form between a given pair of molecules was left as a future challenge (Cruz-Cabeza et al., 2008 ) and, for this blind test, participants were given the observed stoichiometry as the starting information. Participants were also told that the molecules crystallize as a cocrystal, not a salt (although it is worthy of note that many pairs of similar molecules do crystallize as salts, with proton transfer from the carboxylic acid to the pyrimidine; Aakeröy et al., 2003 ).

The molecules form nearly linear S-shaped hydrogen-bonded tetramers in the crystal structure (Fig. 4, left), with double hydrogen bonds between acid groups and the aminopyrimidine moiety. These acid–pyrimidine pairs are linked by two N—H⋯N hydrogen bonds between aminopyrimidines, which form over crystallographic centers of inversion in the P2₁/n structure. These tetramers are nearly planar, with a 0.74 Å offset between root-mean-square (r.m.s.) planes of pyrimidine rings, and a 4.6° angle between mean planes of the pyrimidine and benzoic acids. The tetramers pack in a herringbone motif, with both face-to-face and tilted edge-to-face arene-arene interactions (Fig. 4, right).

Figure 4
Crystal structure of target (XV). Grey = carbon, white = hydrogen, red = oxygen, blue = nitrogen. Hydrogen bonds indicated by thin blue lines. The unbonded H atoms in the left figure indicate the two-site disorder in the 2-amino-4-methylpyrimidine methyl-group orientation.

There is two-site disorder of the pyrimidine methyl H atoms in the observed crystal structure, which was judged as unimportant in the selection of this target for the blind test. The observed disorder indicates that the methyl-group orientation has little effect on the energy of the crystal and H-atom positions are ignored in our comparison of predicted and observed structures, described below.

4.2. Comparison of the predictions with the experimental structures

We compared the submitted predictions with the experimentally determined crystal structures using the COMPACK algorithm (Chisholm & Motherwell, 2005 ; although the default in COMPACK is a 15-molecule cluster, we use a 16-molecule cluster here, to be consistent with the comparisons made in CSP2004), which compares the molecular packing environment in crystal structures. The experimentally determined crystal structure is represented by the interatomic distances between a molecule and its coordination shell of closest neighbouring molecules – here we choose 15 – and this set of distances is searched for in the predicted structures. If the distances match to within specified tolerances, then the coordination spheres are overlaid and a root-mean-squared deviation (RMSD₁₆) in atomic positions is calculated for all 16 molecules. We ignore H-atom positions in this comparison, because of the uncertainty in their positions in X-ray determined crystal structures.

To confirm the matches, a second screening was performed of the three predictions from each group against the experimental crystal structures. This second comparison used a measure of dissimilarity, s_powder, based on the calculated powder diffraction patterns of the two structures being compared. The measure amounts to an area between integrated patterns (Hofmann & Kuleshova, 2005 )

$[\eqalignno{s^{ij}_{\rm powder}=\, &{1\over \vartheta_{\rm max}-\vartheta_{\rm min}}\cr &\times \int^{\vartheta_{\rm max}}_{\vartheta_{\rm o}=\vartheta_{\rm min}}\left|{1\over N_i}\int^{\vartheta_{\rm o}}_{\vartheta = \vartheta_{\rm min}}I_i(\vartheta ){\rm d}\vartheta - {1\over N_j}\int^{\vartheta_{\rm o}}_{\vartheta = \vartheta_{\rm min}}I_j(\vartheta ){\rm d}\vartheta \right| {\rm d}\vartheta_{\rm o}.\cr & &(1)}]$

The index becomes zero for identical structures and the normalization factor

$[N_i = \int^{\vartheta_{\rm max}}_{\vartheta=\vartheta_{\rm min}}I_i(\vartheta ){\rm d}\vartheta \eqno(2)]$

ensures that s_powder has a maximum value of 1. Structures are deemed to be the same when s_powder is below a certain threshold.

The two comparisons, one working in direct space and one in reciprocal space, gave the same list of matching structures. Matched structures, amongst the three `official' predictions, and the extended lists of computer-generated crystal structures, are listed in Tables 4–7 (under the headings `predicted amongst first three' and `present in the submitted extended list, outside of the first three predictions'). Overlays of the unit-cell contents in matches for each target are shown in Fig. 5, along with the measured value for RMSD₁₆ and s_powder. There are two other sections in some of these tables: where groups located the experimentally observed crystal structure amongst their predictions, but outside of the lists they had submitted before the prediction deadline (`not submitted, but located in post-analysis'), and where the group had not located the correct crystal structure in their search, but energy minimized the X-ray structure in post-analysis to test the performance of their energy model (`not located in search, but energy minimized in post-analysis of predictions'). It must be emphasized that structures listed in both of these final categories fall outside of the `blind' part of the exercise and are included here as extra information that is useful in assessing the methods in detail.

Table 4
Lattice parameters, ΔE, RMSD₁₆ and s_powder for the experimental and predicted structures of molecule (XII)

α = β = γ = 90° in all structures.

	Rank	ΔE† (kJ mol⁻¹)	Density (g cm⁻³)	a (Å)	b (Å)	c (Å)	RMSD₁₆‡ (Å)	s_powder × 10²§
Expt. (T = 150 K)	–	–	1.152	6.970 (3)	9.514 (5)	9.752 (5)	–	–

Predicted amongst first three
Boerrigter, Tan	1	−0.15¶	1.117 (−3.0%)	6.879 (−1.3%)	9.697 (+1.9%)	9.994 (+2.5%)	0.156	1.46
Neumann, Leusen, Kendrick	1	−1.19¶	1.129 (−2.0%)	6.969 (−0..01%)	9.487 (−0.3%)	9.976 (+2.3%)	0.127	0.68
Ammon	2	+0.01	1.069 (−7.2%)	7.040 (+1.05)	9.746 (+2.4%)	10.150 (+4.1%)	0.174	1.88
Schweizer	2	+0.30	1.187 (+3.0%)	6.808 (−2.3%)	9.618 (+1.1%)	9.581 (−1.8%)	0.183	2.31

Present in the submitted extended list, outside of the first three predictions
Price, Karamertzanis, Misquitta, Welch	2††	+0.29	1.064 (−7.6%)	7.000 (+0.4%)	9.864 (+3.7%)	10.139 (+4.0%)	0.200	–
van Eijck	6	+0.63	1.079 (−6.3%)	6.976 (+0.1%)	9.791 (+2.9%)	10.107 (+3.6%)	0.180	–
Della Valle, Venuti	13	+1.11	1.161 (+0.8%)	6.720 (−3.6%)	9.898 (+4.05)	9.644 (−1.1%)	0.268	–
Facelli, Bazterra, Ferraro	51	+2.38	1.170 (+1.6%)	6.765 (−2.9%)	9.867 (+3.75)	9.536 (−2.2%)	0.246	–

Not submitted, but located in post-analysis of predictions‡‡
Schmidt, van de Streek, Wolf	6‡‡	+0.90	1.133 (−1.6%)	6.817 (−2.2%)	9.645 (+1.45)	10.000 (+2.5%)	0.179	–
Day, Hejczyk	117‡‡	+5.65	1.112 (−3.5%)	6.735 (−3.4%)	9.936 (+4.4%)	10.007 (+2.6%)	0.308	–

Not located in search, but energy minimized in post-analysis
Hofmann	–	+1.68	1.062 (−7.8%)	7.311 (+4.9%)	9.708 (+2.0%)	9.883 (+1.3%)	0.190	–
Jose, Gadre	–	+2.14	1.079 (−6.3%)	7.110 (+2.0%)	9.720 (+2.2%)	9.990 (+2.4%)	0.319	–
Scheraga, Arnautova	–	+1.87	1.108 (−3.9%)	6.702 (−3.9%)	9.923 (+4.3%)	10.123 (+3.8%)	0.248	–

†ΔE is calculated with respect to the lowest-energy structure predicted by the same research group.
‡RMSD is calculated using a 16 molecule comparison in COMPACK, ignoring H atoms.
§s_powder is the normalized dissimilarity index calculated from simulated powder diffraction patterns.
¶ΔE for the global minimum is calculated with respect to the second lowest-energy structure.
††The experimentally observed crystal structure was found as the second lowest in lattice energy, but not submitted as one of the three predictions, which were chosen from amongst the five lowest lattice-energy structures based on visual assessment and additional calculated properties.
‡‡Structures reported in this category were submitted after the experimentally determined crystal structures were revealed, so cannot be considered blind predictions. They are included here to allow further analysis of the search and ranking methodologies, not as successful blind test predictions.

Table 5
Lattice parameters, ΔE, RMSD₁₆ and s_powder for the experimental and predicted structures of molecule (XIII)

α = γ = 90° in all structures.

	Rank	ΔE† (kJ mol⁻¹)	Density (g cm⁻³)	a (Å)	b (Å)	c (Å)	β (°)	RMSD₁₆‡ (Å)	s_powder × 10²§
Expt. (T = 173 K)	–	–	2.528	3.8943 (5)	13.5109 (17)	14.4296 (17)	93.636 (2)	–	–

Predicted amongst first three
Ammon	1	−2.01¶	2.413 (−4.5%)	3.968 (+1.9%)	13.986 (+3.55)	14.309 (−0.8%)	91.78	0.385	1.59
Day	1	−0.68¶ (−0.29)††	2.506 (−0.9%)	3.880 (−0.4%)	13.683 (+1.3%)	14.403 (−0.2%)	92.01	0.159	0.72
Neumann, Leusen, Kendrick	1	−1.34¶	2.548 (+0.8%)	3.875 (−0.5%)	13.456 (−0.4%)	14.473 (+0.3%)	94.97	0.082	0.89
Price, Karamertzanis, Misquitta, Welch	1	−0.70¶	2.517 (−0.4%)	3.805 (−2.3%)	13.791 (+2.1%)	14.531 (+0.7%)	93.78	0.152	1.16

Present in list, outside of first three predictions
Desiraju, Thakur‡‡	14	+3.56	2.577 (+1.9%)	3.868 (−0.7%)	15.093 (+11.7%)	12.731 (−11.8%)	90.15	1.768‡‡	–
van Eijck	16	+2.47	2.344 (−7.3%)	3.959 (+1.7%)	14.189 (+5.0%)	14.547 (+0.8%)	91.38	0.410	–
Della Valle, Venuti	84	+5.18	2.297 (−9.1%)	4.096 (+5.2%)	14.138 (+4.6%)	14.419 (−0.1%)	93.08	0.500	–

Not submitted, but located in post-analysis of predictions§§
Boerrigter, Tan	4§§	+0.78	2.632 (+4.1%)	3.707 (−4.8%)	13.604 (+0.7%)	14.475 (+0.3%)	94.45	0.285	–
Schmidt, van de Streek, Wolf	10§§	+2.43	2.601 (+2.9%)	3.649 (−6.3%)	13.649 (+1.0%)	14.834 (+2.8%)	94.69	1.059	–

Not located in search, but energy minimized in post-analysis
Hofmann	–	+12.11	2.247 (−11.1%)	4.018 (+3.2%)	14.461 (+7.0%)	14.687 (+1.8%)	92.76	0.373	–
Facelli, Bazterra, Ferraro	–	+6.20	2.289 (−9.5%)	4.138 (+6.3%)	14.107 (+4.4%)	14.366 (−0.4%)	86.22	0.546	–
Scheraga, Arnautova	–	+2.77	2.608 (+3.2%)	3.761 (−3.4%)	13.585 (+0.6%)	14.412 (−0.1%)	94.2	0.315	–

†ΔE is calculated with respect to the lowest-energy structure predicted by the same research group.
‡RMSD₁₆ is calculated using a 16 molecule comparison in COMPACK, ignoring H atoms.
§s_powder is the normalized dissimilarity index calculated from simulated powder diffraction patterns.
¶ΔE for the global minimum is calculated with respect to the second lowest-energy structure.
††Quasi-harmonic free energy with respect to the second lowest-energy structure.
‡‡Reported as a match to the experimental structure, but with extreme deviations.
§§Structures reported in this category were submitted after the experimentally determined crystal structures were revealed, so cannot be considered blind predictions. They are included here to allow further analysis of the search and ranking methodologies, not as successful blind test predictions.

Table 6
Lattice parameters, ΔE, RMSD₁₆ and s_powder for the experimental and predicted structures of molecule (XIV)

α = γ = 90° in all structures.

	Rank	ΔE† (kJ mol⁻¹)	Density (g cm⁻³)	a (Å)	b (Å)	c (Å)	β (°)	RMSD₁₆‡ (Å)	s_powder × 10²§
Expt. (T = 150 K)	–	–	1.479	13.060 (3)	9.738 (2)	9.335 (2)	105.800 (3)	–	–

Predicted amongst first three
van Eijck	1	−0.24¶	1.497 (+1.2%)	12.853 (−1.6%)	9.803 (+0.7%)	9.341 (+0.1%)	106.52	0.147	1.14
Neumann, Leusen, Kendrick	1	−1.98¶	1.450 (−2.0%)	13.242 (+1.4%)	9.821 (+0.9%)	9.314 (−0.2%)	105.80	0.130	0.80
Price, Karamertzanis, Misquitta, Welch	1	−4.19¶	1.466 (−0.9%)	12.882 (−1.4%)	9.765 (+0.3%)	9.612 (+3.0%)	107.62	0.222	1.18

Present in list, outside of first three predictions
Facelli, Bazterra, Ferraro	6	+4.65	1.543 (+4.3%)	14.046 (+7.5%)	9.612 (−1.3%)	8.263 (−11.5%)	100.95	0.830	–
Day, Cooper	8	+6.52	1.432 (−3.2%)	12.472 (−4.5%)	9.894 (+1.6%)	10.078 (+8.0%)	108.39	0.536	–

Not submitted, but located in post-analysis of predictions††
Scheraga, Arnautova	2††‡‡	+4.73	1.451 (−1.9%)	13.032 (−0.2%)	9.692 (−0.5%)	9.638 (+3.3%)	106.9	0.221	–
Ammon	4††	+1.99	1.401 (−5.3%)	13.438 (+2.9%)	9.684 (−0.6%)	9.690 (+3.8%)	106.96	0.264	–
Schmidt, van de Streek, Wolf	4††	+0.67	1.513 (+2.3%)	12.957 (−0.8%)	9.569 (−1.7%)	9.366 (+0.3%)	105.97	0.127	–
Boerrigter, Tan	9††	+2.12	1.462 (−1.2%)	12.957 (−0.8%)	9.626 (−1.2%)	9.596 (+2.8%)	105.05	0.195	–

Not located in search, but energy minimized in post-analysis
Hofmann	–	+3.09	1.454 (−1.7%)	13.041 (−0.1%)	9.821 (+0.9%)	9.443 (+1.2%)	106.12	0.081	–
Jose, Gadre	–	+103.48	1.372 (−7.2%)	13.047 (−0.1%)	10.100 (+3.7%)	9.480 (+1.6%)	99.71	0.522	–

†ΔE is calculated with respect to the lowest-energy structure predicted by the same research group.
‡RMSD₁₆ is calculated using a 16 molecule comparison in COMPACK, ignoring H atoms.
§s_powder is the normalized dissimilarity index calculated from simulated powder diffraction patterns.
¶ΔE for the global minimum is calculated with respect to the second lowest-energy structure.
††Structures reported in this category were submitted after the experimentally determined crystal structures were revealed, so cannot be considered blind predictions. They are included here to allow further analysis of the search and ranking methodologies, not as successful blind test predictions.
‡‡The correct structure was ranked second on energy, but not submitted because the predicted structure was slightly out of symmetry, with Z′ = 2.

Table 7
Lattice parameters, ΔE, RMSD₁₆ and s_powder for the experimental and predicted structures of molecule (XV)

α = γ = 90° in all structures.

	Rank	ΔE† (kJ mol⁻¹)	Density (g cm⁻³)	a (Å)	b (Å)	c (Å)	β (°)	RMSD₁₆‡ (Å)	s_powder × 10²§
Expt. (T = 203 K)	–	–	1.301	7.2795 (10)	13.6699 (18)	12.6695 (16)	96.646 (3)	–	–

Predicted amongst first three
Neumann, Leusen, Kendrick¶	1††	−2.08	1.307 (+0.5%)	7.264 (−0.2%)	13.818 (+1.1%)	12.520 (−1.2%)	97.44	0.075	0.88
van Eijck¶	3	+1.36	1.303 (+0.2%)	7.336 (+0.8%)	13.556 (−0.8%)	12.674 (+0.0%)	97.34	0.294	1.32

Present in list, outside of first three predictions
Day, Cruz Cabeza¶	4	+3.47	1.272 (−2.2%)	7.201 (−1.1%)	13.943 (+2.0%)	12.884 (+1.7%)	98.07	0.242	–

Not submitted, but located in post-analysis of predictions‡‡
Boerrigter, Tan	26‡‡	+5.80	1.240 (−4.7%)	7.698 (+5.7%)	13.834 (+1.2%)	12.451 (−1.7%)	97.88	0.536	–
Schmidt, van de Streek, Wolf	> 100‡‡	+11.32	1.264 (−2.8%)	7.312 (+0.4%)	13.701 (+0.2%)	13.989 (+10.4%)	113.11	0.385	–

Not located in search, but energy minimized in post-analysis‡‡
Della Valle, Venuti	–	+1.81	1.301 (−0.0%)	7.633 (+4.8%)	12.693 (−7.1%)	12.968 (+2.4%)	94.80	0.473	–
Facelli, Bazterra, Ferraro	–	+0.91	1.350 (+3.8%)	7.156 (−1.7%)	12.798 (−6.4%)	13.215 (+4.3%)	94.53	0.491	–
Hofmann	–	+47	1.271 (−2.3%)	7.486 (+2.8%)	13.512 (−1.2%)	12.737 (+0.5%)	95.71	0.157	–
Price, Karamertzanis, Misquitta, Welch	1§§	0§§	1.301 (+0.0%)	7.250 (−0.4%)	13.774 (+0.8%)	13.625 (+7.5%)	113.07	0.203	–

†ΔE is calculated with respect to the lowest-energy structure predicted by the same research group.
‡RMSD₁₆ is calculated using a 16 molecule comparison in COMPACK, ignoring H atoms, with the 2-methylbenzoic acid as the central molecule in the cluster.
§s_powder is the normalized dissimilarity index calculated from simulated powder diffraction patterns.
¶All three predictions were submitted in P2₁/c, which have been converted to the P2₁/n setting for comparison with the experimentally determined structure.
††ΔE for the global minimum is calculated with respect to the second lowest-energy structure.
‡‡Structures reported in this category were submitted after the experimentally determined crystal structures were revealed, so cannot be considered blind predictions. They are included here to allow further analysis of the search and ranking methodologies, not as successful blind test predictions.
§§Result from post-analysis completion of Crystal Predictor search, see supplementary material .

Figure 5
Overlays of the unit-cell contents of the four observed crystal structures, (XII)–(XV), and an example of one of the successful predictions for each. Observed structures are given in green, and the predicted structures in red. (a) Crystal structure of (XII) (green) and Ammon.XII.2 (red); (b) crystal structure of (XIII) (green) and Price.XIII.1 (red); (c) crystal structure of (XIV) (green) and vanEijck.XIV.1 (red); (d) crystal structure of (XV) (green) and NeumannLeusenKendrick.XV.1 (red).

4.3. Predictions results

4.3.1. Molecule (XII)

13 of the 14 participating research groups attempted predictions for molecule (XII), four of whom predicted the observed structure within their three predictions (Table 4). Two of these successes (Neumann, Leusen & Kendrick; Boerrigter & Tan) were submitted as the group's first prediction, while the other two (Ammon; Schweizer) were submitted as the participant's second prediction. All four of these correct predictions gave RMSD₁₆ deviations from the experimentally determined structure of less than 0.2 Å and root-mean-squared errors in the unit-cell lengths (a, b, c) of less than 3%. An overlay of one of these predictions with the X-ray determined structure is shown in Fig. 5.

Outside of the official three predictions, the observed crystal structure was present in the extended lists of six other research groups and only three of the 13 groups reported not finding the structure in their list of computer-generated crystal structures. The success rates here are a moderate improvement over the previous blind tests, whose category 1 molecules and success rates are shown in Fig. 6. Only molecule (I) from CSP1999 had as high a success rate (four of 11 groups with successful predictions), but only for one of its known polymorphs – there were no successful predictions of the other polymorph.

Figure 6
Previous blind test molecules in category (1) (simple rigid molecules). Success rates for these are given as number of correct predictions/number of participants.

4.3.2. Molecule (XIII)

All 14 participants attempted predictions for molecule (XIII), four of whom (Ammon; Day; Neumann, Leusen & Kendrick; Price, Karamertzanis, Misquitta & Welch) predicted the observed crystal structure (Table 5). All of these successes were found as the first predicted structure from that participant and all gave an RMSD₁₆ deviation from the observed structure of less than 0.4 Å, with root-mean-squared errors in the unit-cell lengths (a, b, c) of better than 2.4%. An overlay of one of the four successful predictions is shown in Fig. 5. The success rates here are about the same as in this category in CSP2001, and higher than in the other two previous blind tests (Fig. 7).

Figure 7
Previous blind test molecules in category (2) (rigid molecules with challenging functional groups). Success rates for these are given as number of correct predictions/number of participants.

The observed crystal structure was generated by the search method used by five other research groups, outside of their top three predictions and between 0.8 and 5.2 kJ mol⁻¹ above their global minimum. These generally had greater geometric deviations from the experimental structure than seen in the predictions where the structure was ranked first in energy. The other five groups reported not finding the structure in their list of computer-generated crystal structures, indicating a failure of the search method. The slightly higher rate of search method failure here than for molecule (XII) might reflect difficulties in modelling the halogen atoms, as many of the methods do involve lattice-energy calculations and crystal structure optimizations during the search procedure. Therefore, poor modelling of the interactions can lead to a failed search.

4.3.3. Molecule (XIV)

12 research groups attempted predictions for the category 3 target, molecule (XIV), three of whom (van Eijck; Neumann, Leusen & Kendrick; Price, Karamertzanis, Misquitta & Welch) were found to have predicted the observed crystal structure within their three predictions (Table 6). An overlay of one of the successful predictions is shown in Fig. 5. Each of these groups found the correct structure as their first ranked prediction, with RMSD₁₆ deviations from the observed structure of 0.22 Å or lower and root-mean-squared errors in the unit-cell lengths (a, b, c) smaller than 2%. The observed crystal structure was present in the extended lists of six other research groups, while three groups did not find the observed structure in their search.

These rates of success are similar to those seen for rigid molecules from previous blind tests and a noticeable improvement on what has previously been seen for flexible molecules (Fig. 8). There has only been one successful prediction for a flexible molecule in all three of the previous tests. This striking improvement might partly reflect advances in methods of dealing with conformational flexibility during the crystal structure search and during the ranking of structures. We must also consider that the molecule chosen for this category was less challenging than those in previous blind tests (Fig. 8). This is one unavoidable weakness of using the blind tests to measure progress in the field – variations in the difficulty of molecules can be as important as changes in the methods used to predict crystal structures and it is difficult to judge the difficulty associated with a molecule before performing the calculations involved in its prediction. In the case of molecule (XIV), it was felt that the conformational flexibility of the molecule was less challenging than in previous blind tests. Several groups performed quantum mechanical calculations to map out the energy of molecule (XIV) as a function of rotation about one or more of the exocyclic single bonds. While details of the methods varied, the minimum-energy conformation was generally found to have a planar geometry about the exocyclic N atom and an angle of 70–80° between the thioformamide and benzothiazole groups, i.e. very close to the conformation found in the crystal structure. Therefore, crystal structure searches using the gas phase minimum molecular geometry had a good chance of finding the observed structure and ranking it favourably on energy. Predictions were simplified by there only being one minimum on the conformational energy surface: for a cost of ca 10 kJ mol⁻¹, the out-of-plane angle of the thioformamide can distort about 30° either side of the minimum and the geometry around the exocyclic N atom can rotate by a similar amount. This much intramolecular energy could be compensated for by improved packing and intermolecular interactions, so these distortions from the gas-phase minimum geometry had to be considered during the crystal structure predictions. However, the resulting relevant conformational space was fairly restricted compared with the flexible molecules in the previous blind tests. As an example, the packing of molecule (X) in CSP2004 (Fig. 8) was found to be quite sensitive to six torsion angles, all of whose orientations had to be considered during the predictions (Day et al., 2005).

Figure 8
Previous blind test molecules in category (3) (flexible molecules). Success rates for these are given as number of correct predictions/number of participants.

4.3.4. Target (XV)

12 participants attempted predictions for the cocrystal (XV) and two of these predicted the observed cocrystal structure within the three official predictions (Neumann, Leusen & Kendrick and van Eijck), as the first and third predictions, respectively (Table 7). Both had RMS errors in the lattice constants (a, b, c) of less than 1% and RMSD₁₆ deviations in atomic positions better than 0.3 Å. An overlay of one of the two successful predictions is shown in Fig. 5 (where the disordered pyrimidine methyl group H atoms in the observed structure are shown in the site with highest occupancy). Three other groups had found the observed crystal structure outside of their three best predictions (Table 6), while the other seven failed to locate the observed crystal structure in their search.

The cocrystal was introduced in this blind test as a new category of prediction challenge, so there are no results from previous blind tests with which to compare. The most similar example from previous blind tests is that of molecule (XI) of CSP2004, which crystallized with two independent molecules. No groups predicted the correct crystal structure for that molecule, partly because many could not or opted not to search for crystal structures with Z′ = 2. In CSP2004 the value of Z′ was not given, but Z′ > 1 was allowed as a possibility, unlike here, where the contents of the asymmetric unit were specified.

In this blind test two different approaches were applied to searching phase space with more than one type of molecule in the crystal structure. One option was to search all of the packing space, with the positions and orientations of the two molecules treated independently. As discussed by van Eijck (van Eijck & Kroon, 2000 ; van Eijck, 2002 ), it is a considerable computational challenge to exhaustively search all of the crystal-packing space with two independent molecules, because of the six extra degrees of freedom compared with the search space when there is only one molecule in the asymmetric unit. Furthermore, there is a choice of conformation for the acid molecule which had to be considered.

The other strategy used to generate crystal structures takes advantage of the strong interactions between the two molecules, which helps predict their relative orientation before starting to generate crystal structures. Several groups deemed that hydrogen-bond dimers were likely and used dimers as the basic unit with which crystal structures were generated, essentially reducing the problem back to that of a single-component crystal. Indeed, a survey of known structures of carboxylic acid: pyrimidine cocrystals in the CSD finds that such dimers are always formed, so this strategy was well founded in this case. For this pair of molecules, four planar dimer geometries are possible (Fig. 9), so there is a choice of which dimer geometries to consider in generating crystal structures. Some groups performed searches with all four possibilities, while others chose the most likely dimer structures from calculated energies. In this case the dimer geometry in the observed crystal structure corresponds to the lowest-energy dimer from various flavours of quantum mechanical calculation (Fig. 9a).

Figure 9
The four likely hydrogen-bond dimer structures formed between 2-methylbenzoic acid and 2-amino-4-methylpyrimidine. Calculated energies at MP2 (MP2/6-31G** from Ammon) and DFT (B3LYP/6-31G**, from Thakur & Desiraju, 2008

) levels of theory are given, relative to the most stable dimer.

Table 3 summarizes the cocrystal search strategy used by each participant; five groups used the approach with two independent molecules, six groups used the dimer-based approach and one group used a combination of the two approaches (performing searches both with independent molecules and with dimers as starting points). Of the five groups who found the observed crystal structure either in their three official predictions or in their extended lists, two had used the dimer-based approach in the crystal structure search and three (including the two successful predictions) had used independent molecules in the search. Both methods can clearly be successful, but several groups using either search strategy also failed to produce the observed structure.

4.4. Computational expense

The range of methods being applied to crystal structure prediction come at varying costs in terms of computational time and resources, and some of the methods now being used in the blind tests have only been made possible by access to high-performance computing resources. Therefore, participants in CSP2007 were asked to keep track of the computational resources used to come up with their predictions, to give an idea of the resources required for each approach. Table 8 summarizes the resources used by some of the participants, where available and easily quantifiable.

Table 8
Summary of computational resources used by some of the participants in CSP2007

Group	Comments on computing time used	Total computational cost, approximately normalized to 2.8 GHz CPU hours
Boerrigter, Tan	(XII): 55 h, 200 MHz octane	350 CPU hours
	(XIII): 177 h, 200 MHz octane
	(XIV): 299 h, 200 MHz octane
	(XV): 194 h 200 MHz octane, + 280 h 3.0 GHz Pentium 4
Day, Cooper, Cruz Cabeza, Hejczyk	Crystal structure search (on 1.3 GHz Itanium processors): (XII) ≃ 200 CPU hours; (XIII)–(XV) ≃ 300 CPU hours each. Lattice-energy minimization and free-energy calculations (on 2.4 GHz Opteron processors) ranged from 70 (XV) to 320 CPU hours (XIV)	∼ 1000 CPU hours
Della Valle, Venuti	Processor times on 2.2 GHz 64-bit processors were 12, 15 and 43 d for molecules (XII), (XIII) and (XV). About 96% of this time was spent on energy minimization	1320 CPU hours
van Eijck	147 CPU hours molecular calculations, 1611 CPU hours spent on searches, 732 CPU hours on energy minimization, time standardized to 2.8 GHz processors	2490 CPU hours
Facelli, Bazterra, Ferraro	Approximated total computer time was 200 000 CPU hours on 2.5 GHz class processors	180 000 CPU hours
Hofmann	(XII): 30 h, (XIII): 60 h, (XIV): 60 h, 3.0 GHz processor	160 CPU hours
Neumann, Leusen, Kendrick	Approximately 280 000 CPU hours on 2.8 GHz processors, mostly spent on the generation of reference data for force-field parameterization and the final energy ranking with the hybrid method	∼ 280 000 CPU hours
Price, Karamertzanis, Misquitta, Welch	Each MOLPAK search could run overnight on the UCL Condor cluster of PCs, and a similar period was required for a simple reminimization of order of 1000 structures with DMAREL on one processor. The Crystal Predictor searches took a few days for (XIII)	∼ 5000 CPU hours
	DMAflex refinements (XIV) took several days of CPU time for each of the ten structures
	Total excludes the potential development for (XIII) of ∼ 4000 CPU hours, and the work on (XV) using Crystal Predictor search which was only completed after the deadline (2 weeks CPU time for the intramolecular potential surfaces, and an equivalent amount of time for the Crystal Predictor search, and ∼ 30 DMAflex minimizations of about a week each)
Schmidt, van de Streek, Wolf	∼ 8 months CPU time, 1.7 GHz AMD processors	∼ 3500 CPU hours

The computing requirements can clearly be very high and would be a consideration in the choice of method for a particular problem. Most methods have required a few hundred hours (weeks) to a few thousand hours (months) CPU time on a modern processor. The computing requirement for the very successful method of Neumann, Leusen and Kendrick is several orders of magnitude higher, at 280 000 CPU hours (∼ 32 CPU years) for predictions on the four targets. In the majority of methods, most of the computing time is being spent on the energy-minimization part of the problem. The real time used for the calculations is often much shorter, because of the use of parallel or distributed computing setups. The price of powerful computing clusters is decreasing year-on-year such that even the most expensive of these methods could be brought down to a matter of weeks to a few months in real computing time at relatively low cost.

5. Discussion

5.1. Overall success rates

The success rates in the blind test depend principally on the performance of two main elements in the prediction methodology: the generation of all possible crystal structures, followed by the evaluation and ranking of these structures. Both must be performed effectively for a successful prediction, while failure of either the search or the ranking precludes success. In this fourth blind test, we have observed improved overall rates of successful prediction over the first three blind tests, reflecting developments in the methods applied to each step in crystal structure prediction.

Of the 14 groups participating in CSP2007, most attempted predictions for all four targets and half (seven groups) had at least one successful prediction within the rules of the blind test, where three predictions are allowed for each molecule. Four of the participating groups had multiple successes and, overall, there were 13 successful predictions, ten of which were submitted as a participant's first choice prediction. The quality of these predictions is illustrated in the overlays of predicted structures with those determined from X-ray diffraction data (Fig. 5). The success rate here is an important improvement over the results from the previous blind tests: 11 of the successful predictions were for molecules in the `original' three categories of molecules that formed the first three blind tests, while in CSP2004 there was only one successful prediction from 18 participating groups² and there were six successes in each of CSP1999 (Lommerse et al., 2000) and CSP2001 (Motherwell et al., 2002). The CSP2004 results looked discouraging at the time and they highlighted areas requiring development in methods; some of these have clearly been addressed to some extent in the three years between CSP2004 and this latest blind test. The search methods used and approaches taken for ranking of the computer-generated crystal structures are discussed in the following sections.

As well as the overall increase in successful predictions compared with previous years, it is important to note that these predictions were distributed amongst the four categories and there were successful predictions from at least two groups for each of the four crystal structures [four successes for each of molecules (XII) and (XIII), three for (XIV) and two for (XV)]. It is significant that there were three correct predictions for the flexible molecule. While molecule (XIV) might not have been as flexible as previous molecules in this category, the successes here do demonstrate that progress is being made in extending the generality of CSP methods to larger molecules. Furthermore, the new challenge of a cocrystal was not insurmountable, despite the added complexity of searching phase space with two independent molecules coupled with the issue of two possible conformations of the acid molecule.

Of course, the most impressive results from this blind test are those of Neumann, Leusen and Kendrick, who successfully predicted all four crystal structures, each as their first choice amongst their submitted predictions. Their calculations also produced the lowest RMSDs of all successful predictions for all four crystal structures, demonstrating that their method produces excellent matches to the true structures. One observation from previous blind tests was that there has not been one method which has been successful in general over the three categories of molecule. This group's results are certainly a striking improvement over what has been achieved before and indicate that generally applicable methods can be successful across the various types of molecules and crystals represented in the blind test.

5.2. The search problem

For each of the single-component systems, a few groups did not locate the observed crystal structures using their search method. The failure rate seems to have been lower for the flexible molecule (XIV) than in previous tests, which would contribute to the increased success overall for the flexible molecule. As expected, the cocrystal was the main problem – the increased search space was the main reason for including the new category of two-component crystals in this blind test and seven of the 12 groups who attempted predictions for this system did not locate the observed crystal structure in their search. The successes and failures in generating the observed structure were roughly evenly split between those who took either of the approaches described in §4.3.4 (i.e. treating the two molecules completely independently or starting with hydrogen-bonded dimers). There is certainly a gain in computational efficiency for the dimer approach in the crystal structure search, but the method did not prove more reliable than the full-blown independent molecule search in locating the observed structure.

It is not possible to completely analyse the rates of success of the various search methods without very long lists of the computer-generated crystal structures from each participating group. Nine of the 14 groups did submit extended lists of up to 100 predicted structures beyond their three predictions per molecule. However, in some cases these lists were not always long enough to fully assess the search method. Sometimes the energy model used to rank the structures performed poorly enough that the observed crystal structure was generated in the search, but ranked outside the best 100 structures and, so, was absent from the list. What we can say is that at least five groups (Neumann, Leusen & Kendrick; van Eijck; Day, Cooper, Cruz Cabeza & Hejczyk; Schmidt, van de Streek & Wolf; Boerrigter & Tan) located all four observed structures; four of these groups used a variation on random sampling of structural variables to generate crystal structures, while the fifth employed Monte Carlo simulated annealing. Several other groups only missed one of the observed structures in their searches.

An analysis akin to that performed by van Eijck following CSP2004 (van Eijck, 2005) was performed on the extended lists of submitted structures; all participants' extended lists of structures were re-minimized in a high-quality common force field (that used by van Eijck) to remove structural differences owing to the use of different force fields. The lists from each participant were then compared. For the three single-component crystals [(XII), (XIII) and (XIV)], most of the low-energy structures were located by more than one participant. This indicates that, by combining the lists, the search space is probably sampled completely. However, individual participant's lists often missed important low-energy structures; this could indicate either incomplete sampling or that minima exist on the energy surface described by van Eijck's force field that are absent from the energy surface of other groups' energy model. For the cocrystal (XV), there was less overlap between participants' lists, suggesting that the sampling of crystal packing possibilities might not be complete, even after combining the extended lists. There are limitations to the conclusions that can be drawn from comparisons of structures generated using different energy models (van Eijck, 2005). However, these observations point to the conclusion that there is still room for improvement in methods for structure generation that will provide reliably complete sets of the low-energy crystal structures.

5.3. Ranking of the generated structures

As well as the computational and algorithmic challenge of providing an adequate sampling of crystal-packing phase space, it has also been clear from previous studies that the reliable ranking of the set of computer-generated crystal structures has been an obstacle for crystal structure prediction. This is because most molecules are found to have many distinct crystal packing possibilities within a small energy range (Day et al., 2004 ), so that the energy differences between structures are often very small. As most methods rely on the ranking on lattice energy as a major, if not the only, part of the ranking process, the challenge of crystal structure prediction has been a driving force for the development of highly accurate methods for calculating the relative energies of crystal structures.

Until recently, developmental work has focused on improving atom–atom model potential methods, both in the functional form used and their parameterization (Price, 2008 ). Most of the participants in CSP2007 have based their predictions on atom–atom model potential calculations, and the nature of these models ranges from the fairly simple to quite elaborate anisotropic descriptions of atom–atom interactions. Most often, the atom–atom anisotropy is only included in the electrostatic term of the model potential and methods that have been used here are atomic multipole expansions of the charge density (four groups) and optimized core-shell models (one group). As in CSP2004, two groups treated the category (2) molecule with non-spherical atom–atom repulsion models, to account for the well known anisotropy in close contact distances for interactions involving halogen atoms (Nyburg & Faerman, 1985 ). Both of these resulted in the successful predictions of the observed crystal structure as the first choice prediction; Day used information from CSD distributions of atom–atom contacts to parameterize the atom–atom anisotropy, whereas Price, Misquitta, Karamertzanis and Welch's prediction employed a completely non-empirically derived model potential, where all atom–atom terms were derived from quantum-mechanics calculations on monomers and dimers (Misquitta et al., 2008 ).

Beyond the atom–atom model potential approach, one group used Gavezzotti's semi-classical density sums (or PIXEL) method (Gavezzotti, 2003a ,b ), which avoids the partitioning into atom–atom terms, but discretizes the molecular electron density into a set of roughly 10⁴ interaction sites. Additionally, a new class of approach that has been applied for the first time in the blind tests is the use of quantum-mechanical electronic structure based methods to treat the intermolecular as well as intramolecular energy. While the application of electronic structure calculations has been widely accessible for understanding molecular structure and properties for many years, applying these methods directly to the organic solid state presented further challenges – the computational expense and the now well known limitations of affordable methods (i.e. Hartree–Fock or density-functional theory) in modelling the dispersion interactions between molecules (Kristyán & Pulay, 1994 ; Hobza et al., 1995 ). The computational problem is being resolved by a combination of algorithmic improvements and increased access to high-performance computational clusters that allow the application of such methods to the large numbers of crystal structures that must be considered for each molecule. Two groups used electronic structure methods in CSP2007 for energy minimization and ranking of their predicted crystal structures. The details of the methods are very different. Jose and Gadre applied the cardinality-guided molecular tailoring (CG-MTA) approach (Ganesh et al., 2006 ), which has previously been applied to the energy minimization of large molecules. The crystal is described by a set of overlapping clusters of molecules. The total energy of the crystal is evaluated as the sum of the cluster energies minus the energies of the overlap regions, which are each evaluated at the HF/STO-3G level of theory. This method did not locate the experimental structure within the best three predictions for any of the molecules. A critical post analysis of (XII) and (XIV) as test cases showed that the method does locate crystal structures close to the experimental ones (see Tables 4 and 6). However, in the case of molecule (XIV), the energy difference from the lowest-energy generated structures clearly shows the inadequacy of the employed level of theory and basis set in accounting for the weak interactions in the crystals.

The other electronic structure based approach was that of Neumann, Leusen and Kendrick, who used periodic density-functional theory (DFT) calculations for their final energy minimizations. An important aspect of their calculations was that the DFT approach was empirically corrected to account for the poor treatment of the dispersion attraction between molecules by DFT. By supplementing the DFT energy from the VASP program (Kresse & Hafner, 1993 ; Kresse & Furtmüller, 1996 ; Kresse & Joubert, 1999 ) with an empirically parametrized atom–atom C₆R⁻⁶ correction (Neumann & Perrin, 2005 ), the most successful predictions were obtained: all four crystal structures were predicted as the global minimum on the static potential-energy surface. The approach is by far the most computationally demanding of any of the methods used for the final optimization and evaluation of lattice energies, so could only be applied to a limited number of crystal structures for each molecule. Whereas many methods involve the optimization of thousands of putative structures, only between 32 and 100 structures were optimized for each of the four targets. Therefore, another key component of the method was the use of highly accurate tailor-made force fields (Neumann, 2008 ) to provide an initial ranking of the structure and ensure that the observed structures were within this relatively small set.

Overall, we find that there is a rough correlation between how sophisticated a method is used for the final energy ranking and rates of success in this blind test. In fact, all successful predictions in this blind test were achieved by using more sophisticated methods for the final lattice-energy minimization than the traditional force fields with isotropic atom–atom interaction and atomic point charges.

Of course, the ranking of the stability of crystal structures should be based on calculated free energies, including contributions from lattice dynamics, rather than the potential energies of static configurations of atoms. However, the results here suggest that such dynamic contributions to the free energy are unimportant for the molecules included in CSP2007; those calculations that did include a lattice dynamical calculation of the free energy did not result in important re-rankings of structures. For the molecules studied here, the entropy differences between crystal structures were typically smaller than the lattice-energy differences.

5.4. Kinetics

As in the first three blind tests, the ranking of crystal structures has been based almost exclusively on calculated energies. Kinetic considerations are still fairly poorly understood: we know that growth conditions can be used to select polymorphs, but do not have a good understanding of how often observed crystal structures do not correspond to the lowest-energy possibility. Attempts have been made in the past to consider the kinetics of crystal growth in crystal structure prediction studies. The only major use of non-energetic criteria in this blind test was the synthon-based approach of Desiraju and Thakur, who considered hydrogen-bond motifs in the final ranking of crystal structures of the cocrystal (XV). The analysis of the arrangement of molecules in cocrystals of similar molecules in the CSD and in an in-house library of crystal structures (Thakur & Desiraju, 2008 ) highlighted the linear tetramer seen in the observed crystal structure (Fig. 4) as the most likely hydrogen-bond configuration for this pair of molecules. Therefore, predicted structures containing this synthon were given preference in the ranking. Statistics on the occurrence of hydrogen-bond motifs in known crystal structures will contain both energetic and kinetic information; observed crystal structures must be stable as well as kinetically accessible. Therefore, using such information in the ranking of predicted crystal structures could be a way of including some kinetic information. In this case, the approach successfully predicted the hydrogen-bonding motif. However, the observed structure was not present in the list, perhaps because of a poor choice of electrostatic model.³ One might conclude that the analysis of crystal structures of similar molecules (or pairs of molecules, in the case of a cocrystal) can give valuable predictive information, but the requirement of a robust search method and accurate energy model remains.

The results presented in this blind test demonstrate important steps forward in the calculation of relative energies and, if we can reliably get these right, then we have a necessary starting point for considering the, perhaps less important, effects of entropy, nucleation and crystal growth on the crystal structure that results from a particular experiment. We were perhaps somewhat lucky in this blind test that the four target crystal structures all appear to be the most stable form on the potential energy surface, i.e. static T = 0 K energies.

6. Summary

Blind tests of crystal structure prediction of organic molecules have been carried out periodically over the past decade, serving as an objective evaluation of the state-of-the-art in crystal structure prediction methods and we believe that they are useful for those monitoring progress in the field. The latest such international investigation, CSP2007, has revealed major steps forward. Given the molecular diagrams of four targets (three molecules and the two components of a cocrystal), 13 successful predictions were achieved by the 14 participating research groups; here, a success means that a very good representation of the observed crystal was included in three allowed predictions per target. Amongst these successes were three predictions of the crystal structure of the flexible molecule – a category for which there has only been one correct prediction over the three first blind tests – and two predictions of the structure of a cocrystal of small molecules.

Much of the improved success over previous blind tests can be associated with improvements in how we calculate the relative energies of putative crystal structures. All successful predictions in this blind test were achieved by going beyond standard force fields, although several very different approaches are represented in the successful methods: dispersion-corrected periodic density-functional theory (Neuman, Leusen, Kendrick), empirically and non-empirically derived anisotropic atom–atom model potentials (Ammon; van Eijck; Day, Cooper, Cruz Cabeza, Hejczyk; Price, Misquitta, Karamertzanis & Welch), core-shell electrostatic models (Boerrigter, Tan) and the semi-classical density sums (SCDS-PIXEL) approach (Schweizer). The evidence here and in the published literature strongly suggests that off-the-shelf force fields will not be generally successful for the final energy ranking in crystal structure prediction.

The most successful results were achieved by applying plane-wave density-functional theory calculations, supplemented by an empirically derived C₆-dispersion energy term, to the final energy minimization and energy ranking of the putative crystal structures. While the method comes at an appreciable computational cost, all four observed crystal structures were predicted, each as the first ranked structure; such consistently correct predictions have not previously been achieved under blind test conditions. These impressive results support the approach of pursuing increasingly accurate relative lattice energies as worthwhile for crystal structure prediction, and this method has proven to be sufficiently accurate for the challenges presented in CSP2007.

These results demonstrate that the crystal structures of small rigid organic molecules are predictable, within the restrictions of the blind test. The molecules represented in the blind test are all quite small and have very little conformational flexibility; even the molecule chosen to represent the category designed to test methods for dealing with flexibility turned out to be fairly rigid. Furthermore, the contents of the asymmetric unit are known: Z′ ≤ 1 was given, as well as the stoichiometric ratio and non-ionicity of components in the cocrystal. Therefore, the whole challenge of crystal structure prediction of organic molecules is certainly not solved and the area is open to further development. However, even within these limits of applicability tested here, the results that have been achieved by some of the methods are of a sufficient level to be constructively applied to a number of problems in materials design and to improving our understanding of crystallization and polymorphism. Furthermore, we believe that it will be possible to extend some of the current methods to the successful prediction of the crystal structures of even more complex systems, broadening the possible areas of application of such methods.

Supporting information

Experimentally determined crystal structures. DOI: 10.1107/S0108768109004066/bk5081sup1.cif

Predicted crystal structures from each participant, for each target. DOI: 10.1107/S0108768109004066/bk5081sup2.cif

Discussion of results and methodology. DOI: 10.1107/S0108768109004066/bk5081sup3.pdf

Pdf of cif of experimentally determined crystal structures. DOI: 10.1107/S0108768109004066/bk5081sup4.pdf

Pdf of cif of predicted crystal structures. DOI: 10.1107/S0108768109004066/bk5081sup5.pdf

Computing details top

Data collection: SMART (Siemens, 1993) for expt_XII; SMART (Bruker, 2002) for expt_XIII; Bruker SMART for expt_XV. Cell refinement: SAINT (Siemens ,1995) for expt_XII; SAINT (Bruker, 2002) for expt_XIII; Bruker SMART for expt_XV. Data reduction: SAINT (Siemens ,1995) for expt_XII; SAINT for expt_XIII; Bruker SHELXTL for expt_XV. Program(s) used to solve structure: SHELXS 86 (Sheldrick, 1986) for expt_XII; SHELXTL (Sheldrick, 1997) for expt_XIII; SHELXS97 (Sheldrick, 1990) for expt_XIV, expt_XV. Program(s) used to refine structure: CRYSTALS (Betteridge et al., 2003) for expt_XII; SHELXTL for expt_XIII; SHELXL97 (Sheldrick, 1997) for expt_XIV, expt_XV. Molecular graphics: CAMERON (Watkin et al., 1996) for expt_XII; SHELXTL for expt_XIII; Bruker SHELXTL for expt_XV. Software used to prepare material for publication: CRYSTALS (Betteridge et al., 2003) for expt_XII; SHELXTL for expt_XIII; Bruker SHELXTL for expt_XV.

(expt_XII) top

Crystal data top

C₃H₄O	F(000) = 240
M_r = 56.06	D_x = 1.152 Mg m⁻³
Orthorhombic, Pbca	Mo Kα radiation, λ = 0.71073 Å
Hall symbol: -P 2ac 2ab	Cell parameters from 950 reflections
a = 6.970 (3) Å	θ = 4–23.5°
b = 9.514 (5) Å	µ = 0.09 mm⁻¹
c = 9.752 (5) Å	T = 150 K
V = 646.7 (5) Å³	Cylinder, colourless
Z = 8	2.00 × 0.43 × 0.43 mm

Data collection top

Bruker SMART diffractometer	490 reflections with I > 2.0σ(I)
Graphite monochromator	R_int = 0.029
ω scans	θ_max = 25.0°, θ_min = 4.2°
Absorption correction: multi-scan SADABS	h = −8→8
T_min = 0.79, T_max = 0.96	k = −11→11
2237 measured reflections	l = 0→11
570 independent reflections

Refinement top

Refinement on F²	Primary atom site location: structure-invariant direct methods
Least-squares matrix: full	Hydrogen site location: difference Fourier map
R[F² > 2σ(F²)] = 0.045	Restrained refall
wR(F²) = 0.120	Method = Modified Sheldrick w = 1/[σ²(F²) + ( 0.02P)² + 0.0P] , where P = (max(F_o²,0) + 2F_c²)/3
S = 1.40	(Δ/σ)_max = 0.014
570 reflections	Δρ_max = 0.13 e Å⁻³
50 parameters	Δρ_min = −0.13 e Å⁻³
4 restraints

Crystal data top

C₃H₄O	V = 646.7 (5) Å³
M_r = 56.06	Z = 8
Orthorhombic, Pbca	Mo Kα radiation
a = 6.970 (3) Å	µ = 0.09 mm⁻¹
b = 9.514 (5) Å	T = 150 K
c = 9.752 (5) Å	2.00 × 0.43 × 0.43 mm

Data collection top

Bruker SMART diffractometer	570 independent reflections
Absorption correction: multi-scan SADABS	490 reflections with I > 2.0σ(I)
T_min = 0.79, T_max = 0.96	R_int = 0.029
2237 measured reflections

Refinement top

R[F² > 2σ(F²)] = 0.045	4 restraints
wR(F²) = 0.120	Restrained refall
S = 1.40	Δρ_max = 0.13 e Å⁻³
570 reflections	Δρ_min = −0.13 e Å⁻³
50 parameters

Fractional atomic coordinates and isotropic or equivalent isotropic displacement parameters (Å²) top

	x	y	z	U_iso*/U_eq
C1	0.8803 (3)	0.2131 (2)	0.4800 (2)	0.0535
C2	0.9203 (3)	0.1717 (2)	0.3568 (2)	0.0464
C3	0.8581 (3)	0.0378 (2)	0.3071 (2)	0.0505
O4	0.8889 (2)	−0.00711 (14)	0.19511 (17)	0.0666
H5	0.785 (3)	−0.0142 (19)	0.3756 (17)	0.068 (4)*
H11	0.807 (3)	0.1547 (18)	0.5407 (17)	0.068 (4)*
H12	0.924 (3)	0.3025 (14)	0.516 (2)	0.068 (4)*
H2	0.989 (3)	0.2300 (18)	0.2936 (16)	0.068 (4)*

Atomic displacement parameters (Å²) top

	U¹¹	U²²	U³³	U¹²	U¹³	U²³
C1	0.0516 (13)	0.0568 (14)	0.0521 (13)	0.0067 (10)	−0.0020 (9)	0.0022 (11)
C2	0.0470 (11)	0.0439 (11)	0.0484 (12)	−0.0021 (9)	−0.0005 (9)	0.0043 (9)
C3	0.0436 (12)	0.0455 (11)	0.0623 (16)	0.0023 (10)	−0.0080 (11)	0.0032 (10)
O4	0.0734 (12)	0.0573 (11)	0.0690 (13)	0.0042 (8)	−0.0069 (9)	−0.0164 (8)

Geometric parameters (Å, º) top

C1—C2	1.295 (3)	C2—H2	0.958 (9)
C1—H11	0.960 (9)	C3—O4	1.192 (2)
C1—H12	0.968 (9)	C3—H5	0.975 (9)
C2—C3	1.430 (3)

C2—C1—H11	120.7 (13)	C3—C2—H2	116.7 (12)
C2—C1—H12	122.4 (13)	C2—C3—O4	125.1 (2)
H11—C1—H12	116.9 (18)	C2—C3—H5	112.2 (13)
C1—C2—C3	121.4 (2)	O4—C3—H5	122.7 (13)
C1—C2—H2	121.9 (12)

(expt_XIII) 2,6-dibromo-1-chloro-4-fluorobenzene top

Crystal data top

C₆H₂Br₂ClF	F(000) = 536
M_r = 288.35	D_x = 2.528 Mg m⁻³
Monoclinic, P2₁/c	Mo Kα radiation, λ = 0.71073 Å
a = 3.8943 (5) Å	Cell parameters from 2210 reflections
b = 13.5109 (17) Å	θ = 2.8–23.3°
c = 14.4296 (17) Å	µ = 10.98 mm⁻¹
β = 93.636 (2)°	T = 173 K
V = 757.69 (16) Å³	Needle, colorless
Z = 4	0.50 × 0.15 × 0.10 mm

Data collection top

Bruker SMART area detector diffractometer	1091 independent reflections
Radiation source: fine-focus sealed tube	1011 reflections with I > 2σ(I)
Graphite monochromator	R_int = 0.040
ω scans	θ_max = 23.3°, θ_min = 2.1°
Absorption correction: multi-scan SADABS; Shekdrick, 1996; Blessing, 1995	h = −2→4
T_min = 0.16, T_max = 0.33	k = −14→15
3351 measured reflections	l = −16→14

Refinement top

Refinement on F²	Primary atom site location: structure-invariant direct methods
Least-squares matrix: full	Secondary atom site location: difference Fourier map
R[F² > 2σ(F²)] = 0.036	Hydrogen site location: inferred from neighbouring sites
wR(F²) = 0.089	H-atom parameters constrained
S = 1.08	w = 1/[σ²(F_o²) + (0.062P)² + 0.024P] where P = (F_o² + 2F_c²)/3
1091 reflections	(Δ/σ)_max = 0.001
92 parameters	Δρ_max = 0.64 e Å⁻³
0 restraints	Δρ_min = −1.01 e Å⁻³

Crystal data top

C₆H₂Br₂ClF	V = 757.69 (16) Å³
M_r = 288.35	Z = 4
Monoclinic, P2₁/c	Mo Kα radiation
a = 3.8943 (5) Å	µ = 10.98 mm⁻¹
b = 13.5109 (17) Å	T = 173 K
c = 14.4296 (17) Å	0.50 × 0.15 × 0.10 mm
β = 93.636 (2)°

Data collection top

Bruker SMART area detector diffractometer	1091 independent reflections
Absorption correction: multi-scan SADABS; Shekdrick, 1996; Blessing, 1995	1011 reflections with I > 2σ(I)
T_min = 0.16, T_max = 0.33	R_int = 0.040
3351 measured reflections	θ_max = 23.3°

Refinement top

R[F² > 2σ(F²)] = 0.036	0 restraints
wR(F²) = 0.089	H-atom parameters constrained
S = 1.08	Δρ_max = 0.64 e Å⁻³
1091 reflections	Δρ_min = −1.01 e Å⁻³
92 parameters

Fractional atomic coordinates and isotropic or equivalent isotropic displacement parameters (Å²) top

	x	y	z	U_iso*/U_eq	Occ. (<1)
Br2	1.01129 (14)	0.14882 (3)	0.90258 (3)	0.0244 (2)
Br6	0.39102 (13)	0.40777 (4)	0.62652 (3)	0.0249 (2)
Cl1	0.6619 (4)	0.35899 (9)	0.83886 (9)	0.0288 (5)	1.000 (5)
F4	0.7773 (10)	0.0514 (2)	0.5594 (2)	0.0455 (10)
C1	0.6950 (12)	0.2676 (3)	0.7563 (3)	0.0189 (11)
C2	0.8441 (12)	0.1773 (3)	0.7801 (3)	0.0193 (10)
C3	0.8718 (13)	0.1031 (3)	0.7138 (4)	0.0240 (12)
H3	0.9706	0.0407	0.7300	0.029*
C4	0.7523 (15)	0.1231 (4)	0.6244 (3)	0.0264 (12)
C5	0.6035 (14)	0.2116 (4)	0.5968 (3)	0.0258 (12)
H5	0.5198	0.2226	0.5344	0.031*
C6	0.5814 (13)	0.2838 (3)	0.6641 (3)	0.0206 (11)

Atomic displacement parameters (Å²) top

	U¹¹	U²²	U³³	U¹²	U¹³	U²³
Br2	0.0240 (4)	0.0266 (4)	0.0223 (4)	−0.0001 (2)	−0.0006 (2)	0.00311 (19)
Br6	0.0253 (4)	0.0228 (4)	0.0265 (4)	0.0040 (2)	0.0022 (2)	0.0032 (2)
Cl1	0.0403 (9)	0.0230 (8)	0.0232 (8)	0.0015 (5)	0.0036 (6)	−0.0046 (5)
F4	0.080 (3)	0.0305 (18)	0.0273 (18)	0.0153 (18)	0.0117 (17)	−0.0075 (14)
C1	0.020 (3)	0.019 (2)	0.018 (3)	−0.001 (2)	0.009 (2)	0.001 (2)
C2	0.011 (2)	0.027 (3)	0.020 (3)	−0.002 (2)	0.003 (2)	0.004 (2)
C3	0.026 (3)	0.019 (3)	0.028 (3)	0.006 (2)	0.014 (2)	−0.001 (2)
C4	0.037 (3)	0.020 (3)	0.024 (3)	0.001 (2)	0.008 (2)	−0.006 (2)
C5	0.029 (3)	0.031 (3)	0.017 (3)	0.000 (2)	0.003 (2)	−0.001 (2)
C6	0.016 (3)	0.022 (3)	0.025 (3)	−0.002 (2)	0.006 (2)	0.005 (2)

Geometric parameters (Å, º) top

Br2—C2	1.884 (5)	C2—C3	1.395 (7)
Br6—C6	1.897 (5)	C3—C4	1.370 (8)
Cl1—C1	1.727 (5)	C3—H3	0.9500
F4—C4	1.355 (6)	C4—C5	1.377 (7)
C1—C2	1.385 (6)	C5—C6	1.382 (7)
C1—C6	1.393 (7)	C5—H5	0.9500

C2—C1—C6	118.6 (4)	F4—C4—C3	118.4 (5)
C2—C1—Cl1	120.5 (4)	F4—C4—C5	117.9 (5)
C6—C1—Cl1	120.9 (4)	C3—C4—C5	123.7 (5)
C1—C2—C3	120.8 (5)	C4—C5—C6	117.0 (5)
C1—C2—Br2	121.8 (4)	C4—C5—H5	121.5
C3—C2—Br2	117.4 (4)	C6—C5—H5	121.5
C4—C3—C2	117.9 (5)	C5—C6—C1	122.0 (5)
C4—C3—H3	121.1	C5—C6—Br6	117.6 (4)
C2—C3—H3	121.1	C1—C6—Br6	120.4 (4)

(expt_XIV) top

Crystal data top

C₁₀H₁₀N₂S₃	V = 1142.4 (4) Å³
M_r = 254.38	Z = 4
?, ?	F(000) = 528
a = 13.060 (3) Å	D_x = 1.479 Mg m⁻³
b = 9.738 (2) Å	Mo Kα radiation, λ = 0.71073 Å
c = 9.335 (2) Å	µ = 0.62 mm⁻¹
α = 90°	T = 150 K
β = 105.800 (3)°	× × mm
γ = 90°

Data collection top

Radiation source: fine-focus sealed tube	R_int = 0.041
Graphite monochromator	θ_max = 27.5°, θ_min = 1.6°
9769 measured reflections	h = −16→16
2607 independent reflections	k = −12→12
1919 reflections with I > 2σ(I)	l = −12→12

Refinement top

Refinement on F²	Primary atom site location: structure-invariant direct methods
Least-squares matrix: full	Secondary atom site location: difference Fourier map
R[F² > 2σ(F²)] = 0.032	Hydrogen site location: inferred from neighbouring sites
wR(F²) = 0.077	H atoms treated by a mixture of independent and constrained refinement
S = 0.99	w = 1/[σ²(F_o²) + (0.037P)²] where P = (F_o² + 2F_c²)/3
2607 reflections	(Δ/σ)_max = 0.001
138 parameters	Δρ_max = 0.34 e Å⁻³
0 restraints	Δρ_min = −0.25 e Å⁻³

Crystal data top

C₁₀H₁₀N₂S₃	γ = 90°
M_r = 254.38	V = 1142.4 (4) Å³
?, ?	Z = 4
a = 13.060 (3) Å	Mo Kα radiation
b = 9.738 (2) Å	µ = 0.62 mm⁻¹
c = 9.335 (2) Å	T = 150 K
α = 90°	× × mm
β = 105.800 (3)°

Data collection top

9769 measured reflections	1919 reflections with I > 2σ(I)
2607 independent reflections	R_int = 0.041

Refinement top

R[F² > 2σ(F²)] = 0.032	0 restraints
wR(F²) = 0.077	H atoms treated by a mixture of independent and constrained refinement
S = 0.99	Δρ_max = 0.34 e Å⁻³
2607 reflections	Δρ_min = −0.25 e Å⁻³
138 parameters

Special details top

Geometry. All e.s.d.'s (except the e.s.d. in the dihedral angle between two l.s. planes) are estimated using the full covariance matrix. The cell e.s.d.'s are taken into account individually in the estimation of e.s.d.'s in distances, angles and torsion angles; correlations between e.s.d.'s in cell parameters are only used when they are defined by crystal symmetry. An approximate (isotropic) treatment of cell e.s.d.'s is used for estimating e.s.d.'s involving l.s. planes.

Refinement. Refinement of F² against ALL reflections. The weighted R-factor wR and goodness of fit S are based on F², conventional R-factors R are based on F, with F set to zero for negative F². The threshold expression of F² > σ(F²) is used only for calculating R-factors(gt) etc. and is not relevant to the choice of reflections for refinement. R-factors based on F² are statistically about twice as large as those based on F, and R- factors based on ALL data will be even larger.

Fractional atomic coordinates and isotropic or equivalent isotropic displacement parameters (Å²) top

	x	y	z	U_iso*/U_eq
S1	0.83433 (4)	0.73937 (5)	0.06977 (6)	0.02633 (14)
S2	0.64445 (4)	1.15798 (5)	0.13523 (6)	0.02746 (14)
S3	0.93182 (4)	0.96808 (5)	0.27809 (6)	0.02901 (14)
N4	0.76679 (12)	0.98549 (15)	0.03162 (16)	0.0200 (3)
C5	0.84298 (15)	0.9099 (2)	0.1279 (2)	0.0227 (4)
C6	0.61949 (15)	0.9667 (2)	−0.2038 (2)	0.0254 (4)
H6A	0.6031	1.0619	−0.2084	0.030*
N7	0.80564 (13)	1.21880 (15)	0.02127 (17)	0.0225 (4)
C8	0.70002 (14)	0.91293 (19)	−0.0888 (2)	0.0209 (4)
C9	0.66864 (16)	0.6838 (2)	−0.1916 (2)	0.0265 (5)
H9A	0.6847	0.5885	−0.1872	0.032*
C10	0.72610 (15)	0.77374 (19)	−0.0831 (2)	0.0224 (4)
C11	0.89594 (16)	1.1863 (2)	−0.0386 (2)	0.0291 (5)
H11A	0.9627	1.1995	0.0394	0.044*
H11B	0.8946	1.2471	−0.1227	0.044*
H11C	0.8907	1.0906	−0.0722	0.044*
C12	0.74426 (15)	1.12697 (19)	0.0614 (2)	0.0210 (4)
C13	0.56371 (16)	0.8763 (2)	−0.3118 (2)	0.0295 (5)
H13A	0.5079	0.9103	−0.3919	0.035*
C14	0.78734 (16)	1.36505 (19)	0.0429 (2)	0.0291 (5)
H14A	0.7121	1.3867	−0.0012	0.044*
H14B	0.8312	1.4204	−0.0052	0.044*
H14C	0.8066	1.3857	0.1497	0.044*
C16	0.58781 (16)	0.7371 (2)	−0.3054 (2)	0.0289 (5)
H16A	0.5479	0.6776	−0.3808	0.035*

Atomic displacement parameters (Å²) top

	U¹¹	U²²	U³³	U¹²	U¹³	U²³
S1	0.0283 (3)	0.0192 (3)	0.0300 (3)	0.0049 (2)	0.0054 (2)	−0.0009 (2)
S2	0.0280 (3)	0.0237 (3)	0.0329 (3)	0.0048 (2)	0.0121 (2)	0.0027 (2)
S3	0.0266 (3)	0.0292 (3)	0.0269 (3)	0.0035 (2)	−0.0001 (2)	−0.0022 (2)
N4	0.0220 (9)	0.0154 (8)	0.0217 (8)	0.0012 (6)	0.0046 (7)	−0.0005 (6)
C5	0.0218 (10)	0.0219 (10)	0.0259 (10)	0.0026 (8)	0.0093 (8)	0.0006 (8)
C6	0.0253 (11)	0.0252 (11)	0.0256 (10)	0.0002 (8)	0.0068 (9)	0.0003 (8)
N7	0.0260 (9)	0.0169 (8)	0.0237 (9)	−0.0001 (7)	0.0053 (7)	0.0005 (7)
C8	0.0205 (10)	0.0217 (10)	0.0223 (10)	−0.0017 (8)	0.0086 (8)	−0.0015 (8)
C9	0.0289 (11)	0.0206 (10)	0.0330 (11)	−0.0045 (9)	0.0138 (9)	−0.0037 (9)
C10	0.0241 (10)	0.0218 (10)	0.0228 (10)	0.0013 (8)	0.0090 (8)	0.0009 (8)
C11	0.0281 (11)	0.0297 (11)	0.0324 (11)	−0.0041 (9)	0.0129 (10)	0.0022 (9)
C12	0.0230 (10)	0.0197 (10)	0.0181 (9)	0.0016 (8)	0.0020 (8)	0.0018 (8)
C13	0.0240 (11)	0.0376 (12)	0.0257 (11)	−0.0014 (9)	0.0047 (9)	0.0013 (9)
C14	0.0354 (12)	0.0186 (10)	0.0316 (11)	−0.0007 (9)	0.0063 (10)	0.0018 (9)
C16	0.0260 (11)	0.0333 (12)	0.0287 (11)	−0.0104 (9)	0.0094 (9)	−0.0078 (9)

Geometric parameters (Å, º) top

S1—C5	1.741 (2)	C8—C10	1.395 (3)
S1—C10	1.746 (2)	C9—C16	1.379 (3)
S2—C12	1.6599 (19)	C9—C10	1.393 (3)
S3—C5	1.657 (2)	C9—H9A	0.9500
N4—C5	1.361 (2)	C11—H11A	0.9800
N4—C8	1.410 (2)	C11—H11B	0.9800
N4—C12	1.451 (2)	C11—H11C	0.9800
C6—C8	1.385 (3)	C13—C16	1.389 (3)
C6—C13	1.387 (3)	C13—H13A	0.9500
C6—H6A	0.9500	C14—H14A	0.9800
N7—C12	1.321 (2)	C14—H14B	0.9800
N7—C14	1.467 (2)	C14—H14C	0.9800
N7—C11	1.470 (2)	C16—H16A	0.9500

C5—S1—C10	92.36 (9)	N7—C11—H11A	109.5
C5—N4—C8	115.95 (16)	N7—C11—H11B	109.5
C5—N4—C12	122.47 (15)	H11A—C11—H11B	109.5
C8—N4—C12	120.96 (15)	N7—C11—H11C	109.5
N4—C5—S3	126.26 (15)	H11A—C11—H11C	109.5
N4—C5—S1	109.65 (14)	H11B—C11—H11C	109.5
S3—C5—S1	124.09 (12)	N7—C12—N4	114.71 (16)
C8—C6—C13	117.51 (19)	N7—C12—S2	126.88 (15)
C8—C6—H6A	121.2	N4—C12—S2	118.38 (13)
C13—C6—H6A	121.2	C6—C13—C16	121.32 (19)
C12—N7—C14	118.93 (16)	C6—C13—H13A	119.3
C12—N7—C11	124.97 (16)	C16—C13—H13A	119.3
C14—N7—C11	116.05 (16)	N7—C14—H14A	109.5
C6—C8—C10	121.50 (18)	N7—C14—H14B	109.5
C6—C8—N4	127.07 (18)	H14A—C14—H14B	109.5
C10—C8—N4	111.41 (16)	N7—C14—H14C	109.5
C16—C9—C10	118.11 (18)	H14A—C14—H14C	109.5
C16—C9—H9A	120.9	H14B—C14—H14C	109.5
C10—C9—H9A	120.9	C9—C16—C13	121.15 (19)
C9—C10—C8	120.36 (18)	C9—C16—H16A	119.4
C9—C10—S1	129.01 (15)	C13—C16—H16A	119.4
C8—C10—S1	110.62 (14)

C8—N4—C5—S3	180.00 (14)	C6—C8—C10—S1	177.26 (14)
C12—N4—C5—S3	−9.0 (3)	N4—C8—C10—S1	−1.2 (2)
C8—N4—C5—S1	0.2 (2)	C5—S1—C10—C9	−179.50 (19)
C12—N4—C5—S1	171.19 (13)	C5—S1—C10—C8	1.07 (15)
C10—S1—C5—N4	−0.70 (14)	C14—N7—C12—N4	178.37 (15)
C10—S1—C5—S3	179.46 (13)	C11—N7—C12—N4	−4.5 (3)
C13—C6—C8—C10	1.5 (3)	C14—N7—C12—S2	0.4 (3)
C13—C6—C8—N4	179.62 (18)	C11—N7—C12—S2	177.53 (15)
C5—N4—C8—C6	−177.65 (18)	C5—N4—C12—N7	85.3 (2)
C12—N4—C8—C6	11.2 (3)	C8—N4—C12—N7	−104.1 (2)
C5—N4—C8—C10	0.7 (2)	C5—N4—C12—S2	−96.55 (19)
C12—N4—C8—C10	−170.52 (16)	C8—N4—C12—S2	74.04 (19)
C16—C9—C10—C8	1.6 (3)	C8—C6—C13—C16	−0.2 (3)
C16—C9—C10—S1	−177.78 (15)	C10—C9—C16—C13	−0.3 (3)
C6—C8—C10—C9	−2.2 (3)	C6—C13—C16—C9	−0.4 (3)
N4—C8—C10—C9	179.36 (17)

(expt_XV) 2-amino-4-methylpyrimidine, 2-methylbenzoic acid top

Crystal data top

(C₆H₇N₃)·(C₇H₈O₂)	F(000) = 520
M_r = 245.28	D_x = 1.301 Mg m⁻³
Monoclinic, P2₁/n	Mo Kα radiation, λ = 0.71073 Å
a = 7.2795 (10) Å	Cell parameters from 3341 reflections
b = 13.6699 (18) Å	θ = 2.2–27.4°
c = 12.6695 (16) Å	µ = 0.09 mm⁻¹
β = 96.646 (3)°	T = 203 K
V = 1252.3 (3) Å³	Prism, colorless
Z = 4	0.30 × 0.30 × 0.20 mm

Data collection top

CCD area detector diffractometer	1791 reflections with I > 2σ(I)
Radiation source: fine-focus sealed tube	R_int = 0.053
Graphite monochromator	θ_max = 28.3°, θ_min = 2.2°
phi and ω scans	h = −9→9
10115 measured reflections	k = −17→17
2905 independent reflections	l = −16→16

Refinement top

Refinement on F²	Secondary atom site location: difference Fourier map
Least-squares matrix: full	Hydrogen site location: inferred from neighbouring sites
R[F² > 2σ(F²)] = 0.046	H atoms treated by a mixture of independent and constrained refinement
wR(F²) = 0.122	w = 1/[σ²(F_o²) + (0.060P)²] where P = (F_o² + 2F_c²)/3
S = 1.01	(Δ/σ)_max = 0.001
2905 reflections	Δρ_max = 0.23 e Å⁻³
174 parameters	Δρ_min = −0.20 e Å⁻³
0 restraints	Extinction correction: SHELXL, Fc^*=kFc[1+0.001xFc²λ³/sin(2θ)]^-1/4
Primary atom site location: structure-invariant direct methods	Extinction coefficient: 0.061 (5)

Crystal data top

(C₆H₇N₃)·(C₇H₈O₂)	V = 1252.3 (3) Å³
M_r = 245.28	Z = 4
Monoclinic, P2₁/n	Mo Kα radiation
a = 7.2795 (10) Å	µ = 0.09 mm⁻¹
b = 13.6699 (18) Å	T = 203 K
c = 12.6695 (16) Å	0.30 × 0.30 × 0.20 mm
β = 96.646 (3)°

Data collection top

CCD area detector diffractometer	1791 reflections with I > 2σ(I)
10115 measured reflections	R_int = 0.053
2905 independent reflections

Refinement top

R[F² > 2σ(F²)] = 0.046	0 restraints
wR(F²) = 0.122	H atoms treated by a mixture of independent and constrained refinement
S = 1.01	Δρ_max = 0.23 e Å⁻³
2905 reflections	Δρ_min = −0.20 e Å⁻³
174 parameters

Special details top

Fractional atomic coordinates and isotropic or equivalent isotropic displacement parameters (Å²) top

	x	y	z	U_iso*/U_eq	Occ. (<1)
N11	−0.2014 (2)	1.05629 (10)	0.04985 (10)	0.0382 (4)
C12	−0.1356 (2)	1.01550 (11)	0.14345 (12)	0.0330 (4)
N12	0.0361 (2)	0.97961 (12)	0.15313 (11)	0.0412 (4)
H12A	0.078 (2)	0.9469 (13)	0.2145 (14)	0.049*
H12B	0.089 (3)	0.9707 (13)	0.0959 (14)	0.049*
N13	−0.23154 (18)	1.00838 (10)	0.22833 (9)	0.0324 (3)
C14	−0.4026 (2)	1.04538 (11)	0.21942 (12)	0.0344 (4)
C15	−0.4765 (2)	1.09168 (13)	0.12732 (13)	0.0422 (4)
H15A	−0.5945	1.1204	0.1214	0.051*
C16	−0.3701 (2)	1.09386 (13)	0.04499 (13)	0.0419 (4)
H16A	−0.4197	1.1238	−0.0188	0.050*
C17	−0.5097 (2)	1.03200 (14)	0.31211 (13)	0.0449 (5)
H17A	−0.4338	0.9980	0.3685	0.067*	0.414 (19)
H17B	−0.6200	0.9938	0.2904	0.067*	0.414 (19)
H17C	−0.5449	1.0954	0.3375	0.067*	0.414 (19)
H17D	−0.6276	1.0654	0.2984	0.054*	0.586 (19)
H17E	−0.4403	1.0591	0.3754	0.054*	0.586 (19)
H17F	−0.5308	0.9628	0.3227	0.054*	0.586 (19)
C21	0.0623 (2)	0.82593 (11)	0.52929 (11)	0.0326 (4)
C22	0.2288 (2)	0.77798 (12)	0.56317 (12)	0.0353 (4)
C23	0.2467 (3)	0.73560 (12)	0.66401 (13)	0.0415 (4)
H23A	0.3585	0.7046	0.6892	0.050*
C24	0.1065 (3)	0.73768 (13)	0.72759 (13)	0.0448 (5)
H24A	0.1227	0.7077	0.7948	0.054*
C25	−0.0571 (3)	0.78332 (13)	0.69346 (13)	0.0455 (5)
H25A	−0.1541	0.7842	0.7364	0.055*
C26	−0.0777 (2)	0.82809 (13)	0.59511 (12)	0.0401 (4)
H26A	−0.1889	0.8607	0.5722	0.048*
C27	0.0273 (2)	0.87447 (12)	0.42353 (12)	0.0342 (4)
O27	−0.12808 (17)	0.92383 (10)	0.41108 (9)	0.0458 (4)
H27	−0.152 (2)	0.9546 (13)	0.3411 (15)	0.055*
O28	0.12949 (17)	0.86829 (10)	0.35444 (9)	0.0531 (4)
C28	0.3863 (2)	0.77007 (13)	0.49849 (13)	0.0429 (4)
H28A	0.4865	0.7340	0.5381	0.064*
H28B	0.4290	0.8351	0.4826	0.064*
H28C	0.3462	0.7360	0.4327	0.064*

Atomic displacement parameters (Å²) top

	U¹¹	U²²	U³³	U¹²	U¹³	U²³
N11	0.0438 (9)	0.0391 (8)	0.0321 (7)	0.0035 (7)	0.0056 (6)	0.0049 (6)
C12	0.0355 (10)	0.0325 (9)	0.0312 (8)	−0.0017 (7)	0.0053 (7)	−0.0002 (6)
N12	0.0352 (9)	0.0571 (10)	0.0321 (8)	0.0048 (7)	0.0071 (7)	0.0081 (7)
N13	0.0335 (8)	0.0335 (8)	0.0306 (7)	−0.0001 (6)	0.0051 (6)	0.0010 (5)
C14	0.0355 (10)	0.0318 (9)	0.0360 (9)	0.0001 (7)	0.0047 (7)	−0.0043 (6)
C15	0.0421 (11)	0.0422 (10)	0.0418 (9)	0.0099 (8)	0.0032 (8)	−0.0011 (7)
C16	0.0504 (12)	0.0382 (10)	0.0359 (9)	0.0081 (9)	0.0000 (8)	0.0048 (7)
C17	0.0412 (11)	0.0513 (12)	0.0436 (10)	0.0061 (9)	0.0114 (8)	0.0007 (8)
C21	0.0376 (10)	0.0307 (9)	0.0294 (8)	−0.0016 (7)	0.0039 (7)	−0.0023 (6)
C22	0.0394 (10)	0.0300 (9)	0.0363 (8)	−0.0020 (8)	0.0038 (7)	−0.0026 (7)
C23	0.0473 (11)	0.0351 (10)	0.0410 (9)	0.0072 (8)	0.0005 (8)	0.0034 (7)
C24	0.0612 (13)	0.0392 (10)	0.0342 (9)	0.0050 (9)	0.0071 (9)	0.0059 (7)
C25	0.0538 (12)	0.0485 (11)	0.0367 (9)	0.0068 (9)	0.0158 (9)	0.0054 (8)
C26	0.0433 (11)	0.0435 (10)	0.0344 (9)	0.0048 (8)	0.0078 (8)	0.0009 (7)
C27	0.0345 (10)	0.0359 (9)	0.0319 (8)	−0.0007 (8)	0.0030 (7)	−0.0012 (6)
O27	0.0466 (8)	0.0591 (9)	0.0323 (6)	0.0137 (6)	0.0069 (6)	0.0102 (5)
O28	0.0456 (8)	0.0772 (10)	0.0387 (7)	0.0133 (7)	0.0139 (6)	0.0159 (6)
C28	0.0387 (10)	0.0501 (11)	0.0399 (9)	0.0037 (8)	0.0048 (8)	0.0020 (8)

Geometric parameters (Å, º) top

N11—C16	1.326 (2)	C21—C22	1.401 (2)
N11—C12	1.347 (2)	C21—C27	1.490 (2)
C12—N12	1.335 (2)	C22—C23	1.395 (2)
C12—N13	1.3517 (18)	C22—C28	1.489 (2)
N13—C14	1.337 (2)	C23—C24	1.372 (2)
C14—C15	1.380 (2)	C24—C25	1.369 (2)
C14—C17	1.494 (2)	C25—C26	1.381 (2)
C15—C16	1.370 (2)	C27—O28	1.2155 (17)
C21—C26	1.390 (2)	C27—O27	1.3108 (19)

C16—N11—C12	115.73 (14)	C22—C21—C27	122.40 (14)
N12—C12—N11	117.81 (14)	C23—C22—C21	117.44 (15)
N12—C12—N13	117.70 (14)	C23—C22—C28	118.53 (16)
N11—C12—N13	124.49 (15)	C21—C22—C28	124.03 (14)
C14—N13—C12	117.88 (13)	C24—C23—C22	122.15 (17)
N13—C14—C15	120.84 (15)	C25—C24—C23	120.19 (16)
N13—C14—C17	116.94 (14)	C24—C25—C26	119.13 (16)
C15—C14—C17	122.21 (15)	C25—C26—C21	121.41 (17)
C16—C15—C14	117.04 (16)	O28—C27—O27	122.72 (15)
N11—C16—C15	123.95 (16)	O28—C27—C21	124.14 (15)
C26—C21—C22	119.64 (14)	O27—C27—C21	113.13 (13)
C26—C21—C27	117.96 (15)

C16—N11—C12—N12	−177.83 (15)	C27—C21—C22—C28	−0.1 (2)
C16—N11—C12—N13	2.2 (2)	C21—C22—C23—C24	1.7 (2)
N12—C12—N13—C14	179.13 (14)	C28—C22—C23—C24	−178.48 (16)
N11—C12—N13—C14	−0.9 (2)	C22—C23—C24—C25	−0.8 (3)
C12—N13—C14—C15	−1.7 (2)	C23—C24—C25—C26	−0.8 (3)
C12—N13—C14—C17	176.84 (14)	C24—C25—C26—C21	1.4 (3)
N13—C14—C15—C16	2.8 (2)	C22—C21—C26—C25	−0.4 (3)
C17—C14—C15—C16	−175.69 (15)	C27—C21—C26—C25	178.85 (16)
C12—N11—C16—C15	−0.9 (3)	C26—C21—C27—O28	−171.38 (16)
C14—C15—C16—N11	−1.4 (3)	C22—C21—C27—O28	7.9 (3)
C26—C21—C22—C23	−1.1 (2)	C26—C21—C27—O27	7.5 (2)
C27—C21—C22—C23	179.64 (14)	C22—C21—C27—O27	−173.29 (14)
C26—C21—C22—C28	179.10 (15)

Hydrogen-bond geometry (Å, º) top

D—H···A	D—H	H···A	D···A	D—H···A
O27—H27···N13	0.979 (18)	1.650 (19)	2.6193 (17)	169.8 (16)
N12—H12A···O28	0.919 (18)	2.069 (18)	2.9802 (18)	170.8 (15)
N12—H12B···N11ⁱ	0.868 (17)	2.137 (18)	3.0032 (19)	176.0 (17)

Symmetry code: (i) −x, −y+2, −z.

Experimental details

	(expt_XII)	(expt_XIII)	(expt_XIV)	(expt_XV)
Crystal data
Chemical formula	C₃H₄O	C₆H₂Br₂ClF	C₁₀H₁₀N₂S₃	(C₆H₇N₃)·(C₇H₈O₂)
M_r	56.06	288.35	254.38	245.28
Crystal system, space group	Orthorhombic, Pbca	Monoclinic, P2₁/c	?, ?	Monoclinic, P2₁/n
Temperature (K)	150	173	150	203
a, b, c (Å)	6.970 (3), 9.514 (5), 9.752 (5)	3.8943 (5), 13.5109 (17), 14.4296 (17)	13.060 (3), 9.738 (2), 9.335 (2)	7.2795 (10), 13.6699 (18), 12.6695 (16)
α, β, γ (°)	90, 90, 90	90, 93.636 (2), 90	90, 105.800 (3), 90	90, 96.646 (3), 90
V (Å³)	646.7 (5)	757.69 (16)	1142.4 (4)	1252.3 (3)
Z	8	4	4	4
Radiation type	Mo Kα	Mo Kα	Mo Kα	Mo Kα
µ (mm⁻¹)	0.09	10.98	0.62	0.09
Crystal size (mm)	2.00 × 0.43 × 0.43	0.50 × 0.15 × 0.10	× ×	0.30 × 0.30 × 0.20

Data collection
Diffractometer	Bruker SMART diffractometer	Bruker SMART area detector diffractometer	?	CCD area detector diffractometer
Absorption correction	Multi-scan SADABS	Multi-scan SADABS; Shekdrick, 1996; Blessing, 1995	–	–
T_min, T_max	0.79, 0.96	0.16, 0.33	–	–
No. of measured, independent and observed reflections	2237, 570, 490 [I > 2.0σ(I)]	3351, 1091, 1011 [I > 2σ(I)]	9769, 2607, 1919 [I > 2σ(I)]	10115, 2905, 1791 [I > 2σ(I)]
R_int	0.029	0.040	0.041	0.053
θ_max (°)	25.0	23.3	27.5	28.3
(sin θ/λ)_max (Å⁻¹)	0.595	0.556	0.650	0.667

Refinement
R[F² > 2σ(F²)], wR(F²), S	0.045, 0.120, 1.40	0.036, 0.089, 1.08	0.032, 0.077, 0.99	0.046, 0.122, 1.01
No. of reflections	570	1091	2607	2905
No. of parameters	50	92	138	174
No. of restraints	4	0	0	0
H-atom treatment	Restrained refall	H-atom parameters constrained	H atoms treated by a mixture of independent and constrained refinement	H atoms treated by a mixture of independent and constrained refinement
Δρ_max, Δρ_min (e Å⁻³)	0.13, −0.13	0.64, −1.01	0.34, −0.25	0.23, −0.20

Computer programs: SMART (Siemens, 1993), SMART (Bruker, 2002), Bruker SMART, SAINT (Siemens ,1995), SAINT (Bruker, 2002), SAINT, Bruker SHELXTL, SHELXS 86 (Sheldrick, 1986), SHELXTL (Sheldrick, 1997), SHELXS97 (Sheldrick, 1990), CRYSTALS (Betteridge et al., 2003), SHELXTL, SHELXL97 (Sheldrick, 1997), CAMERON (Watkin et al., 1996).

Selected geometric parameters (Å, º) for (expt_XII) top

C1—C2	1.295 (3)	C3—O4	1.192 (2)
C2—C3	1.430 (3)

C1—C2—C3	121.4 (2)	C2—C3—O4	125.1 (2)

Hydrogen-bond geometry (Å, º) for (expt_XV) top

D—H···A	D—H	H···A	D···A	D—H···A
O27—H27···N13	0.979 (18)	1.650 (19)	2.6193 (17)	169.8 (16)
N12—H12A···O28	0.919 (18)	2.069 (18)	2.9802 (18)	170.8 (15)
N12—H12B···N11ⁱ	0.868 (17)	2.137 (18)	3.0032 (19)	176.0 (17)

Symmetry code: (i) −x, −y+2, −z.

Footnotes

‡Current address: Preformulation, Product Research and Development, Eli Lilly and Company, Indianapolis, IN 46285, USA.

§Retired.

¶Current address: Avant-garde Materials Simulation Deutschland GmbH, Merzhauser Strasse 177, D-79100, Germany.

¹Supplementary data for this paper are available from the IUCr electronic archives (Reference: BK5081 ). Services for accessing these data are described at the back of the journal.

²Counting only the predictions that were completely blind. Limited structural information on one of the CSP2004 molecules was discovered part way through the blind test.

³When the CSP exercise was repeated in the post deadline period with electrostatic potential-derived charges instead of the default COMPASS force-field charges that were used in the official submission, the experimental structure was located and favourably ranked (Thakur & Desiraju, 2008).

Acknowledgements

We thank the CCDC for its support in organizing the blind test, and hosting of the workshop – in particular, Dr Frank Allen and Jean Mabbs. Many thanks are due to Professor A. L. Spek for accepting the role of independent referee, holding the experimental structures during the course of the blind test, and collecting submitted predictions from each of the participants. We are grateful to the crystallographers who supplied candidate structures: Professor Simon Parsons [molecule (XII)], Professor Doyle Britton [molecule (XIII)], Professor Alexander Blake [molecule (XIV)] and Professor Christer Aakeröy [molecule (XV)]. Also, thanks to Dr Sijbren Otto for selecting the target molecules from the list of candidates. G. M. Day thanks the Royal Society for a University Research Fellowship and the Pfizer Institute for Pharmaceutical Materials Science (University of Cambridge) for funding.

References

Aakeröy, C. B. (2007). Unpublished results. Google Scholar
Aakeröy, C. B., Beffert, K., Desper, J. & Elisabeth, E. (2003). Crystal Growth Des. 3, 837–846. Google Scholar
Allen, F. H. (2002). Acta Cryst. B58, 380–388. Web of Science CrossRef CAS IUCr Journals Google Scholar
Anghel, A. T., Day, G. M. & Price, S. L. (2002). CrystEngComm, 4, 348–355. Web of Science CrossRef CAS Google Scholar
Bazterra, V. E., Thorley, M., Ferraro, M. B. & Facelli, J. C. (2007). J. Chem. Theory Comput. 3, 201–209. Web of Science CrossRef CAS Google Scholar
Beyer, T., Lewis, T. & Price, S. L. (2001). CrystEngComm, 3, 178–212. Web of Science CrossRef Google Scholar
Blake, A. J., Clark, J. S. & Conroy, J. (2007). Unpublished results. Google Scholar
Boese, R. & Nussbaumer, M. (1994). Crystallographic Symposia 7, Transformations and Interactions in Organic Crystal Structure. International Union of Crystallography. Google Scholar
Borreguero, J. & Skolnick, J. (2007). Proteins Struct. Funct. Bioinf. 68, 48–56. Web of Science CrossRef CAS Google Scholar
Brillante, A., Bilotti, I., Della Valle, R. G., Venuti, E. & Girlando, A. (2008). CrystEngComm, 10, 937–946. Web of Science CrossRef CAS Google Scholar
Britton, D. (2008). Private communication (CCDC711859). CCDC, Cambridge, England. Google Scholar
Brooks III, C. L., Onuchic, J. N. & Wales, D. J. (2001). Science, 293, 612–613. Web of Science CrossRef PubMed Google Scholar
Busing, W. R. (1981). WMIN. Report ORNL-5747. Oak Ridge National Laboratory, Tennessee, USA. Google Scholar
Case, F. H., Brennan, J., Chaka, A., Dobbs, K. D., Friend, D. G., Frurip, D., Gordon, P. A., Moore, J., Mountain, R. D., Olson, J., Ross, R. B., Schiller, M. & Shen, V. K. (2007). Fluid Phase Equilib. 260, 153–163. Web of Science CrossRef CAS Google Scholar
Chisholm, J. A. & Motherwell, S. (2005). J. Appl. Cryst. 38, 228–231. Web of Science CrossRef IUCr Journals Google Scholar
Coombes, D. S., Catlow, C. R. A., Gale, J. D., Rohl, A. L. & Price, S. L. (2005). Cryst. Growth Des. 5, 879–885. Web of Science CrossRef CAS Google Scholar
Cooper, T. G., Jones, W., Motherwell, W. D. S. & Day, G. M. (2007). CrystEngComm, 9, 595–602. Web of Science CrossRef CAS Google Scholar
Cruz-Cabeza, A. J., Day, G. M. & Jones, W. (2008). Chem. Eur. J. 14, 8830–8836. Web of Science CSD CrossRef PubMed CAS Google Scholar
Cruz Cabeza, A. J., Day, G. M., Motherwell, W. D. S. & Jones, W. (2006). J. Am. Chem. Soc. 128, 14466–14467. Web of Science CSD CrossRef PubMed Google Scholar
Day, G. M., Chisholm, J., Shan, N., Motherwell, W. D. S. & Jones, W. (2004). Cryst. Growth Des. 4, 1327–1340. Web of Science CrossRef CAS Google Scholar
Day, G. M. et al. (2005). Acta Cryst. B61, 511–527. Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
Day, G. M. & Motherwell, W. D. S. (2006). Cryst. Growth Des. 6, 1985–1990. Web of Science CSD CrossRef CAS Google Scholar
Day, G. M., Motherwell, W. D. S. & Jones, W. (2005). Cryst. Growth Des. 5, 1023–1033. Web of Science CrossRef CAS Google Scholar
Day, G. M., Motherwell, W. D. S. & Jones, W. (2007). Phys. Chem. Chem. Phys. 9, 1693–1704. Web of Science CrossRef PubMed CAS Google Scholar
Day, G. M. & Price, S. L. (2003). J. Am. Chem. Soc. 125, 16434–16443. Web of Science CrossRef PubMed CAS Google Scholar
Della Valle, R. G., Venuti, E., Brillante, A. & Girlando, A. (2008). J. Phys. Chem. A, 112, 6715–6722. Web of Science CrossRef PubMed CAS Google Scholar
Dey, A., Kirchner, M. T., Vangala, V. R., Desiraju, G. R., Mondal, R. & Howard, J. A. K. (2005). J. Am. Chem. Soc. 127, 10545–10559. Web of Science CSD CrossRef PubMed CAS Google Scholar
Dey, A., Pati, N. N. & Desiraju, G. R. (2006). CrystEngComm, 8, 751–755. Web of Science CSD CrossRef CAS Google Scholar
Dzyabchenko, A. V. (1994). Acta Cryst. B50, 414–425. CrossRef CAS Web of Science IUCr Journals Google Scholar
Eijck, B. P. van (2001). J. Comput. Chem. 22, 816–826. Google Scholar
Eijck, B. P. van (2002). J. Comput. Chem. 23, 456–462. Web of Science CrossRef PubMed Google Scholar
Eijck, B. P. van (2005). Acta Cryst. B61, 528–535. Web of Science CrossRef IUCr Journals Google Scholar
Eijck, B. P. van & Kroon, J. (2000). Acta Cryst. B56, 535–542. Web of Science CrossRef IUCr Journals Google Scholar
Forster, T., Oswald, I. D. H. & Parsons, S. (2007). Unpublished results. Google Scholar
Ganesh, V., Dongare, R. K., Balanarayan, P. & Gadre, S. R. (2006). J. Chem. Phys. 125, 104109. Web of Science CrossRef PubMed Google Scholar
Gavezzotti, A. (1999–2000). ZIP-PROMET. University of Milano, Italy. Google Scholar
Gavezzotti, A. (2003a). CrystEngComm, 5, 429–438. Web of Science CrossRef CAS Google Scholar
Gavezzotti, A. (2003b). CrystEngComm, 5, 439–446. Web of Science CrossRef CAS Google Scholar
Gavezzotti, A. (2004). OPIX. University of Milano, Italy. Google Scholar
Hobza, P., Sponer, J. & Reschel, T. (1995). J. Comput. Chem. 16, 1315–1325. CrossRef CAS Web of Science Google Scholar
Hofmann, D. W. M. & Apostolakis, J. (2003). J. Mol. Struct. THEOCHEM, 647, 17–39. CrossRef CAS Google Scholar
Hofmann, D. W. M. & Kuleshova, L. (2005). J. Appl. Cryst. 38, 861–866. Web of Science CrossRef CAS IUCr Journals Google Scholar
Hofmann, D. W. M. & Lengauer, T. (1997). Acta Cryst. A53, 225–235. CrossRef CAS Web of Science IUCr Journals Google Scholar
Holden, J. R., Du, Z. Y. & Ammon, H. L. (1993). J. Comput. Chem. 14, 422–437. CrossRef CAS Web of Science Google Scholar
Karamertzanis, P. G. & Pantelides, C. C. (2005). J. Comput. Chem. 26, 304–324. Web of Science CrossRef PubMed CAS Google Scholar
Karamertzanis, P. G. & Pantelides, C. C. (2007). Mol. Phys. 105, 273–291. Web of Science CrossRef CAS Google Scholar
Karamertzanis, P. G. & Price, S. L. (2006). J. Chem. Theory Comput. 2, 1184–1199. Web of Science CrossRef CAS Google Scholar
Karamertzanis, P. G., Raiteri, P., Parrinello, M., Leslie, M. & Price, S. L. (2008). J. Phys. Chem. B, 112, 4298–4308. Web of Science CrossRef PubMed CAS Google Scholar
Khalili, M., Liwo, A. & Scheraga, H. A. (2006). J. Mol. Biol. 355, 536–547. Web of Science CrossRef PubMed CAS Google Scholar
Kresse, G. & Furtmüller, J. (1996). Phys. Rev. B, 54, 11169. CrossRef Web of Science Google Scholar
Kresse, G. & Hafner, J. (1993). J. Phys. Rev. B, 47, 558. CrossRef Web of Science Google Scholar
Kresse, G. & Joubert, D. (1999). Phys. Rev. B, 59, 1758. Web of Science CrossRef Google Scholar
Kristyán, S. & Pulay, P. (1994). Chem. Phys. Lett. 229, 175–180. Google Scholar
Liwo, A., Khalili, M., Czaplewski, C., Kalinowski, S., Oldziej, S., Wachucik, K. & Scheraga, H. A. (2007). J. Phys. Chem. B, 111, 260–285. Web of Science CrossRef PubMed CAS Google Scholar
Llinàs, A., Glen, R. C. & Goodman, J. M. (2008). J. Chem. Inf. Model. 48, 1289–1303. Web of Science PubMed Google Scholar
Lommerse, J. P. M., Motherwell, W. D. S., Ammon, H. L., Dunitz, J. D., Gavezzotti, A., Hofmann, D. W. M., Leusen, F. J. J., Mooij, W. T. M., Price, S. L., Schweizer, B., Schmidt, M. U., van Eijck, B. P., Verwer, P. & Williams, D. E. (2000). Acta Cryst. B56, 697–714. Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
Misquitta, A. J. & Stone, A. J. (2007). CamCASP. Cambridge, England. Google Scholar
Misquitta, A. J., Welch, G. W. A., Stone, A. J. & Price, S. L. (2008). Chem. Phys. Lett. 456, 105–109. Web of Science CrossRef CAS Google Scholar
Mooij, W. T. M., van Eijck, B. P. & Kroon, J. (1999). J. Phys. Chem. A, 103, 9883–9890. Web of Science CrossRef CAS Google Scholar
Motherwell, W. D. S. et al. (2002). Acta Cryst. B58, 647–661. Web of Science CrossRef CAS IUCr Journals Google Scholar
Moult, J., Fidelis, K., Kryshtafovych, A., Rost, B., Hubbard, T. & Tramontano, A. (2007). Proteins Struct. Funct. Bioinf. 69, 3–9. Web of Science CrossRef CAS Google Scholar
Neumann, M. A. (2008). J. Phys. Chem. B, 112, 9810–9829. Web of Science CrossRef PubMed CAS Google Scholar
Neumann, M. A., Leusen, F. J. J. & Kendrick, J. (2008). Angew. Chem. Int. Ed. 47, 2427–2430. Web of Science CrossRef CAS Google Scholar
Neumann, M. A. & Perrin, M.-A. (2005). J. Phys. Chem. B, 109, 15531–15541. Web of Science CrossRef PubMed CAS Google Scholar
Nicholls, A., Mobley, D. L., Guthrie, J. P., Chodera, J. D., Bayly, C. I., Cooper, M. D. & Pande, V. S. (2008). J. Med. Chem. 51, 769–779. Web of Science CrossRef PubMed CAS Google Scholar
Nyburg, S. C. & Faerman, C. H. (1985). Acta Cryst. B41, 274–279. CrossRef CAS Web of Science IUCr Journals Google Scholar
Oldziej, S. et al. (2005). Proc. Natl. Acad. Sci. 102, 7547–7552. Web of Science CrossRef PubMed CAS Google Scholar
Pillardy, J., Arnautova, Y. A., Czaplewski, C., Gibson, K. D. & Scheraga, H. A. (2001). Proc. Natl. Acad. Sci. USA, 98, 12351–12356. Web of Science CrossRef PubMed CAS Google Scholar
Price, S. L. (2008). Int. Rev. Phys. Chem. 27, 541–568. Web of Science CrossRef CAS Google Scholar
Raiteri, P., Martonák, R. & Parrinello, M. (2005). Angew. Chem. Int. Ed. 44, 3769–3773. Web of Science CrossRef CAS Google Scholar
Sarma, J. A. R. P. & Desiraju, G. R. (2002). Cryst. Growth Des. 2, 93–100. Web of Science CrossRef CAS Google Scholar
Schmidt, M. U. & Englert, U. (1996). J. Chem. Soc. Dalton Trans. pp. 2077–2082. CrossRef Web of Science Google Scholar
Schmidt, M. U. & Kalkhof, H. (1997). CRYSCA. Clariant GmbH, Frankfurt. Google Scholar
Signorini, G. F., Righini, R. & Schettino, V. (1991). Chem. Phys. 154, 245. CrossRef Web of Science Google Scholar
Spek, A. L. (2003). J. Appl. Cryst. 36, 7–13. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tan, J. S., Boerrigter, S. X. M., Scaringe, R. P. & Morris, K. R. (2009). J. Comput. Chem. 30, 733–742. Web of Science CrossRef PubMed CAS Google Scholar
Thakur, T. S. & Desiraju, G. R. (2008). Cryst. Growth Des. 8, 4031–4044. Web of Science CSD CrossRef CAS Google Scholar
Timmermans, J. (1922). Bull. Soc. Chim. Belg. 31, 389. Google Scholar
Verwer, P. & Leusen, F. J. J. (1998). Rev. Comput. Chem. 12, 327–365. CrossRef CAS Google Scholar
Willock, D. J., Price, S. L., Leslie, M. & Catlow, C. R. A. (1995). J. Comput. Chem. 16, 628–647. CrossRef CAS Web of Science Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.

STRUCTURAL SCIENCE
CRYSTAL ENGINEERING
MATERIALS

ISSN: 2052-5206

Volume 65| Part 2| April 2009| Pages 107-125

doi:10.1107/S0108768109004066

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text

Search term		doi		Advanced search
Author		volume	page

feature articles\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Significant progress in predicting the crystal structures of small organic molecules – a report on the fourth blind test

1. Introduction

2. Organization and approach

3. Methodologies

3.1. Methods of generating the molecular structure

3.2. Generating trial crystal structures

3.3. Ranking of structures

3.4. Treatments of the molecular flexibility in (XIV) and the independent molecules in (XV)

4. Results

4.1. Experimental structures

4.1.1. Molecule (XII)

4.1.2. Molecule (XIII)

4.1.3. Molecule (XIV)

4.1.4. Target (XV)

4.2. Comparison of the predictions with the experimental structures

4.3. Predictions results

4.3.1. Molecule (XII)

4.3.2. Molecule (XIII)

4.3.3. Molecule (XIV)

4.3.4. Target (XV)

4.4. Computational expense

5. Discussion

5.1. Overall success rates

5.2. The search problem

5.3. Ranking of the generated structures

5.4. Kinetics

6. Summary

Supporting information

Footnotes

Acknowledgements

References

feature articles