ζ-Glycine: insight into the mechanism of a polymorphic phase transition

The structure of the ζ polymorph of glycine has been solved by a combination of crystal structure prediction and neutron powder diffraction, where the elusive phase was trapped at low temperature. At room temperature ζ-glycine is a short-lived intermediate in the pressure-driven transition between the ∊ and γ polymorphs; by contrast at low temperature it undergoes the first observed transformation into the metastable β polymorph.


S1.1 Ab initio Crystal Structure Prediction:
The CSP procedure starts with a population of random structures in the first generation which then evolves such that only the thermodynamically most stable members are allowed to 'procreate', so that each generation inherits energy-reducing geometric patterns from previous ones. The procreation operations are cross-overs of parent structures, and "mutations" that vary the molecular position and orientation. To guarantee the diversity of the population throughout the evolution, new random structures are added at each generation, while new members that are too similar to previously sampled ancestor structures are prohibited from becoming parents for future generations. In this study the searches of phase space assumed 2, 3 and 4 glycine molecules per unit cell The ab initio calculation of energy and geometry optimization used during the search procedure, though accurate, comes with increased computational cost. Since this would limit the explored phase space several strategies were employed to accelerate the search. These included retention of low-energy candidates over the evolution of several generations, and the use of supramolecular arrangements seen in the known phases in the search, i.e. using glycine dimer of the α polymorph as the search unit. We also employed a new strategy of biasing the space group of the randomly chosen members towards those observed in naturally occurring molecular crystals.
The following lists the enthalpy vs. volume diagrams obtained throughout the phase space explorations during crystal structure prediction (CSP) in this work. Each diagram represents a different search strategy and summarizes the outcome. Only the structures within the first 4 kJ mol -1 are shown. Figure S1. CSP for 2 molecule/cell without using any experimental information on the known crystal structures of the molecule. The simulation is run for 18 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. Figure S2. CSP for 3 molecule/cell without using any experimental information on the known crystal structures of the molecule. The simulation is run for 15 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. Figure S3. CSP for 3 molecule/cell in which the randomly generated structures were biased to represent the space group distribution of the known organic crystal structure database: P21/c 36.59%, P1bar 16.92%, P212121 11.00%, C2/c 6.95%, P21 6.35% Pbca 4.24%. The simulation is run for 20 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. This search fails to find the most stable γ-phase of glycine in 20 generations. Figure S4. CSP for 3 molecule/cell in which the randomly generated structures were biased to represent the space group distribution of the known organic crystal structure database: P21/c 36.59%, P1bar 16.92%, P212121 11.00%, C2/c 6.95%, P21 6.35% Pbca 4.24%. The simulation is run for 2 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. This search identifies the most stable γ-phase in the first generation. Figure S5. CSP for 4 molecule/cell without using any experimental information on the known crystal structures of the molecule. The simulation is run for 20 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. This simulation fails to find α-phase within 20 generations. Figure S6. CSP for 4 molecule/cell in which the randomly generated structures were biased to represent the space group distribution of the known organic crystal structure database: P21/c 36.59%, P1bar 16.92%, P212121 11.00%, C2/c 6.95%, P21 6.35% Pbca 4.24%. The simulation is run for 16 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. This search identifies the α-phase in the first generation. Figure S7. CSP for 4 molecule/cell using the glycine dimer seen in the α-phase as the building block of the crystal structure, instead of the single molecule as done in all other cases. The simulation is run for 7 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. Note that in this search there are new phases that were not identified using 4 single molecules per cell, hinting at the vastness of the phase space and poor connectivity between enthalpy valleys. Figure S8. CSP for 4 molecule/cell using an increased similarity penalty between already encountered structures and the new ones proposed during evolution such that structures that are more different from one another are produced. The simulation is run for 5 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. Figure S9. CSP for 4 molecule/cell in which the randomly generated structures are strictly taken from the P21/n space group that α-phase belongs to. The simulation is run for 14 generations with settings described in the main text. Phases are labelled according to their enthalpy ranking in the whole search. Note that this search does not find the α-phase within 14 generations.

S1.2 Theoretical assignment of the ζ-glycine phase:
Figure S10. Theoretical assignment of the ζ-phase is done by a) comparison of simulated X-ray diffraction pattern for ε-γ-and ζ-glycine at 2GPa, with experimental data taken from Ref.4a of main text at 0.2 GPa. The XRD of proposed ζ-glycine can explain most of the unassigned peaks that were marked in the experimental spectrum. The theoretical spectra are calculated at higher pressure to offset the overestimation of ground state volumes. b) The enthalpy per molecule as a function of pressure for ε-glycine and ζ-glycine with respect to the γ phase up to 5 GPa is given. The black arrows indicate the phase transitions observed experimentally at room temperature: Under pressure the γ phase undergoes a transition to ε-glycine and upon decompression of εglycine, ζ-glycine is observed before the transition is completed to γ-glycine again. Table S1 compares the energies and experimental and computed volumes of the α-ζ phases.  (Kvick et al., 1980) [b] at 23 K. Ref. (Destro et al., 2000) [c] (Perlovich et al., 2001) (Dawson et al., 2005) S1.3 Structure solution and refinement of the neutron powder diffraction pattern obtained at 100 K All analysis was carried out using Topas-Academic (Coelho, 2015). The conventional space group setting of ζglycine is P1, and the position of the molecule in the unit cell is arbitrary; however, the coordinates for N1 were fixed at those determined in the structure prediction. To confirm that the predicted coordinates were correct the orientation of the glycine molecule were also determined by simulated annealing (the position is arbitrary in P1).
In order to emphasise its relationship with ε-glycine, the structure of ζ-glycine was transformed to a nonstandard I1 setting for refinement. The transformation matrix for the unit cell basis vectors from the P1 to I1 setting is The refinement model consisted of ζ-and ε-glycine, Pb (which was used as a pressure marker, (Fortes et al., 2007(Fortes et al., , 2012), Al2O3 and ZrO2 (both components of the anvils of the Paris-Edinburgh press used to apply pressure). Glycine molecules were treated as variable-metric rigid bodies in which the bond distances and angles were fixed at those of γ-glycine, but the N1-C2-C1-O2 and C1-C2-N1-H3 torsion angles were allowed to vary; the other torsion angles involving H were assumed to be related to these by ±120°. In the final stages of refinement the two strongest lines of γ-glycine, corresponding to the 111 and 110 reflections, were introduced to account for two broad features at d = 2.96 and 3.47 Å. This approach was found to yield the same fitting statistics as an explicit model of γ-glycine, but at the cost of fewer parameters. The final Rietveld fit is shown in Fig. 2a of the main text. Crystal and refinement data are listed in Table S2.

S1.4 The diffraction pattern at 290 K.
A stack plot showing the changes in the diffraction pattern as the sample was warmed from 100 K to 290 K is shown in Fig. S11. The pattern at 290 K was modelled as a mixture of β-glycine with a small residue of the ζphase. The glycine phases were modelled using variable-metric rigid bodies, as described in Section S1.3 for the 100 K pattern. The two peaks assigned to γ-glycine were found to persist, and were also modelled in the same way as described above. Lead, Al2O3 and ZrO2 were also included in the model. The final Rietveld fit is shown in Fig. 2b of the main paper. Crystal and refinement data are listed in Table S2.

S1.5 PIXEL and symmetry adapted perturbation theory calculations.
PIXEL and symmetry adapted perturbation theory enable the intermolecular interaction energies in a crystal structure to be estimated. For the PIXEL calculations, molecular electron densities were obtained using the program GAUSSIAN09 revision A.02 (M. J. Frisch, 2009) at the MP2 level with the 6-31G** basis set. The electron density was then analysed using the PIXELc module of CLP program package (Gavezzotti, 2011) which allows the calculation of dimer and lattice energies. Symmetry-adapted perturbation theory (SAPT) calculations (Jeziorski et al., 1994, Stone, 2013 were carried out with the PSI4 code (Turney et al., 2012) using the SAPT2+3 method with the aug-cc-pVDZ basis set. δEHF (3) corrections were applied to induction energies in all cases (Hohenstein & Sherrill, 2010b. Structures were visualised using XP (Sheldrick, 2001), MERCURY (Macrae et al., 2008) and DIAMOND (CrystalImpact, 2004). PLATON was used for geometric analysis (Spek, 2003). Figure S11: (a) Diffraction data collected as the mixture of ε and ζ glycine obtained at 100 K was warmed to RT. Sample cooled to 120 K (black), stable at 120 K (red), stable at 100 K (blue), and then warmed in 50 K steps to 150K (magenta), 200 K (green), 250 K (navy) and 290 K (purple). (b) Expansion of the region between 3.7 and 3.95 Å at 200 K (green), 250 K (blue) and 290 K (purple). Weighting scheme Statistical (w = 1/σ 2 (I)) Max(Δ/σ) < 0.001

Additional comments on the structure of ζ-Glycine:
The first coordination spheres of ε and ζ-glycine each contain 14 molecules, and Fig. S12 shows the two structures in approximately equivalent orientations. The formation of layers in the ζ, ε and β phases is shown in Figs. S13-15. The structure of γ-glycine is shown in Fig. S16 for comparison. The interaction energies for ε and ζglycine are listed in Table S3; the values of the energies were calculated from the experimentally-determined structures of the ε-and ζ-phases reported here. Energies were calculated using both the PIXEL and SAPT2+3 methods, and the agreement in total energies between the two is excellent. The advantage of these methods, aside from their accuracy, is that the total energies of the interactions are broken-down into electrostatic, polarisation, dispersion and repulsion contributions, which provides insight into the physical nature of each interaction. Although there are differences in the break-down of the different terms in the PIXEL and SAPT calculations because of the way the contributions are calculated, both calculations lead to the same conclusion: all interactions are dominated by the electrostatic term, with only a relatively modest contribution from dispersion.
In addition to the stabilising contacts described in the main text, the structure also exhibits some destabilising interactions. Within the layers there are relatively long interactions across the ac face diagonal of the unit cell in which the intermolecular energy is +16.0 kJ mol -1 . There are also some quite strongly repulsive (+43.3 kJ mol -1 ) interlayer contacts featuring C2H2…O2 interactions. In both cases the source of the positive energy is a large positive electrostatic contribution which is not compensated for by either polarisation or dispersion. Overall six out of the fourteen interactions are repulsive or have essentially zero energy; this is similar to the situation described previously for the ε and γ phases, where 6/14 contacts are repulsive. The energies of molecules D and G and their equivalents in both phases remain largely unchanged through the transition, while those of E and F change from being respectively strongly repulsive and attractive, becoming much less energetic through attenuation of the electrostatic components. Figure S12: The first coordination spheres of ε and ζ glycine showing labelling used in Table S2. Coordinates taken from CSD entry GLYCIN16 and the structure inverted (Kvick et al., 1980). Table S3: Contact geometry and energies within the first coordination spheres in ε and ζ glycine. Energies are in kJ mol −1 , and the letters A…N refer to the labelling in Fig. S12; dcentroid is the distance between the centroids of the molecules.

S2.2 Positions of the molecules in β, ε and ζ glycine.
The positions of the centroid of the molecules in the unit cells of β, ε and ζ glycine, which all have Z = 2, are listed in Table S4. The space groups, P21, Pn and I1 are polar, and the unit cell origin may be chosen arbitrarily along the polar directions. In addition, unit cell translations of ½ are possible along the a and c directions of P21, these being elements of the Euclidean normaliser of this space group. The transformed centroid positions to near [¼, ¼, ¼] are also shown in Table S3.