feature articles
HUG and SQUEEZE: using CRYSTALS to incorporate in the SQUEEZE structurefactor contributions to determine absolute structure
^{a}Chemical Crystallography, University of Oxford, 12 Manseld Road, Oxford, Oxfordshire OX1 3TA, England, and ^{b}Chimie minerale, analytique et appliquee, University of Geneva, Geneva, Switzerland
^{*}Correspondence email: richard.cooper@chem.ox.ac.uk
The resonantscattering contributions to singlecrystal Xray diffraction data enable the via a discrete Fourier transform of the residual electron density to approximate the Xray scattering from the disordered region. However, the corrected model cannot normally account for resonant scattering from atoms in the disordered region. Straightforward determination of from crystals where the strongly resonantly scattering atoms are not resolved has therefore not been possible. Using an approximate resonantscattering correction to the Xray scattering from the disordered regions, we have developed and tested a procedure (HUG) to recover the using conventional Flack x or other postrefinement determination methods. Results show that in favourable cases the HUG method works well and the can be correctly determined. It offers no useful improvement in cases where the original correction for the disordered region scattering density is problematic, for example, when a large fraction of the scattering density in the crystal is disordered, or when voids are not occupied equally by the disordered species. Crucially, however, if the approach does not work for a given structure, the statistics for the measures are not improved, meaning it is unlikely to lead to misassignment of absolute structure.
of crystalline materials to be determined. Crystal structures can be determined even if they contain considerably disordered regions because a correction is availableKeywords: disorder; resonsant scattering; absolute structure.
1. Background
The F_{o} and F_{c} (or I_{o} and I_{c}, where I are squared structure amplitudes, F^{2}).
of crystal structures usually requires the structural parameters to be adjusted by the method of least squares to minimize the differences between Cases exist where this procedure is complicated by the fact that part of the structure cannot easily be modelled by clearly defined individual atoms. This situation may exist in extended lattice structures with voids which contain independent molecules (host and guest structures), or in discrete molecule structures in which the lattice is stabilized by the inclusion of solvent molecules or ions necessary to preserve charge balance.
If these subsidiary molecules are not spatially constrained by the surrounding lattice, they may have freedom to move even in the solid state, or have the possibility of occupying alternative positions and orientations. Often this ambiguity can be modelled by large anisotropic atomic displacement factors (ADPs) or by the superposition of displaced partially occupied images of the molecule. In unfavourable cases, the average scattering density in the cavity cannot reasonably be modelled by independent atoms. This situation has been addressed by replacing the atomic model of the contents of the cavity by the discrete Fourier transform of the electron density in the cavity computed from the observed structure amplitudes and phases obtained from the atomic model of the resolved part of the structure (van der Sluis & Spek, 1990; Spek, 2015).
1.1. The SQUEEZE procedure
The
can be computed either as the Fourier transform of the continuous periodic electron density in the crystal:or as the summation of the contributions from individual `atoms':
SQUEEZE defines a region of the V, in which the disordered part of the is located. No atomic model of V is available. The content of V is represented by a real electron density, ρ_{V}(x):
The SQUEEZE procedure then uses the hybrid structurefactor expression:
The first term is a summation over the resolved atoms as in equation 2. The integral in the second term is evaluated for x ∈ V, which contains unresolved electron density. The resulting expression accounts for scattering from both resolved atoms and unresolved electron density.
The integral can be replaced by a summation over a suitable resolution grid of electron density:
1.2. Resonant scattering
If a material is in a noncentrosymmetric ). Representing F_{h}^{2} by I^{+} and F_{h}^{2} by I^{−}:
and it contains one or more atoms with significant for the wavelength in use, it may be possible to determine the of the sample using Flack's interpretation of the observed Bijvoet differences (Flack, 1983where the subscript s indicates a quantity computed from the atomic model with the x set to zero (i.e. a nontwinned single crystal), c a quantity computed from an inversiontwinned model (i.e. not necessarily zero) and o an observed quantity (Cooper et al., 2016). The Flack(x) may be determined either during the leastsquares or by postrefinement methods (Parsons et al., 2013). The success of this procedure depends upon the quality of the data and upon the of the material, conveniently estimated by Friedif (Flack & Shmueli, 2007). The magnitude of Friedif is increased in the presence of atoms with large factors, even if these atoms are not part of an host material. This means that the possibility of reliably determining the of an alllightatom structure can be increased if the material crystallizes with a suitable molecule of solvation.
In the case that the solvent molecule is highly disordered, it may not be possible to model it with discrete atoms, so that the only way to complete the analysis is to SQUEEZE the solvent region, which in the standard implementation makes no allowance for a resonant contribution from the solvent to the computed structure amplitudes. This means that a conventionally SQUEEZEd solvent cannot be used to help in the determination of below.
as demonstrated at the start of §32. Methods
Fourier transformation of ρ_{V}(x) leads to its F(ρ_{V}(x))_{h}, which is added to the of the atomic model for the ordered part of the unit cell.
The F_{h} = A_{h} + iB_{h}, where A_{h} and B_{h} are the real and imaginary parts of F_{h}, respectively, and i = .
is a complex number having both magnitude and phase, and may be written as2.1. The HUG procedure – enhancing SQUEEZE to include resonant scattering
If the disordered volume contains atoms with strong resonant scattering, how might one proceed to incorporate this resonant scattering contribution to ρ_{V}(x)?
Method 1 Construct a model in which the contribution of V is distributed uniformly over V. Then ρ_{V}(x) can be modified to become the complex ρ′_{V}(x):
in which c and d are constants to be chosen or determined in some way.
The inconvenience of this simple model is that the resonantscattering contribution is distributed widely over V and its contribution in will diminish more rapidly as a function of sinθ/λ than with an atomic model. Such consideration leads to:
Method 2 Construct a model in which the contribution of V is assumed to be proportional to ρ_{V}(x) at each point x. Then ρ_{V}(x) can be modified to become the complex ρ′_{V}(x):
In this way, the major part of the resonantscattering contribution will be located at the positions of high electron density in ρ_{V}(x). The advantage of this model is that it is `more atomic' than that of Method 1, a positive attribute intended to imply exactly the same as the `large f', usually associated with heavier elements, and whose ghosts would leave more miasma^{1} in the difference density.
If F_{h} is the Fourier transform of ρ(x), then the Fourier transform, F′_{h}, of ρ′_{V}(x) is given by:
where c and d set the ratios of the real and imaginary parts of the resonantscattering contribution from the electron density in the disordered region of the crystal.
A reasonable first approximation for c and d is to assume that the regions of high electron density in the unresolved volume are those of highest One may even take the step of assuming that the resonantscattering contribution is proportional to electron density, so that:
where the summations are over the expected atoms in the solvent. Note that equation 10 might use f instead of f_{sol} in the denominator, to avoid overcorrection for the resonant signal at higher sinθ/λ, however, trialanderror has shown equation 10 to be the more effective formulation.
The A and B of the region V for By inspection of equation 9, we obtain
contribution in the solvent region is thus which can be used as a modifier to correct the structurefactor componentswhere the subscripts sqz indicate the complex contribution to the returned by SQUEEZE due to unresolved electron density in the volume V, and hug indicates the same contribution corrected for is undertaken in the usual way, except that the A and B parts of the computed from the resolved atoms are supplemented by the addition of A_{hug} and B_{hug}, respectively.
where the subscript res indicates structurefactor components for the resolved part of the structure.
2.2. HUGging in CRYSTALS
Since its inception, CRYSTALS has had a facility for storing the precomputed A and B parts for a reflection so that they can be added into the A and B parts computed from an atomic model (Carruthers, 1977). The original use was to facilitate the development of a poorly resolved part of a structure. The A and B parts of the wellresolved atoms were computed once and stored in the database. Structurefactor contributions were then computed from the atoms in experimental models of the disorder and added to the stored parts. This gave significant timesavings when the wellresolved part of the structure contained a large number of atoms compared with the disordered part (Watkin et al., 1985). With the publication of the SQUEEZE program, this procedure could be reversed. For more than 20 years, an interface between SQUEEZE and CRYSTALS has enabled the A and B parts of the database to hold contributions to the computed from the discrete Fourier transform of electron density in parts of the not modelled by independent atoms. This procedure has the virtue that during the values of F_{obs} (or I_{obs}) are not modified. The enhanced strategy (HUG^{2}) represented by equation 12 has been implemented in CRYSTALS (Versions after 24/02/2017) by an external module which uses a proposed for the solvent to correct the standard output from unmodified SQUEEZE before passing the modified A and B parts into CRYSTALS.
The concept was evaluated by processing several structures with wellresolved solvent molecules. The A and B parts for an atomic model of the solvent were first computed and stored in the CRYSTALS database and then used together with the main structure in a normal These refinements were compared with a in which the A and B parts were from a solvent which was SQUEEZEd and HUGged. The agreement between the HUGged and atomic structure amplitudes can be estimated by
where Av_{atom} and D_{atom} are the average and difference of structurefactor magnitudes of a computed from a fully atomic model, and Av_{hug} and D_{hug} are equivalent values computed from a HUGged model. The agreement can be visualized in plots of D_{hug} versus D_{atom}.
3. Results
Table 1 lists solvated structures selected from the recent literature where the molecules of interest contained only light atoms and the solvents were reasonably well defined: sgd240 (Chernega et al., 2009), fg3257 (Bojarska et al., 2012), ky3014 (Shi et al., 2012), sgd464 (Davies et al., 2013a), sgd475 (Davies et al., 2013b), sk3422 (Fábry et al., 2012) and awisac02 (Qian et al., 2016). The absolute structures were confirmed by rerefining the atomic model in CRYSTALS. The solvents were then excluded from the structurefactor calculation and modelled using the standard SQUEEZE procedure. The outputs from SQUEEZE were HUGged as explained above, and the structures rerefined. The applicability of the procedure was assessed by comparing the Flack(x) (Flack, 1983) and Bijvoet(d) (Cooper et al., 2016) parameters determined from the atomic model and the HUGged model, and by plotting D_{s}, the computed Bijvoet difference, for one model against the other.
The effect of modelling regions of the x) and Bijvoet(d) parameters for a complete atomic model are −0.14 (12) and −0.08 (5) respectively, and the R1 value is 11.1%. After the chloroform molecule is removed and SQUEEZE applied, the remaining structure can be refined to an R1 value of 9.39%, but the Flack(x) and Bijvoet(d) parameters are now 4.7 (10) and 0.7 (2) (Table 2).
containing strong resonant scatterers with scattering from an electrondensity map with no effects may be demonstrated by a comparison of the statistics for sgd464: the Flack(3.1. Structure sgd240
This is an organic material of known Kα radiation. The solvent contains 22 electrons, SQUEEZE returns 114 electrons/cell in the voids, and −3 electrons/cell outside the voids. The principal normal and resonantscattering atom is sulfur in the main molecule. The from the solvent is marginal so that the unmodified SQUEEZE is essentially the same as the modified (Fig. 1).
(from the starting materials) containing a sulfoxide group and a wellbehaved acetonitrile of solvation. The data were measured with Mo3.2. Structure fg3257
This is an organic material containing C, H, N and O atoms, with two independent molecules and one dimethyl sulfoxide solvent in the 0 gives a Bijvoet(d) parameter of −0.30 (1). Multiplying c and d by factors of 1.5 and 2.0 gave Bijvoet(d) values of −0.02 (1) and 0.12 (0), respectively. The Flack(x) parameter increased from −0.30 (4) with unscaled d values to −0.00 (2) and −0.13 (2) with the increasing scaling factors. The nearunity value of the ratio (Z_{h} + Z_{s})/Z_{h} (1.13) suggests that the scaling of the A and B parts from SQUEEZE is almost correct, confirmed by the gradient (0.98) of the plot of the averaged Bijvoet pairs of the HUGged model versus the atomic model (Fig. 2); the need for scaling of d is demonstrated in the righthand plot.
The was determined from the Xray diffraction data. Modifying the SQUEEZE output using constants determined by equation 13.3. Structure sgd464
This is an organic material containing C, H, N and O atoms, with a Kα radiation.
(dr) > 99:1 and chloroform of solvation. The data were measured with MoThe Wilson plot (Fig. 3) showed an anomaly for ρ > 0.35 so the structure was rerefined excluding the highangle data. A difference density map showed small (< 0.5 e Å^{−3}) local maxima near chlorine. The solvent was excluded from the structure, and the main molecule was rerefined with HUGged contributions. A Fourier map (Fig. 4) computed using only A_{hug} and B_{hug} clearly recovered the solvent with elongated distributions near chlorine in an otherwise featureless map.
The HUGged structure refined to a lower R factor and gave a Bijvoet(d) estimate of Flack's parameter similar to that from the fully atomic model (Fig. 5).
3.4. Structure sgd475
This is an organic material containing C, H, F, O and N atoms (dr > 99:1), with disordered chloroform of solvation and measured with Cu Kα radiation. The difference map for the atomic modelled structure shows residual density. The normal probability plot for the weighted residuals w(F_{o}^{2}−F_{c}^{2})^{2}, had a slope of 1.36 and many outliers. The SHELXtype weighting coefficients (Cruickshank, 1961) were 0.049 and 0.873, and the D_{s}/σ(D_{o}) plot (Watkin & Cooper, 2016) was unusually skewed. The conventional R factor for the HUGged model is higher than that for the atomic model, but the Bijvoet(d) estimate of is still reliably determined. The unmodified SQUEEZE map contains 61 electrons per void; R(Av_{s}) = 0.12 and R(D_{s}) = 0.40.
3.5. Structure sk3422_III
This is an organic salt consisting of C, H, N and O atoms, containing a cation and a disordered mixture of hydrogen phosphite (H_{2}O_{3}P^{−}) and hydrogen fluorophosphonate (HFO_{3}P^{−}) anions in an 88:12 ratio. Although the Wilson plot looked normal, the N(z) plot contained bumpy deviations from the theoretical acentric curve. of the atomic model gave a conventional R factor of 1.87% (SHELXtype weighing parameters of 0.032, 0.000). The of the HUGged model was more problematic (conventional R = 13.8%). A SHELXtype weighting scheme could not be determined automatically and the parameters (0.40, 0.00) were set manually to get a roughly flat distribution of residuals. Not unsurprisingly, the analysis of the HUGged data was also unsatisfactory. One possibility for these difficulties may have been failures in the interface between PLATON and CRYSTALS, but this is unlikely because the R1 value computed from the SQUEEZEd data in CRYSTALS was 17%, comparable with a value of 17% computed by PLATON and 15% computed by SHELXL (Version 2014/7; Sheldrick, 2015).
The average of the Bijvoet pairs determined by HUGged SQUEEZE was approximately onehalf of that determined from an atomic model for the anion (Fig. 6). This led us to suspect that the root problem was the quality of the difference density maps. Plots of the A and B parts for the anion alone computed by SQUEEZE or an atomic model showed the same discrepancy (Fig. 7).
In sgd240, the ratio of the electron count of the atoms in the main moiety to that in the solvent was 11.6:1. In sk3422_III, the ratio is only 1.3:1. The column headed (Z_{h} + Z_{s})/Z_{h} in Table 1 shows that almost half of the total scattering is due to the anion, which could have an influence in the scaling, but clearly there is some other unidentified factor influencing the poor performance of sk4322_III. The consists of layers of hydrogenbonded chains of fluorophosphonate–hydrogen phosphite sandwiched between layers of the organic cations. Fig. 8 shows the spacefilling contents of the (MCE Version 2005 2.3.01) (Rohlíček & Hušák, 2007). The yellow regions represent inaccessible volumes in the structure, and it may be these which contribute to the problems.
3.6. Structure awisac02
This steroid derivative has a known PLATON difference synthesis showed that the residual density formed a continuous chain in a channel through the structure (Fig. 9).
There are two molecules in the which are conformationally almost identical except for one hydroxy H atom, but not related by any approximate The original authors could not locate an atomic model for the included solvent and SQUEEZEd the residual electron density, interpreting the electron count for the solventaccessible volume as a disordered molecule of methylene dichloride. Examination of the peaks found in theHUGging the SQUEEZE output for a single molecule of CH_{2}Cl_{2} reduced the conventional R factor to 6.56%. However, the electron count in the cell voids from PLATON (95 e^{−}) is more than that for two single molecules (84 e^{−}) of CH_{2}Cl_{2}. Multiplying c and d (equation 10) by 1.5 (i.e. three molecules of the solvent per unit cell) reduced the R factor to 5.92. The refined Flack(x) of 0.1 (2) and Bijvoet(d) of 0.33 (3) are on the correct side of 0.5, but are not convincing. The Hooft P(2) probability does not compute and the P(3) probability is strongly in favour of a twinned material.
3.7. Simulated diffuse solvent
During the review of this manuscript, one referee was interested to know how HUGging would perform if the strong resonant scatterers in the solvent were highly disordered. This situation could be studied by altering the model of compound sgd464, which contains a wellordered chloroform of solvation. In order to simulate a very disordered solvent, the three Cl atoms were replaced by an annular distribution equivalent to the three Cl atoms (Schröder et al., 2004) (Fig. 10).
The Flack(x) parameter was set at 0.02 and the U_{iso} value for the ring set at 0.06 Å^{2}. The structure factors computed from this model were treated as (error free) observations, but retaining the estimated standard uncertainties of the original data; the R factor was 0.03%. The Fourier synthesis calculated from this simulated data had the distributed electron density shown in Fig. 11.
The whole chloroform residue, including the C and H atoms, was deleted and the data SQUEEZEd. The Fourier synthesis computed from structure factors including the squeezed A and B parts was a close approximation to the original synthesized data (Fig. 12). of the structure including the SQUEEZEd A and B parts gave an R factor of 1.6% and a Flack(x) parameter of 2.9. HUGging and reweighting the SQUEEZEd data gave an R factor of 1.7% and a Flack(x) parameter of 0.02 (3), admirably close to the value of 0.02 used in simulating the data. The Bijvoet(d) parameter was 0.02 (1).
4. Conclusions
These preliminary observations show that an approximation to the
can be computed for a disordered solvent molecule or counterion which may be adequate for the determination of It seems that the greatest chance of success occurs when the solvent/counterion has significant but its real scattering must not overwhelm that of the host molecules.Since the HUG algorithm is only a postprocessing of the output from SQUEEZE, the success of the method is critically dependent on the applicability of SQUEEZE. The computation of c and d (equation 10) has no knowledge of the distribution in the voids of the strong resonant scatterers so that the HUG procedure can only be expected to be indicative of the Except for awisac02 and the simulated data above (§3.7), in the cases examined here, the solvent had been modelled by discrete atoms so that the target results were known. This will not be the case in reallife applications, but it seems that if the of an all lightatom material is required, it makes sense to attempt to recrystallize it from a solvent containing strong resonant scatterers, even if there is a likelihood that these may be incorporated as disordered solvent.

Footnotes
‡Deceased 2 February 2017.
§RIC & DJW are grateful to HDF for his inspiration and encouragement. We have tried to preserve his contributions to this manuscript unaltered where possible; any errors or omissions are our own.
^{1}HDF's original wording. `Miasma' might be replaced with `contribution' without altering the meaning here.
^{2}Just a few weeks before he died, Howard Flack drastically reorganized a draft of this manuscript and changed its prosaic title to `HUG and SQUEEZE'. One could not help remembering the CAMEL JOCKEY, with or without the humps (Watkin & Schwarzenbach, 2017).
Funding information
Funding for this research was provided by: EPSRC (grant No. EP/K013009/1 to RIC).
References
Bojarska, J., Maniukiewicz, W., Sieroń, L., Fruziński, A., Kopczacki, P., Walczyński, K. & Remko, M. (2012). Acta Cryst. C68, o341–o343. Web of Science CSD CrossRef IUCr Journals Google Scholar
Carruthers, J. R. (1977). In Proceedings of the 4th European Crystallographic Meeting (ECM4), Oxford, UK, 30 August–3 September 1977. Abstract Ob. 2. Google Scholar
Chernega, A. N., Davies, S. G., Goodwin, C. J., Hepworth, D., Kurosawa, W., Roberts, P. M. & Thomson, J. E. (2009). Org. Lett. 11, 3254–3257. CSD CrossRef PubMed CAS Google Scholar
Cooper, R. I., Watkin, D. J. & Flack, H. D. (2016). Acta Cryst. C72, 261–267. Web of Science CrossRef IUCr Journals Google Scholar
Cruickshank, D. W. J. (1961). In Computing Methods and the Phase Problem, edited by R. Pepinsky, J. M. Robertson & J. C. Speakman, Paper No. 6. Oxford: Pergamon Press. Google Scholar
Davies, S. G., Figuccia, A. L. A., Fletcher, A. M., Roberts, P. M. & Thomson, J. E. (2013a). Org. Lett. 15, 2042–2045. CSD CrossRef CAS PubMed Google Scholar
Davies, S. G., Fletcher, A. M., Roberts, P. M., Thomson, J. E. & Zammit, C. M. (2013b). Chem. Commun. 49, 7037–7039. CSD CrossRef CAS Google Scholar
Fábry, J., Fridrichová, M., Dušek, M., Fejfarová, K. & Krupková, R. (2012). Acta Cryst. C68, o76–o83. Web of Science CSD CrossRef IUCr Journals Google Scholar
Flack, H. D. (1983). Acta Cryst. A39, 876–881. CrossRef CAS Web of Science IUCr Journals Google Scholar
Flack, H. D. & Shmueli, U. (2007). Acta Cryst. A63, 257–265. Web of Science CrossRef CAS IUCr Journals Google Scholar
Parsons, S., Flack, H. D. & Wagner, T. (2013). Acta Cryst. B69, 249–259. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pearce, L. J. (1995). PhD thesis, University of Oxford, England. Google Scholar
Qian, M., EnglerChiurazzi, E. B., Lewis, S. E., Rath, N. P., Simpkins, J. W. & Covey, D. F. (2016). Org. Biomol. Chem. 14, 9790–9805. CSD CrossRef CAS PubMed Google Scholar
Rohlíček, J. & Hušák, M. (2007). J. Appl. Cryst. 40, 600–601. Web of Science CrossRef IUCr Journals Google Scholar
Schröder, L., Watkin, D. J., Cousson, A., Cooper, R. I. & Paulus, W. (2004). J. Appl. Cryst. 37, 545–550. Web of Science CrossRef IUCr Journals Google Scholar
Sheldrick, G. M. (2015). Acta Cryst. C71, 3–8. Web of Science CrossRef IUCr Journals Google Scholar
Shi, P., Zhang, L. & Ye, Q. (2012). Acta Cryst. C68, o266–o269. CSD CrossRef IUCr Journals Google Scholar
Sluis, P. van der & Spek, A. L. (1990). Acta Cryst. A46, 194–201. CrossRef Web of Science IUCr Journals Google Scholar
Spek, A. L. (2015). Acta Cryst. C71, 9–18. Web of Science CrossRef IUCr Journals Google Scholar
Watkin, D. & Schwarzenbach, D. (2017). J. Appl. Cryst. 50, 666–667. CrossRef CAS IUCr Journals Google Scholar
Watkin, D. J., Carruthers, J. R. & Betteridge, P. W. (1985). CRYSTALS User Guide. Chemical Crystallography Laboratory, University of Oxford, England. Google Scholar
Watkin, D. J. & Cooper, R. I. (2016). Acta Cryst. B72, 661–683. Web of Science CrossRef IUCr Journals Google Scholar
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.