Hans-Beat Bürgi tribute
X-ray constrained wavefunctions based on Hirshfeld atoms. I. Method and review
aSchool of Molecular Sciences, The University of Western Australia, 35 Stirling Highway, Crawley 6009, Western Australia, Australia, and bDepartement für Chemie, Biochemie und Pharmazie, Universität Bern, Freiestrasse 3, 3012 Bern, Switzerland
*Correspondence e-mail: dylan.jayatilaka@uwa.edu.au
The X-ray constrained wavefunction (XCW) procedure for obtaining an experimentally reconstructed wavefunction from X-ray diffraction data is reviewed. The two-center probability distribution model used to perform nuclear-position averaging in the original paper [Grimwood & Jayatilaka (2001). Acta Cryst. A57, 87–100] is carefully distinguished from the newer one-center probability distribution model. In the one-center model, Hirshfeld atoms are used, and the Hirshfeld atom based X-ray constrained wavefunction (HA-XCW) procedure is described for the first time, as well as its efficient implementation. In this context, the definition of the related X-ray wavefunction (XWR) method is refined. The key halting problem for the XCW method – the procedure by which one determines when overfitting has occurred – is named and work on it reviewed.
Keywords: X-ray constrained wavefunction; Hirshfeld atom; halting problem; X-ray wavefunction refinement.
1. Introduction
Twenty years ago, Grimwood & Jayatilaka (2001) reconstructed for the first time an electronic wavefunction for the molecular crystal of α-oxalic acid dihydrate with the help of an X-ray diffraction experiment. This wavefunction minimized the quantum mechanical Hartree–Fock energy, and at the same time was adjusted to fit X-ray diffraction data to a desired level according to a chosen agreement statistic. Such wavefunctions have become known as X-ray constrained or restrained wavefunctions (XCWs), and they represent one of the central concepts of quantum crystallography (QCr) (Grabowsky et al., 2017; Genoni et al., 2018; Grabowsky et al., 2020; Macchi, 2020; Genoni & Macchi, 2020).
The XCW method has recently received renewed interest (Landeros-Rivera et al., 2021; Macetti et al., 2021; Kleemiss et al., 2021; Grabowsky et al., 2020; Ernst et al., 2020; Genoni et al., 2017), especially since the development of the Hirshfeld atom (HAR) method (Jayatilaka & Dittrich, 2008; Capelli et al., 2014; Woińska et al., 2016). This is because HAR, also a development in the QCr field, better-known than the XCW method, is capable of obtaining hydrogen atom positions and ADPs in agreement with independent neutron diffraction experiments in a `fairly automatic and standard way' (Woińska et al., 2016). HAR is particularly significant for molecular biology because hydrogen atoms play such a crucial role in that field, evidenced by the fact that billion-dollar spallation neutron sources have been commissioned in part to detect these atoms (Henderson et al., 2015; Hall-Wilton & Theroine, 2014). In this regard, it is worth mentioning that there have already been recent attempts to extend HAR for use in such macromolecular systems (Malaspina et al., 2019; Bergmann et al., 2020; Chodkiewicz et al., 2022).
Given the recent interest, we think that a review of the status of the XCW method is appropriate, especially since the most popular version of it – the Hirshfeld atom X-ray constrained wavefunction (HA-XCW) method – has never been fully described in the literature. The HA-XCW procedure involves some technical steps which, if not properly implemented, would make it very impractical. Furthermore, the inception of the HAR method necessitates clarification of the X-ray wavefunction et al., 2017). Therefore, after a recapitulation of the purpose of the XCW procedure, the review will concentrate on these newer issues in §2. Section 3 will present the equations for the HA-XCW method, clarifying the relationship to the earlier methods. After the Summary and Conclusion in §4, some reminiscences by one of us (DJ) concerning Hans-Beat Bürgi are presented.
(XWR) method, which involves the simultaneous of both structural (HAR) and electronic (XCW) parameters in the model wavefunction (Woińska2. Review and status of the X-ray constrained wavefunction (XCW) method
There are two main reasons why one might want to obtain an XCW (Jayatilaka & Grimwood, 2001):
(1) Reliability and accuracy. To have a model of a diffraction experiment which explicitly incorporates quantum-mechanical knowledge through the use of a model wavefunction whose parameters are variationally determined, and which is, in-principle, systematically improvable both by adding functional flexibility to the wavefunction and by adding more experimental data; and
(2) Data compression. To have a single model capable of describing several different kinds of experiments simultaneously (e.g. diffraction and spectroscopy), with the same set of parameters.
The second point might be viewed as a corollary of the first, since wavefunction-based models are in some sense `universal' from a quantum mechanical point of view: it is hard to see how one could develop a conceptually better approach.
2.1. What is being measured?
This paper concerns fitting various models for the X-ray diffraction experiment, so we should begin by explaining what the measured quantities are, and why these are so important for determining a wavefunction.
In X-ray diffraction, the experimental quantities of interest are the intensities of the Bragg spots, which nowadays are mostly obtained as digital images. These images comprise the raw data. After extracting the cell orientation and shape, these spots are labelled by a triple of integers (Miller indices) representing the scattering vector positions on the e.g. for absorption, extinction, inelastic (phonon) scattering, and for differing X-ray scattering path lengths within the crystalline sample. An angular geometrical factor correction is also applied, as well as any other instrument-specific corrections, e.g. those related to a lack of incident X-ray beam constancy and perhaps angle of incidence on the detector. What is left after these corrections is an estimate of the (elastic) Bragg intensities of the X-ray diffraction spots, which, for an ideally in the weak-scattering limit (sometimes also called the first Born approximation limit) will be proportional to , the square of the magnitude of the Fourier transform of the average electron density in the each diffraction spot corresponding to a particular Fourier component . The square root of these corrected intensities are the measured quantities, the magnitudes, conventionally written
(the scattering vector is the difference between the direction of the wavevectors for the scattered and incident X-ray beams). The boundaries of these spots on the detector image are delineated, and the raw intensities of the spots extracted from the images, in some unit system. After this, several (hopefully small) corrections may be applied, symbol 〈 〉 represents the averaging process, discussed in more detail below. Important for any measurement process, is that each of the measured quantities are associated with an estimated standard uncertainties (ESUs), conventionally written . These ESUs are obtained not only by repeated measurement of the same reflection from the sample oriented in different geometric configurations with respect to the X-ray beam, but also by propagation of errors according to different models related to the mentioned experimental corrections. Many of these steps have been succinctly described by Kabsch (2010)The point of explaining this whole process is to note that the X-ray diffraction experiment is a complex one, and the quantities which are measured are subject to many theoretical assumptions and manipulations. Or, as Einstein said very elegantly in a conversation with Heisenberg (as translated by Joos et al. (2013) on p. 107):
`Only the theory decides about what can be observed… on [the] entire long path from a process up to our conscious perception we need to know how Nature is working in order to claim that we have observed anything at all.'
Thus, due to the long path associated with these complex manipulations, it is clear that the experimental errors associated with the experimental
magnitudes are difficult to quantify. It should be noted that it is rare for an X-ray diffraction experiment to involve averaging data over multiple samples; that is, the term experiment is used in the usual sense, as a process applied to a single sample, applied over a limited time span. Note that it is not our intention to claim (neither do we think it was Einstein's) that somehow, theory is more important than experiment. Nor is there a claim that experiment is somehow theoretical in nature – experiments are fundamentally different from theory. Rather, the intention is to say that experiment and theory are much more intertwined than is often stated.2.2. Why use X-ray diffraction data?
The main reason why the Fourier components of the electron density are of interest is that, according to the Hohenberg & Kohn (1964) theorem, the ground-state electronic energy of a system is a universal functional of the electron density (under certain conditions), usually written E = F[ρ]1. Therefore, the three-dimensional ground-state electron density ρ0 is all that is needed to define the exact ground-state electronic wavefunction Ψ0 of a system (Dreizler & Gross, 1990). Although not often mentioned, the Hohenberg–Kohn theorem ensures that the ground-state wavefunction is directly reconstructable from experiment, since the electron density is experimentally reconstructable2.
2.3. How is fit defined?
Since we are concerned with fitting to experimental data, we now explain how fit is defined, and some related issues.
The quantity used to assess the agreement between the experiment and any model is conventionally the (square of the) goodness of fit statistic, defined by
Here is the modelled or calculated X-ray r. ξ is an overall scale factor, determined by least-squares minimising the GoF2 (sometimes simultaneously with other least-squares refined parameters, as discussed further below, when we describe HAR in §2.5 and XWR in §2.8). The scale factor is necessary because the experimental magnitudes are not measured on an absolute scale3. Nrefl is the number of X-ray measured reflections; while Nparam is the number of parameters in the least-squares structure model, i.e. the number of symmetry-unique atomic (nuclei) coordinates and atomic displacement parameters, which we usually call NpADP, plus one for the ξ scale parameter. That is
magnitude for a reflectionwhere Nmisc is at least equal to one (for the scale parameter) but may also count any additional least-squares refined parameters associated with phenomenological corrections that may have been used, as we have discussed e.g. extinction parameters. Important for later considerations, the GoF2 depends not only on the experimental measurements, but also on the associated ESUs, the values; and of course, also on the number of parameters Nparam in the model4.
In this paper, we use GoF2 defined in terms of magnitudes even though it is common to use the square of the magnitudes instead; generalising all our results in this paper to use the square of the magnitudes is not difficult (Kleemiss et al., 2021). GoF2 is often defined in terms of squared magnitudes because these are more directly related to the measured intensities, and an important consideration for us in this paper is that estimation of the associated values is more straightforward in this case. Nevertheless, there remain several advantages to using the magnitudes in the field of electron density reconstruction.5
2.4. Wavefunctions from averaged electron densities
A wavefunction is generally used to describe a microscopic quantum system, so its use for a macroscopic entity such as a molecular crystal with finite temperature and ).
deserves more comment than was given in the initial publication (Jayatilaka & Grimwood, 2001The quantities which are observed in the X-ray diffraction experiment are the intensities of the Bragg reflections. Under idealized conditions (at low temperature, with a small crystal which does not absorb the X-rays, no dynamical re-diffraction, and with a crystal composed of crystallites which are only slightly orientationally disordered) these intensities are related to the square of the magnitude of the structure factors, the Fourier transform of the average electron density in an ideal ; Trueblood et al., 1996). The importance of the X-ray diffraction experiment is that it leverages the scattering from the translational repetition of many unit cells to get accurate averaged atomic and subatomic information on the electron density – information that may not be as accurate if one only examines a single region using microscopic techniques. The wavefunctions defined from such diffraction data must therefore incorporate this averaging process in some phenomenological way, as now described.
The word average means both time-averaging over the motions of the atoms in the and also a space-averaging of identical atoms in different mean positions in different unit cells (Bürgi, 1989In this work, we employ a fixed-nucleus localized molecular wavefunction convolved with an infinitely repeated lattice function to represent the wavefunction for a molecular crystal. The static or fixed-atom electron density from this wavefunction Ψ is calculated according to quantum mechanics by integrating its square |Ψ|2 over all but one electron spatial coordinate. Since it is impossible to obtain exact wavefunctions for large molecules (Kohn, 1999), it is common to expand the static wavefunction in terms of products of one-electron orbitals, which are themselves expanded in a set of basis set functions which are centred on the positions of the nuclei (Boys, 1950). Thus, as a consequence of the squared nature of the electron density and the use of basis functions with a centre (in contrast to plane wave basis functions which have no fixed centre), the quantum mechanical electron density is required to have the form of a product of basis functions on two centres.
The expansion of the electron density as products of basis functions on up to two distinct centres is important because if one wants to develop a model for the position-averaged electron density in the ideal b; Willis & Pryor, 1975). It is only when this macroscopic unit-cell averaging process is accounted for in some phenomenological way that we can speak of an underlying wavefunction (which we intend to fit to the data) that describes the crystal.
then one must account for the displacement of these atom pairs (and the basis functions which are centred on them) from their averaged positions over all unit cells. This is achieved with the so-called two-centre probability distribution functions (TCPDFs) (Scheringer, 1972It is worth comparing the formally more correct TCPDF position-averaging process to what is preferred in standard crystallography. In standard crystallography, the electron density is not regarded as coming from a complicated multi-atom multi-centre wavefunction; it is rather regarded as being composed of asum position-unchanging, and position-uncorrelated, atomic electron densities. Clearly, in this `independent atom' model (IAM), the average electron density does not depend on TCPDFs: since motion on the two centres is uncorrelated, only one-centre probability distribution functions (OCPDFs) are needed. These OCPDFs are assumed to be three-dimensional Gaussian distributions, characterised by a matrix of atomic displacement parameters (ADPs) (Trueblood et al., 1996). Thus, in standard crystallography, a nuclear position is introduced at which an assumed fixed-shape or static atomic electron density is placed, and this fixed-shape electron distribution is `convolved' with the OCPDF described by the ADPs to produce an averaged atomic density supposedly seen by the scattering X-ray beam (Einstein, 1907, 2005; Bürgi & Capelli, 2000). This atomically-averaged electron distribution is then summed over all atoms to produce an approximation to the averaged total electron density in the This averaged unit-cell electron density, being ultimately a function of atomic positions and ADPs, is least-squares refined with respect to these parameters (and an overall scale factor) to fit the X-ray diffraction magnitudes, which under ideal conditions are essentially the square-root of the Fourier transform of the unit-cell electron density. These kinds of one-center models for the electron density are well known and standardised (Giacovazzo et al., 1992; Dunitz, 1995).
In order to relate the OCPDF formalism of standard X-ray crystallography, in the original XCW procedure several models were considered to express the TCPDFs in terms of one-center probability distributions. These were the approximations of Stewart (1969), Coppens et al. (1971), and, for Gaussian basis functions, the approximation of Tanaka (1988). With such approximations in hand, the two-center probability can be written in terms of the one-center ADPs, and the XCWs in the original Grimwood & Jayatilaka (2001) work are more properly called TCPDF-XCWs.
2.5. Hirshfeld atom (HAR) and its significance
The complexities of the TCPDF models inherent in the original XCW approach motivated Jayatilaka & Dittrich (2008) to develop a new one-centre model called Hirshfeld atom (HAR). In HAR, the electron density is obtained from a quantum mechanical model based on estimates of the nuclear positions and further partitioned into aspherical atomic density pieces according to Hirshfeld's stockholder partitioning (Hirshfeld, 1977a,b). These aspherical atomic pieces are then convolved with the usual OCPDFs and summed over all atoms to give, as in the standard crystallographical approach, the averaged unit-cell electron density which is, again, least-squares refinable in terms of parameters describing the atomic positions and ADPs.
One difference in HAR compared to the standard IAM is that the shape of the electron density, as well as the atomic partitioning, depends on the position of each atom. Therefore, the wavefunction must be recalculated every time the geometry changes during the et al., 2014). This is not needed in the IAM because for a superposition of fixed-shape atomic densities, a position change only results in a phase change on the Fourier transformation.
process (CapelliThe importance of HAR is that it produces atomic coordinates including hydrogen atom positions and anisotropic ADPs in agreement with those from independent neutron diffraction experiments in a fairly automatic and standard way (Woińska et al., 2017). HAR thus provides the best estimates of the nuclear positions presently available from X-ray data, and this provides strong but indirect evidence that the electron densities associated with the theoretical wavefunctions whose nuclear positions are experimentally determined are reliable.
2.6. The effect of HAR positions and ADPs on the electron density
In the XCW procedure, the positions of the nuclei in the fragment wavefunction Φ, and the values of the ADPs, are fixed by some specified predetermined procedure. Therefore, in the XCW fitting procedure it is important to state how these positions and ADPs are chosen.
The importance of using HAR positions and ADPs in producing electron densities was illustrated in section 2 of Capelli et al. (2014), where it was found that, if such positions and ADPs were used for the ammonia crystal, then the model magnitudes agreed with the measured ones within the experimentally ESUs. In other words, if HAR positions and ADPs are used for ammonia, then no new information concerning the electron density can be obtained from the X-ray diffraction experiment. In contrast, if multipole-model refined atom positions and ADPs were used, then the unfitted wavefunctions did not produce agreement between the experimental and calculated magnitudes within the experimental errors; and further XCW fitting yielded features in the electron density that differed from those calculated with ab initio wavefunctions (Bytheway et al., 2002). Importantly, this conclusion holds only if one believes the experimental errors are correctly estimated, a matter which is reviewed in more detail below.
Clearly, the HAR positions and ADPs produce better electron densities than those produced from the multipole model (Fugel et al., 2018). Therefore, the use of fixed HAR positions and ADPs with a further fitting of the electronic parameters in the wavefunction via XCW has been recommended by Grabowsky et al. (2012) and Woińska et al. (2017), and that this procedure be called X-ray wavefunction (XWR). Note that the XCW procedure used by these authors made use of a TCPDF position-averaging model, the same as was used by Grimwood & Jayatilaka (2001) Therefore, the recommended XWR procedure was somewhat inconsistent since it requires an OCPDF model, HAR, to obtain the atomic positions and ADPs, but a TCPDF model to refine the electronic wavefunction.
2.7. The Hirshfeld atom X-ray constrained wavefunction (HA-XCW)
The OCPDF model of atomic displacement (and position averaging, or thermal smearing) based on Hirshfeld atoms, as used in the Tonto (Jayatilaka & Grimwood, 2003), but remarkably, has never been documented in the literature!
formalism inside HAR since 2008, can also be transferred to XCW fitting. This produces an OCPDF XCW; and since it makes use of the HA model of thermal smearing inside the formula we term it HA-XCW to distinguish it from the XWR protocol and from earlier TCPDF XCW versions. To be clear: the HA-XCW model implies a one-center averaging procedure (but not all one-center averaging models are necessarily Hirshfeld atom based models). The HA-XCW procedure itself has been available for nearly a decade via the softwareThere has only been one study by Bučinský et al. (2019) where the impact of the TCPDF approach based on Stewart, Coppens and Tanaka on the magnitudes was compared to the impact of using the Hirshfeld atom OCPDF model. There, it was shown that the calculated magnitudes are indeed systematically different between the TCPDF and OCPDF approaches, but it was not investigated how this, in turn, impacts the electron density from the fitted XCW itself.
2.8. Clarification of the X-ray wavefunction (XWR) procedure
Technically, the use of a HA-based OCPDF smearing model is independent of how one chooses the (fixed) atomic coordinates and ADP values in an XCW fitting procedure. Therefore, for simplicity, clarity and consistency, we recommend from now on to use the same position-averaging or thermal-smearing model in both the least-squares of the positions and ADPs, as well as in the XCW fitting procedure. Concerning XWR then we therefore recommend that HAR be used to obtain the geometric and ADP parameters, and that HA-XCW fitting be used with these HAR parameters. That is, we recommend to define a new consistent Hirshfeld-atom X-ray wavefunction model by the mnemonic
Should different position-averaging methods be used in the XWR procedure, these can always be specified by the particular method, in the following brackets, or by explicit description. For example, in Grabowsky et al. (2012), a HAR + TCPDF(Tanaka)-XCW method was used to produce experimental wavefunctions, where there the HAR pertains to how the geometric and ADP parameters were obtained, and the prefix [TCPDF(Tanaka)] of the XCW part specifies how the thermal smearing was performed in the XCW part.
2.9. Are properties derived from XCWs reproducible and reliable?
Both TCPDF and HA-XCWs have been used to produce properties such as in-crystal dipole moments, polarisabilities, and refractive indices (Whitten et al., 2006; Jayatilaka et al., 2009), or in-crystal hyperpolarisabilities and second-order optical susceptibilities (Hickstein et al., 2013). Furthermore, chemical bonding descriptors were produced, such as the electron localization function (ELF) (Grimwood et al., 2003), the electron localizability indicator (ELI) (Grabowsky et al., 2010), the delocalisation index (Grabowsky et al., 2011), and Roby-Gould bond indices (Grabowsky et al., 2012). The properties from these XCWs compared favourably with purely theoretical calculations, and agreed with chemical expectations.
XCWs have also been obtained from structure factors generated from theoretical ab initio wavefunctions, with the aim of understanding to what extent (magnitude, accuracy) polarization by the crystal field and relativistic effects could be retrieved from the structure factors (Bučinský et al., 2016; Genoni et al., 2017; Ernst et al., 2020; Podhorský et al., 2021; Macetti et al., 2021). For example, Genoni et al. (2017) demonstrated that effects in the valence region of the electron density could be recovered, provided high-resolution data was downweighted, whereas Bučinský et al. (2016) quantified relativistic effects in the electron density, and under what conditions such effects might be detected experimentally.
Despite all this work, it is still not clear whether XCWs produce electronic properties or bonding descriptors which are experimentally reproducible (in the sense of those properties being robust when the same or very similar XCW analyses are applied to different experimental data on the same system). Indeed, there are reasons to suspect that at present the XCW procedure may not yield experimentally reproducible features in the electron density—notwithstanding that the XCW has been fitted to electron density magnitudes. For example, in the case of the ammonia crystal already mentioned, it was confirmed that small differences in the atomic positions and ADPs do indeed strongly affect the electron densities obtained from an XCW fit (Bytheway et al., 2002; Capelli et al., 2014). Yet even though the magnitudes from a HAR agree better with the experimental data even before any XCW fitting procedure in this case, it is not clear the electron densities produced from XCW after HAR are accurate or reasonable in general.
In this regard Kleemiss et al. (2021) found differences in the electron density distributions for the epoxysuccinyl amide molecule that, upon a full XWR(HA) fitting, showed qualitatively different features to those obtained using other methods; in fact, the XWR(HA) features were three times as large as those in reference calculations (though, in this case, it was known that the sample suffered from radiation damage during the measurement). For high-quality data of urea and xylitol, on the other hand, the fitted wavefunctions agreed favourably with theoretical reference calculations (Ernst et al., 2020; Malaspina et al., 2021).
Very recently, Landeros-Rivera et al. (2021) investigated the XCW procedure and concluded that XCW is very sensitive to the manipulation of crystallographic data, by which they mean the effect of using different well-established experimental methods to estimate the errors in the data. These authors mainly examined the effect on agreement statistics between the calculated and observed experimental data. However, in the one case where properties were investigated (for SO2), the integrated atomic charges, the delocalisation bond index, and the Laplacian of the electron density obtained with different error estimates compared well. For example, the delocalisation indices were stable to about 3%, the atomic charges were stable to 4%; and the images of the Laplacian contours from the different error-model and resolution-model treatments were, overall, very difficult to distinguish from each other by eye. It was only when critical points of the Laplacian were investigated, that differences could be seen.
Therefore, in a new study [hereafter referred to as part II (Davidson et al., 2022)] we have reinvestigated the oxalic acid dihydrate data used in the very first XCW fitting paper on a molecular compound (Grimwood & Jayatilaka, 2001) with the HA-XCW fitting procedure – and extended it by 13 further X-ray diffraction data sets measured at the same temperature. Indeed, some systematic features are stable and reproducible throughout all data sets as well as in a joint consensus fitting of all data sets simultaneously. However, only some of these features can be ascribed to physical effects such as the above-discussed and polarization effects. Others represent recurring systematic errors in the measurements or the XCW model.
2.10. Strategy for finding the XCW
Let us recall the strategy for finding an X-ray constrained wavefunction.
The strategy for finding the fitted wavefunction Ψ is to minimise its quantum mechanical energy EQM[Ψ] in the usual way, but also to simultaneously subject the wavefunction to a constraint or condition expressing a desired agreement between the experimental and modelled or calculated data. (It should be realised that, according to the variational theorem, minimising the variational energy expression EQM[Ψ] gives a wavefunction which best approximates the exact wavefunction in a least-squares sense). Thus, the strategy for finding the XCW implies a fundamental tension between two requirements: on the one hand, minimising an energy expression, and, on the other, minimising the agreement statistic.
Why adopt such an abstruse procedure to determine the fitted wavefunction?
(i) The main reason is that, in general, there may not be enough data to determine all the parameters in even a simple electronic wavefunction. Thus, the variational energy expression supplies the missing information needed to determine all the wavefunction parameters
(ii) Equally important, though, is the fact that the experimental data are inherently subject to errors, and quantum mechanical information may be used to mitigate these errors. The word mitigate used here is subjective, since it presumes that the experimental errors are larger than any of those introduced by incorporating information from the quantum mechanical method. It is clearly preferable to address these uncertainties in a purely experimental way – if measurement is the main concern. However, one may be less interested in the measured quantities, or it may not be impossible to do a better measurement. Then, one may be interested in obtaining the best estimate for a property given all the information at hand, both experimental and theoretical.
2.11. Lagrange method for finding the XCW
To obtain the wavefunction according to the above methodology, we use Lagrange's method of undetermined multipliers, see e.g. Arfken et al. (2013). According to this method, we define a Lagrangian functional6
as a sum of the quantum mechanical energy, EQM[Ψ], and a Lagrange multiplier λ multiplying the constraint expression, involving the difference between the GoF2 [see equation (2)] and the desired value for this statistic, Δ. It is not very important what the specific form of the agreement statistic is – different forms may well have different advantages – but as we have said, in crystallographic least-squares it is common to use the goodness of fit defined in equation (2).
Lagrange's method now requires making J stationary with respect to variations in the wavefunction Ψ, and with respect to variations in the Lagrange multiplier λ. At the optimum point where J is stationary we obtain the value of the Lagrange multiplier λ = λopt and the fitted wavefunction . The fitted wavefunction then determines the desired value Δ of the goodness of fit statistic GoF2. Explicitly, the stationary conditions are
and
The second condition (6) simply returns the constraint condition expressing the desired agreement between the observed and calculated data. GoF2[Ψ] is a function of the magnitudes , see equation (2), which are in turn calculated from the electron density obtained from the wavefunction Ψ. Hence GoF2[Ψopt] may also be regarded a function of λopt,
However, at the stationary point of J the agreement statistic GoF2[Ψopt] equals Δ through equation (6), since it is the value of Δ that determines Ψopt through this same equation. On the other hand, the first condition, equation (5), is harder to evaluate because it takes different forms depending on the specific kind of wavefunction which is used. We evaluate it for the case of single determinant wavefunctions later on in §3.3, using the Hirshfeld atom and other TCPDF methods to obtain the calculated structure factors. Since J and EQM are nearly the same except for the constraint condition, equation (5) yields very equations very similar to those from a `normal' variational calculation for the adopted ab initio wavefunction Ψ.
2.12. The XCW as a function of the Lagrange multiplier and desired fit
To further explore the meaning of the above equations, first consider the case λ = 0. In this case, the second term in equation (4) is zero, and the wavefunction returned by making J stationary, Ψλ=0 = ΨQM, which is simply the usual (calculated) ab initio wavefunction ΨQM. On the other hand, for λ > 0, a bias or penalty is added to J. The effect of this penalty term is to change the magnitude of J, so that in the limit λ → ∞ the magnitude of the second term on the right hand side of equation (4) dominates the value of J. Therefore, making J stationary in the limit λ → ∞ returns a wavefunction Ψλ=∞ which is very similar to (but not the same as!) a least-squares fit.
In summary, the Lagrange multiplier λ in the XCW method allows a smooth interpolation between a calculated or purely theoretical wavefunction, and a fitted or experimental wavefunction. At intermediate values of 0 < λ < ∞, the wavefunction returned by minimising J is a chimera between the calculated and experimental wavefunctions. Hence, the wavefunctions Ψλ smoothly connect the quantum mechanical wavefunction ΨQM = Ψλ = 0 with the wavefunction in the sense of differential geometry. This is very important because there are, in principle, an infinity of wavefunctions which might reproduce the GoF2 to a desired level of agreement; and out of all these, the quantum mechanical method provides the missing information to select the one with the lowest variational energy. This is illustrated in Fig. 1.
The above discussion suggests how one may solve the Lagrange equations in practice: one simply chooses increasingly larger values of λ until the desired value Δ of the GoF2 statistic is achieved; at this point we will have found λopt.
When one reads papers on the XCW method, it may seem that the value Δ and the optimum value of the Lagrange multiplier λopt have all disappeared. They have not! It is only convention that in most practical applications, the distinction between λ and λopt, and between GoF2[Ψopt] = Δ is blurred, because the solution method involves exploring different values of λ, and all these quantities are determined all at once.
There is another point to make: in principle, with an infinitely flexible basis set, it would be possible to exactly reproduce the experimental data (perhaps it would be better to say interpolate the data). This clearly shows that the XCW fitting is not a least-squares procedure, a point which we return to later. Therefore, for any XCW, it is important to note the basis set which has been used.
2.13. XCW from a least-squares perspective
From a least-squares perspective the XCW procedure can be viewed as a EQM[Ψ] term in J. This should be familiar to crystallographers who are used to introducing missing information through the use of restraints e.g. restraints on bond lengths, so that they are not too far from a certain empirically established target value; or restraints on the coordinates of certain atoms so that they remain in a plane. Here, it is important to emphasize that the information being introduced by the use of the quantum mechanical energy expression EQM[Ψ] is much more fundamental and general in nature than any empirical constraint; and this information is, in principle, also systematically improvable to the true value (according to theory, at least) by using better energy expressions or wavefunctions. The procedure of using better wavefunctions has been well established by the quantum chemists (Goerigk et al., 2017). Put simply: quantum chemistry works.
with a large number of restraints – the restraints being represented in a complicated way by theThe notion of using crystallographic restraints raises the vexed question of the number of independent parameters in the least-squares model. Indeed, as we have said, at λ = 0 the XCW procedure determines all the electronic wavefunction parameters without the need of any experimental data (we ignore the positional, ADP and scale factor parameters for now). And, as λ approaches infinity, only the least-squares term dominates the expression for J. In the large-λ limit then, the maximum number of parameters which can be determined by least-squares is equal to the number of reflection data Nrefl (assuming this is less than the number of parameters in the wavefunction, , which is almost always the case, see section 3.4). This means that, from a least-squares perspective, the effective number of independent parameters Nparam(λ) determined by the XCW procedure must depend on the value of λ, with Nparam(0) = 0 and . If it were possible to determine values for Nparam(λ) at intermediate values of λ such that 0 < λ < ∞ then this value could be substituted into the equation (2) for GoF2, and we would expect that at a certain point λopt this GoF2 value would display a minimum value, where one could be reassured that the XCW procedure would display a maximum benefit for the number of independent parameters, i.e. we would not have a situation of overfitting or interpolation. Various methods can indeed be devised to estimate the effective independent number of parameters, say by eigenvalue filtering, see Diamond (1966) for the use of this method in a crystallographic context, but these have not been tried yet.
Something should also be said about the notion of overfitting in a least-squares context. The key quantities of interest in a least-squares procedure are that there is no bias in the estimated parameters (i.e. that the difference between the estimated and expected values of the parameters is zero), and that the variance of the prediction parameters is minimised (this is the `least' part of the least-squares process). The notion of overfitting is related to large values of the variances, due to having too many parameters in the model, because the least-squares equations are ill-conditioned. What Hoerl & Kennard (1970) and Marquardt (1970) noticed [see also Marquardt & Snee (1975) for a more accessible discussion] is that if `extra information' is introduced to increase the bias, then the least-squares error can be reduced. Indeed, the least-squares error is exactly the sum of the variance and the square of the bias. This is the famous variance-bias trade-off. In a sense it is obvious: as we said, at λ = 0 all the parameters is the model are determined: there are no independent least-squares parameters in this case. This is nevertheless important because, often an unbiased estimate with large error is less useful than a biased estimate that has a smaller error. It all depends on how small the error estimate is!
2.14. XCW fitting is a regularisation procedure
We have mentioned that the number of parameters needed to describe an XCW will often exceed the number of experimental data which are measured, so that the XCW procedure is not a least-squares procedure. It is, in fact, a regularisation procedure (Weese, 1993; Mueller & Siltanen, 2012) closer to inference of the best possible values for the model parameters (based on appropriate prior information) than a measurement process. In fact, the XCW procedure is closely related to Bayesian reconstruction methods, which are described by Ward & Ahlquist (2018, p. 9) as follows: `Rather than consider the data as random and the parameters as fixed, the principle of treats the observed data as fixed and … the parameters [as] random variables.'
The italics in the quote above are ours, and highlight the fact that from a Bayesian viewpoint, data are just one source of information (along with prior information) to produce inferred or updated results. Bayesian methods allow a straightforward way to obtain credibility intervals of inferred quantities, the analogue of confidence intervals from classical statistics – albeit with much more computation required to obtain them (Carlin & Louis, 2008; Turkman et al., 2019). However, let us reiterate: Bayesian estimates for quantities do not constitute a measurement. The intention is quite different, because we are in a regime of inference – not the scientifically accepted standard of measurement relative to a fixed model where all sources of error are reduced or eliminated to insignificant levels.
We also point out that the XCW procedure is closely related to the procedure of ridge regression and cross validation, especially the jack-knife or leave-one-out methods which, interestingly, permit a determination of the Lagrange multiplier parameter λ via analytical rather than bootstrap data-resampling procedure (Golub et al., 1979; Gruber, 2017; Dixon & Ward, 2018). These ridge regression methods are interesting because they lie in-between least-squares methods and a full Bayesian analysis: the well-known Brünger (1992) Rfree procedure of macromolecular crystallography is just one such example of these bootstrap methods.
To avoid a potential confusion we mention that in the mathematical literature a Lagrange multiplier (here below, μ) is usually applied to the regularising expression (in this case EQM[Ψ]) rather than the GoF2 term, thus leading to the idea of making stationary the expression
We have not written the regularisation procedure in the above standard form because there would be no solution in the case that the Lagrange multiplier μ would be zero, due to the fact that the least-squares equations are ill posed, as already mentioned. However, as soon as μ takes on an infinitesimally small value, making J′ stationary becomes well defined, and it is easy to see that an infinitesimal value of μ corresponds to minimising J′ in the limit λ → ∞. We chose the nonstandard way to represent the XCW regularisation procedure in the expectation that one would be using very good, well-tested wavefunctions so that relatively small values of λ are required (though it is hard to quantify this). That is, we expect to be operating in a regime where the theoretical models are only perturbed by a small amount to reach an optimum agreement (according to a well-defined halting procedure) with the experimental data.
2.15. Constraint or restraint?
Originally, the term constraint was associated with the way these wavefunctions were derived. However, it was soon realised (Jayatilaka, 2012; Grabowsky et al., 2017; Ernst et al., 2020) that in crystallography it is standard to instead use the word restraint. However, further research shows that the term restraint is not commonly used outside crystallography and Even in the well-known text by Dunitz (1995, p. 212) we find the statement `Instead of exact constraints, more elastic ones may be imposed – e.g. that bond distances … are permitted to deviate from standard values by not too much… This way of adding constraints is equivalent to adding a set of further observational equations, and the order of the matrix of normal equations [for the least-squares parameters] is thereby neither increased or reduced. The method of conditional or slack least-squares is applicable when… [and so on]'
The italics in the above quote are ours, and indicate that the word constraint is used in connection with `elastic', `slack' and `conditional'; but in other fields `flexible' or `soft' are also often used together with the word constraint. In classical mechanics one has a closely related notion of nonholonomic constraints (Goldstein et al., 2002). The word restraint does not even appear in the original reference by Waser (1963), though he does refer extensively to Lagrange's method (of constrained minimisation) associated with subsidiary conditions. Likewise, Giacovazzo et al. (1992, p. 106–107) (who also uses the term `soft, flexible constraints' to describe restraints) also makes clear that the difference between constraints and restraints is only in the method of solving the least-squares equations.
Given this, in this paper we use the original term X-ray constrained wavefunction (XCW) not only to draw a direct connection to Jayatilaka (1998) and Grimwood & Jayatilaka (2001), but also because it emphasizes the importance of achieving the correct and desired value of the goodness of fit Δ, which is the object of the constraint. Our preference, however, is that the term regularisation be used in the context of XCW as much as possible.
2.16. The XCW and the Kohn–Sham wavefunction
Without going into too many details, the Kohn & Sham (1965) DFT theory posits that there exists a single determinant wavefunction ΦKS which minimises the and simultaneously yields the exact ground state electron density ρ0. Levy formally implemented this idea by introducing a continuous Lagrange multiplier function to find ΦKS (Levy, 1979; Parr & Yang, 1994). According to Lagrange's method then, we need to construct Levy's functional
where T is the of the determinant, and the term following represents the constraint. Lagrange's method requires us to find the stationary point of L to find ΦKS. Using the calculus of variations the stationary conditions with respect to variation of L[Φ] with respect to Lagrange multiplier then leads to the constraint equation (Parr & Yang, 1994)
On the other hand, variation with respect to the orbitals comprising Φ then lead to the Kohn–Sham equations for the Kohn–Sham orbitals (Parr & Yang, 1994),
The term is the identified as the famous Kohn–Sham effective potential, which is approximated in all DFT methods.
Notice that Levy's derivation of the Kohn–Sham DFT equations is essentially the same as that in XCW method. For example, making the XCW Lagrangian J stationary with respect to Lagrange multiplier λ gave back the constraint condition (6), analogous to equation (10), while making the variation of L with respect to the wavefunction gives the constrained variational equation (5). In addition, in §3.10 we show that, for a molecule-based procrystal determinant, the XCW obeys orbital equations in a basis set just like the Kohn–Sham equations.
In fact, the XCW procedure can be expressed in a similar way to Levy's formulation of the Kohn–Sham equations if we define the XCW as the wavefunction that makes
stationary. Now, the only difference compared to Levy's formulation is the full ΦKS which yield a given exact ρ0, the XCW method finds the Ψopt which gives a desired goodness-of-fit statistic such that GoF2[Ψopt] = Δ, see §2.12 and Fig. 1.
is used for the energy, rather than the operator . And rather than find the wavefunctions2.17. The XCW and the Zhao–Parr method
Important for our work is that Zhao & Parr (1993) have implemented the Levy constrained-search approach for the Kohn–Sham wavefunction. That is, Zhao and Parr have described how to extract the Kohn–Sham wavefunction from an arbitrary input density ρ0.
The Zhao–Parr (ZP) equations for the Kohn–Sham orbitals minimize the functional
where
where ρ0 is supposed to be the exact ground-state electron density. However, unlike the Kohn–Sham–Levy method which yields the effective potential, in the ZP method the constraint term C is associated with a single Lagrange multiplier λ, and the Kohn–Sham equations are obtained in the limit λ → ∞ (the factor of ½ in the equation above is only aesthetic). According to ZP, this method works – by which they mean that making the ZP[Φ] functional stationary yields the orbitals comprising the Kohn–Sham determinant wavefunction with electron density ρ0. The similarity between this variational procedure and that used in the XCW method is now even more clear compared to Levy's formulation of the Kohn–Sham equations (described in the previous section) since rather than an effective potential there is now only a single Lagrange multiplier λ.
There are, however, some minor differences. Zhao and Parr take their exact electron densities ρ0 on a grid of points in real space, from highly energy-accurate (but nevertheless approximate) reference ab initio wavefunction calculations. On the other hand, in the XCW method, a grid of points in is used, see equation (1). Also unlike the XCW method, the errors in the electron density ρ0 were assumed by Zhao and Parr to be negligible, and so they were aiming to constrain C = 0 in their procedure, which would be analogous to obtaining Δ = 0 in the XCW procedure. However, the goal of XCW is not Δ = 0 since this would imply reproducing the experimental data more accurately than the measured errors — see equation (2) and the more detailed discussion in §2.21.
2.18. Convergence problems with the Zhao–Parr method
Zhao et al. (1994) (ZMP) have noticed convergence problems in their method when λ became very large. They described this as `tedious', and did not really understand it, calling it a `critical phenomenon'. Nevertheless, they were able to obtain results by using an extrapolation procedure to large values of λ, employing the fact that C(λ) = C[Φλ] is an even function in λ, and second, employing an asymptotic expansion in (even) powers of λ−2. In Appendix A, we demonstrate that the GoF2(λ) function in the XCW procedure is also an even function of λ, while in Appendix B it is shown that the GoF2 used in the XCW fitting procedure is essentially equal to the square of the difference between the calculated and experimental electron density integrated over the Based on these results, in Appendix C, we show that the constraint function C of Zhao and Parr is the same as our GoF2, except with a different weighting in In summary: like C, the GoF2 may be loosely regarded as an energy term.
Tozer et al. (1996, 1997) (TIH) later argued and established convincingly that the convergence problems observed by ZP were due to the fact that the basis sets used to obtain the Kohn–Sham orbitals were incomplete, and they proposed to introduce a constant error term in the asymptotic expansion used to extrapolate to λ → ∞.
We mention that what TIH describe as basis set errors in their method could equally well be associated with inaccuracy in the reference electron densities ρ0 used by these authors. It is hard to say. Note that, for ab initio wavefunction methods, the errors in these accurate electron density values are likely very large in absolute terms at the nuclear positions due to the use of basis functions with incorrect cusp behaviour; while they are likely quite incorrect in percentage terms in the low-density regions of the wavefunction, for the same reason. We mention these possible inaccuracies because, in the XCW method, the experimental magnitudes are subject to experimental errors which are very similar in nature to the basis-set incompleteness errors in the just-mentioned reference electron densities ρ0 used by Zhao & Parr: in both the ZP and XCW procedures there is a mismatch between the the flexibility of the basis set, i.e. basis-set completeness or equivalently, the closeness of the reference ρ0 (or magnitude) data values to numerical values which are fittable by the (incomplete) set of basis functions used.
2.19. Convergence of the XCW method
Given our analysis above, it should not be very surprising that, like the ZMP and TIH approaches, the XCW method also displays convergence problems as the Lagrange multiplier λ becomes larger. Apart from the comments already made, we can understand these also as a consequence of the fact that the stationary equations which are to be solved become increasingly like an ill-conditioned least-squares fit where there are too many variables and not enough data. As such, there will be many eigenvalues in the matrices associated with these methods whose magnitudes are close to zero. This is a least-squares explanation of the critical phenomenon of Zhao and Parr.
2.20. Incompatible measurements
Although convergence problems are inherent to the XCW procedure itself, they may be exacerbated by specific features in the experimental data. To see this, suppose that there are experimentally measured reflections which are near each other in incompatible measurement. For this case, it may be impossible to obtain a GoF2 below a certain magnitude without a sufficiently flexible basis set to represent the sharp change in the (reciprocal space) electron density that this situation demands; and trying to do so will certainly lead to convergence problems [in part II (Davidson et al., 2022) we demonstrate this fact].
but whose associated estimated standard uncertainties () are unreasonably small. We may call this the problem of conflicting orAs a matter of fact, Henn (2016) and Landeros-Rivera et al. (2021) have argued that, more often than not, the σs are underestimated, leading to values for GoF2 which are unrealistically large. On the other hand, there are also many cases known for which an unfitted wavefunction has a GoF2 value smaller than one, suggesting that the are overestimated [see, for example, Grabowsky et al. (2011), and the supporting information of Grabowsky et al. (2010)]. From the literature then, it seems safe to conclude only that the σs are not always well estimated, and hence neither is the value of GoF2.
2.21. The XCW halting problem
Given an adequate model for Ψ, and given experimental data with uncorrelated and random errors only, we expect to terminate the XCW fitting procedure with a value of λ = λopt at which
The meaning of equation (15) is then that the averaged squared differences between observed and calculated magnitudes are, on average, within one estimated standard deviations . Naturally, if the σs are well estimated, the properties derived from an XCW when Δ = 1 are reliable (Jayatilaka, 1998; Jayatilaka & Grimwood, 2001; Grimwood & Jayatilaka, 2001). (Although the value of λ is important, indeed critical to the XCW process, it is only a means to constraining GoF2 to a desired value of Δ, and it is Δ which is most directly related to the experimental measurements and other kinds of refinements, and which should always be mentioned in XCW fitting calculations). On the other hand, if the are not well estimated, say as a rough approximation they are all in error by a uniform scale factor (A)1/2, then this could be re-cast as an equivalent functional J[Ψ, λ′] with the well estimated value of the , instead applying the scaled factor to λ′ = λ/A. Of course, to make these functionals truly equivalent, Δ must also change, but this is merely to highlight the relationship between the values of λ, the and Δ. It also emphasizes the fact that the value of Δ is critical.
We denote the problem of choosing the desired value of Δ at which the XCW procedure should be terminated as the halting problem. The name is chosen because, in practice, one chooses increasingly larger values of λ until a desired value of GoF2 is obtained; the halting problem here is supposed to evoke the idea of Turing's halting problem (Lucas, 2021). However, unlike Turing's halting problem, our halting problem can never have an unambiguous solution because it falls in the realm of statistical inference, which is always subjective.
2.22. Attempts to solve the halting problem
The XCW halting problem remains unsolved, despite several efforts, which we now summarise.
(i) A pragmatic solution to the halting problem was proposed by Whitten et al. (2006). They suggested `…to pursue fitting until the weighted residual [wR1] (or [wR2]) approaches that obtained in a parallel multipole of the same X-ray structure factors; see also Jayatilaka et al. (2009). Even so, in some cases these authors still found it difficult to obtain convergence, possibly because the multipole-refined geometries and anisotropic displacement parameters (ADPs) were not optimal, as discussed already.
(ii) The statistical procedure known mainly in protein crystallography as Rfree was also suggested as a solution [Lecomte (2003), private communication], but it has proved inconclusive [see for example the supporting information to Grabowsky et al. (2012) and Woińska et al. (2017)].
(iii) Bytheway et al. (2002) proposed a method to estimate errors in the experimental σs by adding Gaussian noise to the experimental magnitudes. Based on the behaviour of various topological properties as a function of GoF2 for these artificially noisy data sets, specifically where plots of these property values versus λ intersect with the same plots using un-noised experimental data, a scaling factor was found for the σs; and using these scaled σs, the fit was terminated when was reached. Unfortunately, when this method was examined for urea and alloxan, it failed (Grimwood et al., 2003).
(iv) Hudák et al. (2010) terminated the XCW procedure for the smallest λ value for which the change in the wR1 or wR2 was less than a certain limit, which they set to 0.04%, without any justification.
(v) Genoni (2013) proposed, using his X-ray constrained extremely localized (XC-ELMO) wavefunction method, that the constrained fitting should be terminated based on several criteria involving ensuring either GoF2 < 1, and both the relative change in GoF2 versus λ and in the relative change in the total electronic energy being less than pre-decided thresholds.
(vi) Recently, Genoni et al. (2019) have empirically observed that the GoF2 versus λ plot tends to have a change in curvature. They proposed that the fitting should be terminated at this inflection point. This method is somewhat similar in spirit to the L-curve method of inverse problem theory (Mueller & Siltanen, 2012).
(vii) In part II (Davidson et al., 2022), we have tested three new ways of terminating the fit. Although interesting relationships for a scale of the error in the σs arise, the problem is still not conclusively solved.
Currently, most XCW fittings are simply terminated at the ultimate or penultimate value of λ before convergence ceases, where lack of convergence is defined arbitrarily by more than a certain number of iterations for the self-consistent-field (SCF) process including the direct inversion of iterative subspace (DIIS) extrapolation method (Pulay, 1980, 1982). The assumption behind this fit until it fails method is that the quantum mechanical energy expression, EQM, will prevent overfitting, because if the desired Δ is too small (i.e. λ becomes too large and the fitted wavefunction Ψλ too different from ΨQM) it will be impossible to solve the XCW equations.
3. Practical formulation of the Hirshfeld-atom based X-ray constrained wavefunction method (HA-XCWs)
3.1. The X-ray constrained wavefunction
The X-ray constrained wavefunction (XCW) Ψ used by Grimwood & Jayatilaka (2001) comprises a Hartree product of isolated-molecule determinant wavefunctions Φk,
This is a molecule-based procrystal approximation or model for the periodic crystal wavefunction Ψ (Grimwood et al., 2003). It is a generalisation of the promolecule approximation defined by Spackman & Maslen (1986) which forms the basis of the IAM. In both these models, the electron density turns out to be a sum of electron densities ρk from the individual fragment wavefunctions Φk.7
In principle, each of the Φk may have a complicated form, but the simplest is to use a single determinant. In this case, space-group symmetry dictates that each Φk is related by spatial and translation symmetry operator to a single unique reference determinant
ϕi are spin orbitals which are most often expanded as a linear combination of Gaussian basis functions.8
3.2. Technical points concerning the XCW method
Three important but technical points must now be mentioned here.
(i) First, the number of atoms comprising the reference molecular wavefunction Φ may sometimes be larger than necessary, in order to better model the effect of the crystalline environment on the atoms whose electron density is needed to calculate the X-ray structure factors. That is ρ from Φ comprises more than the electron density comprising the of periodically repeating cell. In this case, even though the (normalised) wavefunction for this enlarged fragment is quantum mechanically valid, there is no molecule-based procrystal wavefunction Ψ of the form of equation (16), since this would entail that the enlarged environments mutually overlap. If the purpose of the wavefunction is to model the electron density or other properties in the crystal, then some means of extracting the electron density, or density matrix, of the symmetry-unique molecular fragment from its enlarged environment is necessary. Such means using a basis-function partitioning were developed in Jayatilaka & Grimwood (2001). Likewise, the atom-based HAR method provides an obvious way to extract the symmetry-unique electron density from this extended molecular fragment. In any case, the actual symmetry-unique atoms comprising the used to calculate the model structure factors must be identified. A convenient and recommended way to do this is to assume that in the reported the atoms are those symmetry-unique atoms which occur earliest in the list.
(ii) Second, rather than use a molecule-based procrystal wavefunction, one might want to use a fully periodic wavefunction, since such a wavefunction incorporates long-range electrostatic crystal field effects. This was achieved by Wall (2016) and Ruth et al. (2022) for HAR-like In general, such infinite-periodic quantum mechanical methods are more time-consuming to compute owing to the necessity to perform Ewald sums. In addition, it is often the case that such methods usually employ pseudopotentials to remove the core electrons from the model; but in this case, those core electrons comprise bulk of the X-ray diffraction signal. Finally, such periodic wavefunctions by design do not deal with surface effects, which are also long-range.
(iii) Alternatively, other methods may be used to account for the crystal field such as a self-consistent cluster of point charges and point dipoles based on Hirshfeld atoms (Capelli et al., 2014); localized-orbital periodic wavefunctions have been suggested (Shukla et al., 1996) and used (Jayatilaka, 1998); and a cluster of extremely-localized molecular orbitals (ELMOs) embedding a central moiety (Wieduwilt et al., 2021), this last being similar in spirit to that described in the first point in this list.
3.3. Procedure for obtaining a single-determinant XCW
The procedure for obtaining the molecular orbitals in the XCW wavefunction Φ is a practical implementation of an idea due to Henderson & Zimmerman (1976). It involves minimising the quantity
with respect to the wavefunction Φ. In order of appearance of these terms, they are defined as follows:
(i) EQM[Φ] is a variational energy computed using Hamiltonian ,
Here is the fixed-nucleus electronic Hamiltonian associated with the symmetry-unique isolated molecule (or cluster of molecules, as discussed in the previous section) whose wavefunction is Φ, written in terms of molecular spin orbitals , see equation (17).
(ii) The molecular spin orbitals ϕi are usually expanded in a basis set of fitting functions ,
The spin basis functions gμ are typically Gaussian functions centred on the atomic nuclei, as already discussed. The expansion coefficients c in the equation above are the molecular spin orbital coefficients, and making J stationary with respect to the spin orbitals implies stationary conditions with respect to these parameters.
(iii) In order to ensure that Φ is normalised, see equation (19), the spin orbitals are further subject to a normalisation constraint, expressed in the second term in equation (18). This constraint employs the Lagrange multipliers . The εis turn out to be the orbital eigenvalues (Szabo & Ostlund, 2012).9 For later reference, the normalisation constraint can be written in terms of the molecular spin orbital coefficients as
where
is the overlap matrix for the basis functions, and the † is a complex-conjugate transpose (though in most cases the coefficients c will be real and a transpose would suffice).
(iv) The most important parameter in equation (18) is λ, which is the Lagrangian multiplier associated with the constraint (or penalty) when GoF2(λ) achieves the desired value of Δ. The value of λ that gives the desired Δ is in practice found by trial and error, using a protocol.
(v) It is very important to note that the number of parameters used to define the GoF2(λ) is taken as
That is, Nparam is equal to the number of parameters usually refined in a crystallographic least-squares model, see §2.3, plus in addition one more parameter for λ. The rationale behind this choice is that λ counts as a kind of pseudo parameter. However, as we explained in §2.13, this is not at all the correct number of effective independent parameters, whose determination would count as one way to resolve the halting problem. Our choice here is pragmatic, so that the value obtained for GoF2 when doing an XCW at λ = 0 corresponds to that obtained from a HAR calculation. In previous work before HAR, NpADP was effectively assumed to be equal to zero.
The procedure for obtaining the spin orbitals is then to obtain the derivatives of the quantity J in equation (18) with respect to the orbital coefficients c which define Φ. This is the implementation of the more general equation (5) mentioned earlier. The resulting set of equations, derived in the next section, will still involve the multiplier λ. As we explained before, in practice, one solves these equations by first setting λ equal to zero, and then slowly increasing it until one achieves the desired value of Δ. In this way, one ensures that the wavefunctions Φλ which minimise J in equation (18) form a continuous sequence with Φλ=0 being the unmodified and unfitted theoretical wavefunction.
3.4. The number of parameters in a determinant wavefunction
Levy & Goldstein (1987) have shown that the number of independent parameters required to describe a single determinant wavefunction is given by
where Norb is the number of orbitals in the determinant; for a restricted determinant with the same spatial orbitals with different spins this is Ne/2, and for the unrestricted case with different spatial orbitals for each spin it is Ne. For an example, in part II (Davidson et al., 2022) where we consider restricted Hartree–Fock determinant for α-oxalic acid dihydrate comprising one oxalic unit and six surrounding water molecules, with a def2-SVP basis set (Weigend & Ahlrichs, 2005), we have Ne = 106 and Nbf = 250, so that the number of independent orbital coefficient parameters is = 13250. This is to be compared with the number of reflections Nrefl which, in the thirteen oxalic acid dihydrate data sets described by Kaminski et al. (2014), range from 1431 to 2881. It is worth keeping in mind that this is only the simplest kind of wavefunction, and more complicated multi-determinant wavefunctions may have exponentially more independent parameters.
3.5. The Hirshfeld atom structure factors
In order to minimize J of equation (18), we first need an expression for the calculated X-ray fA of Hirshfeld atom A in terms of the wavefunction parameters, the coefficients c. The required expression in Cartesian coordinates is (Jayatilaka & Dittrich, 2008)
where:
(i) The first summation is over the nominated Nasym asymmetric unit-cell atoms in the determinant wavefunction Φ. The first sum should be over all atoms in the but this is avoided because the second summation over Nsymop symmetry operators generates the symmetry-dependent unit-cell atoms. Thus the second summation is over Cartesian rotation and translation symmetry operators, respectively . sA is a site-symmetry factor needed to avoid double-counting of electron density contributions; it counts the number of times atom A is mapped onto itself by the symmetry operations.
(ii) is the Cartesian scattering vector for reflection r,
A* is the reciprocal cell matrix, which is given in terms of the real-space unit-cell matrix A by , and the columns of A are the a, b and c lattice vectors with respect to Cartesian axes. are the for the r-th reflection.
(iii) The averaged aspherical atomic form factor is the Fourier transform of the averaged aspherical electron density for the A, rotated to the correct orientation by the 3 × 3 symmetry matrix , and translated to its position in the It may be written in more detail as
atomwhere
The term comprises three others, which in order are: a complex phase factor associated with the position of atom A in the a temperature factor , derived from the averaged nuclear atomic density (or OCPDF), usually a 3D Gaussian function characterised by , the symmetric 3 × 3 matrix of ADPs; and the static aspherical atomic form factor , which is given explicitly by the Fourier transform
where is carved out of the molecular electron density from Φ by a weight function. The methods by which the atomic electron densities are obtained are detailed in the sections below for the one-centre and two-centre PDF models. Note that, in the above equation, ρA is shifted to the origin using its position . The averaged aspherical form factor can be thought of as the Fourier transform of the space-averaged static ρA (at the origin) with a real-space OCPDF for the atom A,
where is the convolution operation. Equation (27) follows from the above using the convolution theorem, if the temperature factor T is the Fourier transform of the PDF.
Note that, unlike standard crystallography which uses dimensionless fractional coordinates, all our formulae are presented and implemented in Cartesian coordinates because the ab initio quantum mechanical wavefunctions are calculated in the physical 3D space. Besides, asphericity is also characterized in Cartesian coordinates. For peculiarities of refinements using aspherical atomic form factors see also Midgley et al. (2021).
The electron density for the reference wavefunction Φ is given in terms of a basis set of functions by
where Aν and Aμ are the indices of the atoms on which the basis functions gν and gμ are centred, and where D is the symmetric density matrix for Φ in terms of this basis set,
3.6. Space-based atomic partitions for OCPDF models
In a space-based atom partitioning scheme the electron density of an A at its position (not the origin!) is given by
atomHere, is a weight function that carves out the aspherical atomic density from the total molecular quantum mechanical electron density associated with the reference determinant Φ. There are several ways to choose this weight function, and different methods chosen correspond to different OCPDF approximations (Chodkiewicz et al., 2020, 2018) but every such method must satisfy the partitioning condition
In the Hirshfeld atom or stockholder principle space-based partitioning scheme the weight function is defined by Hirshfeld (1977b)
Here is a spherical atomic electron density centred at the origin. The denominator is the sum of the nominated atoms in the Φ. The spherical electron densities can be taken from tables, but we usually use spherically-averaged high-spin unrestricted atomic ground-state wavefunctions calculated on-the-fly. We note that Kleemiss et al. (2021) have used tabulated Clementi-Roetti atomic wavefunctions. Chodkiewicz et al. (2020) have tested several different choices for the weight function and have concluded that the refined positions and ADPs using these different partitions are not greatly affected by these different choices.
of the cell, a subset of the atoms in the model wavefunctionSubstituting equations (32) and (33) into equation (29) and then equation (27), and then finally into the equation (25), we obtain, after some rearrangement
where
The latter integrals may need to be performed numerically (Jayatilaka & Dittrich, 2008). Midgley et al. (2021) have presented a very similar set of equations, in fractional coordinates, and without the use of symmetry operators to generate all the atoms in the from an Note that the above equations correct some minor errors that are present in Jayatilaka (2012). In HAR or related procedures, the positions and the ADPs which appear in are least-squares refined, by minimising the GoF2 (as was discussed in §3.3), whereas in XCW, the wavefunction parameters c which define D are obtained by minimising J. Some of this material as well as practical guidance for using the HA-XCW procedure has already been given in the book edited by Grabowsky (2021).
3.7. Basis-function-based atomic partitions for OCPDF models
For wavefunctions which employ one-electron basis functions with a centre e.g. the popular Gaussian basis sets (Boys, 1950), the static aspherical atomic electron densities ρA can also be defined using a basis-function partitioning scheme,
where
We reiterate that Aμ and Aν are the indices of the atom nuclei on which basis functions gμ and gν are centered, respectively. The above equation is analogous to space-based partitioning equation (34), but in basis function space. The analogue of the space-based partitioning condition, equation (35), is the basis-function partitioning rule
In Jayatilaka & Grimwood (2001) explicit formulae for the partition factors were given for the McWeeny–Mulliken partitioning (which assigns an equal portion of the electron density to each atom in a two-center basis function product), and for Tanaka's Gaussian basis function partitioning scheme (which assigns a proportion to each basis function in a two-center basis function product according to the Gaussian product theorem). In the latter case, the Gaussian product theorem occurs at the uncontracted basis function level, so the summations in the above equations should be regarded being over uncontracted Gaussian basis functions (Szabo & Ostlund, 2012). However, there is no requirement imposed by the above sum rule that the electron density associated to an atom A must involve the atoms on which the basis functions gμ and gν are centered: Vanfleteren et al. (2010) describe one interesting possibility.
Although the summation over A in the equation above is over a nominated set of atoms, it could be extended to comprise a sum over all atoms in the wavefunction Φ provided that partition factors are appropriately modified to account for double-counting. In Jayatilaka & Grimwood (2001) such atoms were called oversampled with respect to the and the factors which correct for this double counting are colloquially called repetition factors. In general however, our preference is not to work with an oversampled set of atoms, since otherwise specifying how the structure factors are calculated becomes a difficult task.
3.8. Basis-function-based partitioning for a TCPDF model
In a TCPDF model there is no longer any need to partition the electron density into atomic contributions, because ADPs are introduced for each atom on which the basis functions are centered, at most two. So in contrast to equations (27)–(29) and equations (39) and (40), the averaged form factor associated with a pair of basis functions in a TCPDF approach is given by
where
In the above, the point
is some function of the positions and on which the basis functions gμ and gν are centered, respectively. For example, in the McWeeny–Mulliken scheme it is simply the midpoint of the two atoms,
Whatever scheme is used, we must have the one-center condition
which implies that the basis function product on one center reduces to the OCPDF form.
The TCPDF analogues of the OCPDF temperature factor in equation (28) are the different kinds of two-center temperature factor models referred to in Jayatilaka & Grimwood (2001). For example, Coppens et al. (1971) have proposed the model
Again, for every q we must have a one-centre consistency condition whereby the TCPDF approximation for the temperature factor reduces to the corresponding OCPDF form, i.e.
pointThe formula for in the TCPDF case is then the same as in equation (37) with only a minor and obvious modification to equation (38) to incorporate the new averaged atomic form factor of equation (42), leading to
It includes the complex phase factors associated with the center of the atom or two-centre bond density, and the new one- or two-center temperature factors. The sνμ are pair site-symmetry factors which avoid double counting electron density contributions, and they count the number of times the basis function product pair is mapped onto itself by the symmetry operations [Dupuis & King (1977) have shown an alternative method to calculate these factors]. Note that only pairs of basis function products where at least one basis function is centered on an atom appear in equation (49) and in addition, the point associated with the basis function pair should also lie within the The latter condition ensures that the only electron density contributions in the are used to calculate the via equation (37).
One advantage of this formalism is that analytical Fourier transform integrals may be used, and these will be much quicker than the numerical integrations which are required in Hirshfeld-atom procedures.
It is worth mentioning here the interesting possibility to investigate the b) for certain pairs of atoms with large two-center electron density contributions [possibly with the incorporation of additional information concerning the behaviour of the electron density as a function of the atom positions from derivative electron density calculations to avoid linear dependencies (Scheringer, 1972a; Bürgi, 1989)] with only a minor modification of the above formalism.
of the full 6 × 6 two-center ADPs (Scheringer, 19723.9. Discussion of the two models
In summary, the difference between the OCPDF and TCPDF formulations is only that:
(i) In TCPDF models, the static atomic densities which define atomic form factors are obtained from the molecular density by partitioning it in basis-function space rather than in real space; and that
(ii) In TCPDF models, the averaging procedure involves the ADPs of both atom centers; and that this averaging procedure involves a variable point which depends in some way on the positions of the atoms on which a pair of basis functions are centred.
The TCPDF model is therefore more closely aligned with the underlying quantum-mechanical model in that it does not force the assumption of an Einsteinian atomic model (Bürgi, 1995; Bürgi & Capelli, 2000), though admittedly, the whole concept of two centre contributions (sometimes referred to as bonding electron density contributions) is rooted in the unnecessary but nevertheless widely-used concept of an atom-centred basis function. It should be noted that the bonding electron density does not account for more than roughly 10% of the total scattering signal, see Bytheway et al. (2007) for more details.
3.10. The Hirshfeld atom XCW (HA-XCW) equations
The equations we require are obtained when J is a minimum, i.e. when
The term in parentheses on the right hand side, E′QM, is the quantum mechanical energy of a wavefunction constrained to have orthonormal spin orbitals.
We take only the derivative with respect to because the derivative with respect to cνi would only yield the complex conjugate equation. From quantum chemistry we know that the first two terms will yield the Hartree–Fock equations,
where F is the Fock or Kohn–Sham matrix in the basis set chosen; all other terms were already defined (Szabo & Ostlund, 2012). We only need to evaluate the derivative of the term in the square brackets, which is just the derivative of the GoF2 term, since is an independent constant. Recalling that , from equation (2) we obtain
If we define to be the matrix which multiplies c in the equation above,
then the final equation resulting from minimising J is
These are a set of modified self-consistent-field equations, which for a given λ can be solved in a manner very similar to the usual Hartree–Fock or Kohn–Sham equations.
3.11. Efficient implementation of the HA-XCW equations
Equation (53), if implemented as written, will not be efficient because it would have an impractical scaling with the number of reflections and integration grid points. Fortunately, the summations in these expressions can be reordered and batched over atoms to give an expression with better computational cost, albeit with greater computer memory usage.
To get an equation which can be evaluated more quickly, we substitute equation (38) into equation (54), and extract all terms and summations except for the basis-function pair in the integrand of equation (38). The result is a potential
The matrix elements of with respect to the basis functions,
then constitute the matrix in equation (54). To see how an efficient implementation may be achieved from this form of the equations, note that the integral above must in general be evaluated numerically. Therefore, must be evaluated on a grid of points. Usually, such grids can be partitioned to be associated with individual atoms, so that, if required, the integral evaluation may be judiciously combined with tests eliminating terms from basis functions whose centres are too far away from the position of atom A, and which therefore may be neglected. A typical example of these grids used for integration are those defined by Becke (1988) with the atom-based partitioning scheme of Stratmann et al. (1996). If the sum is batched over atoms in this way, the memory requirements of the grid points and sums over the reflections may also be mitigated.
It is worth noting that, in a TCPDF approach, the Gaussian integrals in equation (54) be done analytically and efficiently, so this problem of efficient evaluation does not arise in that case.
4. Summary and conclusion
In this paper, we have reviewed the X-ray constrained wavefunction (XCW) method and have described and named the halting problem as the main issue to be addressed in obtaining reliable fitted wavefunctions. In addition, we have derived and presented working equations for the Hirshfeld atom XCW (HA-XCW) method, and how to implement these in an efficient manner. For comparison, we have distinguished between the older two-centre probability distribution (TCPDF) method, and the present one-centre (OCPDF) method for calculating the structure factors. The Hirshfeld atom (HA) method is one example of the latter. Finally, we reviewed the importance of using atomic positions and displacement parameters obtained from Hirshfeld Atom
(HAR) for the use in XCW fitting, a protocol termed X-ray Wavefunction (XWR). For consistency and clarity, we recommend always using the same method for dealing with thermal smearing in the least-squares of positions and ADPs as well as in the wavefunction fitting procedure.5. Reminiscences by DJ on my work with Hans-Beat Bürgi
I first met HBB (as we call him) at the combined Asca03/Crystal-23 international crystallographic meeting in Broome, Western Australia. I don't recall that we said much to each other. My immediate impression was: here is a funny character who looks like Fozzie Bear (doesn't he?) – happy, with a funny accent, sunburnt face and sandy feet, and more than a little tipsy in the warm tropical weather of the remote Pilbara region.
The following year I had arranged a nine month sabbatical to Claude Lecomte's group in Nancy, then called LCM3B I think, where I was supposed to work on what was called the X-ray wavefunction fitting project, and on polarised neutron diffraction PND.
Some time late in 2004, I don't recall the exact timing, I was invited by HBB to Bern to give a talk to his group; and second, I was invited to do my first X-ray experiment at the ESRF with Silvia Capelli, HBB's ex-PhD student. The Bern talk was memorable. It was delivered in HBB's small laboratory, with many interruptions, so it went on for a full hour. I felt I had been thoroughly grilled, but little did I know then (in what was to become a common signature of HBB's modus operandi) – out came the cups of tea (in HBB's case, horribly sweetened) and with biscuits – and then a further grilling of more than an hour! This would become a common fixture of our working relationship.
Later that year I was invited to submit an application for beam time at the ILL, to do neutron diffraction on the gly-L-ala crystal with Silvia. It was tremendously exciting, especially since I'd already heard of these facilities (a nuclear reactor!) and the PND experiment from Professor Brian Figgis, and had been working on it with his PhD student Steve Wolff and my own student Ann Whitten. I did not care much about the beam-time application – I just wanted to go – and so I did not spend much thought in it. I believe this application was the first time that the acronym X-ray constrained wavefunction (XCW) was used, introduced by Silvia. I recall not liking this strange acronym, but could not fault it, so it stuck. But during the writing phase, HBB got involved and was characteristically unsalutary about some loose language Silvia and I had used.
After this, Mark Spackman was very quick off the mark getting HBB to Australia on a visiting Professorship, which extended to two years, officially. Unofficially, HBB has been a fixture at UWA ever since (along with Bo Iversen). There are many memories of the azure waters of Cottesloe beach where, armed with his backpack from the 2003 Broome conference and his trusty bus timetable, we often spent the mornings running and swimming. The COVID times have put an end to this. We can only hope that he returns; despite losing my job, I have been assured I can still host him. After all, we have a huge backlog of work, e.g. that on extracting Cartesian normal coordinates, dating back to the very first beam-time application I just mentioned. But this is HBB: even this contribution, which we planned to be short, quick and easy has ballooned out into two papers, and has led to scientific disagreements on which, true to form, HBB wishes to have his say. We would not have it any other way.
So this is how it all started, and I feel very happy to be submitting this work on the XCW method, given the heritage I have described. HBB has surely been one of my closest collaborators and mentors over the years, on both professional and personal levels. Finally, I am most pleased because I have worked on this contribution with my own PhD student, Max Davidson, and as the work with Silvia Capelli shows, this somehow sums-up the kind of way that science should be done.
Thank you and congratulations, HBB!
APPENDIX A
GoF2 is an even function in λ
We essentially specialise the argument of Zhao & Parr (1993) for our case here. To show that GoF2(λ) is an even function of λ for the special case of a single determinant, note that the transformation
in equation (55), is equivalent to leaving λ unchanged but transforming
in that same equation, which in turn is equivalent to leaving Δv unchanged but transforming
for every reflection r in equation (56). (This is only a formal manipulation, we are not suggesting that the ESUs are really negative. This transformation could have equally been applied to the numerator.) This last transformation clearly does not change GoF2 in equation (2) because is squared in this equation. To summarise, the transformation implies that GoF2(−λ) = GoF2(λ). This means that GoF2 is an even function in λ, as claimed.
APPENDIX B
Relationship between the GoF2 and the square of the difference between the calculated and observed electron densities
We define a squared distance between calculated and experimental static electron densities in the unit-cell as
where
is a probability density function used to average the difference in the densities via a convolution operation, represented by the symbol. is the δ-function as
function which copies the unit-cell electron densities throughout all space, which is defined in terms of the Diracwhere A is the previously defined unit-cell matrix, and the sum is over all integer triples n representing each If the Fourier transform is further defined by equation (35), then the convolution and Parseval theorems (Arfken et al., 2013) lead to
Substituting in the above equation the standard result that the Fourier transform of the lattice function is
where V is the volume of the is the reciprocal cell matrix, and the sum is over all , we get
Recall that
is the square of the difference between the calculated and observed r (if it is assumed that both have the same complex phase), and may be taken as an effective weight. In fact, this weight is nothing other than the Wilson's factor from his famous plot, see e.g. Giacovazzo et al. (1992). Comparing now the above equation with the definition of GoF2 in equation (2), we see that the two are identical apart from a prefactor if we identify the weights wr with the values. Note, however, that the weights are smooth functions of q, but the measurement errors are not. Note also that the unit-cell volume prefactor is not unimportant: it affects the relative importance of the least-squares GoF2 term relative to the energy-regularization term EQM. The above squared metric scales with the number of electrons per unit-cell volume, whereas the GoF2 term does not.
magnitudes for a reflectionThe result derived here is very similar to one which appears in Dunitz & Seiler (1973), but was taken with slight modification from Appendix C of Wolff (1995). The only difference compared to this earlier work is that an averaging function is introduced, for use in the following appendix.
APPENDIX C
Relationship between the GoF2 and the classical Coulomb energy
There is another similarity involving the GoF2, namely one with the classical Coulomb energy constraint term C used by Zhao & Parr (1993). Substituting our unaveraged electron density differences defined in the previous appendix [i.e. take ] into their equation (4), which is in we get
Again employing the convolution theorem with standard results (Arfken et al., 2013), we may write
Comparison with equations (63) and (64) shows that our GoF2 is the same as the C constraint term used by Zhao & Parr (1993) provided that the weight factors are chosen to be , the magnitude of the reciprocal vectors . Thus, the GoF2 can be seen as a kind of pseudo-energy per unit-cell volume, differing from the Coulomb energy only in employing a different reciprocal-space weight.
The result quoted here is similar to one quoted in Coppens (1997, section 9.3.2), and also used in chapter 8 of the thesis by Ernst (2020). The notation differs because the fractional crystal-axis system is used. Results in the Cartesian axis system are derived in Wieduwilt (2018) Appendix B with comparison in the two axis systems.
Footnotes
1A functional is simply a function of a continuous function rather than a scalar variable, and by convention the square braces distinguish a functional from a function.
2There is an amusing explanation of the Hohenberg–Kohn theorem due to E. Bright Wilson, as related by Löwdin (1986): `…if one knows ρ, one knows also the total number of electrons … the cusps of ρ will further indicate the locations of the nuclei, and the shape of the cusps will give the atomic numbers of the nuclei involved. Hence, one can construct the N-electron Schrödinger equation for fixed nuclei, and the solution determines all the electronic properties including the ground-state energy. This gives undoubtedly an explicit recipe for the Hohenberg–Kohn functional…. For some reason, the Bright-Wilson interpretation does not seem to be popular among the practitioners ….'
3It might be argued that the experimental magnitudes should be scaled to be on an absolute scale to facilitate comparison with the calculated values. However, by crystallographic convention the attitude is to avoid modifying the experimental data as far as possible, so the scale factor ξ is usually applied to the magnitude of the calculated structure factor.
4Note that in all previous works by one of us (DJ), GoF2 was called the χ2, which is incorrect. The difference between what we used to call χ2 and the correct definition, namely χ2 = (Nrefl − Nparam)GoF2 , involves only a scale factor, see equation (2), but this distinction is important if one wishes to avoid confusion when referring to the statistical literature, or if one wants to use statistical tests. From this point, we use the correct terminology, even when referring to our own and others' previous work, and we encourage the use of the correct terminology henceforth.
5Thus, as Watkin (2008) says, use of the squared magnitudes `involves the minimal tinkering with the [experimental] data'. Yet, in the same paper Watkin still defends the use of magnitudes to define GoF2 since optimisation methods, such as the least-squares method, are unaffected by non-linear transformations in those same magnitudes; and also because the use of magnitudes (as opposed to the squared magnitudes) lessens the effect of outliers, which may be of more importance for high-resolution X-ray diffraction experiments. Besides, as we show in Appendix B, GoF2 defined in terms of the magnitudes is closely related to the integrated difference in average electron density in the and in Appendix C, also somewhat closely related to the constraint term used by Zhao & Parr (1993) in their method to obtain the Kohn–Sham wavefunction of density functional theory. For these reasons we persist in using the GoF2 in terms of the magnitudes themselves, making use of an appropriate procedure to estimate the associated values. In any case, the use of squared magnitudes to define the GoF2 in the HAR and XCW procedures has never been reported, and its use should be investigated in future work.
6We follow an embarrassing fashion for one of us (DJ) to call this functional with the name `J'. It is just a Lagrangian.
7It is worth mentioning that an analogous promolecule or procrystal model (as used to obtain the electron density in the crystals) is also used to calculate the nuclear probability density functions from the atomic displacement parameters: in the Einstein–Debye model, the nuclear wavefunctions are assumed to be a Hartree product of atom centred harmonic nuclear wavefunctions (Debye, 1913; Born, 1942). So there is a precedent for this idea.
8Thus, the molecular wavefunctions Φk, even though normalised, are not orthogonal to each other. Although Ψ is normalised, it would not remain so if it were antisymmetrized with respect to electron coordinates, as required by quantum mechanics. Therefore, using the wavefunction (16) to obtain intermolecular properties relating to different determinants Φk requires care.
9Note that in the standard textbooks, a matrix of Lagrange multipliers εij is introduced to enforce orthonormality conditions 〈ϕi|ϕj〉 − δij = 0. For the cases here, the energy expression EQM[Φ] involves a self-adjoint Hamiltonian, which eventually leads to a Hermitian one-electron Fock matrix, see §3.10, which automatically ensures orthogonal spin orbital solutions. Whether using the standard approach or our Lagrangian, the Lagrange normalisation constraint on the solution must be explicitly enforced; and this is done as a matter of course in most eigensolver routines.
Acknowledgements
We thank Professor Hans-Beat Bürgi for his critical assessment of this manuscript. Above and beyond that, we thank him for very stimulating and enlightening discussions over the past two decades. We salute him for his career. MLD was supported by an Australian Government Research Training Program (RTP) Scholarship. We would like to acknowledge the critical and insightful comments of all referees. Referee 3 is acknowledged for pointing out that Ernst et al. (2020) had already used the relationship in Appendix B
Funding information
Funding for this research was provided by: Australian Government Research Training Program (RTP) Scholarship.
References
Arfken, G., Weber, H. & Harris, F. (2013). Mathematical Methods for Physicists. Elsevier. Google Scholar
Becke, A. D. (1988). J. Chem. Phys. 88, 2547–2553. CrossRef CAS Web of Science Google Scholar
Bergmann, J., Davidson, M., Oksanen, E., Ryde, U. & Jayatilaka, D. (2020). IUCrJ, 7, 158–165. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Born, M. (1942). Rep. Prog. Phys. 9, 294–333. CrossRef Google Scholar
Boys, S. F. (1950). Proc. R. Soc. London Ser. A, 200, 542–554. CrossRef CAS Web of Science Google Scholar
Brünger, A. T. (1992). Nature, 355, 472–475. PubMed Web of Science Google Scholar
Bučinský, L., Jayatilaka, D. & Grabowsky, S. (2016). J. Phys. Chem. A, 120, 6650–6669. Web of Science PubMed Google Scholar
Bučinský, L., Jayatilaka, D. & Grabowsky, S. (2019). Acta Cryst. A75, 705–717. Web of Science CrossRef IUCr Journals Google Scholar
Bürgi, H.-B. (1989). Acta Cryst. B45, 383–390. CrossRef Web of Science IUCr Journals Google Scholar
Bürgi, H.-B. (1995). Acta Cryst. B51, 571–579. CrossRef Web of Science IUCr Journals Google Scholar
Bürgi, H. B. & Capelli, S. C. (2000). Acta Cryst. A56, 403–412. Web of Science CrossRef IUCr Journals Google Scholar
Bytheway, I., Chandler, G., Figgis, B. & Jayatilaka, D. (2007). Acta Cryst. A63, 135–145. Web of Science CrossRef IUCr Journals Google Scholar
Bytheway, I., Grimwood, D. J., Figgis, B. N., Chandler, G. S. & Jayatilaka, D. (2002). Acta Cryst. A58, 244–251. Web of Science CrossRef CAS IUCr Journals Google Scholar
Capelli, S. C., Bürgi, H.-B., Dittrich, B., Grabowsky, S. & Jayatilaka, D. (2014). IUCrJ, 1, 361–379. Web of Science CSD CrossRef CAS PubMed IUCr Journals Google Scholar
Carlin, B. P. & Louis, T. A. (2008). Bayesian Methods for Data Analysis. CRC Press. Google Scholar
Chodkiewicz, M., Pawlędzio, S., Woińska, M. & Woźniak, K. (2022). IUCrJ, 9, 298–315. CSD CrossRef CAS PubMed IUCr Journals Google Scholar
Chodkiewicz, M. L., Migacz, S., Rudnicki, W., Makal, A., Kalinowski, J. A., Moriarty, N. W., Grosse-Kunstleve, R. W., Afonine, P. V., Adams, P. D. & Dominiak, P. M. (2018). J. Appl. Cryst. 51, 193–199. Web of Science CrossRef CAS IUCr Journals Google Scholar
Chodkiewicz, M. L., Woińska, M. & Woźniak, K. (2020). IUCrJ, 7, 1199–1215. Web of Science CSD CrossRef CAS PubMed IUCr Journals Google Scholar
Coppens, P. (1997). X-ray Charge Densities and Chemical Bonding, Vol. 4. International Union of Crystallography. Google Scholar
Coppens, P., Willoughby, T. V. & Csonka, L. N. (1971). Acta Cryst. A27, 248–256. CrossRef CAS IUCr Journals Web of Science Google Scholar
Davidson, M. L., Grabowsky, S. & Jayatilaka, D. (2022). Acta Cryst. B78, 397–415. CrossRef IUCr Journals Google Scholar
Debye, P. (1913). Ann. Phys. 348, 49–92. CrossRef Google Scholar
Diamond, R. (1966). Acta Cryst. 21, 253–266. CrossRef CAS IUCr Journals Web of Science Google Scholar
Dixon, M. & Ward, T. (2018). arXiv: 1803.04947. Google Scholar
Dreizler, R. M. & Gross, E. K. (1990). Density Functional Theory. Springer. Google Scholar
Dunitz, J. D. & Seiler, P. (1973). Acta Cryst. B29, 589–595. CrossRef CAS IUCr Journals Web of Science Google Scholar
Dunitz, J. D. (1995). X-ray Analysis and the Structure of Organic Molecules. Verlag Helvetica Chimica Acta. Google Scholar
Dupuis, M. & King, H. F. (1977). Int. J. Quantum Chem. 11, 613–625. CrossRef CAS Google Scholar
Einstein, A. (1907). Ann. Phys. 327, 180–190. CrossRef Google Scholar
Einstein, A. (2005). Ann. Phys. 14, 280–291. CrossRef Google Scholar
Ernst, M. (2020). PhD thesis, Universität Bern, Switzerland. Google Scholar
Ernst, M., Genoni, A. & Macchi, P. (2020). J. Mol. Struct. 1209, 127975. Web of Science CrossRef Google Scholar
Fugel, M., Jayatilaka, D., Hupf, E., Overgaard, J., Hathwar, V. R., Macchi, P., Turner, M. J., Howard, J. A. K., Dolomanov, O. V., Puschmann, H., Iversen, B. B., Bürgi, H.-B. & Grabowsky, S. (2018). IUCrJ, 5, 32–44. Web of Science CSD CrossRef CAS PubMed IUCr Journals Google Scholar
Genoni, A. (2013). J. Chem. Theory Comput. 9, 3004–3019. CrossRef CAS PubMed Google Scholar
Genoni, A., Bučinský, L., Claiser, N., Contreras–García, J., Dittrich, B., Dominiak, P. M., Espinosa, E., Gatti, C., Giannozzi, P., Gillet, J.-M., Jayatilaka, D., Macchi, P., Madsen, A. O., Massa, L., Matta, C. F., Merz, K. M., Nakashima, P. N. H., Ott, H., Ryde, U., Schwarz, K., Sierka, M. & Grabowsky, S. (2018). Chem. Eur. J. 24, 10881–10905. Web of Science CrossRef CAS PubMed Google Scholar
Genoni, A., Dos Santos, L. H. R., Meyer, B. & Macchi, P. (2017). IUCrJ, 4, 136–146. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Genoni, A. & Macchi, P. (2020). Crystals, 10, 473. Web of Science CrossRef Google Scholar
Genoni, A., Macetti, G., Franchini, D., Pieraccini, S. & Sironi, M. (2019). Acta Cryst. A75, 778–797. Web of Science CrossRef IUCr Journals Google Scholar
Giacovazzo, C., Monaco, H. L., Viterbo, D., Scordari, F., Gilli, G., Zanotti, G. & Catti, M. (1992). Fundamentals of Crystallography, IUCr Texts on Crystallography, Vol. 2, 1st ed. Oxford University Press. Google Scholar
Goerigk, L., Hansen, A., Bauer, C., Ehrlich, S., Najibi, A. & Grimme, S. (2017). Phys. Chem. Chem. Phys. 19, 32184–32215. CrossRef CAS PubMed Google Scholar
Goldstein, H., Poole, C. P. & Safko, J. L. (2002). Classical Mechanics, ch. 7, 3rd ed. Addison-Wesley. Google Scholar
Golub, G. H., Heath, M. & Wahba, G. (1979). Technometrics, 21, 215–223. CrossRef Google Scholar
Grabowsky, S. (2021). Complementary Bonding Analysis. Walter de Gruyter GmbH & Co KG. Google Scholar
Grabowsky, S., Genoni, A. & Bürgi, H.-B. (2017). Chem. Sci. 8, 4159–4176. Web of Science CrossRef CAS PubMed Google Scholar
Grabowsky, S., Genoni, A., Thomas, S. P. & Jayatilaka, D. (2020). 21st Century Challenges in Chemical Crystallography II: Structural Correlations and Data Interpretation, pp. 65–144. Google Scholar
Grabowsky, S., Jayatilaka, D., Mebs, S. & Luger, P. (2010). Chem. Eur. J. 16, 12818–12821. Web of Science CrossRef CAS PubMed Google Scholar
Grabowsky, S., Luger, P., Buschmann, J., Schneider, T., Schirmeister, T., Sobolev, A. N. & Jayatilaka, D. (2012). Angew. Chem. Int. Ed. 51, 6776–6779. Web of Science CSD CrossRef ICSD CAS Google Scholar
Grabowsky, S., Weber, M., Jayatilaka, D., Chen, Y.-S., Grabowski, M. T., Brehme, R., Hesse, M., Schirmeister, T. & Luger, P. (2011). J. Phys. Chem. A, 115, 12715–12732. Web of Science CSD CrossRef CAS PubMed Google Scholar
Grimwood, D. J., Bytheway, I. & Jayatilaka, D. (2003). J. Comput. Chem. 24, 470–483. Web of Science CrossRef PubMed CAS Google Scholar
Grimwood, D. J. & Jayatilaka, D. (2001). Acta Cryst. A57, 87–100. Web of Science CrossRef CAS IUCr Journals Google Scholar
Gruber, M. H. (2017). Improving Efficiency by Shrinkage: the James-Stein and Ridge Regression Estimators. Routledge. Google Scholar
Hall-Wilton, R. & Theroine, C. (2014). Phys. Procedia, 51, 8–12. CAS Google Scholar
Henderson, G. A. & Zimmerman, R. K. (1976). J. Chem. Phys. 65, 619–622. CrossRef CAS Google Scholar
Henderson, S., Aleksandrov, A. V., Allen, C. K., Assadi, S., Bartoski, D., Blokland, W., Casagrande, F., Campisi, I., Chu, C., Cousineau, S. M. et al. (2015). The Spallation Neutron Source Beam Commissioning and Initial Operations. Technical Report, Oak Ridge National Laboratory, Oak Ridge, TN, USA. Google Scholar
Henn, J. (2016). Acta Cryst. A72, 696–703. Web of Science CrossRef IUCr Journals Google Scholar
Hickstein, D. D., Cole, J. M., Turner, M. J. & Jayatilaka, D. (2013). J. Chem. Phys. 139, 064108. Web of Science CrossRef PubMed Google Scholar
Hirshfeld, F. (1977a). Isr. J. Chem. 16, 198–201. CrossRef CAS Google Scholar
Hirshfeld, F. L. (1977b). Theor. Chim. Acta, 44, 129–138. CrossRef CAS Web of Science Google Scholar
Hoerl, A. E. & Kennard, R. W. (1970). Technometrics, 12, 55–67. CrossRef Web of Science Google Scholar
Hohenberg, P. & Kohn, W. (1964). Phys. Rev. 136, B864–B871. CrossRef Web of Science Google Scholar
Hudák, M., Jayatilaka, D., Perašínová, L., Biskupič, S., Kožíšek, J. & Bučinský, L. (2010). Acta Cryst. A66, 78–92. Web of Science CrossRef IUCr Journals Google Scholar
Jayatilaka, D. (1998). Phys. Rev. Lett. 80, 798–801. Web of Science CrossRef CAS Google Scholar
Jayatilaka, D. (2012). In Modern Charge Density Analysis, edited by P. Macchi & C. Gatti, ch. 6, pp. 213–257. Springer Science & Business Media. Google Scholar
Jayatilaka, D. & Dittrich, B. (2008). Acta Cryst. A64, 383–393. Web of Science CrossRef CAS IUCr Journals Google Scholar
Jayatilaka, D. & Grimwood, D. J. (2001). Acta Cryst. A57, 76–86. Web of Science CrossRef CAS IUCr Journals Google Scholar
Jayatilaka, D. & Grimwood, D. J. (2003). In International Conference on Computational Science, pp. 142–151. Springer. Google Scholar
Jayatilaka, D., Munshi, P., Turner, M. J. M., Howard, J. A. K. & Spackman, M. A. (2009). Phys. Chem. Chem. Phys. 11, 7209–7218. CrossRef PubMed CAS Google Scholar
Joos, E., Zeh, H. D., Kiefer, C., Giulini, D. J., Kupsch, J. & Stamatescu, I.-O. (2013). Decoherence and the Appearance of a Classical World in Quantum Theory. Springer Science & Business Media. Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 133–144. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72–91. Web of Science CSD CrossRef IUCr Journals Google Scholar
Kleemiss, F., Dolomanov, O. V., Bodensteiner, M., Peyerimhoff, N., Midgley, L., Bourhis, L. J., Genoni, A., Malaspina, L. A., Jayatilaka, D., Spencer, J. L., White, F., Grundkötter-Stock, B., Steinhauer, S., Lentz, D., Puschmann, H. & Grabowsky, S. (2021). Chem. Sci. 12, 1675–1692. Web of Science CSD CrossRef CAS Google Scholar
Kohn, W. (1999). Rev. Mod. Phys. 71, 1253–1266. Web of Science CrossRef CAS Google Scholar
Kohn, W. & Sham, L. J. (1965). Phys. Rev. 140, A1133–A1138. CrossRef Web of Science Google Scholar
Landeros-Rivera, B., Contreras-García, J. & Dominiak, P. M. (2021). Acta Cryst. B77, 715–727. CrossRef IUCr Journals Google Scholar
Levy, M. (1979). Proc. Natl Acad. Sci. USA, 76, 6062–6065. CrossRef PubMed CAS Google Scholar
Levy, M. & Goldstein, J. A. (1987). Phys. Rev. B, 35, 7887–7890. CrossRef CAS Web of Science Google Scholar
Löwdin, P.-O. (1986). Int. J. Quantum Chem. 28, 19–37. Google Scholar
Lucas, S. (2021). J. Log. Algebraic Methods Program. 121, 100687. CrossRef Google Scholar
Macchi, P. (2020). Crystallogr. Rev. 26, 209–268. Web of Science CrossRef Google Scholar
Macetti, G., Macchi, P. & Genoni, A. (2021). Acta Cryst. B77, 695–705. Web of Science CrossRef IUCr Journals Google Scholar
Malaspina, L. A., Genoni, A., Jayatilaka, D., Turner, M. J., Sugimoto, K., Nishibori, E. & Grabowsky, S. (2021). J. Appl. Cryst. 54, 718–729. CSD CrossRef CAS IUCr Journals Google Scholar
Malaspina, L. A., Wieduwilt, E. K., Bergmann, J., Kleemiss, F., Meyer, B., Ruiz-López, M. F., Pal, R., Hupf, E., Beckmann, J., Piltz, R. O., Edwards, A. J., Grabowsky, S. & Genoni, A. (2019). J. Phys. Chem. Lett. 10, 6973–6982. Web of Science CSD CrossRef CAS PubMed Google Scholar
Marquardt, D. W. (1970). Technometrics, 12, 591–612. CrossRef Web of Science Google Scholar
Marquardt, D. W. & Snee, R. D. (1975). Am. Stat. 29, 3–20. Google Scholar
Midgley, L., Bourhis, L. J., Dolomanov, O. V., Grabowsky, S., Kleemiss, F., Puschmann, H. & Peyerimhoff, N. (2021). Acta Cryst. A77, 519–533. Web of Science CrossRef IUCr Journals Google Scholar
Mueller, J. L. & Siltanen, S. (2012). Linear and Nonlinear Inverse Problems with Practical Applications. SIAM. Google Scholar
Parr, R. G. & Yang, W. (1994). Density-Functional Theory of Atoms and Molecules. Oxford University Press. Google Scholar
Podhorský, M., Bučinský, L., Jayatilaka, D. & Grabowsky, S. (2021). Acta Cryst. A77, 54–66. CrossRef IUCr Journals Google Scholar
Pulay, P. (1980). Chem. Phys. Lett. 73, 393–398. CrossRef CAS Web of Science Google Scholar
Pulay, P. (1982). J. Comput. Chem. 3, 556–560. CrossRef CAS Web of Science Google Scholar
Ruth, P. N., Herbst-Irmer, R. & Stalke, D. (2022). IUCrJ, 9, 286–297. Web of Science CSD CrossRef CAS PubMed IUCr Journals Google Scholar
Scheringer, C. (1972a). Acta Cryst. A28, 516–522. CrossRef IUCr Journals Google Scholar
Scheringer, C. (1972b). Acta Cryst. A28, 512–515. CrossRef IUCr Journals Google Scholar
Shukla, A., Dolg, M., Stoll, H. & Fulde, P. (1996). Chem. Phys. Lett. 262, 213–218. CrossRef CAS Web of Science Google Scholar
Spackman, M. A. & Maslen, E. N. (1986). J. Phys. Chem. 90, 2020–2027. CrossRef CAS Web of Science Google Scholar
Stewart, R. F. (1969). J. Chem. Phys. 51, 4569–4577. CrossRef CAS Web of Science Google Scholar
Stratmann, R. E., Scuseria, G. E. & Frisch, M. J. (1996). Chem. Phys. Lett. 257, 213–223. CrossRef CAS Web of Science Google Scholar
Szabo, A. & Ostlund, N. S. (2012). Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory. Courier Corporation. Google Scholar
Tanaka, K. (1988). Acta Cryst. A44, 1002–1008. CrossRef CAS Web of Science IUCr Journals Google Scholar
Tozer, D. J., Ingamells, V. E. & Handy, N. C. (1996). J. Chem. Phys. 105, 9200–9213. CrossRef CAS Web of Science Google Scholar
Tozer, D. J., Somasundram, K. & Handy, N. C. (1997). Chem. Phys. Lett. 265, 614–620. CrossRef CAS Google Scholar
Trueblood, K. N., Bürgi, H.-B., Burzlaff, H., Dunitz, J. D., Gramaccioli, C. M., Schulz, H. H., Shmueli, U. & Abrahams, S. C. (1996). Acta Cryst. A52, 770–781. CrossRef CAS Web of Science IUCr Journals Google Scholar
Turkman, M. A. A., Paulino, C. D. & Müller, P. (2019). Computational Bayesian Statistics: An Introduction, Vol. 11. Cambridge University Press. Google Scholar
Vanfleteren, D., Van Neck, D., Bultinck, P., Ayers, P. W. & Waroquier, M. (2010). J. Chem. Phys. 133, 231103. CrossRef PubMed Google Scholar
Wall, M. E. (2016). IUCrJ, 3, 237–246. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Ward, M. D. & Ahlquist, J. S. (2018). Maximum Likelihood for Social Science: Strategies for Analysis. Cambridge University Press. Google Scholar
Waser, J. (1963). Acta Cryst. 16, 1091–1094. CrossRef CAS IUCr Journals Web of Science Google Scholar
Watkin, D. (2008). J. Appl. Cryst. 41, 491–522. Web of Science CrossRef CAS IUCr Journals Google Scholar
Weese, J. (1993). Comput. Phys. Commun. 77, 429–440. CrossRef CAS Web of Science Google Scholar
Weigend, F. & Ahlrichs, R. (2005). Phys. Chem. Chem. Phys. 7, 3297–3305. Web of Science CrossRef PubMed CAS Google Scholar
Whitten, A. E., Jayatilaka, D. & Spackman, M. A. (2006). J. Chem. Phys. 125, 174505. Web of Science CrossRef PubMed Google Scholar
Wieduwilt, E. (2018). Master's thesis, Universität Bremen, Germany. Google Scholar
Wieduwilt, E. K., Macetti, G. & Genoni, A. (2021). J. Phys. Chem. Lett. 12, 463–471. Web of Science CrossRef CAS PubMed Google Scholar
Willis, B. T. M. & Pryor, A. W. (1975). Thermal Vibrations in Crystallography, Vol. 50. Cambridge University Press. Google Scholar
Woińska, M., Grabowsky, S., Dominiak, P. M., Woźniak, K. & Jayatilaka, D. (2016). Sci. Adv. 2, e1600192. Web of Science PubMed Google Scholar
Woińska, M., Jayatilaka, D., Dittrich, B., Flaig, R., Luger, P., Woźniak, K., Dominiak, P. M. & Grabowsky, S. (2017). ChemPhysChem, 18, 3334–3351. Web of Science PubMed Google Scholar
Wolff, S. K. (1995). PhD thesis, The University of Western Australia. Google Scholar
Zhao, Q., Morrison, R. C. & Parr, R. G. (1994). Phys. Rev. A, 50, 2138–2142. CrossRef CAS PubMed Web of Science Google Scholar
Zhao, Q. & Parr, R. G. (1993). J. Chem. Phys. 98, 543–548. CrossRef CAS Web of Science Google Scholar
This article is published by the International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.