N-representable one-electron reduced density matrix reconstruction with frozen core electrons

An improved method of one-electron reduced density matrix reconstruction from structure factors and directional Compton profiles is tested on a urea crystal. Novel restrictions accounting for molecular symmetry and freezing of core electrons are introduced.


Introduction
While N-electron wavefunctions provide the most complete and exact description of electronic structure in crystals, their experimental determination is still out of reach due to their exponentially large complexity for real systems.Moreover, in Coulson's words: 'a conventional many-electron wavefunction tells us more than we need to know' (Coulson, 1960).It is then worth considering the one(two)-electron reduced density matrices (1,2-RDM) as compact substitutes for wavefunctions since they involve significantly fewer parameters.As of today, the incompleteness of N-representability conditions (Liu et al., 2007), which ensure that a reduced density matrix can be associated with a complete N-body density matrix, and the lack of experimental observables with sufficient information content still pose daunting obstacles to the reconstruction of 2-RDMs.Therefore, 1-RDMs, which do not suffer from the same impediment and still contain valuable quantum mechanical information, are considered suitable candidates for modelling electron behaviour from experimental data.The reconstruction process, however, remains a challenging task.Firstly, N-representability conditions still need to be fulfilled for an experimentally reconstructed 1-RDM to be physically meaningful.Secondly, from a pure measurement perspective, as the 1-RDM contains both position and momentum space information, it cannot currently, to the best of our knowledge, be obtained using a single experimental technique.
The challenge of 1-RDM reconstruction from experimental data was initiated by Clinton and coworkers in the 1960s using a drastic idempotency condition as a means to ensure Nrepresentability (Clinton et al., 1969a,bc,d;Clinton & Lamers, 1969).Based on a series of works combining position and momentum space data on isolated atoms, Schmider et al. (1992) argued that the idempotency condition would hinder the recovery of the electron (static and dynamical) correlation effect in the reconstructed density matrix.The potential presence of such information in position space was recently confirmed by an X-ray constrained wavefunction refinement on urea and alanine (Hupf et al., 2023).The authors argue that evidence of significant deviation from the Hartree-Fock description can be found using high-resolution X-ray diffraction structure factors.Any single-determinant-based model would forbid access to such subtle features in the data.Adopting a formal perspective, Mazziotti (2007) discussed different strategies to include N-representability conditions in a series of articles and proposed a semidefinite programming (SDP) formulation of the 1,2-RDM reconstruction problem (Foley & Mazziotti, 2012).On more practical grounds, following Schmider and coworkers' seminal work, several papers reported the joint use of X-ray diffraction structure factors (SF) and directional Compton profiles (DCP) to explore different non-single-determinant models and strategies for 1-RDM modelling in both magnetic and non-magnetic molecular compounds (Schmider & Smith, 1993;Schmider et al., 1993;Schwarz et al., 1994;Schmider, 1996;Gueddida et al., 2018a,b;De Bruyne & Gillet, 2020;Launay & Gillet, 2021).However, all SDP-based reconstruction attempts at 1-RDM, which have been put forward, were applied to isolated atoms or molecules with at most two or three atoms.
The present work further investigates the 1-RDM reconstruction problem in molecular crystals by building upon the convex optimization approach put forward by De Bruyne & Gillet (2020), Launay & Gillet (2021), scaling up the system size from modest dry ice (CO 2 ) to the more realistic urea [CO(NH 2 ) 2 ] crystal.The purpose is thus to demonstrate the potential of an improved method more relevant to practical applications and its suitability to compensate for sparse momentum space data.To address the challenges posed by a significant increase in system size, we propose the implementation of symmetry constraints and the possibility of freezing core-electron contributions.For the first time, approximate energy and virial ratio are used to determine the optimal data set for the 1-RDM model refinement.
This article is structured as follows.In Section 2, we explain how the 1-RDM reconstruction can be formulated as a convex optimization problem, with the N-representability condition, symmetry and frozen core electrons as convex constraints.The method used for reconstruction, deconvoluted from thermal motion, is briefly reviewed.In Section 3, we showcase the importance of the joint use of position and momentum space data even when Compton scattering data are suspected of being poorly informative.Additional degradation due to noise and temperature effects and the improved robustness using symmetry and frozen core constraints are illustrated.The conclusion and future directions are given in the last section.

1-RDM reconstruction using least-squares fitting
For a spin-traced (spin-free) pure-state N-electron system, the 1-RDM can be derived by integrating out the N À 1 coordinates of the N-electron-density matrix, i.e.
where (r, r 2 , . . .r N ) is the pure-state N-electron wavefunction.A mixed-state system 1-RDM is a mere convex combination of pure-state 1-RDMs.
It is well known (Lo ¨wdin, 1955) that the 1-RDM can be conveniently approximated using a discrete one-electron basis set {� i } as If the basis set is kept fixed, the 1-RDM is determined once the population matrix P in (2) is found.The number of parameters in the model is thus solely conditioned by the size of the population matrix and, therefore, by the number of basis functions.In this work, the basis functions are atomic orbitals, but plane waves could also be considered, if needed, for strongly delocalized electron systems.
The 1-RDM is directly connected to the mean electrondensity distribution in position space through its diagonal elements, �ðrÞ ¼ À ð1Þ ðr; rÞ: Furthermore, the 1-RDM encapsulates momentum space information through a 6D Fourier-Dirac transform (Weyrich, 1996), with n(p) being the momentum density.This double connection to both axes of phase space strongly suggests there is little hope of reconstructing a good-quality 1-RDM from data provided by a single experimental technique.Thanks to very efficient refinement methods and models (Gatti & Macchi, 2012), high-resolution X-ray SF, which are obtained by elastic coherent X-ray diffraction, are almost routinely used in the reconstruction of position space electron density.Using (3), the relationship between SF and 1-RDM is simply ðr; rÞ expðÀ ir � qÞ dr: On the other hand, DCP are measured by deep inelastic incoherent X-ray scattering.Within the impulse approximation (Phillips & Weiss, 1968), they give access to projections of momentum space electron density, i.e.
Jðq; uÞ ¼ where u is the unit vector giving the direction in momentum space onto which the electron density is projected.It is collinear with the scattering vector of the Compton measurement.The model used in this work is based on expression (2).The determination of the best population matrix given a set of SF and DCP thus requires expressing experimental observable values as functions of matrix P using the operator form FðqÞ ¼ TrðF q PÞ and JðqÞ ¼ TrðJ q PÞ; ð8Þ with F q and J q being the SF and DCP operators, respectively.For conciseness, q stands for (q, u) in the Compton profile matrix element of (8).To proceed any further, the operators F q and J q as matrix elements need to be written in the basis-set representation as In this work, particular attention has been paid to the reliability of the final 1-RDM reconstruction.If one assumes that error bars on data points are uncorrelated and follow a normal distribution law, for an unbiased model, the most probable population matrix P is found by solving the minimization problem where the model is expected to yield the mean value for each observable datum represented by O i while its actual experimental measurement gives O exp i with the associated estimated variance � 2 i .In our case, each data point originates either from X-ray diffraction or Compton scattering measurements so that O i ¼ F q i or O j ¼ J q j for different scattering vectors.In the present work, for a given basis set of Gaussian contracted Slater-type orbitals, the closed form of each matrix element ( 9) is calculated prior to refinement using a Mathematica code (Wolfram Research, 2023).
The minimization of ( 10) is a convex least-squares fitting problem.The following section will explain how, together with the necessary N-representability conditions, the reconstruction problem falls into a convex optimization problem called semidefinite programming (Boyd & Vandenberghe, 2004).

Constraints: N-representability, symmetry and frozencore
The N-representability conditions must be satisfied to ensure that the population matrix yields a physically meaningful density matrix.It is worth noting that the N-representability conditions are significantly more difficult if one requires the system to be in a pure state instead of a statistical mixture of quantum states.Pure N-representability and ensemble N-representability are generally employed to distinguish the respective situations (Chakraborty & Mazziotti, 2015).We have chosen to consider the latter case for practical reasons and because the system cannot always be exactly in its ground state, without interacting with the environment.Consequently, for an ensemble N-representable 1-RDM, the population matrix P ?for a closed-shell system, associated with an orthonormal basis set, must satisfy the following constraints, together with the obvious condition that P ? is Hermitian.
Here I is the identity matrix, and the symbol � means the matrix is semidefinite positive, which is equivalent to stating that all eigenvalues are non-negative.Hence, constraint (11b) requires the eigenvalues of P ? to be smaller than 2. As previously mentioned, the present basis set is made of Slatertype atomic orbitals (expressed as Gaussian contractions), which are not mutually orthogonal.A Lowdin orthogonalization is thus performed on the atomic-orbital basis set prior to the reconstruction.All constraints listed in (11) are convex; thus, the convexity of the minimization of equation ( 10) is preserved.Moreover, the semidefinite positivity of P ?imposed in (11a) makes it possible to use the tools of SDP (Foley & Mazziotti, 2012;De Bruyne & Gillet, 2020).Access to the solution is thereby significantly facilitated.
The model developed in this work is specifically adapted to molecular crystals for which a single group of atoms can be considered to form a specific entity.It is assumed that this group, referred to as 'the molecule', does not share any charge with other entities in the same or neighbouring unit cells.The 1-RDM model is thus a mere molecular 1-RDM onto which translation and rotational symmetry operations can be applied to generate the density matrix of the entire crystal.These operations are fully taken into account in the present work.
The symmetry invariance at the molecular level can also be considered.The population matrix is thus required to be a direct sum of matrices in the invariant subspaces of each symmetry operator.In other words, P should be block-diagonal when using the symmetry-adapted orbitals as the new basis, i.e. S T PS ¼ � n j¼1 P j , where S transforms the basis of atomic orbitals into symmetry-adapted orbitals, n is the number of irreducible representations and P j the block matrices associated with each irreducible representation.
The new model also allows for freezing core-electron contributions.It effectively reduces the model's active space, hence the number of parameters to be determined in the population matrix.As a consequence, illustrated in the next section, the computational cost is lowered, and the robustness of the result is improved against noise contamination and thermal-induced effects.It can be best observed on coreelectrons' spatial density distribution, contributing to sharp peaks near each nucleus.Therefore, accurately reproducing such features for a population matrix model would require knowledge of high-q SF, which may present an experimental challenge at usual temperatures.Here, an alternative but common approach was chosen.A single-determinant calculation of the wavefunction is performed from which coreelectron molecular orbitals are extracted to construct an approximate core-electron density matrix.As a result, the model population matrix is given by P ?¼ P 0? þ P ?core , with P ?core being the frozen core-electron population matrix.The latter represents a fixed number of electrons and is -by construction -idempotent.The optimization is thus forced to search for the optimal solution in the subspace orthogonal to that spanned by the core-electron's orbitals if the N-representability on the total 1-RDM is to be preserved.
Combining symmetry and frozen-core conditions and assuming a non-magnetic system, the N-representability constraints become with P 0? being the population matrix for valence electrons and N the total electron number for a single molecule.S 0 transforms the orthogonalized atomic basis into the symmetryadapted basis.We remark that with (12a)-( 12d), the optimization problem can still be modelled with SDP.In this work, the constrained optimization problem is solved from the closed form of our model using the CVXPY package (Diamond & Boyd, 2016).

Reconstruction from non-zero temperature data
It must be noted that the reconstruction method is inherently temperature independent, since the 1-RDM describes both mixed states and pure states.However, comparison with first-principle calculations is generally best done at the zero-Kelvin limit.It is thus helpful to deconvolute thermal motion effects to recover the ideal static 1-RDM.In this work, it is assumed that, given the large photon-electron energy transfer involved in the Compton scattering process, DCP are hardly affected by nuclear agitation at reasonably low temperature (Sternemann et al., 2000;Dugdale & Jarlborg, 1998;Matsuda et al., 2020).Therefore, temperature-induced alteration of experimental data is only taken into consideration for X-ray SF.In this case, the model is modified so that the SF matrix elements include an anisotropic Debye-Waller factor, where b B a is the thermal displacement tensor for nucleus a on which both basis functions � i and � j are centred.No change is applied when the basis functions are associated with different atoms.More sophisticated temperature schemes are worth considering (Stevens et al., 1977).For example, the Mulliken partitioning approach to two-centre contribution was implemented in our previous work (Launay & Gillet, 2021) and should be used with real data.However, the usual independent-atom model was chosen to prevent unfair similarity with the computational method used to generate the reference data (Erba et al., 2013).It has been checked that this simple approach allows for a fair deconvolution of thermal agitation effects when data are not contaminated with noise.

Results
The model explained above is well suited to molecular crystals and should be assessed for realistic systems.In particular, for such an approach which combines different experiments it is necessary to evaluate the impact of data quality on the 1-RDM reconstruction.
The urea crystal [CO(NH 2 ) 2 ] (Fig. 1) has been chosen for two specific reasons: firstly, it has long been considered a 'standard' test system in the field of charge density reconstruction.Several bond types are represented, among which highly mobile and delocalized electron density contributes to non-linear optical properties (Cassidy et al., 1979;West et al., 2015).Secondly, because of the interest it has attracted over the years, high-quality SF (Zavodnik et al., 1999;Birkedal et al., 2004) and Compton profile data (Shukla et al., 2001) are available from the literature.This thus positions urea as a legitimate candidate for a first phase-space-derived reconstruction of experimental 1-RDM on a molecular compound.Additionally, the urea molecule is significantly larger than our previous test systems and possibly one of the largest molecules for which Compton measurement has ever been reported (Shukla et al., 2001).It can thus be considered a significant step in the quest for 1-RDM reconstruction.This paper is the last stage of model calibration before a final reconstruction from true experimental data is undertaken.
We use here the same strategy for model assessment as for smaller systems, and described in previous papers (De Bruyne & Gillet, 2020;Launay & Gillet, 2021): a reference 1-RDM is obtained from a periodic density functional theory (DFT) calculation using the B3LYP functional (Becke, 1993) and a pob-DZVP basis set (Peintinger et al., 2013) using the CRYSTAL14 program (Dovesi et al., 2014).The nuclei positions are those given by Worsham (1957) and derived from neutron diffraction data.Artificial experimental data points are then generated based on this DFT-derived 1-RDM.50 K SF are computed up to sin �=� ¼ 1.1 A ˚À 1 after atomic displacement parameters have been obtained using the dedicated option of CRYSTAL14 (Erba et al., 2013).Compton profiles are little affected by thermal motion at such low temperatures, and no particular treatment is applied in their case.The CRYSTAL14 SF and DCP values are considered ideal mean values on which a Gaussian noise distribution is centred for each data point.Consequently, noise-contaminated data are also considered in our test reconstructions.The reconstructed density matrix is obtained by determining a population matrix for a basis of poorer quality than that employed for artificial data generation.An inevitable bias in the model is therefore introduced.The basis set for the 1-RDM model is thus taken as a simple 6-31G basis set, with additional p orbitals on hydrogen atoms.

Reconstructions from ideal data
The best reconstruction result is expected when data are obtained without thermal motion and noise.The use of artificial data cannot be circumvented to test this optimal case.Observing what type of reconstruction results from the sole use of X-ray diffraction data is then quite illustrative.The artificial experimental set includes 3627 SF with sin �=� < 1.1 A ˚À 1 .Inspection of the 1-RDM À (r, r 0 ) on the O-C-N-H path as a 2D function clearly shows that the SF-only derived 1-RDM lacks most of the off-diagonal regions' important features (Fig. 2).It is consistent with conclusions drawn from previous works on a much smaller system.In such a case, inferring the off-diagonal region is inherently difficult because SF are solely related to the position space density, and therefore to the diagonal component �(r) = À (r, r).Only constraints on the model are likely to improve the off-diagonal description.This is an important criterion to assess the quality of the 1-RDM reconstruction since, in essence, off-diagonal parts are conditioned by the bonding mechanisms and how different locations interfere to shape the wavefunction.).For each direction, data points are taken every 0.1 atomic unit.This value corresponds to usual Compton spectrometer resolutions and prevents significant correlation between consecutive points.The maximum momentum value is set to 10 atomic unit.The data set thus contains 800 DCP values in total.Obviously, a noise-free refinement case does not justify any weighting scheme, and the � i in the objective function ( 10) are uniformly taken to be 1.
As displayed in Fig. 2(b), the reconstructed 1-RDM now exhibits very marginal deviation from the reference.Slight differences persist in the off-diagonal regions À (r, r 0 6 ¼ r).A discrepancy is observed when the reconstructed 1-RDMs are visualized along the two different O-C-N-H paths.Such a discrepancy is corrected once the symmetry restriction is imposed.
The virial ratio À V/2T is calculated for the reconstructed 1-RDMs, where the two-electron potential energy is estimated using the 2-RDM expression ansatz À (2) (r 0 1 , r 0 2 ; r 1 , r 2 ) = À (1) (r 1 , r 0 1 )À (1) (r 0 2 , r 2 ) À À (1) (r 0 1 , r 2 )À (1) (r 0 2 , r 1 ).The virial ratios for reconstruction with and without DCP are 0.996 and 0.934, respectively, confirming the role of Compton data in reaching a more pertinent solution.The distinction between the two reconstructions showcases the importance of momentum space measurement even for a system like urea crystal, where the DCP anisotropy does not exceed 1% of the total electron number (see Fig. 5).Note that the good postrefinement virial ratio is a mere consequence of the reconstruction quality and did not require any ad hoc constraint in our model or the objective function.
In the following paragraphs, possible sources of reconstruction errors will be discussed in more detail, and techniques for improving the model's robustness will be emphasized.

Closer to real life: noise and temperature effects
When real experimental data are used, noise contamination cannot be avoided.This section first considers the effect of statistical noise and, as a common practice, assumes no bias in the model.Then, the thermal motion of nuclei is introduced, and we study how it combines with statistical noise to deteriorate the reconstructed 1-RDM further.
Artificial data are now contaminated by a random noise generated according to a Gaussian law.For example, the SF data values become F 0 (q) = F(q) + n � �(q) with � � N ½0; jFðqÞj�.The noise level is chosen to be of the order of 1% by setting n = 0.01.A similar procedure is applied to the DCP values.Notice that, given the weak Compton anisotropy in this system, the chosen noise level wipes out most of the directional information from the Compton scattering spectrum.We have found that such a noise model results in highly unbalanced weight in the objective function.Hence, an unweighted version of ( 10) is used in practice.
As expected, the 1-RDM reconstruction from noisy data now exhibits stronger deviations from the reference.It can be seen in Fig. 3(a).The modest discrepancies in the diagonal part of the RDM can be emphasized by looking at the electron deformation density displayed in Fig. 4(a).As a reminder, the deformation density is the difference between the total electron density and the sum of independent-atom densities, i.e. promolecular density.The latter is obtained from CRYSTAL14 software using the same basis set as the reference calculation (pob-DZVP).As anticipated, discrepancies are more significant in the off-diagonal region of the 1-RDM [Fig.3(a)].The model, constrained by the N-representability conditions, obviously struggles to get sensible information from the weak Compton anisotropies buried under the noise.It is demonstrated by the noticeable mismatch of anisotropy oscillations shown in Fig. 5 (filled red triangles).Nevertheless, it can be seen that the deviation of reconstructed DCP anisotropy is still moderate, possibly due to the information carried by SF data.This assumption is validated after obser-ving further deterioration when SF are removed from the data (filled blue triangles).
For the proposed model, deconvolution of temperature effects constitutes a difficult challenge.In contrast to most common electron-density reconstructions, using, for example, the widely used kappa-refinement pseudo-atom multipolar model (Hansen & Coppens, 1978;Gatti & Macchi, 2012), our approach to 1-RDM determination relies on a linear expression (2) which, combined with linear constraints (Section 2.2), makes it possible to use positive SDP methodology.Insertion of the Debye-Waller formulation to account for the thermal effect destroys such a linearity.While an alternative formulation is currently under development, it was decided for the present work to explore the possibility of treating sequentially both problems.First, an ab initio 1-RDM is computed with the basis set of the model.Then, the atomic displacement parameters (ADPs) are determined from high-order SF (sin �=� > 0.7 A ˚À 1 ), as explained in Section 2.3.Then, these ADP values B are fixed and incorporated into the model.Therefore, the model remains linear for the 1-RDM refinement step, since a mere factor exp(À q • B • q) is added to the SF operators.The quality of such ADPs depends heavily on the model basis set, and one cannot expect the refined P matrix to be exempt from thermal motion contamination.As shown in the lower panels of Fig. 3(b) and Fig. 4(b), the reconstructed 1-RDM and deformation density continue to worsen, which is clear evidence that the thermal motion effect has not been thoroughly deconvoluted.Although Compton data are assumed to be unperturbed at such low temperatures, the off-diagonal region continues to deteriorate.It must be attributed to the sparsity of reliable information from momentum space, which cannot be compensated for by the SFs.The poor performance of our independent-atom Debye-Waller description is clearly shown by unphysical electron depletion in the vicinity of nitrogen centres shown in Fig. 4(b).Further, it is confirmed by the significant differences between the refined and reference (from CRYSTAL14) ADPs for the nitrogen nuclei (about 25% discrepancy).When real data are involved, this very crude scheme will necessitate the addition of the previously mentioned two-centre terms in (13) and a more thorough inclusion of the Debye-Waller contribution in the general refinement.The feature is currently being implemented.

Further restrictions: frozen-core and symmetry
In the previous section, we discussed how the combination of noise and thermal motion affects the 1-RDM reconstruction using (2).To mitigate such problems, a possible method is to reduce the degrees of freedom of the model, thus making it more robust against noise contamination.As introduced in Section 2.2, one would naturally first invoke the necessity of applying symmetry restrictions to the model.An overall improvement in reconstruction quality is observed as unnecessary free parameters are eliminated.
A further limitation of the active space is obtained by freezing the core-electron contribution to the density matrix.This common procedure does not affect our ability to absorb momentum space data, which primarily describe delocalized valence electrons.On the SF side, freezing the core component of the 1-RDM helps stabilize the refinement against highorder reflections, which are the most affected by the noise and nuclear motion, while preserving nearly all the model's flexibility.In this subsection, we report the impact of such a scheme under the non-ideal reconstruction scenarios.
When the frozen-core and symmetry restrictions are added to those concerning N-representability, Fig. 3(b) (upper panel) shows that the distortion in the reconstructed 1-RDM is greatly reduced.In this case, even in the presence of noise and thermal agitation, the model catches most of the features observed in the reference 1-RDM [Fig.3(a)].Note that the most significant discrepancy is in the off-diagonal region corresponding to the long-range interaction between hydrogen and carbon, which are second neighbours.Such a striking improvement confirms that limiting the active space can effectively improve the reconstruction's robustness against noise.The standard deviation on the reconstructed 1-RDMs with and without additional constraints [Fig.3(c)] was estimated upon resampling from the Gaussian noise distri-bution.It is observed that the restriction of active space distinctly diminishes the uncertainty of the reconstruction.
Interestingly, such an improvement in the 1-RDM modelling brings only minor changes to the resulting deformation density near the nuclei.Similarly, no major improvement is observed for the DCP anisotropy reconstruction (see the supporting information).This seemingly paradoxical observation can be resolved when adopting an optimization problem perspective.
As mentioned earlier, the 1-RDM reconstruction was defined through (10) as a least-squares minimization problem given the SF and DCP data.Therefore, introducing constraints such as (12a)-( 12d) can only result in a new optimal solution with a higher � 2 value, i.e. a worse fit to the SF and DCP.Consequently, the DCP anisotropies and deformation density are not likely to be improved because they only depend on our ability to fit the Compton data and a set of Fourier coefficients of the electron density.However, a 1-RDM is a function in 6D space which contains more information than its limited number of projections given by the data values.In such a case, it is well founded to believe that restricting the size of solution space effectively regularizes the model, giving it stronger predictive ability.
Estimating the total electron energy from experimental SF is a well known, difficult challenge.Adding Compton scattering information does not significantly facilitate the task.However, on a mere relative scale, the energy criterion can be employed to compare the performances of different refinement strategies.Throughout this work, a recurrent question has been to evaluate the optimal cut-off value in sin �=� for the SF.In a perfect world, free from thermal motion, high-Miller-indices reflections should be retained as long as they rise above statistical noise.The solid curve in Fig. 6 shows that, 7 of 9  for an ideal 0 K set of SF, the total electron energy stabilizes for any cut-off value above 0.7 A ˚À 1 .It is no longer true when data values are affected by temperature agitation.In the 50 K case (dashed curve), SF corresponding to sin �=� > 0.7 A ˚À 1 contribute to a significant deterioration of the reconstruction from an electron-energy perspective.As shown in Fig. 6 (all dashed curves), minimum energy is reached when only reflections lower than 0.7 A ˚À 1 are included in the set.Then, as one increases the Ewald sphere radius, the total energy starts to rise continuously.This confirms that the additional highorder reflections, which are the most affected by thermal motion, are not sufficiently well deconvoluted by the onecentre Debye-Waller model and merely contribute to perturbing the refinement process.
When symmetry enforcement alone is applied, an overall improvement can already be observed under 0 K and 50 K scenarios.More remarkably, introducing an additional frozencore component not only further reduces the perturbation instilled by high-order reflections but also dramatically improves the reconstruction ability when only a small amount of reflection data is available.Thus, both restrictions effectively increase the stability of the model by filtering out most of the perturbation brought by thermal motion and noise contamination and by reducing the number of unnecessary free parameters.In addition, the behaviour of the fully restricted model in the reduced q max domain suggests the possibility of reconstructing the 1-RDM with a limited amount of low-angle SF data, those that describe the most diffused electrons.
We insist that the Hartree-Fock-like energy computed here is only meaningful as an indicator of the reconstruction quality since it uses both position and momentum space electron densities.However, due to the intrinsic difficulty of predicting energy from 1-RDMs, the question of whether one could accurately determine the total (or interaction) energy from scattering experiments should be left for more careful examination and discussion.

Conclusion and discussion
In this work, an improved 1-RDM reconstruction method has been tested on a system which is significantly larger than those previously investigated (De Bruyne & Gillet, 2020;Launay & Gillet, 2021).The crucial role played by momentum space information, originating from Compton scattering data, is confirmed.It is instrumental in the quality of the reconstruction, even when the weak anisotropy is buried under statistical noise.The two main additions to the model, symmetry restrictions and frozen-core contributions, are shown to drastically stabilize the 1-RDM reconstruction process against statistical noise and temperature effects.Meanwhile, with no additional constraints, it is shown that the resulting energies, evaluated from the modelled 1-RDM, closely satisfy the virial theorem.As a consequence, the approximated total energy and virial ratio were found to be valuable indicators to identify an optimal portion of the Ewald sphere, which balances pertinent information and noise contamination.
However, proper deconvolution of temperature-induced nuclear motion remains a challenging problem.In the current approach, two main obstacles have been identified: firstly, our choice of limiting the flexibility of the temperature model and the basis set to avoid bias in the assessment; secondly, the necessity of keeping the 1-RDM model linear.Both inevitably led to strong discrepancies in atomic displacement parameters but allowed for a reliable assessment of the model's stability.Moreover, we have good reasons to believe that using a nonlinear version of the optimization, including two-centre temperature factors and a better basis set, will drastically improve the performance when real experimental data are considered.
The current method for 1-RDM reconstruction is essentially a statistical inference procedure.Therefore, the quality of its outcome depends not only on the data distribution but also on the prior distribution of the model.In the present stage, a uniform prior was used, which means no prior knowledge is assumed.In the future, one could consider a more informed prior, for example, a Gaussian distribution centred on a lowerlevel theory calculation.The use of additional priors will help the model's performance, especially in the case of poor data quality.
Finally, our results illustrate that 1-RDM reconstruction is achievable for a system of moderate size from X-ray SF and DCP measurements, even when the momentum space information is drastically limited.In the next step, such a method can be readily applied to actual experimental data.

Figure 1 (
Figure 1 (a) The unit cell of urea crystal (P421m with a = 5.66, c = 4.71 A ˚).(b) The dashed green line represents the path along which the 1-RDM values displayed in this article have been computed.The path is a succession of segments passing through O-C-N-H atoms.

Figure 2 (
Figure 2 (a) The reference 1-RDM along the O-C-N-H path was calculated from CRYSTAL14.The reconstructed 1-RDMs with and without the DCP artificial data are shown, respectively, in the upper left and lower right corner of (b).The contours are drawn at �10 À 2 � 2 n e A ˚À 3 , n 2 ½0; 20�, where the positive (negative) contours are shown in solid (dashed) lines with blue (red) shades.

Figure 5
Figure 5Reference (dots) and reconstructed (lines) DCP anisotropy for the[110]   direction without (circles) and with (triangles) 1% noise.The dotted line shows the reconstruction without the use of SF artificial data.The red shaded area indicates the standard deviation of reconstruction upon resampling.

Figure 6
Figure 6Mean field energy (see text) of one urea molecule reconstructed from 1% noisy data.Dashed and solid lines with circle and triangle data points represent the reconstruction with 0 K and 50 K SF data, respectively, and with identical Compton data.Blue lines show the reconstruction with no additional constraints.Light blue and red lines show the results when symmetry and core-electron constraints are used.The purple dotted line shows the virial ratio of a 0 K 1% noisy data reconstruction with additional restrictions.