Recovery of undamaged electron-density maps in the presence of damage-induced partial coherence in single-particle imaging

Single-particle imaging using X-ray free-electron lasers requires an intensive X-ray fluence which damages the sample as it is imaged. This paper explores the potential for including theoretical simulations of single-particle electrodynamics into a reconstruction of the electron density imaged using X-ray free-electron lasers. While partial coherence, induced by radiation damage, poses a challenge for phase retrieval, it is demonstrated that additional coherent modes enable suppression of the effect of radiation damage in the reconstructed sample electron density.


Introduction
Single-particle imaging (SPI) using X-ray free-electron lasers (XFELs) seeks to achieve atomic resolution ($1 Å ) in a reconstructed three-dimensional (3D) electron density of a non-crystalline sample, such as single biological molecules, obtained from a large number of two-dimensional (2D) X-ray diffraction patterns (Sayre et al., 1998;Chapman et al., 2006;Aquila et al., 2015). These 2D diffraction patterns are collected in the far field (Fresnel number much smaller than unity) from randomly oriented copies of the sample, each of which is destroyed by the ultra-high X-ray flux (Neutze et al., 2000). Elastic scattering in light elements of biological importance at X-ray photon energies h -! ' 10 keV, needed to achieve atomic resolution, is relatively weak. In fact inelastic interactions such as photoionization dominate over the Thompson scattering at these X-ray energies. Photoionization starts a cascade of secondary ionization processes which together turn the sample into a highly charged molecular ion that expands rapidly due to Coulomb repulsion (Hau-Riege et al., 2004;Fortmann-Grote et al., 2017).
While radiation damage is a recognized limitation for SPI (Aquila et al., 2015), a range of technical challenges currently constrain the achievable resolution (Chapman, 2019). These challenges include sample delivery, background scattering by a carrier medium, sample reproducibility and purity, low number of photons scattered by individual samples, and various sources of detection noise. For a more detailed discussion we refer the reader to Chapman (2019) and references therein. In this work we focus on the following question, raised by Aquila et al. (2015): 'Will it be possible to recover undamaged electron-density maps based on damaged sample data?'.
Alongside atomic motion, radiation damage induces electronic structure change in the sample during the pulse (Hau-Riege et al., 2007a;Son et al., 2011). The changes in electron structure, caused by photoionization (Neutze et al., 2000) and a cascade of secondary ionization events during the pulse, occur on a sub-femtosecond timescale (Seibert et al., 2010). In nanocrystalline samples the scattering from the damaged structure is suppressed by a breakdown in crystalline periodicity and a loss of bound electrons (Barty et al., 2012;Caleman et al., 2015b;Nass et al., 2020). In single-molecule samples a reduction in the contrast of speckles in continuous diffraction fluence is caused by the damage-induced thermal motion of atoms (Martin et al., 2015). If a molecule is confined to a buffer, for example a droplet, the atomic motion will be suppressed compared with a molecule in a vacuum (Hau-Riege et al., 2007b). Temporal variation of the electronic structure leads to a loss of coherence in the scattered X-ray photons (Quiney & Nugent, 2011;Gorobtsov et al., 2015;Martin & Quiney, 2016;Caleman et al., 2020) and significantly modifies the atomic structure factors (Hau-Riege et al., 2007a;Son et al., 2011). This partial coherence can be described in terms of coherent modes (Mandel & Wolf, 1995, Quiney & Nugent, 2011, so that the partially coherent scattered intensity is represented as the sum of two or more coherent modes with a certain number of photons occupying each of them. Hence, electron-density maps recovered from a single coherent mode depict a damaged electron structure. Even if atomic motion is insignificant, the depicted electron structure is modified from its initial state. Below we use the term 'damaged electron density' in this sense. This damaged electron density tends towards a pulse-averaged distribution in the limit of low incident fluence (Martin & Quiney, 2016). An application of phase retrieval algorithms which assume full coherence and retrieve the phase of a single mode will therefore yield such a damaged electron density. Generalized phase retrieval algorithms capable of retrieving multiple coherent modes have been reported in the literature (Whitehead et al., 2009;Quiney, 2010;Sala et al., 2019).
In this paper we investigate the possibility of utilizing a sample electrodynamics theory combined with experimental measurements of partially coherent diffraction data to overcome the damage-induced electronic structure changes in the retrieved electron density of the sample. We demonstrate that a linear combination of retrieved coherent modes with modal weights obtained from the theory yields a reconstruction of local atomic electron densities with significantly suppressed radiation damage effects. This reconstruction requires input from a theoretical ionization-dynamics model which relies only on knowledge of the chemical composition of the sample without any structural information. Addition of a second mode significantly improves the reconstruction of undamaged electron density compared with that obtained with a single coherent mode, conventionally used under an assumption of full coherence in the scattered X-ray intensity. The coherence of the scattered signal as a function of the pulse length and fluence is investigated for a model system of DNA origami. These artificial structures have recently been proposed for testing SPI using XFELs (Xavier & Chandrasekaran, 2018) and cryo-electron microscopy (Kopatz et al., 2019). For typical contemporary XFEL pulse parameters, as many as 8% of photons can occupy modes other than the primary one. For the theory described below these photons serve as a source of useful structural information rather than decoherence, as would be the case for a conventional approach with an assumption of full coherence. We further discuss the inclusion of additional modes, beyond the first two, into the reconstruction of undamaged electron density.

Modal decomposition
In the context of SPI we consider the problem of reconstruction of the 3D electron-density distribution in a sample from the 3D intensity distribution of scattered X-rays. The latter is obtained by merging many 2D diffraction patterns after solving for sample orientations in each pattern. We assume that the incident X-ray pulse is a quasimonochromatic plane wave propagating along the optic axis z with intensity distribution I(x, y, 0, t) = I in (t), which is spatially uniform in the object plane z = 0 but can vary in time and has a finite duration. Here, (x, y) denote coordinates in planes perpendicular to the optic axis, and t denotes time. The phase distribution of the incident pulse, (x, y, 0, t) ' exp(Ài! t), is assumed to be approximately flat over the pulse duration T within the area occupied by the sample in the object plane, where ! = 2c/, c is the speed of light in a vacuum and is the mean wavelength. We also assume that the scattering is weak, so that the standard first Born approximation can be applied to describe the diffracted intensity in a remote detector plane z = R. Under these assumptions, the integrated intensity of diffracted X-rays W(q) in the far (Fraunhofer) field, R ) d 2 /, where d is the diameter of the sample, obtained by integrating the diffracted intensity I R (q, t) over a single exposure (pulse length) time, can be expressed as where the molecular form factor F(q, t) is the Fourier transform of the time-dependent molecular electron density (r, t), r e is the classical electron radius, and W B (q) is the background integrated intensity that contains both the Compton scattering term and other incoherent contributions (Lorenz et al., 2012;Slowik et al., 2014;Gorobtsov et al., 2015). Below we shall assume that background subtraction has been carried out and set W B (q) = 0. In contrast with previous publications (Lunin et al., 2015;Gureyev et al., 2018), here we include plasma effects and consider entire atomic configurations (Son et al., 2011) rather than individual atomic orbitals (Quiney, 2010;Lunin et al., 2015) in the damage model during illumination by the incident X-ray pulse. Note that because of the stochastic character of radiation damage processes and the variability of the repeated illumination pulses, the right-hand side of equation (1) is a stochastic distribution; this is taken into account below. We adopt an independent-atom model for the total molecular density (Quiney & Nugent, 2011;Son et al., 2011), so that the total molecular electron density (r, t) is given by the sum of the atomic electron densities ðatÞ ðrÞ of all atoms in the molecule, Here = {Z, }, and a m Z ðtÞ represents the occupancy of the electronic configuration in the mth atom of chemical element Z, located at a position R m Z in the molecule. For example, = 1 in carbon enumerates the configuration 1s 2 2s 2 2p 2 , = 2 stands for 1s 2 2s 2 2p 1 , and so on for all 27 configurations. This treatment of a m Z ðtÞ as entire configuration occupancies is in contrast to the original model (Quiney, 2010), which treated these quantities as the number of electrons in an orbital of atom Z. This allows us to account for the change in atomic orbitals between different configurations, which can be significant (Son et al., 2011), especially for heavier elements (Kozlov & Quiney, 2019). Substituting the electron density (3) into (1), we can write the intensity integrated over the X-ray pulse duration as where f ðqÞ exp½Àiq Á R m Z is the Fourier transform of the atomic electron density m Z ðr À R m Z Þ, so that f (q) is an atomic form factor. Here the temporal pulse variability and radiation damage effects are contained in the stochastic 'damage matrix' Å m Z n Z 0 . Note that the quantities a m Z ðtÞ themselves depend on the incident pulse intensity I in (t), as the time-dependent intensity drives the electron-density dynamics of atoms in the molecule. The XFEL pulse is assumed to be sufficiently short that the atomic positions R m Z of atoms do not change during the pulse. The diffracted intensity in equation (4) is a statistical function of repeated measurements. The mean value over the ensemble is equal to where ðqÞ = f (q)T Z (q) and the structure function T Z (q) = P m Z exp½Àiq Á R m Z has been introduced. Note that ðqÞ is the Fourier transform of the electron density of all atoms Z in the sample, each of which has an electronic configuration denoted by . Within the framework of the independent-atom model, the mean damage is independent of the position of each atom in the sample (Hau-Riege et al., 2004;Lorenz et al., 2012), so that where the atomic position in a (t) has been dropped, and the overline denotes the average value over an ensemble of damage scenarios. The independent-atom model assumes that, on average, each atom of chemical element Z, illuminated by incident X-rays, has the same charge density P a ðtÞ ðrÞ at any time t.
The average integrated intensity (6) can be rewritten in diagonal form, where the vectorq qðqÞ has components ðqÞ, and v k is an eigenvector of the matrix PS, so that PSv k = k v k , and (S) = R dq ðqÞ ðqÞ is the overlap matrix (see Appendix A for details). In the above equation the eigenvalues k are associated with the occupancies of coherent modes, which are orthonormal, so that equation (8) takes the form of a modal decomposition (Mandel & Wolf, 1995;Quiney, 2010;Quiney & Nugent, 2011;Lorenz et al., 2012;Lunin et al., 2015) that reflects the coherence properties of the mean integrated intensity WðqÞ. The mode occupancy k is proportional to the total number of photons N k in the mode k (q). Indeed, the mode occupancy can be written as k = !N k = (in atomic units, used henceforth in this article, the reduced Planck constant his set to unity and ! = 2), where is the quantum efficiency of the detector (Goodman, 1985;Mandel & Wolf, 1995;Gureyev et al., 2017). Now consider the problem of extracting the sample electron density from the coherent modes. This can be achieved using a set of vectors u k = Sv k which are bi-orthogonal to vectors v k : hu k ; v k 0 i = kk 0 . The vectors u k are eigenvectors of the matrix SP so that Expanding the vectorq qðqÞ we obtain The undamaged electron density can be expressed as 1 (r) = P Z;m Z Z¼1 ðr À R m Z Þ, where the index = 1 indicates the initial undamaged configuration of each atom. The Fourier transform of the undamaged electron density 1 ðqÞ = P Z T Z ðqÞ f Z¼1 ðqÞ = P 2J 1 ðqÞ, where J 1 = {Z, = 1} is a subset of all electronic configurations that includes only those that correspond to undamaged atoms. Therefore, research papers Equation (12) expresses the undamaged electron density in the sample via the coherent modes that can be retrieved from experimental data, collected in the presence of radiation damage, using phase retrieval algorithms (Whitehead et al., 2009;Quiney, 2010;Sala et al., 2019) and the eigenvectors of the matrix SP obtained from the theoretical model. Importantly, the first summation in equation (12) can always be truncated at a finite k = M, which corresponds to the singularvalue decomposition (SVD) of the matrix SP, SPx = P k k hx; v k i u k (Golub & Van Loan, 1989). However, if the sum in equation (12) is truncated so that only M < N modes are included, where N is the total number of modes, then This leads to imperfect reconstruction of the undamaged electron densities using equation (12). We introduce the normalization factor (M) = P K ðMÞ 6 ¼ 1 for M < N, which rapidly converges to unity when several modes with the highest eigenvalues k are included in the summation. For M = 2 (while N ' 10 3 ), (M) differs from unity by no more than 40% for all the simulation parameters in Table 1 that we considered. To correct the reconstructed electron density, we introduce (M) into equation (12) and obtain for the norm-adjusted undamaged electron-density reconstruction from the truncated sum over coherent modes. If the radiation damage is negligible, the integrated intensity WðqÞ is fully coherent and equation (14) becomes exact with just a single mode occupied by all scattered photons. In summary, equation (14) demonstrates that the electronic structure corresponding to a sample that is unaffected by radiation damage can potentially be retrieved from only the first two coherent modes k (q) and their occupancies k . All coefficients in equation (14) depend only on the chemical composition of the sample and can be obtained via a radiation damage simulation, while the coherent modes k (q) contain information about the sample structure. These modes can in principle be retrieved from the 3D diffraction fluence, collected in the presence of radiation damage, and a priori information about the sample (Quiney, 2010;Quiney & Nugent, 2011;Curwood et al., 2013). A different algorithm for solving a generalized phase retrieval problem (Thibault & Menzel, 2013) has been successfully used to retrieve ten coherent modes from the diffraction data collected in an XFEL imaging experiment (Sala et al., 2019).

Radiation-damage dynamics model
Finding the undamaged electron density (14) from the coherent modes k (q) given by equation (14) requires the expansion coefficients u k given by equation (10). The latter form eigenvectors u k of the matrix SP, so finding them requires knowledge of both the overlap matrix (S) = R dq Ã ðqÞ ðqÞ and the damage matrix P. According to equation (7), the damage matrix depends on the electron-density occupancies a (t), which are determined by the ionization dynamics of the sample. This section describes finding these occupancies and modelling the ionization dynamics of the molecule imaged by an XFEL.
In the independent-atom model that was adopted in the derivation of the coherent mode expansion (6), the molecular electron density is approximated by those of individual atoms. Adopting the average-over-configuration description of electron configurations in atoms (Lindgren & Morrison, 1986;Son et al., 2011;Kozlov & Quiney, 2019) and using Parseval's theorem, the overlap matrix can be written as where N Z is the number of atoms of element Z. The atomic scattering form factor f Z (q), introduced in equation (4), depends only on the absolute value of q for a spherically symmetric atomic electron density. The overlap matrix (15) depends only on the atomic scattering form factors and is independent of the underlying molecular structure (Appendix A). The evolution of the atomic electron configuration occupancies can be described Percentages of the total number of photons N scattered by the sample that occupy coherent modes k (q).
Only values that correspond to the first (k = 0, top row for each pulse length) and second (k = 1, bottom row for each pulse length) coherent modes are depicted for a range of XFEL pulse lengths and fluences with an incident XFEL photon energy of 8 keV.
Pulse fluence (Â 10 4 J cm À2 ) impact ionization and the Auger effect, make a substantial contribution to the dynamics of configuration occupancies a Z (t). We expanded the AC4DC atomic toolkit (Kozlov & Quiney, 2019;Kozlov et al., 2020), which solves the atomic rate equations (16) with photoionization, Auger effect and fluorescence processes, to include electron impact ionization and three-body electron recombination. Here we modified the continuum model (Hau-Riege et al., 2004;Gorobtsov et al., 2015) to treat secondary electrons adiabatically as a Maxwellian plasma separately from the photoelectrons, which can collide with the former. The latter are described by two additional equations for their number and energy densities: Here N s , E s and N p , E p are the number and energy density of the secondary and photoelectrons, respectively, n Z is the number density of element Z, R ðsÞ Z;! 0 include those of the above-mentioned atomic processes that release or capture secondary electrons, eR ðsÞ Z;! 0 are the rates of the same processes weighted by their energy intake or release (see Appendix B for details), and R ðpÞ Z;! 0 and eR ðpÞ Z;! 0 indicate the respective rates for photoelectrons. The temperature of secondary electrons is T(t) = 2E s (t)/3N s (t) (Boltzmann's constant is equal to unity in atomic units). The photoelectron energy distribution is approximated by a delta function ½ p À p ðtÞ, where the average photoelectron energy is ðtÞ = E p (t)/N p (t). This leads to the loss of a major part of the highenergy tail of the total electron partition function (Hau-Riege, 2013), where electrons can acquire energies in excess of the incident photon energy !. To capture this effect, a more sophisticated model (Leonov et al., 2014) is required. In our model, collisions between photoelectrons are ignored, but their collisions with secondary electrons are included via the rate R coll in the weak-beam approximation (Kunc, 1989) where the number density of secondary electrons exceeds that of photoelectrons by at least an order of magnitude. This model assumes that the finite thermalization time of photoelectrons is longer than the pulse length. This is a valid assumption for X-ray photon energies in excess of 2 keV and typical XFEL pulse parameters (Hau-Riege, 2013). We do not consider Coulomb trapping of photoelectrons at the later stages of the pulse, which is justified if the incident X-rays are sufficiently energetic and photoelectron scattering leaves them with a substantial portion of their initial kinetic energy (Gorobtsov et al., 2015). Secondary electrons, which include electrons produced via Auger decay and impact ionization, are assumed to be trapped at all times during the pulse. This results in a spike in the secondary electron temperature at the early stages of the pulse, when only a few energetic secondary electrons are present. These electrons are assumed to undergo rapid thermalization and are trapped by the net positive charge of the sample that is the result of escaping photoelectrons, described by the rate R esc , which depends on the sample size (see Appendix B for more details).
To estimate the accuracy of our atomic and plasma simulation model, AC4DC, we compare its predictions with the non-local thermodynamic equilibrium (NLTE) approach (Scott, 2001). The latter was initially developed for the description of laboratory plasmas and later adapted to model sample dynamics illuminated by an XFEL pulse (Caleman et al., 2015a;Jö nsson et al., 2015;Beyerlein et al., 2018;. Two major differences between our approach and NLTE are the atomic structure description (AC4DC calculates electron transition rates for all possible ionic states) and the different dynamics of photoelectron thermalization (weakbeam approximation in AC4DC, instantaneous thermalization in NLTE). However, our model does not include continuum lowering that modifies the ionization potentials of atoms due to high-density electron plasma in the sample; this process is included in NLTE.
We compared average ionizations per atom of water and lysozyme under an XFEL pulse fluence ranging from 10 5 to 10 6 J cm À2 and a pulse length of 20 fs with an X-ray photon energy of 6 to 9 keV . For the shortest pulse length available in the NLTE implementation (Jö nsson et al., 2018) the difference in prediction between the two models for the average charge per atom in lysozyme and water is under 10%. The secondary electron temperature predicted by AC4DC was higher for all but the weakest incident fluences compared with the temperature predicted by NLTE. The reason for this discrepancy is an amplification of Auger decay in ions, included in AC4DC, and continuum lowering, accounted for in NLTE. Finally, the accuracy of the rate equation model presented above depends on the relative scales of the photoelectron thermalization time and the XFEL pulse length. As the photoelectron thermalization time increases for higher X-ray photon energies, the approximation of the free-electron energy distribution that AC4DC relies on becomes more accurate.

Results
At present, atomic resolution using XFEL sources remains a challenge. Imaging and reconstructing pre-designed DNA origami structures was proposed to guide the experimental and data analysis techniques required for the success of SPI (Xavier & Chandrasekaran, 2018). This approach was recently demonstrated in cryo-electron microscopy, where DNA origami was encapsulated into a capsid of simian vacuolating virus 40 (Kopatz et al., 2019). These structures have a number of features that make them an attractive target for an SPI research papers experiment, such as high reproducibility and homogeneity, structural robustness under various conditions, a chemical composition similar to that of most proteins, and a high degree of control over their structure (Xavier & Chandrasekaran, 2018;Kopatz et al., 2019). While DNA origami structures may appear semi-periodic in small-angle scattering, they can be designed to resemble single molecules at wide angles. Here we apply the theoretical framework described above and implemented in AC4DC to investigate the coherence properties of scattering by DNA origami. First, we use information about the structure of the sample to solve the forward problem and simulate the first two coherent modes. Next, we use only the chemical composition (without the structural information) of the sample and the XFEL pulse parameters to construct the overlap matrix S and damage matrix P. This enables us to find the coefficients in equation (14) and obtain an approximation to the undamaged electron-density map. We consider incident X-ray photons at 8 keV in XFEL pulses of Gaussian shape with FWHMs ranging from 10 to 30 fs and fluences between 10 4 and 10 7 J cm À2 . In our simulations we consider DNA origami that is spherical in shape with a radius of 20 nm, made up of equal proportions of adenine, guanine, cytosine and thymine nucleotides with a total number ratio of chemical elements of 39 (C):15 (N):24 (O):4 (P). The DNA strands occupy a volume equal to that of a cylinder with a radius of 1 nm and a length of 10 nm per 32 base pairs, with an additional 0.5 nm added to the radius to account for spacing between the strands (Rothemund, 2005). Table 1 shows the fraction of photons in the first and second coherent modes given by equation (9) for the XFEL pulse parameters listed above.
We observed the largest decoherence for a 30 fs pulse with a fluence of 3.2 Â 10 7 J cm À2 (corresponding to a peak intensity of 1 Â 10 21 W cm À2 ). For these pulse parameters 91.7% of photons occupy the first mode and 8.0% occupy the second, with the remaining 0.3% occupying all other modes combined. Fig. 1 shows the ionization dynamics obtained by solving the system of equations (16) for atoms of the DNA origami exposed to a 30 fs XFEL pulse with a fluence of 3.2 Â 10 7 J cm À2 .
As is evident from Fig. 1, by the time the pulse reaches its peak almost half the electrons are stripped from their atoms due to ionization. The reduction in the number of bound electrons can be observed in Fig. 2 for an electron-density map (dark-red mesh) obtained from a single mode. In the diffraction data this reduction in scattering power is somewhat compensated by an increase in the number of incident photons for a higher incident fluence. The effective atomic radial densities, the Fourier transform of which gives pulse-averaged atomic form factors, are plotted in Fig. 3 in red for the dominant mode where they were scaled by a factor (M) [see equation (13) and the subsequent description]. The radial atomic electron density as depicted by the dominant coherent mode (which in the low-damage limit represents a pulse-averaged structure) has a relatively higher depletion of valence-shell electrons compared with the K shell, although the photoionization cross section is much higher for the latter. This occurs due to an enhancement of K-shell hole recombination in Auger decay in ions (Son et al., 2011) and a stronger suppression of the contribution to the total wide-angle scattering from atoms that have lost a K-shell electron compared with atoms that have lost an electron from a valence shell. The average loss of bound electrons (ionization degree) for carbon, nitrogen, oxygen and phosphorus atoms in DNA origami, illuminated by a 30 fs XFEL pulse with a fluence of 3.2 Â 10 7 J cm À2 .

Figure 2
A simulation of the reconstructed electron density of a DNA fragment obtained from equation (14) using one (dark-red mesh) and two (paleblue mesh) coherent modes. Values of constant electron density in both depicted surfaces are equal.

Discussion and conclusions
Incorporating modes beyond the dominant mode, which can be obtained from partially coherent diffraction images, enables us to obtain a reconstruction of the sample electron density (14) that is improved compared with the pulseaveraged density affected by radiation damage. This approach assumes that the motion of atoms in a sample during an XFEL pulse is small compared with their size, which may be achieved if the imaged molecule resides in a droplet illuminated with a short pulse (Hau-Riege et al., 2007b) but can be quite large otherwise (Fortmann-Grote et al., 2017), especially for pulses well in excess of 30 fs (Martin et al., 2015;Caleman et al., 2015b). The possibility of extracting coherent modes beyond the dominant one depends on the degree of coherence and the signal-to-noise ratio (SNR) in the average integrated intensity WðqÞ. Indeed, since the number of photons in the second mode does not rise beyond 8% in our simulations, we expect at least an inversely proportional reduction in the SNR in the retrieved second mode compared with the first dominant mode. However, a sufficiently high number of diffraction patterns can provide the required SNR and will be achievable in the near future with the progress of XFEL technology (Sobolev et al., 2020). Our results, shown in Table 1, indicate that for higher-fluence pulses rather substantial radiation damage is imprinted into the electron density of the sample. In such XFEL regimes the atomic form factors need to be adjusted using radiation damage models to prevent structure reconstruction algorithms from confusing neutral atoms with ions of different chemical elements. For higher pulse lengths and fluences we observe a relative reduction in the degree of coherence. For short pulses, higher coherence is observed due to the relative reduction in radiation damage associated with the contribution from secondary ionization processes (Hau-Riege et al., 2004).
The impact of radiation damage on the reconstructed electronic structure of a fragment of DNA origami can be observed in Fig. 2. The electron density, depicted in dark red, was reconstructed using equation (14) from only the first coherent mode. The depletion of valence electrons observed in this figure is clearly visible when compared with the electron density reconstructed from both the first and second modes (light blue). Both reconstructions in Fig. 2 represent surfaces of constant electron density, the values of which are the same for both surfaces. The radial atomic electron densities plotted in Fig. 3 offer a deeper insight into the qualitative advantage of multi-mode electron-density reconstruction in the presence of radiation damage. The red and blue lines represent the 1D radial atomic electron densities used in the 3D electrondensity surfaces of the corresponding colour in Fig. 2 within the framework of the independent-atom model. The twomode reconstructions (blue) accurately reproduce the undamaged radial electron densities, while the one-mode reconstructions (red) clearly show the depletion of valence electrons for all atoms except phosphorus, for which a more dramatic depletion is observed for all electrons including those in the K shell (first peak in radial density). Note that, in general, a single coherent mode reconstruction becomes Radial atomic electron densities reconstructed using expansion (14) with two modes (blue lines) and one mode (red lines). Undamaged densities (black dashed lines) are shown for comparison. Radial distances from the atomic nucleus are expressed in Bohr radii, a B = 0.529 Å , and the radial electron densities are in electrons per a B . dependent on the incident pulse parameters in the presence of significant radiation damage. This will introduce an additional complication into subsequent structure refinement. A twomode reconstruction, on the other hand, is more robust to the choice of XFEL pulse parameters and yields a close approximation to the undamaged electron structure.
It is well known that elements as heavy as sulfur and phosphorus are ionized substantially faster by XFEL pulses than lighter elements of biological significance. The results in Fig. 1 indicate that phosphorus starts undergoing a rapid ionization several femtoseconds earlier than carbon, nitrogen and oxygen. Therefore, the radial electron density of phosphorus in Fig. 3, reconstructed using a single mode (red line), misses a larger fraction of initial electrons than lighter elements. Addition of the second mode into the reconstruction (blue line in Fig. 3) corrects phosphorus as well as the lighter elements. Therefore, the second coherent mode has an amplified contribution from heavier elements and a reduced contribution from lighter atoms compared with the first mode. In the context of single-particle imaging, this may have implications for imaging techniques that rely on anomalous diffraction from heavy elements, and this will be explored in future work.

APPENDIX A Coherent mode equations
In order to find the expression for coherent modes in (9) it is useful to consider the properties of the overlap matrix, The product of the structure factor T Z (q) and atomic form factor f Z (q) is the Fourier transform of the sum of electron densities of atoms Z in a configuration , The last part of this expression was obtained by neglecting atomic density overlap. An assumption of radial symmetry of the atomic electron density ðatÞ ðrÞ, followed by application of Parseval's theorem, yields equation (15) for the overlap matrix. Any such matrix is Gramian (Horn & Johnson, 2013), and hence positive semi-definite, since hSx; xi = P S x x = R dq P x ðqÞ 2 for any vector x. Moreover, the overlap matrix considered in the present paper is invertible, and therefore positive definite. Indeed, as we use a tight-binding model in which each atom of the sample is considered independent [equation (15)], the overlap matrix ðSÞ ZZ 0 0 is always equal to zero when Z 6 ¼ Z 0 . This means that S has a blockdiagonal structure, with each block corresponding to a different chemical element Z. Thus, its determinant can be zero only if the determinant of one of these individual blocks is zero. The latter condition would mean that the overlap matrix for different configurations of one atom is degenerate, which would correspond to a linear dependence of the atomic electron densities Z (r). This is never the case for a set of atomic electron densities that correspond to different orbital occupancies (electron configurations), since the difference in the Coulomb interaction between electron configurations lifts degeneracy and therefore prevents linear dependence. Note that, in the atomic model (Kozlov & Quiney, 2019) used in this paper, we average over the total angular momentum and its projections, treating all these states in the ensemble {Z} as one state with an effective weight. Let k and g k be the eigenvalues and the orthonormal eigenvectors of the symmetric matrix Q = S 1/2 PS 1/2 . Then for any vector x we have Qx = P k k hx; g k ig k . As P = S À1/2 QS À1/2 , we obtain Like the matrix S, the matrix P is also positive semi-definite, but unlike S, P can be highly degenerate. In particular, in the absence of radiation damage, the rank of P is equal to one, i.e. only one of the eigenvalues k is non-zero (in that case there is only one coherent mode, i.e. the diffracted wave is spatially fully coherent). Substituting equation (25) into the expression hPqðqÞ; qðqÞi in equation (6), where qðqÞ is the vector with elements (q), we obtain the following representation of the mean diffracted intensity in terms of coherent modes and their mean 'occupation numbers' (Mandel & Wolf, 1995;Quiney, 2010;Quiney & Nugent, 2011;Lorenz et al., 2012): The functions k (q) are orthonormal in the sense that Z dq Ã k ðqÞ k 0 ðqÞ ¼ X Z dq ðS À1=2 g k Þ ðqÞ S À1=2 g k 0 À Á ðqÞ ¼ hSS À1=2 g k ; S À1=2 g k 0 i where we have used the orthonormality of vectors g k . It is also easy to verify that the vectors v k = S À1/2 g k are eigenvectors of the matrix PS, i.e. PSv k = k v k : the proof consists of substituting g k = S 1/2 v k into the eigenvector equation S 1/2 PS 1/2 g k = k g k and multiplying both sides by S À1/2 from the left.