research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767

Adaptively coupled phase retrieval in multi-peak Bragg coherent diffraction imaging

crossmark logo

aDepartment of Physics and Astronomy, Brigham Young University, Provo, UT 84602, USA, bDepartment of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA, cAdvanced Photon Source, Argonne National Laboratory, Lemont, IL 60439, USA, dNational Synchrotron Light Source II, Brookhaven National Laboratory, Upton, NY 11973, USA, and eMaterials Science Division, Argonne National Laboratory, Lemont, IL 60439, USA
*Correspondence e-mail: [email protected]

Edited by F. Meneau, Brazilian Synchrotron Light Laboratory, Brazil (Received 28 August 2025; accepted 12 November 2025)

Recent advances in Bragg coherent diffraction imaging (BCDI) experimental techniques permit routine measurement of multiple Bragg peaks from a single crystalline grain. The resulting images contain the full lattice distortion vector field which can be differentiated to provide lattice strain and rotation. With the advent of fourth-generation synchrotron light sources, such multi-peak datasets are produced at high rates, facilitating the need for rapid phase retrieval of the multiple peaks and subsequent image analysis. Here we describe and demonstrate a new implementation of a coupled phase retrieval technique for multi-peak BCDI which simultaneously treats each Bragg peak of the dataset and produces a three-dimensional image of the crystal's morphology and lattice distortion field. In addition, this method uses the redundant information contained in the various Bragg diffraction patterns to detect and suppress spurious signal appearing on the detector in a subset of the measurements. Compared with manual data editing, adaptive coupling produces a more consistent phase profile in reciprocal space and sharper surfaces in direct space, with no significant difference in computational cost. These improvements reduce the need for manual preprocessing and enable robust high-throughput analysis of multi-peak BCDI data, supporting near-real-time strain microscopy at modern synchrotron facilities.

1. Introduction

Bragg coherent diffraction imaging (BCDI) is an important form of lensless imaging (Williams et al., 2003View full citation; Pfeifer et al., 2006View full citation; Newton et al., 2010View full citation; Miao et al., 2015View full citation). Like other forms of coherent diffraction imaging, it is based on the retrieval of phase information from spatially oversampled coherent diffraction intensities (Fienup, 1982View full citation; Fienup, 1987View full citation; Marchesini et al., 2003View full citation), which can then be back-propagated to create a nanometre-scale-resolution image of the diffracting object once phase retrieval is complete. In the case of BCDI, this diffraction pattern is obtained by reflecting coherent X-rays from atomic planes within a crystal according to Bragg's law (Patterson, 1939View full citation). In this geometry, `rocking' the crystal through the Bragg condition in fine angular increments while measuring the two-dimensional diffraction intensity around a Bragg peak is equivalent to sweeping out a three-dimensional volume of that crystal's reciprocal space (Williams et al., 2003View full citation). In the elastic single scattering limit, this measurement is mathematically related to the object through a single 3D Fourier transform operation.

Bragg diffraction is highly sensitive to the spacing of atomic planes, and any variation in that spacing due to crystalline deviatoric strain is encoded in the coherent interference surrounding the corresponding Bragg peak. For a small region around the reciprocal-lattice point G, this encoding can be approximated as

Mathematical equation

where Ψ is the complex diffraction pattern, ρ is proportional to the diffracting crystal's electron density and u is the atomic displacement field. Because of the projection u · G, at least three noncoplanar reflections are required to provide a full 3D image of the lattice distortion (and by extension the full strain tensor) within a crystal (Pfeifer et al., 2006View full citation; Harder et al., 2007View full citation; Robinson & Harder, 2009View full citation; Favre-Nicolin et al., 2010View full citation; Newton et al., 2010View full citation; Yang et al., 2013View full citation; Ulvestad et al., 2016View full citation; Hofmann et al., 2017View full citation; Ulvestad et al., 2017View full citation; Yau et al., 2017View full citation; Cherukara et al., 2018View full citation; Hruszkewycz et al., 2018View full citation; Singer et al., 2018View full citation; Kawaguchi et al., 2019View full citation; Hofmann et al., 2020View full citation).

In recent years, several coupled phase retrieval (CPR) techniques have been demonstrated in which, rather than combining the results of individually phased Bragg peaks, the phase retrieval problems associated with each peak are synthesized into a single optimization problem (Newton, 2020View full citation; Gao et al., 2021View full citation; Wilkin et al., 2021View full citation; Maddali et al., 2023View full citation). This larger problem requires a reconstructed object to agree simultaneously with all Bragg peaks. Due to the limitations of measurement, a low-error solution to the phase problem that satisfies all collected data is pathologically rare. However, for small amounts of evenly distributed noise, an equal compromise between all measurements is likely to be closer to the truth than a perfect fit to any one.

Unfortunately, the equal-compromise approach – which we will refer to as static coupling – fails in the case of a large localized discrepancy, such as a second Bragg peak caused by another (unwanted) crystal, as shown in Fig. 1[link](a). These accidental reflections are colloquially referred to as `aliens', a term comparing their unexpected appearance and unknown origin to visitors from outer space. The generally preferred approach to the alien problem has been to remove the problematic data manually, either by erasing sections of a measured Bragg peak or by discarding the measurement entirely. However, while increasingly brilliant X-ray facilities worldwide allow for more sensitive measurements, that very sensitivity will also reduce the amount of data that can be collected without aliens. This, combined with an ever-accelerating rate of data production, necessitates a more efficient, precise and objective method for alien removal.

[Figure 1]
Figure 1
Illustrations of two significant challenges to multi-peak BCDI. (a) An alien is produced when a single experimental geometry satisfies the Bragg condition for multiple crystals along the beam path, preventing either peak from being faithfully measured. (b) Each peak is measured with a unique crystal orientation and detector position, giving each its own unique reciprocal basis.

A second challenge associated with multi-peak BCDI, illustrated in Fig. 1[link](b), is that the coordinate system used for measurement is different for each peak (Maddali et al., 2020View full citation). One solution is to resample the data into a common reference frame before performing phase retrieval. However, transforming the unique geometry of each peak measurement to a common frame skews the initially rectangular 3D images into parallelepipeds which only partially fill the array onto which they are resampled. As a result, diffraction bars which extended to the edges of the original measurement may be truncated in the resampled array, producing Gibbs-phenomenon artifacts in the reconstruction. The alternative, transforming the reconstructing object into each measurement geometry when phasing the corresponding peak, has its own drawbacks. While often preferred for being more faithful to the measurement, this method can require thousands of interpolations throughout a reconstruction, which introduces a risk of compounding errors and increases the reconstruction time by orders of magnitude.

In this article, we present two adaptive coupling methods for multi-peak BCDI that use redundant information to mitigate the effects of spurious or incomplete measurements. These methods vastly increase the robustness of CPR, producing high-quality reconstructions despite aliens and resampled data, all while adding a negligible computational cost. By reducing the need for manual data processing and enabling faster reconstructions, adaptive coupling can dramatically accelerate the analysis of multi-peak BCDI data. Section 2[link] describes both techniques in detail, including how they fit into a particular CPR framework. In Section 3[link] we demonstrate these methods by reconstructing the density and atomic displacement field of an Au crystal from five Bragg peaks, one of which contains an alien. Finally, Section 4[link] concludes the article with a brief discussion of the potential implications of such a technique. Table 1[link] lists the symbols and notation used herein.

Table 1
Symbols for mathematical notation used herein, with definitions

Symbol Meaning
p Index signifying a particular Bragg peak
Gp Scattering vector associated with peak p
r Location in direct space
q Location in reciprocal space (relative to a given Gp)
Ip Measured diffraction intensity for peak p
Mp Measured diffraction amplitude for peak p
ρ Electron density within the scattering crystal
u Atomic displacement field within the scattering crystal
ψp, Ψp Direct- and reciprocal-space reflections for peak p as calculated from ρ and u
ψ, Ψ Direct- and reciprocal-space reflections as they are being phased
β Global update strength
NMI(A, B) Normalized mutual information for two arrays A and B
H(A) Shannon entropy of array A
H(A, B) Joint Shannon entropy of two arrays A and B
Cp List of values returned by equation (4) for peak p during the current reconstruction
wp Weight assigned to peak p
w0 Minimum weight for a peak to apply Shrinkwrap software (user defined)
X0 Threshold for alien detection (user defined)
Qall Set of all voxels for which at least one peak has data

2. Adaptive coupling

The adaptive coupling methods presented here are implemented within a CPR framework that involves phasing a single peak at a time, while periodically applying a peak-switching operation. In this process, the redundant information gained while phasing one peak generally improves the initial guess for the next peak. Each peak p has an associated scattering vector Gp, which determines the diffraction measurement Mathematical equation used during phasing. When switching from a given peak, the partially reconstructed exit wave ψ(r) is first used to update a shared density ρ(r) and displacement field u(r) as follows:

Mathematical equation

Mathematical equation

where β is a weighting parameter which biases the update towards new (β = 1) or old (β = 0) information and may be adjusted at any point in the reconstruction. The updated density/displacement are then projected onto the new peak using equation (1)[link] and phasing continues with the corresponding diffraction pattern. This static coupling largely follows the CPR method described by Gao et al. (2021View full citation), the main differences being the weighting parameter β and the option to allow multiple iterations between peak switches.

Into this framework we implement two novel techniques: adaptive peak weighting and adaptive artifact removal. A single iteration of the complete algorithm is given in Fig. 2[link]. On a conceptual level, both methods work on the principle that, since each Bragg peak is a projection of the same crystal, their fully described wavefields must all be perfectly compatible – any conflicting information represents an error in measurement. As such, adaptive coupling is concerned with determining the compatibility of measurements and weighting them accordingly. In the following descriptions, Ψ refers to the diffraction pattern as projected from the density and displace­ment fields [see equation (1)[link]] without any additional phasing. For simplicity, we shall assume that the various Bragg peaks have been resampled in a common basis.

[Figure 2]
Figure 2
A pseudocode representation of a single iteration of adaptively coupled phase retrieval. The Boolean values of switchPeaks and updateWeights depend on whether the current iteration has been marked for those operations. The FFT and IFFT functions, respectively, represent the forward and inverse fast Fourier transforms.

Adaptive peak weighting regulates the overall influence each dataset has on the reconstruction. As a confidence metric, we calculate the normalized mutual information (NMI) defined as

Mathematical equation

where Mp and |Ψ| are, respectively, the measured and calculated diffraction amplitudes, both of which have been convolved with a Gaussian (σ = 1 voxel) to suppress high-frequency fluctuations, and H is the numerically estimated Shannon entropy of an array (or joint entropy of two arrays). Mutual information compares the overall structure by examining the frequency with which two similar-valued elements in one array are also similar-valued in the other (Feixas et al., 2014View full citation). For a more detailed discussion of NMI, see Appendix A[link]. After switching to a peak but before phasing begins, the NMI is calculated and appended to a running list Cp associated with that peak.

In general, the NMI should increase as the reconstruction progresses, and a persistently low NMI indicates the presence of spurious information. On this basis, each peak is periodically assigned a weight wp, defined as the median value of Cp (scaled so that the `heaviest' peak has unit weight). The global weighting parameter β is then replaced with wpβ in equations (2)[link] and (3)[link], reducing the update strength for all but the most confident peak. Additionally, peaks whose weight falls below a user-defined threshold w0 are not allowed to alter the support region while phasing, for example, with Shrinkwrap (Marchesini et al., 2003View full citation).

Adaptive artifact removal applies the same high-level principle on a voxel-by-voxel basis. Aliens and other artifacts are detected using an error function,

Mathematical equation

which is large where the measurement exceeds the projection but small where the projection exceeds the measurement. This asymmetry makes it well suited to detecting aliens, which are almost exclusively an additive phenomenon. In addition to aliens, any voxel with a value of zero may be considered an artifact. In cases of resampled diffraction patterns, this includes regions where the measurement does not occupy the entire resampling array. Even in the original measurements, however, zero-valued voxels do not indicate that the photon distribution integrated over the area of a pixel is equal to zero. Rather, they are a result of finite dynamic range, photon statistics and/or image processing (e.g. thresholding to remove the noise floor).

When switching to a new peak, we define a mask

Mathematical equation

where X0 is a user-defined threshold and Qall is the volume of reciprocal space that has been measured by at least one peak. This mask selects low-confidence voxels which could benefit from information drawn from other peaks. If the current peak has wp < 1, we redefine the reciprocal-space image used for the modulus constraint as

Mathematical equation

until the next peak switch. This modified modulus constraint enforces consistency both with the reliable portions of the measured diffraction pattern and with the projection synthesized from all other Bragg peaks. If the peak has wp = 1, the modulus constraint is left unchanged, to prevent any voxel from being masked in every peak and therefore unconstrained. A single projection guess, defined as the median measured amplitude at each point in reciprocal space, is used to produce an initial mask for each peak. Until the mask is updated, these voxels should be constrained to zero to prevent particularly bright aliens from becoming entrenched. Note that this assumes that, at any given spatial frequency (relative to its associated Bragg peak), the majority of measurements will not contain an alien.

In addition to the `recipe' associated with nearly all iterative phase retrieval algorithms, adaptive coupling requires the user to set three parameters: (i) corresponding lists of iterations/values for updating w0, (ii) how often to update each peak's Qmask and (iii) the alien detection threshold X0. However, as is often the case, a single set of default parameters can be used for a wide range of datasets. As general guidelines, we recommend w0 be initially set to 0.5 and then increased periodically to a final value of 0.9; Qmask should be updated after the solver has cycled through each peak two to five times; and X0 ∈ [2, 6] depending on the desired sensitivity.

3. Reconstructions

The effectiveness of adaptive coupling is demonstrated by reconstructing an Au crystal (Fig. 3[link]) from its Mathematical equation, Mathematical equation, Mathematical equation, Mathematical equation and 002 Bragg peaks. These data were collected at the Advanced Photon Source (APS) on beamline 34-ID-C in 2022, prior to the APS upgrade. Diffraction patterns were produced with a 9 keV beam and measured at a distance of 1 m on one 256 × 256 pixel quadrant of an ASI Quad Timepix detector at 200 points along a rocking curve (±0.5°). The images were resampled in the crystal's reference frame immediately prior to reconstruction but after all other pre-processing. The resampled arrays used for phasing were 240 × 240 × 240 voxels. Central slices of both the original and resampled diffraction images are shown in Fig. 3[link], and the morphology of the reconstructed crystal is given in Fig. 4[link].

[Figure 3]
Figure 3
Central slices of each measured Bragg peak, (top) in the original rocking curve coordinate system and (bottom) resampled into the crystal coordinate system for phase retrieval. Images are colored on a logarithmic scale. An additional diffraction pattern was added to the Mathematical equation peak image to simulate the presence of an alien. Due to the placement of this alien, it does not appear in the central frame of the rocking curve, though some fringes are visible to the left of the main peak in the resampled image. The alien can be more easily seen in Fig. 6(b).
[Figure 4]
Figure 4
Three views of a 60% density isosurface of an Au crystal as reconstructed using adaptive coupling. The dimensions given in (a) represent the height and width of the crystal at its largest. The cross section used in Fig. 8 is shown as a partially transparent plane in (b) and (c). The dark-gray plane in all three images represents the substrate on which this crystal sits. The crystal's unusual morphology is due to it being only one half of a bicrystal. The forward-facing facet in panel (a) is a grain boundary with a twinned crystal (not shown) of similar size and shape.

Before resampling, a second diffraction pattern was superimposed onto the Mathematical equation peak. This artificial alien, originally measured from a separate Au crystal, was scaled to have a peak intensity 10% that of the primary diffraction pattern, with Poisson noise regenerated to match the dynamic range. The alien was intentionally placed along one of the primary's diffraction bars, such that erasing the alien would also erase some of the intended measurement.

Each reconstruction was performed in Cohere, an open-source BCDI-focused phase retrieval program developed at the APS (Frosik et al., 2024View full citation), with 1000 total iterations of phase retrieval broken into alternating blocks of 75 hybrid input–output (HIO) steps and 25 error reduction (ER) steps (Fienup, 1982View full citation). Shrinkwrap (Marchesini et al., 2003View full citation) was applied with 1 voxel Gaussian pre-smoothing and 10% threshold after every HIO iteration (with exceptions described below). Peak switching was applied after every fifth iteration, with an initial value of β = 1 in equations (2)[link] and (3)[link], reduced to 0.8, 0.6, 0.5 and 0.4, respectively, after every 200 iterations.

We consider three reconstruction cases, each of which was repeated with 25 random initializations. In the first case, no alien was present and the five peaks were phased using CPR with static coupling. In the second case, the alien was added and then manually removed from the Mathematical equation peak prior to phasing by setting a rectangular volume of 92 × 256 × 84 voxels to zero (prior to resampling). Again, CPR was applied with static coupling. In the third case, the alien was added to the Mathematical equation peak and the crystal was phased using the adaptive coupling techniques described in this article. No significant difference was observed in the computational speeds of these three groups.

In the adaptive reconstructions (also implemented in Cohere), peak weights were initialized at unity and updated every 100 iterations, as shown in Fig. 5[link]. Despite significant variability in the first few updates, the final weight assigned to each peak was highly consistent across the 25 reconstructions. As expected, the lowest weight was assigned to the alien-containing Mathematical equation peak. The Mathematical equation peak was also given a relatively low weight, which could be due to its reduced measurement volume compared with the others. The minimum weight required to apply Shrinkwrap to the shared support region while phasing was initially set to w0 = 0.5 and increased by 0.1 every 200 iterations to a final value of w0 = 0.9. The threshold for alien detection and removal was fixed at X0 = 5.

[Figure 5]
Figure 5
Adaptive coupling weights assigned to each peak over the course of 25 reconstructions. Solid lines indicate median values, while lighter filled regions show the 25th to 75th percentile range. While early weights differ greatly from one reconstruction to the next, the values converge strongly by the end. As expected, the Mathematical equation peak (which contained the alien) was given the least weight.

Fig. 6[link] illustrates how adaptive coupling not only removes the alien but also fills in the space it had occupied. By contrast, manual data editing introduces a large region of zero intensity which, while probably better than the alien, still constrains the reconstruction with information known to be false. Fig. 7[link] shows how the alien-detecting part of Qmask, initially conservative, grows over the course of the reconstruction. Fig. 8[link] shows that static coupling also produces oscillations that radiate inward from each facet (Gibbs phenomena) in the reconstructed density and displacement fields. These artifacts are much less pronounced in reconstructions using adaptive coupling.

[Figure 6]
Figure 6
A 2D slice of the Au crystal's Mathematical equation Bragg peak amplitude profile resampled in the crystal reference frame and colored on a logarithmic scale, (a) as measured without an alien, (b) with an artificially added alien, (c) with the alien removed by manual editing and (d) with the alien removed by adaptive coupling. The irregular shape of the alien does not fit cleanly into the (originally) rectangular region cut out of panel (c), which erases some desired signal and leaves some alien signal. By contrast, adaptive coupling is able to mask a more tightly fitted region around the alien and replaces the erased intensity with projected values. The 3D image arrays were sliced through the center of the alien rather than the main peak.
[Figure 7]
Figure 7
A 2D slice (same index as Fig. 6) of the alien detection mask for the Mathematical equation Bragg peak after 0, 200 and 400 iterations of phase retrieval. The mask, initially covering only the brightest parts of the alien, grows as the reconstruction reaches a consensus on the merits of each measured voxel. In these tests, the mask typically stabilized after approximately 400 iterations and showed little change thereafter.
[Figure 8]
Figure 8
Table of cross-sectional images, showing the atomic density and displacement as reconstructed from five Bragg peaks using (a) static coupling with no alien present, (b) static coupling with an alien manually removed from one peak and (c) adaptive coupling with an alien present in one peak. An outline of the 60% density isosurface is overlaid on each image. Each displacement component is labeled, along with an arrow indicating the positive direction, and the size of the cross section is overlaid on the density of panel (a). The atomic displacement fields in rows (a) and (b) show strong oscillations normal to crystal facets – a common Fourier artifact, which is significantly less pronounced in panel (c). To see where the cross section is located in the 3D object, see Fig. 4.

We may obtain an estimate of resolution by examining the sharpness of reconstructed surfaces, similar to the `knife-edge test' commonly done in 2D imaging with binary-resolution test patterns. In the present case, the density of the actual Au crystal is effectively binary – uniform within and dropping sharply to zero at the surface. The sharpness of the reconstructed density may therefore be treated as a true, if incomplete, accuracy metric. The knife-edge resolution was calculated by taking the median distance between each point in the 25% (outer) density isosurface and the nearest point in the 75% (inner) density isosurface. The median knife-edge widths are given in Table 2[link], along with the narrowest and widest (i.e. sharpest and blurriest) values in each group.

Table 2
Half-pitch resolution for the sharpest, median and blurriest density reconstructions in each set, estimated by the median distance between the 25% and 75% density isosurfaces

Because the transition from diffracting crystal to non-diffracting air is abrupt, this metric partially indicates the accuracy of a reconstruction. According to these tests, adaptive coupling produces sharper reconstructions than manual alien removal.

Method Narrowest Median Widest
No alien 21.8 nm 22.9 nm 23.7 nm
Manual removal 24.7 nm 26.0 nm 26.6 nm
Adaptive coupling 23.1 nm 23.8 nm 24.4 nm

To test adaptive coupling further, we performed reconstructions on simulated data. The density profile of the simulated crystal was binary, convex and randomly faceted, and each component of the displacement field was generated from random noise passed through a Gaussian filter. Because the crystal was generated in the laboratory frame, its generated Bragg peaks were resampled to simulate different detector positions. While the resampled arrays were 200 × 256 × 256 voxels to simulate 200 angular positions on a 256 × 256 pixel detector, the crystal and its diffraction peaks were initially generated at the higher resolution of 512 × 512 × 512 to prevent aliasing.

Five diffraction patterns were generated, representing the Mathematical equation, Mathematical equation, 002, 200 and Mathematical equation reflections. Aliens generated from other simulated crystals were superimposed onto the Mathematical equation and Mathematical equation peaks. The Mathematical equation alien was very close to the center of its image, while the Mathematical equation alien was nearer to the edge. Both aliens were adjusted to have 10% of the maximum intensity of their main peak. Two sets of reconstructions were performed: the first using only the Mathematical equation, 200 and Mathematical equation peaks, and the second using all five peaks.

The reconstruction parameters for both sets were the same as for the Au nanocrystal, with two exceptions: when sampling back into the detector frame, the diffraction patterns were sampled at 180 × 180 × 180 voxels and the total number of phasing iterations was reduced from 1000 to 500. Each dataset was reconstructed 50 times using adaptive coupling, 50 using manual data editing and 50 with the aliens simply left in place. Fig. 9[link] shows a cross section of the displacement field as reconstructed by each of these groups.

[Figure 9]
Figure 9
Central cross sections of the out-of-plane displacement field in a simulated crystal, (a) downsampled from the original ground truth to match the sampling rate used for phasing, (b)–(d) reconstructed from three peaks with one alien and (e)–(g) reconstructed from five peaks with two aliens. Three different reconstruction approaches were used: (b) and (e) adaptive coupling, (c) and (f) manual data editing, and (d) and (g) no alien removal.

Each reconstructed crystal was centered, flipped (if a twin image) and projected onto each of the three principle axes via equation (1)[link]. A Fourier shell correlation was then used to compare the resulting diffraction images with corresponding projections of the originally simulated crystal. Fig. 10[link] gives the median correlation over all three axes and all 50 reconstructions performed per dataset per method. In this test, reconstructions produced with adaptive coupling were significantly more correlated to the ground truth than those produced with manual data editing.

[Figure 10]
Figure 10
Fourier shell correlations (FSCs) between the reconstructed and true density/displacement fields for various methods and datasets. Each line represents the median FSC across 50 reconstructions. Higher values represent better correlation with ground truth. By this metric, adaptive coupling leads to more accurate reconstruction than manual data editing by a significant margin across the entire range of spatial frequencies.

4. Conclusion

In this work, we have introduced two complementary adaptive coupling strategies – adaptive peak weighting and adaptive artifact removal – within a CPR framework for multi-peak BCDI. By dynamically adjusting the influence of each Bragg reflection according to its agreement with redundant measurements and by selectively overwriting unreliable voxels, these techniques substantially enhance reconstruction robustness in the presence of imperfect data. This added robustness reduces the need for extensive manual preprocessing of data and allows for rapid reconstructions using pre-transformed diffraction patterns. As a result, adaptive coupling could help beamlines perform high-throughput experiments and enable near-real-time analysis of multi-peak BCDI experiments.

APPENDIX A

Entropy and normalized mutual information

The great insight underlying most of information theory is that the entropy of a distribution – its randomness or unpredictability – is intrinsically related to the amount of information gained by sampling it (Shannon, 1948View full citation). This is why, before planning a picnic for tomorrow, one checks to see whether it will rain but not whether the sun will come up. When discussing the information content of an array A, we assume that each element of A has been drawn randomly from an unknown (and in the present case, continuous) probability distribution p. Fortunately for our numerical machines, p can be simultaneously estimated and discretized by taking the histogram of A. The Shannon entropy of A can then be defined in the usual way,

Mathematical equation

where Mathematical equation is the set of bins used for the histogram, a is an element whose value falls within a given bin and p(a) is the count for each bin divided by the total number of elements in A. Similarly, a joint (2D) histogram can be used to calculate the joint entropy of two arrays A and B,

Mathematical equation

where p(a, b) is the probability of finding an a-valued element in A at the same index as a b-valued element in B. When calculating these histograms during phasing, the amplitudes in each image were divided into 100 log-spaced bins.

Entropy plays a key role in equation (4)[link], where the similarity between the measured and calculated diffraction amplitudes is expressed using normalized mutual information. In its original form (Studholme et al., 1999View full citation), the NMI between two arrays A and B is given as

Mathematical equation

By this definition, perfectly uncorrelated images have an NMI of 1, while perfectly correlated images have an NMI of 2. However, many implementations (including our own) shift this to the more intuitive range [0, 1] by subtracting unity (Feixas et al., 2014View full citation). In either case, however, the NMI is large when knowing the value of some element in A allows one to reliably predict the value of the corresponding element in B. As might be expected, an image has maximum NMI with a perfect copy of itself. Less intuitively, adding a constant offset to an image does not affect its NMI with any other image. In fact, NMI is insensitive to any operation that maps similar ranges to other similar ranges [i.e. Mathematical equation]. For this reason, it was developed for use in aligning biomedical images, where two images of the same tissue, taken using different modalities, might appear drastically different yet have the same underlying structure (Studholme et al., 1999View full citation). In the present context this is a desirable feature, since it allows us to compare images that may have different actual intensities, and it does not favor high-intensity over low-intensity regions.

Supporting information


Acknowledgements

We thank the Brigham Young University Microscopy Laboratory and BYU Integrated Microfabrication Laboratory for help with the electron microscopy work and nano­fabrication. We also thank the developers of NumPy (Harris et al., 2020View full citation), SciPy (Virtanen et al., 2020View full citation), Matplotlib (Hunter, 2007View full citation), CuPy (Okuta et al., 2017View full citation), Mayavi (Ramachandran & Varoquaux, 2011View full citation), ParaView (Fabian et al., 2011View full citation) and Vedo (Musy et al., 2025View full citation) for creating many of the computational tools used in this work.

Conflict of interest

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available as supporting information to this article via Zenodo at https://doi.org/10.5281/zenodo.17503854.

Funding information

We acknowledge funding from the US Department of Energy (DOE), Office of Science, Basic Energy Sciences (award No. DE-SC0019096 and award No. DE-SC0022133); from the US Steel Chair of Metallurgical Engineering and Materials Science at Carnegie Mellon University; from the Advanced Photon Source, a US DOE Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory (contract No. DE-AC02-06CH11357); and from the College of Physical and Mathematical Sciences at Brigham Young University. The contribution of Stephan Hruszkewycz was supported by the US DOE, Office of Science, Basic Energy Sciences, Materials Science and Engineering Division. The contributions of Yuan Gao and Garth Williams were supported by the National Synchrotron Light Source II, a US DOE Office of Science User Facility operated for the DOE Office of Science by Brookhaven National Laboratory under contract No. DE-SC0012704.

References

Return to citationCherukara, M. J., Pokharel, R., O'Leary, T. S., Baldwin, J. K., Maxey, E., Cha, W., Maser, J., Harder, R. J., Fensin, S. J. & Sandberg, R. L. (2018). Nat. Commun. 9, 3776.  Web of Science CrossRef PubMed Google Scholar
Return to citationFabian, N., Moreland, K., Thompson, D., Bauer, A. C., Marion, P., Gevecik, B., Rasquin, M. & Jansen, K. E. (2011). 2011 IEEE symposium on large data analysis and visualization, pp. 89–96. IEEE.  Google Scholar
Return to citationFavre-Nicolin, V., Mastropietro, F., Eymery, J., Camacho, D., Niquet, Y. M., Borg, B. M., Messing, M. E., Wernersson, L.-E., Algra, R. E., Bakkers, E. P. A. M., Metzger, T. H., Harder, R. & Robinson, I. K. (2010). New J. Phys. 12, 035013.  Google Scholar
Return to citationFeixas, M., Bardera, A., Rigan, J., Xu, Q. & Sbert, M. (2014). Information theory tools for image processing. Cham: Springer International Publishing.  Google Scholar
Return to citationFienup, J. R. (1982). Appl. Opt. 21, 2758–2769.  CrossRef CAS PubMed Web of Science Google Scholar
Return to citationFienup, J. R. (1987). J. Opt. Soc. Am. A 4, 118–123.  CrossRef Web of Science Google Scholar
Return to citationFrosik, B., Harder, R. & Porter, J. N. (2024). Cohere, https://github.com/AdvancedPhotonSource/cohereGoogle Scholar
Return to citationGao, Y., Huang, X., Yan, H. & Williams, G. J. (2021). Phys. Rev. B 103, 014102.  Web of Science CrossRef Google Scholar
Return to citationHarder, R., Pfeifer, M. A., Williams, G. J., Vartaniants, I. A. & Robinson, I. K. (2007). Phys. Rev. B 76, 115425.  Web of Science CrossRef Google Scholar
Return to citationHarris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C. & Oliphant, T. E. (2020). Nature 585, 357–362.  Web of Science CrossRef CAS PubMed Google Scholar
Return to citationHofmann, F., Phillips, N. W., Das, S., Karamched, P., Hughes, G. M., Douglas, J. O., Cha, W. & Liu, W. (2020). Phys. Rev. Mater. 4, 013801.  Web of Science CrossRef Google Scholar
Return to citationHofmann, F., Tarleton, E., Harder, R. J., Phillips, N. W., Ma, P.-W., Clark, J. N., Robinson, I. K., Abbey, B., Liu, W. & Beck, C. E. (2017). Sci. Rep. 7, 45993.  Web of Science CrossRef PubMed Google Scholar
Return to citationHruszkewycz, S. O., Maddali, S., Anderson, C. P., Cha, W., Miao, K. C., Highland, M. J., Ulvestad, A., Awschalom, D. D. & Heremans, F. J. (2018). Phys. Rev. Mater. 2, 086001.  Web of Science CrossRef Google Scholar
Return to citationHunter, J. D. (2007). Comput. Sci. Eng. 9, 90–95.  Web of Science CrossRef Google Scholar
Return to citationKawaguchi, T., Keller, T. F., Runge, H., Gelisio, L., Seitz, C., Kim, Y. Y., Maxey, E. R., Cha, W., Ulvestad, A., Hruszkewycz, S. O., Harder, R., Vartanyants, I. A., Stierle, A. & You, H. (2019). Phys. Rev. Lett. 123, 246001.  Web of Science CrossRef PubMed Google Scholar
Return to citationMaddali, S., Frazer, T. D., Delegan, N., Harmon, K. J., Sullivan, S. E., Allain, M., Cha, W., Dibos, A., Poudyal, I., Kandel, S., Nashed, Y. S. G., Heremans, F. J., You, H., Cao, Y. & Hruszkewycz, S. O. (2023). npj Comput. Mater. 9, 1–12.  Web of Science CrossRef Google Scholar
Return to citationMaddali, S., Li, P., Pateras, A., Timbie, D., Delegan, N., Crook, A. L., Lee, H., Calvo-Almazan, I., Sheyfer, D., Cha, W., Heremans, F. J., Awschalom, D. D., Chamard, V., Allain, M. & Hruszkewycz, S. O. (2020). J. Appl. Cryst. 53, 393–403.  Web of Science CrossRef CAS IUCr Journals Google Scholar
Return to citationMarchesini, S., He, H., Chapman, H. N., Hau-Riege, S. P., Noy, A., Howells, M. R., Weierstall, U. & Spence, J. C. H. (2003). Phys. Rev. B 68, 140101.  Web of Science CrossRef Google Scholar
Return to citationMiao, J., Ishikawa, T., Robinson, I. K. & Murnane, M. M. (2015). Science 348, 530–535.  Web of Science CrossRef CAS PubMed Google Scholar
Return to citationMusy, M., Jacquenot, G., Dalmasso, G., Lee, J., Pujol, L. , Soltwedel, J., de Bruin, R., Zhou, Z.-Q., Tulldahl, M., Poisonous, CorpsSans­Organes, RobinEnjalbert, Sol, A., Lu, X. S. U. E., Codacy Badger, Kunimune, J., Claudi, F., Hacha, B., Lee, A., Pollack, A., Schneider, O., daizhirui, RichardScottOZ, Mitrano, P., Brodersen, P., Schlömer, N., mkerrinrapid, Linus-Foley & JohnsWor (2025). Vedo, https://zenodo.org/doi/10.5281/zenodo.2561401Google Scholar
Return to citationNewton, M. C. (2020). Phys. Rev. B 102, 014104.  Web of Science CrossRef Google Scholar
Return to citationNewton, M. C., Leake, S. J., Harder, R. & Robinson, I. K. (2010). Nat. Mater. 9, 120–124.  Web of Science CrossRef CAS PubMed Google Scholar
Return to citationOkuta, R., Unno, Y., Nishino, D., Hido, S. & Loomis, C. (2017). Proceedings of workshop on machine learning systems (Learning­Sys), p. 7, Long Beach, California, USA.  Google Scholar
Return to citationPatterson, A. L. (1939). Phys. Rev. 56, 972–977.  CrossRef CAS Google Scholar
Return to citationPfeifer, M. A., Williams, G. J., Vartanyants, I. A., Harder, R. & Robinson, I. K. (2006). Nature 442, 63–66.  Web of Science CrossRef PubMed CAS Google Scholar
Return to citationRamachandran, P. & Varoquaux, G. (2011). Comput. Sci. Eng. 13, 40–51.  Web of Science CrossRef Google Scholar
Return to citationRobinson, I. & Harder, R. (2009). Nat. Mater. 8, 291–298.  Web of Science CrossRef PubMed CAS Google Scholar
Return to citationShannon, C. E. (1948). Bell Syst. Tech. J. 27, 379–423.  CrossRef Web of Science Google Scholar
Return to citationSinger, A., Zhang, M., Hy, S., Cela, D., Fang, C., Wynn, T. A., Qiu, B., Xia, Y., Liu, Z., Ulvestad, A., Hua, N., Wingert, J., Liu, H., Sprung, M., Zozulya, A. V., Maxey, E., Harder, R., Meng, Y. S. & Shpyrko, O. G. (2018). Nat. Energy 3, 641–647.  Web of Science CrossRef CAS Google Scholar
Return to citationStudholme, C., Hill, D. L. G. & Hawkes, D. J. (1999). Pattern Recognit. 32, 71–86.  Web of Science CrossRef Google Scholar
Return to citationUlvestad, A., Sasikumar, K., Kim, J. W., Harder, R., Maxey, E., Clark, J. N., Narayanan, B., Deshmukh, S. A., Ferrier, N., Mulvaney, P., Sankaranarayanan, S. K. R. S. & Shpyrko, O. G. (2016). J. Phys. Chem. Lett. 7, 3008–3013.  Web of Science CrossRef CAS PubMed Google Scholar
Return to citationUlvestad, A., Welland, M. J., Cha, W., Liu, Y., Kim, J. W., Harder, R., Maxey, E., Clark, J. N., Highland, M. J., You, H., Zapol, P., Hruszkewycz, S. O. & Stephenson, G. B. (2017). Nat. Mater. 16, 565–571.  Web of Science CrossRef CAS PubMed Google Scholar
Return to citationVirtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C. J., Polat, İ., Feng, Y., Moore, E. W., VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa, F., van Mulbregt, P., Vijaykumar, A., Bardelli, A. P., Rothberg, A., Hilboll, A., Kloeckner, A., Scopatz, A., Lee, A., Rokem, A., Woods, C. N., Fulton, C., Masson, C., Häggström, C., Fitzgerald, C., Nicholson, D. A., Hagen, D. R., Pasechnik, D. V., Olivetti, E., Martin, E., Wieser, E., Silva, F., Lenders, F., Wilhelm, F., Young, G., Price, G. A., Ingold, G., Allen, G. E., Lee, G. R., Audren, H., Probst, I., Dietrich, J. P., Silterra, J., Webber, J. T., Slavič, J., Nothman, J., Buchner, J., Kulick, J., Schönberger, J. L., de Miranda Cardoso, J. V., Reimer, J., Harrington, J., Rodríguez, J. L. C., Nunez-Iglesias, J., Kuczynski, J., Tritz, K., Thoma, M., Newville, M., Kümmerer, M., Bolingbroke, M., Tartre, M., Pak, M., Smith, N. J., Nowaczyk, N., Shebanov, N., Pavlyk, O., Brodtkorb, P. A., Lee, P., McGibbon, R. T., Feldbauer, R., Lewis, S., Tygier, S., Sievert, S., Vigna, S., Peterson, S., More, S., Pudlik, T., Oshima, T., Pingel, T. J., Robitaille, T. P., Spura, T., Jones, T. R., Cera, T., Leslie, T., Zito, T., Krauss, T., Upadhyay, U., Halchenko, Y. O. & Vázquez-Baeza, Y. (2020). Nat. Methods 17, 261–272.  Web of Science CrossRef CAS PubMed Google Scholar
Return to citationWilkin, M. J., Maddali, S., Hruszkewycz, S. O., Pateras, A., Sandberg, R. L., Harder, R., Cha, W., Suter, R. M. & Rollett, A. D. (2021). Phys. Rev. B 103, 214103.  Web of Science CrossRef Google Scholar
Return to citationWilliams, G. J., Pfeifer, M. A., Vartanyants, I. A. & Robinson, I. K. (2003). Phys. Rev. Lett. 90, 175501.  Web of Science CrossRef PubMed Google Scholar
Return to citationYang, W., Huang, X., Harder, R., Clark, J. N., Robinson, I. K. & Mao, H. (2013). Nat. Commun. 4, 1680.  Web of Science CrossRef PubMed Google Scholar
Return to citationYau, A., Cha, W., Kanan, M. W., Stephenson, G. B. & Ulvestad, A. (2017). Science 356, 739–742.  Web of Science CrossRef CAS PubMed Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767
Follow J. Appl. Cryst.
Sign up for e-alerts
Follow J. Appl. Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds