Translative lens-based full-field coherent X-ray imaging

A description and simulation of a full-field coherent imaging approach suitable for hard X-rays based on a classical (i.e. Galilean) X-ray microscope.


Introduction
Lens-based full-field X-ray microscopy, in which an objective lens between the object and detector creates a magnified image of the object, offers the possibility to image extended objects in a single acquisition. As such, it is well suited for investigating dynamic processes, such as in materials (Snigireva et al., 2018), chemical reactions (Meirer et al., 2011) and biological systems (Meyer-Ilse et al., 2001). However, the spatial resolution of a lens-based full-field microscope is physically limited by the finite numerical aperture (NA) of its objective lens, which tends to be small (0.01 or less) at hard X-ray energies (E > 15 keV). Recent developments in X-ray optics have yielded substantial improvements in NA (Schroer & Lengeler, 2005;Morgan et al., 2015;Mohacsi et al., 2017;Matsuyama et al., 2019), but often at the cost of reducing the working distance to an impractical degree.
Synthetic aperture microscopy offers an alternative route to increasing the NA. One such approach is Fourier ptychographic microscopy (FPM) , which involves combining a series of low-resolution intensity images in Fourier space and subsequently back-propagating to the object plane to recover the exit surface complex wavefield. Varying the angle of the incident full-field illumination samples a wider range of scattering directions, thus improving the space-bandwidth product without the need to move the sample, objective lens or detector (Lohmann et al., 1996). FPM's image recovery procedure therefore differs from that of conventional X-ray ptychography (for example, see Rodenburg & Bates, 1992;Rodenburg et al., 2007;Thibault et al., 2008;Maiden & Rodenburg, 2009;Dierolf et al., 2010;Maiden et al., 2010;Humphry et al., 2012) in that the object support constraints are imposed in Fourier space rather than real space. The original implementation of FPM used a ISSN 1600-5775 conventional optical microscope (i.e. visible light) with a small magnification (2Â objective) and NA (0.08) to achieve a synthetic NA of 0.5, resulting in a spatial resolution comparable with a 20Â objective while maintaining the much larger field of view and depth of field of the original low-magnification configuration.
Adapting FPM to the X-ray regime could potentially address two key shortcomings of lens-based full-field X-ray microscopy: the compound image corresponds to a larger, synthetic NA, while digital wavefront correction may be used during the image recovery procedure to compensate for lens aberrations (which may be appreciable) (Koch et al., 2016). Furthermore, as this recovery procedure yields a complex image, one could exploit the phase contrast to dramatically increase sensitivity to weakly interacting objects (Fö rster et al., 1980;Snigirev et al., 1995;Cloetens et al., 1997). A practical X-ray implementation of FPM requires subtle differences to the original approach, however, since at large-scale facilities (e.g. synchrotrons) one cannot directly rotate the incident beam angle in the manner originally proposed by Zheng et al. (2013). Very recently, however, Wakonig et al. (2019) experimentally demonstrated FPM in the X-ray regime by moving a pinhole positioned at the aperture of a condensing lens (Wakonig et al., 2019), thus steering the incident beam angle at the sample position. Here we describe an alternative approach, where we instead move the lens, collecting images at various overlapping regions. Crucially, the lens and detector are moved transversely to the optical axis in order to avoid the mechanical complexity and imprecision associated with coupled translations/rotations. This approach is conceptually similar to downstream pinholescanning methods (e.g. Tsai et al., 2016;Guizar-Sicairos & Fienup, 2008), albeit using a focusing lens instead of a pinhole.
In this paper, the theory and methodology outlining the idea of lens translation imaging (LTI) is structured as follows. Section 2 describes the image formation problem via mathematical formalisms pertinent to scalar coherent wavefield propagation. The LTI image acquisition method is depicted in Section 3 and accompanied with numerical simulation examples. In Section 4 we detail the iterative phase-retrieval process that reconstructs the wavefield of the exit surface of the imaged object (see Fig. 1). Results from numerical simulations are also shown.
2. Theory of image formation (forward problem) Fig. 1 illustrates the LTI imaging configuration. Monochromatic coherent X-ray plane-wave electromagnetic illumination with wavelength propagates rightwards along the optical axis z. The incident radiation traverses an object or specimen where the intensity and phase changes incurred are imprinted on the complex wavefield É obj exiting the object.
This exit field then propagates downstream a distance z 1 reaching the entry plane É in of an optically thin converging lens with finite aperture size and focal length f. The field transmitted through the lens É out propagates a further distance z 2 to give É det , where a spatially sensitive detector is placed that measures the square modulus (intensity) of the wavefield É det 2 , thus excluding all phase information.
We derive an analytical expression for the wavefield É det utilizing the linear operator theory of imaging (Nazarathy & Shamir, 1980). The formalism treats the propagation and passage of optical wavefields though a system as a linear operator acting on some input to yield an associated output. This enables the problem in Fig. 1 to be undertaken in a 'cascading' approach, resulting in the following expression for the wavefield at the detector plane, Here, r 0 = ðx 0 ; y 0 Þ, r 1 = ðx 1 ; y 1 Þ and r 2 = ðx 2 ; y 2 Þ are the Cartesian coordinates normal to z corresponding to the object (É obj ), lens (É in and É out ) and detector plane (É det ), respectively. Note that all x and y axes at each plane are assumed to be parallel to each other. Tðr 1 À s n Þ is the transmission function of the lens, where the vector s n = ðs x ; s y Þ with magnitude js n j represents the translation position of the center of the lens in relation to the r 1 plane. The index n is an integer used as an indicator of the lens position. The operator P z , which forward propagates a complex wavefield by a distance z, is the operator form of the Fresnel diffraction integral (Goodman, 2005;Paganin, 2006;Born & Wolf, 1999), This operator maps a field in a plane defined by the coordinates r j = ðx j ; y j Þ onto a propagated field defined by the coordinates r jþ1 = ðx jþ1 ; y jþ1 Þ, where j takes on non-negative integer values ( j = 0, 1). For example, j = 0 would correspond to the field mapping of Éðr 0 Þ ! Éðr 1 Þ. The operation acts Schematic of the lens translation imaging (LTI) setup.
from right to left as follows: (i) multiply the input wavefield by a quadratic phase factor in r j ; (ii) take the 'scaled' Fourier transform which projects a complex function from r j to r jþ1 ; (iii) multiply the result by the quadratic phase factor in r jþ1 and the constant complex phase shift set by z. The Fourier transform convention used here is where We note the use of the term 'scaled', as the Fourier transform used here differs slightly from a conventional Fourier transform in that it maps a complex function from real space onto another complex function also in real space that is re-sampled by a scale factor 1=z (Goodman, 2005;Paganin, 2006). The complex transmission function of the lens denoted by Tðr 1 À s n Þ can be decomposed into four separate functions corresponding to the phase shift Q r 1 f , absorption A r 1 s n , lens aberration ðr 1 ; pq Þ and masking H r 1 s n (due to the finite aperture size) of the transmitted wavefield. That is, where The complex function Q r 1 f quantifies the phase shifts imparted by the, assumed to be thin, lens on the entering wavefield É in (Paganin, 2006). These phase shifts are consistent with the condition needed to create focus fields, where the phase exiting the surface of the lens must be such that a spherical wave is collapsed towards a point (Paganin, 2006). The amplitude attenuation suffered by É in is determined by the function A r 1 s n . It is important to note that in this section the amplitude function is kept arbitrary in order to accommodate for the various types of X-ray focusing elements that exist. However, for the simulation shown in Section 3 this function takes on the form of a Gaussian distribution corresponding to refractive optics such as compound refractive lenses (CRLs) (Simons et al., 2017). H r 1 s n represents the finite aperture size of the lens and serves to transmit only the spatial frequencies of the wavefield É in , within the radius of the physical aperture (Simons et al., 2017). ðr 1 ; pq Þ is the aberration function of the lens, which is characterized by the aberration coefficients pq , where p and q are non-negative integers representing the aberration order (Born & Wolf, 1999). In the case of an ideal thin lens, as is considered in this study, we assume zero aberrations are present [ðr 1 ; pq Þ = 0]. Note, however, that in practical settings these lens aberrations need to be either corrected using aberration balancing techniques or iteratively refined in the phase retrieval process -similar to the determination of the illumination function in classical ptychography. This, however, is beyond the scope of this work.
Given that all terms and symbols have been defined in operator form, equation (1) can now be expressed as where Equation (5) is generally applicable to all systems in which a lens is placed between the object and detector. This includes two special cases: The Fourier transforming condition, where the object is placed very close to the lens plane (z 1 ! 0) such that É in = É obj , the detector is placed in the back focal plane (z 2 = f ) and the lens is completely transparent (A r 1 s n = 1). In this configuration, the measured intensity of the 'focused field' becomes the squared modulus of the Fourier transform of object field (i.e. jÉ det j 2 / jF É obj È É j 2 ). This results in a variation of coherent diffraction imaging in which the lens can be used to reduce the large propagation distances necessary to achieve far-field diffraction patterns in the short-wavelength regime (Quiney et al., 2006).
The imaging condition -considered in this study -corresponds to where the object, lens and detector are placed according to the famous thin lens formula, In this condition the term Q r 1 ; f z 1 ;z 2 becomes unity, substantially simplifying equation (5) in the context of the forward problem. More importantly, the detected image will resemble an inverted version of the object's exit surface (I obj ) -a valuable asset that will be exploited in the inverse problem described in Section 4.

Methodology of LTI
This section describes the image acquisition method for LTI. Returning our attention to the lens plane r 1 in Fig. 1, one sees that the finite size of the lens aperture means that a single jÉ det j 2 measurement will only register information corresponding to a limited region of jÉ in j 2 . Therefore, acquiring multiple jÉ det j 2 measurements at different lens translation positions becomes paramount if one wishes to record a higher portion of spatial frequency data (real and complex) and subsequently improve the spatial resolution of the compound image. Fig. 2 depicts a flow-chart for the LTI methodology, in which a complex test image (É obj ) is successively forward propagated to the lens plane (É in and É out ) and to the detector plane (É det ). Importantly, the schematic embodies the key idea behind LTI where the lens is translated to different position s n in a way that several overlapping areas of É in are imaged at the detector.
The forward simulations shown in Fig. 2 were chosen to be representative of a typical full-field transmission X-ray microscope operating at hard X-ray energies (Snigireva et al., 2018) and with an X-ray magnification of approximately 18Â. The complex object wavefield [i.e. É obj = A obj expði obj Þ, see far left of Fig. 2] consisted of standard test images of a mandrill and peppers for the amplitude and phase, respectively. The physical size of this wavefield was 25.6 mm (W) Â 25.6 mm (H), and the wavelength was chosen to be = 0.75 Å , corresponding to a photon energy of 16.5 keV. This field is propagated by a distance z 1 = 0.264 m where the output field É in 2 according to the Fresnel number N f = ðÁx 2 Þ=ðz 1 Þ ' 10 À4 lies in the far-field regime for the given object pixel size of Áx = 50 nm (not to be confused with the detector pixel size of $ 0.9 mm). For a focus distance f = 0.25 m the detector distance will be z 2 = 4.736 m to satisfy the condition in equation (6) yielding a geometrical magnification of M = 17.9. To approximate experimental conditions, the detected intensity images incorporated Poisson noise, which varied from approximately 1.5 to 20% from the central to the outermost lens position. We note that the signal-to-noise is expected to decrease as the corresponding image intensity decreases towards the most distant lens positions from the central axis ðx 1 ; y 2 Þ = (0, 0). The attenuation properties of the lens were approximated by an apodized Gaussian distribution, A r 1 s n ¼ exp À jr 1 À s n j 2 2 2 ; with a variance of = 25 mm and a physical aperture of radius r phys = 75 mm, which is typical for commercially produced twodimensional Be-based CRLs with this focal length and energy (Simons et al., 2017). The far right of Fig. 2 shows two examples of simulated detected intensity images, labeled as corresponding to lens positions s 1 and s n . As equation (5) predicts, the intensity corresponding to the central axis position, jÉ s 1 det j 2 , is approximately an inversion of I obj . The resolution, however, is considerably poorer due to the masking of high spatial frequencies outside the aperture of T 1 . Furthermore, residual features from sharp gradients in the phase map are visible as mild intensity variations in this centered image (Zernike, 1942). The off-centered image jÉ s n det j 2 corresponds to a region of É s n out with mostly higher spatial frequency data. The intensity variations in this image thus reveal a higher fraction of morphological detail associated with the phase map, with visible features similar to those seen in differential inference contrast images (Kaulich et al., 2002;Ou et al., 2013). This type of contrast is typical in images attained using X-ray imaging techniques such as diffraction-enhanced imaging (Fö rster et al., 1980), grating-based interferometry (Pfeiffer et al., 2006) and speckle-based phase contrast (Morgan et al., 2012;Bé rujon et al., 2012).

Iterative phase-retrieval (inverse problem)
The iterative phase-retrieval aims to recover the object wavefield [É obj = A obj expði obj Þ] from a series of spatially overlapped translated images, each of which has only amplitude information.
We explain the phase-retrieval procedure with the aid of Fig. 3. While only two s n positions are used for explanatory reasons, it can be trivially generalized to arbitrarily many positions (n > 2). The procedure (described here in the case of s 1 ) is as follows. (i) Make an initial guess of the object wavefield É guess obj . (ii) Forward propagate É guess obj by a distance z 1 using equation (2)   Forward simulations and an illustration of the LTI data acquisition process, from the input amplitude and phase (left), to the intensity at the lens plane (center), to the resulting intensity on the detector (far right). Amplitude and intensity images are scaled from 0 to 1, while phase images are scaled from À to þ. Note that the images include a padding area of zeros around their perimeter. lens transmission function for the position T 1 to give É s 1 out . (iv) Forward propagate É s 1 out by a distance z 2 to determine É s 1 det . (v) Replace the ampitude with the square root of the measured intensity at that position. (vi) Back propagate by a distance Àz 2 to give an updated É (viii) Move to the neighboring position s 2 and repeat steps (iii)-(vii). This process [steps (i)-(viii)] is then carried out up to s n and repeated for N iterations or until the error metric E 0 has reached a minimum value. The error metric used here is defined as (Maiden & Rodenburg, 2009) where Here, É N rec ðr 0 Þ is the reconstructed object wavefield after a particular iteration N. The final step (ix) involves back propagating the Nth iteration of É N in by a distance Àz 1 , therefore fully recovering É rec obj . The reconstruction procedure was applied to a series of n = 25 intensity images calculated for n = 25 different lens positions corresponding to a 2.38Â increase in NA with an average overlap of 80% of their physical aperture (radius 150 mm). The initial guess utilized the centered intensity measurement as the initial guess of the object's amplitude jÉ guess obj j = ffiffiffiffiffi ffi I s 1 p expði random Þ]; (iii) a phase grid constructed using the relation between intensity and phase based on the arguments made by Paganin et al. (2002); that is, where the ratio = = relates to the object's complex refractive index distribution n r = 1 À þ i. A key assumption of this relation is that the value of is constant throughout the object's volume predicating it is largely composed of a single material. To test the effectiveness of each guess, the reconstruction of É rec obj was attained after N = 1, 10, 100 and 1000. The respective results are shown in Fig. 4.
In all three cases, the amplitude reconstructions presented in Fig. 4 require fewer iterations to reach an acceptable solution in comparison with the phase. This is primarily due to the understandably close resemblance of the initial guess to the true amplitude of the exit wavefield at the object. We additionally note that the speckle-like noise pollution for the random phase guess below N = 100 is likely due to the strong phase gradients being manifested in the intensity, and disappears by N = 1000.
The choice of guess clearly has a decisive effect on the quality and convergence rate of the final phase reconstructions, shown in Fig. 5. Both the convergence plot [ Fig. 5(a)] and the quantitative accuracy of the amplitude and phase [Figs. 5(b) and 5(c)] strongly favor the flat phase or single material assumption over the random phase guess. This is a somewhat surprising observation, as the test phase image (peppers) contained large variations over 2, including significant phase gradients. These large phase gradients appeared to cause some significant errors in the recovered phase related to phase-wrapping, though in general the recovered phase is quantitatively similar to the phase of the original test image. More surprising, however, is the improved convergence rate of the single material assumption from N > 200, given that there was no correlation between the phase and amplitude of the test image, which undoubtedly violates its key premise in equation (9). This observation supports recent work by Gureyev et al. (2015), which showed Flow-chart illustrating the iterative phase-retrieval procedure used in LTI. that the single material assumption can extend to a broader class of samples without significant loss of generality. For further details, the reader is encouraged to refer to Gureyev et al. (2015).

Discussion and conclusion
Lens translation imaging provides a practical approach to synthetically increasing the numerical aperture, spatial bandwidth product and phase sensitivity of classical full-field hard X-ray microscopes. The methodology is described here in detail using coherent scalar wave optics theory to derive a generalized mathematical expression for the wavefield as it traverses the entire LTI system, and includes a formulation of the iterative phase-retrieval algorithm based on the popular E-PIE algorithm.
In addition to providing the mathematical framework for developing simulation and image recovery code, the analytical expressions also provide valuable physical insights into the image contrast mechanism. In particular, we note that the forward simulations [based on equation (5)] suggest that the off-axis intensity images contain clear contrast in the form of differential phase contrast (DPC). By paying specific attention to the lens amplitude function A r 1 s n , we note the term exp½ðs n Á r 1 Þ= 2 arises once the squared binomial jr 1 À s n j 2 is expanded. Taylor approximating this term to the first order and then invoking the Fourier derivative theorem explains the origin of the DPC signal and how its contribution is proportional to the shifting js n j. From this it becomes clear that this type of contrast is the same as that observed and studied in visible-light FPM setups (Ou et al., 2013).
Realizing LTI means that the compound image must be recovered without access to the true amplitude and phase maps as benchmarks for convergence, as was the case here in equation (8). To this end, we recommend calculating the error metric relative to the measure images taken at the various lens positions via the following formula, where I s n meas ðr 2 Þ is the measured intensity for a certain s n , and ffiffiffiffiffiffiffiffiffiffiffiffi ffi I s n N ðr 2 Þ p is the calculated intensity at the same s n for a parti- Reconstruction of the object's wavefield performed with N = 1, 10, 100 and 1000 iterations.  cular iteration N. The above error metric takes into consideration the average over all lens positions. Our study of the iterative phase recovery revealed the importance of the initial guess on the rate of convergence and the ultimate quality of the compound image. Like FPM, LTI has an extremely significant advantage that the central image may be used as an accurate guess for the amplitude. However, although the single-material assumption offers some advantages of a constant phase guess, we believe there is still room for improvement. Given that elements of DPC are clearly present in the off-axis images, it seems intuitive that this information could be utilized to provide a more accurate guess of the object phase that would significantly improve the convergence rate.
As a full-field X-ray imaging technique, LTI might have the potential to offer a significantly reduced radiation dose rate (as opposed to accumulated dose) in comparison with scanning-probe methods, such as conventional X-ray ptychography. While the mechanisms of radiation damage vary greatly between specimens, many have clear dependencies on the radiation dose rate (Berejnov et al., 2018). In the typical cases of a scanning nanoprobe with a 200 nm probe diameter and a full-field microscope with a 200 mm diameter, one could anticipate a reduction in dose rate of six orders of magnitude. In the case of the latest generation of high-brilliance coherent X-ray sources (Eriksson et al., 2014), this reduction may be an important consideration. On the other hand, the attenuation of X-rays by the objective lens implies that LTI will have a higher accumulated radiation dose than conventional (i.e. 'lensless') X-ray ptychography for a given field of view and spatial resolution. However, we speculate that this disadvantage may be mediated to an extent by reducing exposure times at small scattering angles, where the intensity is high.
Perhaps most importantly, however, LTI has the potential to be both intuitive and widely accessible at synchrotron beamlines. By retaining the 'what you see is what you get' character of full-field microscopy, it provides users with the ability to quickly and decisively design and perform measurements, and to seamlessly switch from fast overviews of the entire specimen (narrow scan range) to detailed inspections of individual elements (broad scan range). Because LTI is based on the classical Galilean geometry used by full-field microscopes around the world, it can also be implemented with little to no additional hardware. Given the imminent improvements in brilliance and coherence of synchrotron sources, we believe LTI could be a convenient, valuable and effective new tool for the broad spectrum of X-ray microscopists.