Why is my image noisy? A look into the terms contributing to a time-resolved X-ray microscopy image

Through Monte Carlo simulations, we investigate how various experimental parameters can inﬂuence the quality of time-resolved scanning transmission X-ray microscopy images. In particular, the effect of the X-ray photon ﬂux, of the thickness of the investigated samples, and of the frequency of the dynamical process under investigation on the resulting time-resolved image are investigated. The ideal sample and imaging conditions that allow for an optimal image quality are then identifed.


Introduction
The interactions between the large number of atoms that constitute real-world material systems lead to a multiplicity of collective effects with spatial and temporal scales ranging from the nanoscopic to the mesoscopic and macroscopic. In condensed-matter physics, such effects include, for example, ferromagnetism and antiferromagnetism, ferroelectricity, and superconductivity. The characterization of these effects at the nanoscale is of vital importance for their understanding, and has received much attention from the research community in recent times (Wen et al., 2019).
Various experimental techniques have been developed to tackle the challenge of investigating complex systems at ultrasmall and ultrafast timescales (Wen et al., 2019). A particular effort has been dedicated into the development of experimental methods and protocols able to combine high spatial and temporal resolutions into time-resolved (TR) imaging techniques. Amongst such time-resolved imaging techniques, pump-probe TR X-ray microscopy has proven itself from its inception in the early 2000s (Choe et al., 2004;Van Waeyenberge et al., 2006) a very powerful technique for the investigation of dynamical processes exhibiting features at the sub-nanosecond and nanometre scales. The pump-probe microscopy technique allows for the imaging of repetitive dynamical processes by exciting the dynamical process through a periodic 'pump' signal (e.g. electrical pulse, RF signal, optical excitations, etc.) and probing the configuration of the sample through a periodic 'probing' beam (e.g. an X-ray pulse). The time separation between the pump and probe signal can be varied and, by determining the status of the sample at each delay, a time-resolved series can be reconstructed. A successful pump-probe investigation requires therefore a sample exhibiting a reproducible dynamical behavior, precise timing between the pump and probe pulses, a probing pulse faster than the dynamical process of interest, a contrast mechanism to allow for the probing of the dynamical sample configuration, and the absence of interactions between ISSN 1600-5775 the probing pulse and the sample that might cause an unwanted dynamical response.
For some specific sample systems, the acquisition of highquality TR-STXM images can be challenging. Examples of challenging sample systems are ultrathin magnetic films (Baumgartner et al., 2017), where the low contrast requires long integration times to acquire sufficient statistics, the study of processes occurring at frequencies of several GHz (Dieterle et al., 2019), where the width of the X-ray pulses generated by the synchrotron light source becomes comparable with the period of the excitation, and long-lived processes requiring nonetheless to be investigated with a fine time resolution (Finizio et al., 2019b), where the high number of frames of the time-resolved image imposes high integration times to acquire sufficient statistics. Given the fact that synchrotron beam time is a relatively limited and costly resource, the optimization of the processes, leading to a TR-STXM image (from the design of the sample to the choice of the specific imaging parameters), should lead to an improved success rate. It is therefore useful to provide the users of TR-STXM imaging with a tool allowing them to simulate how the TR-STXM images of the process to be investigated would appear under experimental conditions. In this work, we provide such a tool, based on Monte Carlo simulations. With these simulations, we then identify what parameters affect the quality of the final TR-STXM image, finding optimal imaging conditions.

Time-resolved imaging model
To understand what parameters influence the quality of a TR-STXM image, we modeled each possible contribution to the TR-STXM image and simulated them with a Monte Carlo approach. The model employed for the simulations is schematically depicted in Fig. 1. There are four different systems that contribute to the final TR-STXM image, given by: (1) Probe beam. In our case, the probing beam is generated by the synchrotron light source. Here, we model the generation of the single X-ray photons, i.e. the probability of whether a photon is generated in a given bunch, its polarization, and the moment at which it is generated.
(2) Sample. The X-ray photons generated by the synchrotron are focused onto a nanometric spot on the surface of the sample (which is then raster scanned to form an image), and interact with the magnetic material. The probability that an X-ray photon is transmitted is dependent on the absorption probability.
(3) Electronics. The goal of time-resolved imaging with the pump-probe protocol is to image a reproducible dynamical process triggered by an external excitation. This excitation is typically generated using high-frequency waveform generators synchronized to the master clock of the synchrotron light source. The precision of this locking will affect the recorded dynamics, and needs therefore to be modeled.
(4) Detector. The X-ray photons transmitted across the sample need to be detected, in order to determine the local absorption cross section of the sample. For TR-STXM imaging, a fast APD is employed as X-ray detector. The detection process affects the recorded TR-STXM images, and Schematic overview of the model employed for the Monte Carlo simulations reported in this work. Four different contributions were identified, given by the probing beam (in our case, synchrotron plus beamline optics and the Fresnel zone plate used to focus the beam onto the sample), modeled as a photon generator with Poisson statistics, the sample, where the X-ray absorption from the magnetic material takes place, the electronic setup, generating the excitation signal that drives the dynamical process on the sample, and the APD, used to detect the X-ray photons transmitted across the sample. The variables used for each of those contributions are also marked in the figure. therefore needs to be considered in our model. In particular, the linearity of the APD and the inability of the APD to recognize multiphoton and higher-energy photon events (due to the fact that each X-ray photon can be absorbed in different regions of the APD) need to be considered.
In the next sections, the model employed to describe each of the contributions introduced above will be discussed.

Probe beam
A synchrotron light source operates as an intrinsically pulsed X-ray source, generating an X-ray pulse every about 2-10 ns depending on the frequency of the radio-frequency (RF) cavities employed to restore the energy lost by the electrons due to the X-ray emission. For the Swiss Light Source, the RF cavities operate at a frequency of about 5 Â 10 8 Hz, leading to the emission of an X-ray pulse every about 2 ns. Within a single electron bunch, the generation of the single X-ray photons is a stochastic process, where the generation probability is determined by quantum-mechanical and relativistic effects, whose detailed description lies well beyond the scope of this work. For the purpose of the model described here, i.e. in the absence of coherent synchrotron radiation, the photon generation can be described according to Poisson statistics, where the probability mass function P ph (N) of generating N photons from the electron bunch is given by the following relation, with being the mean number of X-ray photons generated by the electron bunch.
The emitted X-ray photons are passed through a beamline that uses a combination of mirrors, slits and gratings (or crystals) to reject photons outside of the desired photon energy range and coherently illuminate a Fresnel zone plate that focuses the beam onto the sample. For the purpose of the simulations presented in this work, we can describe the whole synchrotron and beamline up to the sample (see Fig. 1) as an X-ray photon generator with Poisson statistics, where the mean number of monochromatic X-rays generated per bunch is given by with È being the photon flux at the focal spot of the Fresnel zone plate, f rev the revolution rate of the synchrotron (i.e. the frequency at which an electron bunch conducts a full rotation of the storage ring), and N fill the number of filled bunches stored in the ring. In the case of the Swiss Light Source, the revolution rate f rev is 1.049 Â 10 6 Hz, and the number of filled bunches N fill is 420 (in the multibunch operation mode), out of a total of 480 available bunch positions. Assuming a typical zone plate for TR-STXM (Ir zone plate with an outermost zone width of 25-30 nm, diameter around 240 mm) with an efficiency of about 10% (Jefimovs et al., 2007), the typical photon fluxes È that can be expected after the zone plate for a soft X-ray bend-magnet beamline at the L 3 absorption edges of the transition metal ferromagnetic elements are on the order of 10 6 -10 8 photons s À1 with a resolving power of 2000 (Raabe et al., 2008). This leads to a probability of a bunch generating a photon that passes through the beamline to illuminate the sample in the range 10 À3 -10 À1 . In the case of an undulator-based beamline with typical photon fluxes before the zone plate on the order of 10 11 photons s À1 (resolving power > 5000) (Flechsig et al., 2010), photon fluxes on the order of 10 8 -10 10 photons s À1 can be expected to illuminate the sample.
Besides the photons at the fundamental energy, X-ray photons at energies corresponding to the higher-order diffractions of the grating monochromator can also illuminate the sample. This additional, unwanted, photon flux is described by the spectral purity of the X-ray beam and, in the case of the PolLux bend-magnet beamline of the Swiss Light Source, the higher-order contribution can span up to 10-20% of the total photon flux (Flechsig et al., 2007). For an undulator beamline equipped with a plane-grating monochromator, the higher-order suppression will depend on the fix focus constant (c ff ) of the monochromator, but a value on the order of 1-5% can be assumed for the higher-order contribution (Sawhney et al., 1997). These higher-order photons are not affected by the magnetic configuration of the sample, and are transmitted across the sample with minimal absorption from it. As the APD is not able to determine the energy of the X-ray photon (see Section 2.4 for more details), the contribution of the higher-order photons on the final TR-STXM image needs to be considered. Here, the higher-order photons are generated in the same way as the fundamental energy photons where the probability mass function for the photon generation is given by with H = H, with H defined as the higher-order fraction. The total photon flux will therefore be given by È tot = (1 + H)È.
A further parameter that has to be considered here is given by the temporal width of the X-ray pulses generated by the synchrotron light source. The amplitude as a function of time of the X-ray pulses generated by the synchrotron light source can be described with a Gaussian distribution. The time instant at which the photon is generated can be described using a Gaussian probability density function, given by the following relation, with x being the 1 width of the X-ray pulse. In the case of the Swiss Light Source operating in the multibunch filling pattern with normal optics, the width of the X-ray pulses is about 70 ps full width at half-maximum (FWHM), corresponding to a x of about 30 ps. Therefore, for the simulations presented here, the synchrotron generates, for each filled bunch, a number of photons according to the probability mass functions given in equations (1) and (3). As the interaction of the fundamental research papers energy and higher-order photons with the sample is different, the two cases are simulated separately, each with their defined photon fluxes. For each of the photons at the fundamental energy, the time at which the photons are generated is determined according to the probability density function given in equation (4). This will be used, together with the time-jitter term described in Section 2.3, to determine the time-of-arrival of the X-ray photon on the sample (in the time frame of the dynamical process being investigated). Finally, the polarization of each X-ray photon at the fundamental energy is determined depending on the fraction of circular light generated by the synchrotron. In the case of a bend-magnet beamline such as PolLux, about 60% of the generated photons are circularly polarized (when tilting the electron orbit of 200 mrad inside the bend magnet) (Raabe et al., 2008), while for a APPLE-II type undulator operating at the first harmonic in the soft X-ray regime about all of the generated photons are circularly polarized.

Sample
The sample is where the dynamical process that the user of the endstation wants to investigate takes place. Samples for TR-STXM imaging are fabricated on X-ray transparent substrates that allow for the transmission of the soft X-rays not absorbed by the magnetic material. To image the dynamical processes, the X-ray energy is tuned to one of the X-ray absorption edges of the element that is being investigated.
To image the magneto-dynamical processes, as considered in this work, the X-ray energy is tuned to a resonance peak that shows X-ray magnetic circular dichroism (XMCD) (Schü tz et al., 1987) and belongs to an absorption edge of the element being investigated. This provides a contrast mechanism to discern variations in the projection of the local magnetization vector of the sample along the direction of the probing X-ray beam, which we define here as m x . The discussion and model presented here can be extended also to other contrast mechanisms. The absorption probability is determined by the thickness of the material following the Lambert-Beer law, i.e.
with d being the thickness of the magnetic material, and the attenuation coefficient of the material. The product of d and is defined as the optical density (OD) of the material. If the X-rays are circularly polarized, the additional contribution to the photon absorption probability given by the XMCD effect needs to be considered. This can be described by an additional term mag that contributes to the absorption coefficient. This additional term is given by the product of the XMCD coefficient XMCD , dependent on the specific magnetic material, with the projection of the local magnetization vector m along the direction indicated by the wavevector of the incoming X-radiation. Therefore, for circularly polarized photons, the absorption probability is given by the following relation, with m x being the projection of the local magnetization unit vector along the wavevector of the X-radiation, and XMCD the XMCD contrast in units of optical density. The value of m x will vary across the image (i.e. depending on the local magnetic configuration of the sample), and will also exhibit a time-dependent variation, describing the dynamical process being investigated. For the purpose of the simulations presented here, we consider the case of a purely sinusoidal excitation, i.e. where the magnetization configuration of the sample is excited by a sinusoidal signal at a defined frequency f . This means that the magnetization can be described with the following relation, with r being the position on the sample probed by the beam. Equation (6) can therefore be rewritten as follows, The absorption probability given in equation (8), and its counterpart for the linearly polarized light given in equation (5), are used to describe the absorption of the X-ray photons at the fundamental energy. For the higher-order light, the absorption can be described as P abs;H = 1 À expðÀ H dÞ, independently of the X-ray polarization, as the photon energy does not match a dichroic resonance of the magnetic material. As the effect of this absorption is to decrease the higher-order photon flux after the sample independently from the magnetic configuration of the sample, and considering that higherenergy photons typically have a high transmittance, we have made the simplifying assumption that all of the higher-order light is transmitted across the sample, as the variation of the higher-order photon flux is already being considered in the photon generation model.

Electronics
In the previous section, it was mentioned that the local magnetization configuration of the sample, described by the local magnetization vector m, oscillates with a sinusoidal modulation at a given frequency f. This dynamical process is triggered by an external excitation. As mentioned in the introductory section of this manuscript, many different excitation mechanisms can be employed to trigger magnetodynamical processes. In the case of TR-STXM imaging, the most typical excitation mechanism is given by electrical signals, which can be directly employed for exciting the dynamical process [e.g. processes driven by spin-orbit torques (Finizio et al., 2019b;Litzius et al., 2017;Woo et al., 2018;Baumgartner et al., 2017)], or indirectly through the generation of oscillating magnetic fields when injecting the electrical signal across a nanostructured antenna [e.g. spin-wave processes ( research papers arbitrary waveform generator is synchronized with the master clock of the storage ring, allowing to generate a signal that is synchronous with each of the X-ray pulses generated by the storage ring, allowing for the sampling of this signal at well defined time instants (Puzic et al., 2010).
However, jitter in the synchronization between the master clock and the waveform generator can produce an additional uncertainty on the time instant that is being probed by a given X-ray photon. As electronic jitter is usually well described by a Gaussian probability density function, the effect of the electronic jitter can be described as an additional uncertainty on the photon arrival time (in the time frame of the dynamical process), described by the following Gaussian probability density function, with t being the standard deviation of the electronic jitter. A reasonable number for the electronic setup currently employed at PolLux is for a jitter of about 50 ps FWHM, corresponding to a t of about 20 ps.

Detector
The final component of the TR-STXM imaging model presented in this work is the X-ray detector, which is utilized to convert the X-ray photons into an electrical signal that is then processed by the fast analog-digital converter installed in the setup employed for time-resolved imaging, which converts the voltage pulses into counts (Puzic et al., 2010). The detector employed for TR-STXM imaging is an APD with a bandwidth higher than the master clock frequency of the synchrotron. This allows the APD to resolve two X-ray photons emitted by neighboring bunches, which enables the possibility to employ the entire filling pattern of the synchrotron to acquire a timeresolved image (Puzic et al., 2010;Finizio et al., 2018b). This is one of the main advantages of TR-STXM compared with other time-resolved X-ray microscopy techniques, which have to rely only on the photons emitted by an isolated electron bunch and suffer therefore from a strongly reduced photon flux.
The behavior of the APD detector can be modeled according to the following considerations: (i) The APD cannot distinguish the 'type' of photon it is detecting, i.e. a circularly polarized photon is the same as a linearly polarized for the APD.
(ii) The APD cannot resolve the energy of the photon, i.e. a higher-energy photon will be indistinguishable from a photon at the nominal energy.
(iii) The APD cannot distinguish between one or multiple photons within the same bunch. It can either count 0 (no photons detected) or 1 (one or more photons detected). This will affect the linearity of the APD at high photon fluxes.
(iv) The APD has a given quantum efficiency, defining the probability that a photon arriving at the APD will produce a voltage pulse.
It is noteworthy here that, in principle, if all photons interacted in the same manner with the APD, it would be possible to utilize the amplitude of the generated APD voltage pulse to resolve the energy of the incident photons. However, as there are several possible interaction mechanisms for the photon inside the APD, this leads to a substantial distribution of the amplitude of the generated APD voltage pulses , which hinders the detection of the original photon energy.

Monte Carlo simulations
To investigate the influence of each of the components of the model introduced in the previous section, we performed a Monte Carlo simulation of a TR-STXM image, where a magneto-dynamical process simulated through micromagnetic modeling was employed as template for the TR-STXM image. The specific process that was utilized for the simulations presented in this work is the gyration of a magnetic vortex in a Landau pattern, simulated using the MuMax3 finite differences simulation package (Vansteenkiste et al., 2014), and shown in the supporting information. To reduce computation time, a 16 Â 16 pixel cropped area at the center of the Landau pattern was utilized for the simulation of the TR-STXM image. Here, it is assumed that the probing X-ray spot covers exactly one pixel of the reference image, without any signal originating from the neighboring pixels. The simulated TR-STXM image is composed of 14 time steps.
Before starting the Monte Carlo simulation, the number of bunches per pixel that contribute to the final image is calculated from the user-defined integration time. For each of these bunches, which interact with the sample every 2 ns, the following algorithm is executed: (1) The number of photons emitted by the given bunch is determined using equations (1) and (3) for the photons of nominal energy and the higher-order photons, respectively, taking into consideration the specific filling pattern of the synchrotron. For each one of the photons at the nominal energy, the following is determined: (1a) Polarization, depending on the probability of generation of circularly polarized photons.
(1b) The position (time) in the bunch from which the photon was emitted, using equation (4).
(2) The nominal energy photons are then interacting with the sample (it is assumed, for simplicity, that the higher-order photons do not interact with the sample), and the number of photons transmitted across the sample is determined. For this determination, the following is considered: (2a) For the linearly polarized photons, the absorption probability is determined according to equation (5).
(2b) For the circularly polarized photons, the absorption probability is determined according to equation (8). The photon arrival time is calculated by summing the time at which the photon was generated from the bunch [equation (4)] with the jitter between the electronics and the synchrotron master clock, using equation (9).

research papers
(3) Each of the transmitted photons will independently interact with the APD. The number of photons that give rise to a signal from the APD is determined from the quantum efficiency of the diode. If at least one photon is detected, a single count will be added to the simulated TR-STXM image pixel.
The simulations shown in this work were performed using the commercial software Matlab. The scripts employed for the simulations are available in the supporting information.
To understand how different parameters contribute to the final TR-STXM image, the following variables were used: (i) Photon flux. The photon flux for the nominal energy photons was varied in a logarithmic scale between 10 6 and 10 11 photons s À1 . For an undulator-based endstation, a higher photon flux can be considered.
(ii) Spectral purity. The fraction of higher-order light was varied in a logarithmic scale between 0.001 and 1 (0.05% to 50% of the total flux). This parameter depends on the specific beamline.
(iii) Sample thickness. The thickness of the sample was varied in a logarithmic scale between 0.1 and 7 optical densities.
(iv) XMCD contrast. The XMCD contrast of the sample was varied in a linear scale between 0.01 and 0.1.
(v) Excitation frequency. The frequency of the excitation of the magneto-dynamical process was varied in a logarithmic scale between 10 8 and 10 10 Hz.
These values span the typical ranges for experimental samples routinely investigated at the PolLux beamline [i.e. ranging from 0.8 nm-thick Co nanostructures (Baumgartner et al., 2017) to 200 nm-thick Fe x Ni 1-x microstructures (Finizio et al., 2018a)]. In addition to those variables, the following parameters have been kept constant (but can in principle be varied with minimal changes to the simulation script): (i) Polarization. The purity of the circular polarization of the X-rays was selected to be 0.6, according to the value measured for the PolLux beamline (Raabe et al., 2008). For a STXM endstation operating at an APPLE-II undulator-based beamline, the purity of the circular polarization would be 1 (at the first harmonic).
(ii) Bunch width and jitter. The width of the X-ray pulse was selected to be 70 ps FWHM, and the electronic jitter was selected to be 50 ps FWHM, according to values suitable for the PolLux beamline. For different beamlines, this parameter would be determined by the specific filling pattern (and electron optics) used in the synchrotron.
(iii) Quantum efficiency. To reduce the computation time, the quantum efficiency of the APD was set to 1, i.e. that every photon that arrives at the APD will give rise to a detectable signal.
(iv) Integration time. The integration time was selected to be 10 ms, corresponding to about 5 Â 10 6 bunches per pixel at the Swiss Light Source. For a different beamline, this parameter would have to be modified depending on the specific filling pattern used in the synchrotron.
In summary, the parameters depending on the specific beamline are the photon flux, its spectral purity, the polarization of the X-rays, the width of the X-ray bunches and the electronics jitter. The parameters depending on the specific sample are its thickness, the XMCD contrast, and the frequency of the signal driving the dynamical process.
The TR-STXM images simulated through the algorithm described above were then analyzed by determining the timedependent change in contrast in a region of interest centered on the geometrical center of the structure, shown in the inset of Fig. 2. The predicted change in contrast, determined from the micromagnetic simulated template image, follows a sinusoidal curve, as shown in Fig. 2. Therefore, the change in contrast of the simulated TR-STXM images was fitted with the following function, The amplitude A 1 in equation (10) is then utilized as the metric for the quality of the TR-STXM image. To acquire sufficient statistics to determine a meaningful error for the fitted amplitude A 1 , each simulation was repeated 100 times.

Results and discussion
In this section, the results of the Monte Carlo simulations described in the previous section will be presented. In particular, the contribution of each of the variables defined in the previous section to the final simulated TR-STXM image, using the amplitude of the sinusoidal fit described in Fig. 2 as metric, will be shown and discussed.

Sample thickness and XMCD contrast
In this section, the influence of the sample thickness and of the XMCD contrast magnitude on the TR-STXM image will be discussed.
STXM imaging measures the transmission of X-rays across a sample. For this reason, the intensity recorded in a STXM image is directly related to the X-ray transmittance of the sample, which can be determined from the Lambert-Beer law  presented in equation (13). Ignoring, for the moment, the contribution of the higher-order light, it is possible to write the transmitted photon flux È T (t) for circularly polarized X-rays at a given position on the sample as follows, where T(t) has been defined as the transmittance of the sample, and È as the incident photon. From equation (11), it is then possible to obtain the timedependent variation of the projection of the magnetization vector m x (t) along the direction defined by the wavevector of the incoming X-ray beam by calculating the natural logarithm of the transmittance, Assuming that the magnetization has a sinusoidal time dependence according to equation (10), it is possible to express the time-dependent transmittance as follows, Therefore, the amplitude of the dynamical signal A 1 can be determined from a sinusoidal fitting of the natural logarithm of the sample transmittance, normalized to the optical density of the sample. We calculated the amplitude A 1 XMCD , shown in Fig. 3(a), from the simulated TR-STXM images. These images were simulated considering a photon flux of 10 Â 10 6 photons s À1 and considering an excitation frequency of 5 Â 10 8 Hz. We calculated the amplitude value of A 1 XMCD as the value of the XMCD contrast is typically not calculated for experimental samples. As expected from intuitive considerations, a stronger XMCD contrast XMCD will give rise to a larger amplitude of the recorded time-resolved signal.
It is noteworthy that in many cases m x (t) XMCD ( 1, due to the combination of a relatively weak effective XMCD contrast (i.e. normalized to the entire thickness of the sample, including its non-magnetic parts), and of a relatively small variation of the magnetization [e.g. in the case of spin waves, the canting of the spins from the surface plane is usually limited to a maximum of a few degrees (Wintz et al., 2016), leading to m x (t) XMCD ' 10 À3 ]. In this case, the transmittance given in equation (12) can be simplified by considering only the first order of its Taylor expansion, Using equation (13) instead of equation (12), we obtain the amplitudes shown in Fig. 3(b). By comparing Figs. 3(a) and 3(b), it is possible to observe that the two calculations yield comparable results, indicating that the approximation of equation (12) with its first-order Taylor expansion is reasonable. This is an important consideration, as most experimental TR-STXM images are analyzed utilizing the relation given by equation (13), which we will use for the remainder of this manuscript. It is, however, worth noting that for some special applications, such as, for example, time-resolved laminographic imaging (Donnelly et al., 2020), where the quantitative determination of the changes in the orientation of the local magnetic vectors is of extreme importance, the formulation given in equation (12) needs to be employed. From the results shown in Fig. 3, it can be observed that an increase in the error of the determined amplitude occurs at the extreme ends of the optical density interval investigated here. This will influence the visibility of the dynamics in the timeresolved images, which can be quantified by the signal-to-noise ratio in the measured amplitude. We calculated this value from the simulations shown in Fig. 3(b), to find the optimal thickness of the magnetic material (as a function of its XMCD contrast). The results of this calculation are presented in Fig. 4, where it is possible to observe an optimal thickness of about 2 optical densities. It can also be observed that the signal-tonoise ratio has a relatively constant value for sample thicknesses between about 0.5 and 4 optical densities.
Therefore, if the specific physical process to be investigated allows for a selectable range of thicknesses of the material    (13), normalized to OD exp(ÀOD). The results are effectively equal to the ones shown in (a), indicating that the first-order Taylor expansion is a reasonable approximation.
[e.g. the investigation of magnetic vortex gyration processes ], the choice of a thickness between 0.5 and 4 optical densities would be optimal in terms of signal-to-noise ratio. For processes with more stringent requirements on the thickness [e.g. spin-orbit torque switching processes, requiring ultrathin magnetic films (Baumgartner et al., 2017)], the signalto-noise ratio will not be optimal (according to the values presented in Fig. 4), but the other parameters described in the next sections still provide degrees of freedom that can be used to optimize the image acquisition protocol.

Photon flux
It is reasonable to assume that a higher photon flux will provide higher-quality images thanks to the increased counting statistics. However, higher photon flux also brings an increased probability of multi-photon events. Since the APD is unable to recognize multi-photon events, a linear increase in the photon flux will not result in an equal increase of the recorded signal quality.
To demonstrate the influence of the photon flux on the recorded dynamics, TR-STXM images where the photon flux is varied were simulated. The resulting TR-STXM images were then fitted according to the relation given in equation (13). Here, it is of particular interest to vary both the photon flux and the optical density of the sample. The other parameters were kept constant to an XMCD contrast of 5%, an excitation frequency of 5 Â 10 8 Hz, and the contribution of the higher-order light is ignored. The results of these simulations, normalized to the amplitude expected from equation (13), are shown for a set of simulated optical densities, in Fig. 5.
From the results shown in Fig. 5, it is then possible to compute the signal-to-noise ratio as a function of the photon flux for each of the simulated optical densities, which is shown in Fig. 6(a).
In Fig. 6(a), it is possible to observe that, for each sample thickness, the maximum of the signal-to-noise ratio occurs at Dependence of the measurable amplitude of the recorded time-resolved signal on the X-ray photon flux illuminating the sample normalized to the maximum amplitude. A drop in the amplitude can be observed for photon fluxes higher than 10 8 photons s À1 . The drop in the amplitude occurs at different photon fluxes for different thicknesses of the sample.

Figure 4
Signal-to-noise ratio as a function of the sample thickness for different values of the XMCD contrast. A high signal-to-noise ratio can be observed for thicknesses between about 0.5 and 4 optical densities, with an optimal thickness of about 2 optical densities, independently of the magnitude of the XMCD contrast. (a) Signal-to-noise ratio as a function of the photon flux illuminating the sample for different values of the sample thickness. The photon flux at which the maximum of the signal-to-noise ratio occurs scales with the thickness of the sample. (b) Signal-to-noise ratio as a function of the photon flux reaching the APD detector for the same values of the sample thickness as in (a). In this case, the maximum of the signal-to-noise ratio occurs at the same photon flux (reaching the APD detector), independently from the thickness of the sample. The white background identifies the linear regime, the light blue background the semi-linear regime, and the dark blue background the non-linear regime. a different photon flux, with larger photon fluxes required to obtain the optimal imaging conditions for thicker samples. If, however, we now display the signal-to-noise ratio as a function of the photon flux after the sample, determined by applying equation (5) to the photon flux illuminating the sample, we obtain the result shown in Fig. 6(b). Here, the optimal imaging conditions occur, independently from the sample thickness, at a photon flux of about 5 Â 10 8 photons s À1 . Higher photon fluxes result in a sharp drop of the signal-to-noise ratio.
The reason for why the optimal imaging conditions occur at a photon flux after the sample of about 5 Â 10 8 photons s À1 is in the behavior of the APD detector. Due to the fact that the APD is only able to distinguish between 0 and ! 1 photons in a given bunch, the response of the APD depending on the incident photon flux will exhibit a non-linear behavior at high photon fluxes. In particular, the response of the APD on the incident photon flux is displayed in Fig. 7.
From the behavior of the APD shown in Fig. 7, three different regimes can be identified, depending on the photon flux at the APD: (i) Linear regime, È APD < 2.5 Â 10 7 photons s À1 ( APD < 0.05).
Here, APD identifies the average number of photons per bunch that reach the APD, calculated as APD = exp(ÀOD), being the average number of photons per bunch that illuminate the sample.
Within the low photon fluxes, the flux detected by the APD responds linearly to the incident flux. In this low-photon-flux regime, the amplitude of the time-resolved signal [ Fig. 8(a)] remains constant, and its error, shown in Fig. 8(b), follows the 1= ffiffi ffi p dependence dictated by the Poisson statistics of the stochastic photon generation [linear slope in the log-log graph shown in Fig. 8(b)].
For higher photon fluxes, where the probability of multiphoton events within the same bunch becomes non-negligible, the detected photon flux starts to deviate from the purely proportional response observed for low photon fluxes. Here, a drop in the measured amplitude of the time-resolved signal can be observed, while the error on the amplitude still follows the expected behavior dictated by Poisson statistics.
Finally, for the very high photon fluxes, where the detected photon flux saturates to its maximum value of about 4.5 Â 10 8 counts s À1 (i.e. corresponding to APD ' 1) with a non-linear response with respect to the incident photon flux, a more substantial reduction of the measured amplitude of the time-resolved signal can be observed. In this case, it is also possible to observe [see Fig. 8  X-ray photon flux detected by the APD as a function of the real photon flux illuminating the APD (black line). A saturation of the detected photon flux can be observed for real photon fluxes above 10 9 photons s À1 , due to the fact that the APD is not able to recognize multiphoton events within one bunch. The ideal purely linear response is marked by the dashed black line. The average number of photons emitted per bunch is shown by the red line. The white background identifies the linear regime, the light blue background the semi-linear regime, and the dark blue background the non-linear regime.  The two contributions to the signal-to-noise ratio that depend on the photon flux are therefore given by the error on the determined amplitude (following Poisson statistics for the lower photon fluxes), which increases the signal-to-noise ratio when increasing the photon flux, and by the drop in the detected amplitude arising from the multiphoton events, which decreases the signal-to-noise ratio when increasing the photon flux. This implies that there is an optimal photon flux, at which the signal-to-noise ratio is maximum, as shown in Fig. 6. This optimal photon flux (after the sample) is about 5 Â 10 8 photons s À1 , and the slit settings of the beamline should be selected in order to obtain a count rate as similar as possible to this optimal value. However, it is worth noting that the absolutely optimal photon flux will therefore be given by a compromise between the signal-to-noise ratio presented in Fig. 6 and other beam-dependent parameters on a case-bycase basis depending on the specific goals of the measurement and on the sample under investigation.

Higher-order light
In the previous sections, it was assumed that the X-ray photon flux that illuminates the sample comprises exclusively monochromatic photons at the specified energy. However, as mentioned in Section 2, a fraction of the photon flux comprises higher-order light. Since the energy of these photons is not at the elemental absorption edge of the magnetic material, their transmission across the sample will be independent of the local magnetization configuration of the sample, providing an additional background to the recorded images.
We performed simulations of the TR-STXM images with this additional contribution from the higher-order light. The simulations were performed for a sample thickness of 1.2 optical densities, with an XMCD contrast of 5%, an excitation frequency of 5 Â 10 8 Hz, and considering a photon flux of 10 7 photons s À1 for the photons at the nominal energy. These parameters were selected as they describe a typical TR-STXM sample and the frequency range typically used to excite its dynamics. The results of this simulation, normalized to the amplitude measured in the absence of higher-order light, are shown by the red circles in Fig. 9, where it is possible to observe that the increase of the photon flux for the higherorder light gives rise to a reduction of the measurable amplitude.
This reduction in the measurable amplitude is due to the fact that the transmittance is calculated according to equation (11), i.e. by normalizing the transmitted photon flux to the photon flux at the sample surface. Using the definition for the higher-order fraction of the photon flux given in equation (3), the reduction in the measured amplitude due to the higherorder photons is given by 1/(1 + H). This factor is plotted with a black line in Fig. 9. It should, however, be noted that, for photon fluxes in the semi-linear and non-linear regimes defined in the previous section, an increase in the higher-order contribution will lead to a non-linear response in the signal-tonoise ratio. Therefore, considering the behavior of the APD detector shown in Fig. 7 and the results shown in Fig. 8, the photon flux at the detector (including the higher-order contribution) should be tuned to about 5 Â 10 8 photons s À1 to obtain the best imaging conditions, as the contribution of multiphoton events at higher fluxes will severely impact the attainable signal-to-noise ratio.

Excitation frequency
In this section, the influence of the frequency of the excitation signal on the amplitude of the excitation measured through TR-STXM imaging will be investigated. Here, we have to consider two contributions, given by the width of the X-ray pulses generated by the synchrotron, and by the jitter of the synchronization between the synchrotron master clock and the excitation signal, both of which add an uncertainty on the arrival time of the photons with respect to the sampling point.
As described in the simulation model section, both of these contributions can be described by a Gaussian probability density function for the photon arrival time. Therefore, the combined effect of the X-ray pulse width and of the electronic jitter can be described by the convolution between the two probability density functions, which, as well a Gaussian probability density function, is given by the following relation, where x and t , respectively, identify the standard deviations caused by the width of the X-ray pulse and by the jitter between the excitation signal and the master clock of the synchrotron light source. At each sampling time t s , the point of the sinusoidal excitation signal that will be sampled will be given by t s + t P , where t P is determined by the Gaussian probability density function given in equation (14). Assuming an infinite number of sampling points t s , which allows us the use of continuous Simulated amplitude of a sinusoidal signal as a function of the higherorder flux (in units of the photon flux at the fundamental energy). A reduction of the measurable amplitude can be observed when increasing the photon flux of the higher-order light. The simulated amplitude is plotted by the red circles, while the predicted change in amplitude is plotted by the black continuous line. mathematics, the amplitude of the measured signal after its sampling will be given by the convolution between the probability density function given in equation (14) and the signal itself. The result of this convolution is given by a sinusoidal function modulated by a frequency-dependent amplitude term A 1 (f ), whose analytical formulation is given by the following relation, From equation (15), it is then possible to conclude that the detected amplitude is modulated by a Gaussian-shaped term, dependent on the X-ray bunch width and on the jitter between the excitation signal and the master clock of the synchrotron light source. Monte Carlo simulations of the TR-STXM images as a function of the frequency of the excitation signal allow us to observe that the analytical formulation of the detected amplitude as a function of the excitation frequency given in equation (15) is correct, as shown in the comparison between simulations and analytical predictions shown in Fig. 10. The simulations shown in Fig. 10 were performed for a sample thickness of 1.2 optical densities, with an XMCD contrast of 5%, a photon flux of 10 7 photons s À1 , and ignoring the contribution of higher-order light. Again, we selected these parameters as representative for a typical TR-STXM sample.
From the analytical formulation of the detected amplitude in TR-STXM imaging as a function of the excitation frequency, it is possible to determine the frequency boundary over which the detectable amplitude of time-resolved signals becomes too difficult for a meaningful measurement. Defining this threshold as the frequency at which 90% of the original amplitude of the excitation signal is lost, the critical frequency f c can be defined as follows, For the PolLux beamline, the critical frequency f c would be at about 9 Â 10 9 Hz, considering an X-ray bunch width of 70 ps FWHM, and an electronic jitter of 50 ps FWHM. The critical frequency reported in equation (16) is inversely proportional to the width of the X-ray pulses and of the electronic jitter. Improvements in either of those will provide a substantial increase of the measuring frequencies. The reduction of the X-ray pulse width can, for example, be achieved by operating the synchrotron light source using low-optics, which allows for a reduction of the X-ray pulse width down to less than 10 ps FWHM (Goslawski et al., 2014), and the reduction of the electronic jitter can be achieved by improvements on the phase locked loops used to reference the signal generators to the master clock of the synchrotron light source. The method described above for the measurement of the transmitted X-ray photons is based on the fast detection of whether the voltage pulses generated by the APD cross a userdefined threshold. This detection is performed every 2 ns, synchronized with the master clock of the synchrotron light source, with the use of a custom-programmed field-programmable gate array setup (Puzic et al., 2010). Different detection methods, involving an improvement in the processing of the APD voltage pulses, are under development. An example is the measurement of the arrival time of the X-ray photons performed by a constant-fraction discrimination-based detection scheme . We have, however, made the choice for this manuscript to concentrate on the simpler TR-STXM detection scheme that is currently available for regular user operation.

Conclusions
In conclusion, we have presented a model for the physical mechanisms that contribute to the formation of a TR-STXM image, and investigated, with the help of Monte Carlo simulations and analytical calculations, their influence on the quality of the acquired image. In particular, it has been observed that the ideal sample thickness can be found in a range between about 0.5 and 4 optical densities, with an optimal thickness of about 2 optical densities. An optimal photon flux after the sample, independent of the thickness, of the order of 5 Â 10 8 photons s À1 was determined, where care should be taken to minimize the contribution from higherorder light. Finally, the excitation frequency should be below the critical frequency defined by equation (16), and the measured amplitude of the time-resolved signal should be normalized according to the attenuation factor defined by equation (15) if measurements at different frequencies are to be compared.
With this work, we aim at giving the users of time-resolved STXM imaging beamlines a tool for the verification of the feasibility of their proposed investigations, and for the estimation of the most efficient and effective imaging conditions. The tools and models described in this manuscript have been Simulated amplitude of a sinusoidal signal as a function of the frequency of the sinusoidal signal. A reduction of the measurable amplitude when increasing the frequency of the sinusoidal signal, due to the fact that the period of the signal becomes comparable with the uncertainty in the X-ray photon arrival time, can be observed. The simulated amplitude is plotted by the red circles, while the change in amplitude predicted by equation (15) is plotted by the black continuous line. limited to the acquisition of time-resolved images, ignoring aspects such as the influence of energy and spatial resolution on the acquired images. The model can, however, be easily extended to include these additional restrictions, allowing the user to perform an as accurate as possible simulation of their specific processes.