Improving reproducibility in synchrotron tomography using implementation-adapted filters

Dissimilar hardware and software conventions at various synchrotrons lead to quantitative differences in experimental results. This paper proposes a method to improve reproducibility of tomographic reconstructions by optimizing the filtering step in commonly used reconstruction algorithms.


Introduction
In several scientific disciplines, such as materials science, biomedicine and engineering, a quantitative three-dimensional representation of a sample of interest is crucial for characterizing and understanding the underlying system (Fusseis et al., 2014;Luo et al., 2018;Midgley & Dunin-Borkowski, 2009;Rubin, 2014). Such a representation can be obtained with the experimental technique of computerized tomography (CT). In this approach, a penetrating beam, such as X-rays, is used to obtain projection images of a sample at various angles. These projections are then combined by using a computational algorithm to give a 3D reconstruction (Buzug, 2011;Kak et al., 2002).
Different tomographic setups are used in various practical settings. Our focus here is on tomography performed with a parallel-beam X-ray source at synchrotrons. Synchrotrons provide a powerful source of X-rays for imaging, enabling a broad range of high-resolution and high-speed tomographic imaging techniques (Thompson et al., 1984;De Carlo et al., 2006;Stock, 2019).
A typical tomography experiment at the synchrotron can be described by a pipeline consisting of several sequential steps (see Fig. 1). First, a sample is prepared according to the experiment and imaging setup requirements. Then, the imaging system is aligned (Yang et al., 2017), and a series of projection images of the sample are acquired (Hintermü ller et al., 2010). These data are then processed for calibration, contrast improvement [e.g. phase retrieval (Paganin et al., 2002)] or removal of undesirable artefacts like rings or stripes (Massimi et al., 2018). Following pre-processing, the data are fed into a reconstruction software package that makes use of one or more standard algorithms to compute a 3D reconstruction (Gü rsoy et al., 2014;Pelt et al., 2016). The reconstruction volumes can then be further post-processed and analysed (Salomé et al., 1999;Bü hrer et al., 2020) to obtain parameter estimates of the system being studied. In some cases, systematic imperfections in the data can also be corrected by post-processing reconstructions. For example, ring artefacts, which are commonly observed in synchrotron data, can be corrected before or after reconstruction (Gü rsoy et al., 2014).
At various synchrotron facilities in the world, the pipeline described above is implemented using different instruments, protocols and methods specific for each facility (Kanitpanyacharoen et al., 2013). These differences are on the level of both hardware and software. Dissimilarities in the characteristics of the used X-ray source and detection system, including camera, visible light objective and scintillator screen, lead to differences in the acquired data. The differences in the data are then further compounded by variations in processing and reconstruction software, resulting in differences in voxel or pixel intensities, and eventually in variations in the output of post-processing and analysis routines.
For users, such differences pose several challenges. First, it is difficult to ensure that results and conclusions obtained from experiments at one facility are comparable and consistent with experiments from another facility. Second, other researchers seeking to reproduce the results of a previous work with their own software might not be able to do so, even if they have access to raw data. Kanitpanyacharoen et al. (2013) report quantitative differences at various stages of the pipeline when scanning the same object at different synchrotrons. Reproducibility and the ability to verify experimental findings is crucial for ascertaining the reliability of scientific results. Therefore, in order to ensure reproducibility for the synchrotron pipeline, it is important to quantify and mitigate differences in the acquired, processed and reconstructed data.
Hardware and software vary across synchrotrons for a number of reasons. Each synchrotron uses a pipeline that is optimized for its specific characteristics. In addition, legacy considerations play a role in the choice of components. Because of the variations across synchrotrons, any successful strategy for creating reproducible results must take this diversity into account. Ideally, the choices for specific implementations of each block in the synchrotron pipeline in Fig. 1 should not influence the final results of a tomography experiment. Following this strategy, each block can be optimized for reproducibility independently from the rest of the pipeline.
In this paper, we focus on improving the reproducibility of the reconstruction block in the pipeline. In most synchrotrons, fast analytical methods such as filtered backprojection (FBP) (Kak et al., 2002) and Gridrec (Dowd et al., 1999) are the most commonly used algorithms for reconstruction. This is primarily because such algorithms are fast and work out-ofthe-box without parameter tuning. These algorithms give accurate reconstructions when the projection data are well sampled, such as in microCT beamlines where thousands of projections can be acquired in a relatively short time.
Several open-source software packages for synchrotron tomography reconstruction are available, such as TomoPy, the ASTRA toolbox and scikit-image (Gü rsoy et al., 2014;Palenstijn et al., 2013;Van der Walt et al., 2014). Usually, an in-house implementation of FBP or Gridrec, or one of the open-source software packages, is used for reconstruction. Each of these implementations contains a filtering step that is applied to the projection data as part of the reconstruction. Filtering influences characteristics such as noise and smoothness, of the reconstructed volume. A sample-independent, pre-defined filter is generally used for reconstruction. Some filters used in this step have tunable parameters, but these are often tuned on-the-fly and are not recorded in metadata.
Reconstructions in analytical algorithms are obtained by inversion of the Radon transform (Natterer, 2001). Although this inversion is well defined mathematically in a continuous setting, software implementations invariably have to work in a discretized space. In software implementations, the measurements as well as the reconstructed volume are discrete. In a discretized space, inversion of the Radon transform often translates to a backprojection step, which makes use of a discretized projection kernel to simulate the intersection between the scanned object and X-rays (Batenburg et al., Schematic representation of a typical tomography pipeline at synchrotrons. Hardware differences play an important role during sample preparation and data acquisition. Software differences affect image preprocessing, reconstruction and post-processing. Together these lead to differences in the output of analysis and parameter estimation studies. In this paper we propose a filter optimization method that works as a wraparound routine on the reconstruction block. Our method only requires evaluations of the reconstruction routine and does not require any internal coding. The output of our method is a filter that can be used in the reconstruction block for more reproducible reconstructions. 2021). The backprojection operation can also be performed directly using interpolations in Fourier space (Kak et al., 2002).
Different choices of discretization and interpolation, in projection kernels and filters, are possible. These choices lead to quantitative differences between the reconstructions obtained from different software implementations. A simple example of this effect is shown in Fig. 2, where we consider a phantom of pixel size 33 Â 33 and data along 8 projection angles uniformly sampled in [0, ). We compare reconstructions of the same data using two different projection kernels and two different filtering methods. In both instances, the image to be reconstructed contains a single bright pixel at the centre of the field of view. The sinogram of such an image (i.e. the combined projection data for the full range of angles) was computed using a CPU strip kernel projector from the ASTRA toolbox (Palenstijn et al., 2013). Backprojections of this projection data using two other projectors -a CPU line projection kernel and a pixel-driven kernel implemented on a graphics processing unit (GPU) -show significant, radially symmetric differences. These differences are dependent on the number of projection angles used, and are highly structured, unlike differences due to random noise. We also observe structured differences between reconstructions when the same projection kernel (gpu-pixel) is used after different filtering operations in real and Fourier space. This example highlights the impact of discretization and interpolation choices on the final reconstruction obtained from identical raw data.
Our main contribution in this paper is a heuristic approach that can improve reproducibility in reconstructions.
Our method consists of optimizing the filter used in different software implementations of reconstruction methods. We call such optimized filters implementationadapted filters. The computation of our filters does not require knowledge of the underlying software implementation of the reconstruction algorithm. Instead, a wrapper routine around any black-box implementation can be used for filter computation. Once computed, these filters can be applied with the reconstruction software like any other standard filter.
Our paper is organized as follows. In Section 2, we formulate the reconstruction problem mathematically and discuss the effect of different software implementations. In Section 3, we describe our algorithm for computing implementationadapted filters. Numerical experiments described in Sections 4 and 5 demonstrate use cases for our filters on simulated and real data. Finally, we discuss extensions to the current work in Section 6 and conclude our paper in Section 7. Our opensource Python code for computing implementation-adapted filters is available on GitHub (https://github.com/poulamis ganguly/impl-adapted-filters).

Continuous reconstruction
Consider an object described by a two-dimensional attenuation function f: R 2 ! R. Mathematically, the tomographic projections of the object can be modelled by the Radon transform, Rðf Þ. The Radon transform is the line integral of f along parametrized lines l ;t = fðx; yÞ 2 R 2 j x cos þ y sin ¼ tg, where is the projection angle and t is the distance along the detector. Projection data p (t) along an angle are thus given by research papers Figure 2 Differences in reconstruction due to differences in backprojector and filter implementations. (a) A 33 Â 33 phantom with one bright pixel. (b) Sinogram of the phantom (computed using a strip kernel from the ASTRA toolbox). (c) Differences in (unfiltered) backprojection when using different backprojectors: (left to right) backprojection using a CPU line kernel from the ASTRA toolbox, backprojection using a GPU pixel-driven kernel from the ASTRA toolbox, absolute difference between the two backprojections. (d) Differences in reconstruction when using different filtering routines in FBP with the gpu-pixel kernel as backprojector: (left to right) reconstruction using filtering in real space with the Ram-Lak filter, reconstruction using the ramp filter in Fourier space, absolute difference between the two reconstructions.
f ðx; yÞ ðx cos þ y sin À tÞ dx dy: ð1Þ The goal of tomographic reconstruction is to obtain the function f(x, y) given the projections p (t) for various angles 2 Â. One way to achieve this is by direct inversion of the Radon transform. Given a complete angular sampling in [0, ), the Radon transform can be inverted giving the following relation (Kak et al., 2002), whereP P ð!Þ denotes the Fourier transform of the projection data p (t) and multiplication by the absolute value of the frequency |!| denotes filtering with the so-called ramp filter. For noiseless and complete data, the Radon inversion formula [equation (2)] provides a perfect analytical reconstruction of the function f(x, y) from its projections. However, in practice, tomographic projections are obtained on a discretized detector, consisting of individual pixels, and for a finite set of projection angles. Additionally, the reconstruction volume must be discretized in order to represent it on a computer. Therefore, in practical applications, a discretized version of equation (2) is used to obtain reconstructions.

Discrete reconstruction
Discretization of the reconstruction problem yields the following equation for the discrete reconstruction r(x d , y d ), where (x d , y d ), d and t d denote discretized reconstruction pixels, angles and detector positions, respectively, and h(t d ) is a discrete real-space filter. This inversion formula is known as the filtered backprojection (FBP) algorithm. The FBP equation (3) can be written algebraically as the composition of two matrix operations: filtering and backprojection. Filtering denotes convolution in real space (or, correspondingly, multiplication in Fourier space) with a discrete filter. Backprojection consists of a series of interpolation and numerical integration steps to sum contributions from different projection angles. These discretized operations can be implemented in a number of different ways and different software implementations often make use of different choices for discretization and interpolation. Consequently, the reconstruction obtained from a particular implementation is dependent on these choices. The reconstruction r I from an implementation I can thus be written as where W T I is the backprojector and M I ðÁ; ÁÞ is the (linear) filtering operation associated with implementation I. We denote the discrete filter by h.
In the following subsection, we discuss some common choices for projection and filtering operators in software implementations of analytical algorithms.

Differences in projectors and filtering
In order to discretize the Radon transform, we must choose a suitable discretization of the reconstruction volume, a discretization of the incoming ray and an appropriate numerical integration scheme. All these choices contribute to differences in different backprojectors W T I in (4). Voxels (or pixels in 2D) in the reconstruction volume can be considered either to have a finite size or to be spikes of infinitesimal size. Similarly, a ray can be discretized to have finite width (i.e. a strip) or have zero width (i.e. a line). The numerical integration scheme chosen might be piecewise constant, piecewise linear or continuous. All of these different choices have given rise to different software implementations of backprojectors . There exist different categorizations of backprojectors in the literature; for example, the linear kernel in the ASTRA toolbox is referred to as the slice-interpolated scheme by Xu & Mueller (2006) and the strip kernel is referred to as the box-beam integrated scheme in the same work. In this paper, we designate different backprojectors with the terms used in the software package where they have been implemented.
In addition to the choices mentioned above, backprojectors have also been optimized for the processing units on which they are used. For this reason, backprojectors that are optimized to be implemented on graphics processing units (GPUs) might be different from those that are implemented on a CPU due to speed considerations. In particular, GPUs provide hardware interpolation that is extremely fast, but can also be of limited accuracy compared with standard floating point operations.
So far, we have discussed real-space backprojectors. Fourier-domain algorithms such as Gridrec (Dowd et al., 1999) use backprojectors that operate in the Fourier domain. These operators are generally faster than real-space operators, and are therefore particularly suited for accelerating iterative algorithms (Arcadu et al., 2016). Unlike real-space backprojectors, Fourier-space backprojectors perform interpolation in the Fourier domain. As this might lead to non-local errors in the reconstruction, an additional filtering step is performed to improve the accuracy of the interpolation.
Apart from differences in backprojectors, different implementations also vary in the way they perform the filtering operation in analytical algorithms. Filtering can be performed as a convolution in real space or as a multiplication in Fourier space. Real-space filtering implementations can differ from each other in computational conventions, for example by the type of padding used (Marone & Stampanoni, 2012) to extend the signal at the boundary of the detector. Moreover, the zero-frequency filter component is treated in different ways between implementations. For example, the Gridrec implementation in TomoPy sets the zero-frequency component of the filter to zero.

research papers 3. Implementation-adapted filters
We now present the main contribution of our paper. In order to mitigate the differences between implementations discussed in the previous section, we propose to specifically tune the filter h for each implemented analytical algorithm. In the following, we describe an optimization scheme for the filter, which helps us to reduce the differences between reconstructions from various implementations.
We optimize the filter by minimizing the ' 2 difference with respect to the projection data p. This can be stated as the following optimization problem over filters h, where r I is the reconstruction from implementation I. Note that the forward projector W used above is chosen as a fixed operator in our method (the same for each implementation for which the filter is optimized) and does not have to be the transpose of the implementation-specific backprojection operator W T I . In order to improve stability and take additional prior knowledge of the scanned object into account, a regularization term can be added to the objective in (5).
The solution to the optimization problem above is an implementation-adapted filter h Ã I . Once the filter has been computed, it can be used in (4) to give an optimized reconstruction, pÞ: Out of all reconstructions that an implemented algorithm can produce for a given dataset p by varying the filter, this reconstruction, r Ã I , is the one that results in the smallest residual error. Such filters are known as minimum-residual filters and have previously been proposed to improve reconstructions of real-space analytical algorithms in low-dose settings (Pelt & Batenburg, 2014;Lagerwerf et al., 2020a).
Our implementation-adapted filters are thus minimumresidual filters that have been optimized to each implementation I. The main difference between the previous works (Pelt & Batenburg, 2014;Lagerwerf et al., 2020a) and our present study is that we use a fixed forward operator in our optimization problem, which is not necessarily the transpose of the backprojection operator. More importantly, our goal in this paper is not the improvement of reconstruction accuracy, but the reduction of differences in reconstruction between various software implementations.
We hypothesize that such minimum-residual reconstructions obtained using different implementations are closer (quantitatively more similar) to each other than reconstructions obtained using standard filters. As an example for motivating this choice, let us take an implementation of an analytical algorithm from both TomoPy and the ASTRA toolbox. Given a certain dataset, changing the reconstruction filter results in different reconstructed images, each with a different residual error. Even though the implementations used by TomoPy and ASTRA are fixed, the freedom in choosing a filter gives us an opportunity to reduce the difference between reconstructions from both implementations.
Tuning the filter is a way to optimize the reconstruction according to user-selected quality criteria. Choosing the minimum-residual reconstruction for each implementation results in reconstructions that are the closest possible to each other in terms of data misfit. Closeness in data misfit, under convexity assumptions, indicates closeness in pixel intensity values of reconstruction images. Hence, the minimum-residual reconstructions for the two implementations are closer to each other than reconstructions with standard filters offered by the implementations.
To compute the optimized filter (5), we use the fact that the reconstruction r I ðh; pÞ of data p obtained from an implementation of FBP or Gridrec is linear in the filter h. This means that we can write the reconstruction as where R I ðpÞ is the reconstruction matrix of implementation I given projection data p. Thus, the optimization problem (8) becomes The matrix F I ðpÞ has dimensions N p Â N f , where N p is the size of projection data and N f is the number of filter components. For a filter that is independent of projection angle, the number of filter components, N f , is equal to the number of discrete detector pixels, N d . The projection size N p := N d N , where N is the number of projection angles. F I ðpÞ can be constructed explicitly by assuming a basis for filter components. A canonical basis can be formed using N d unit vectors fe i ; i = 1; 2; . . . ; N d g, such that Using these basis filters, each column of F I ðpÞ can be computed by reconstructing p using the implementation I, followed by forward projection with W, f j ¼ Wr I ðe j ; pÞ; j 2 f1; 2; . . . ; N d g; We can then substitute for F I ðpÞ in (6) and solve for the optimized filter h Ã I . Note that our method only requires evaluations of the implementation I by using it as a black-box routine to compute the reconstructions r I ðe j ; pÞ above. In other words, no knowledge of the implementation I or any internal coding is required.
If we expand the filter in a basis of unit vectors, OðN p Þ reconstructions using the implementation I and OðN p Þ forward projections with W must be performed for filter optimization. In contrast, the complexity of a standard FBP reconstruction is of the order of a single backprojection. Choosing a smaller set of suitable basis functions would result in a reduction in the number of operations for filter optimi-research papers zation and, consequently, faster filter computations. One way to do this is by exponential binning (Pelt & Batenburg, 2014).
The idea of exponential binning is to assume that the realspace filter is a piecewise constant function with N b bins, where N b < N d . The bin width w i for i = 1, 2, . . . , N b is assumed to increase in an exponential fashion away from the centre of the detector, such that where N l is the number of large bins with width 1. Exponential binning is inspired by the observation that standard filters used in tomographic reconstruction, such as the Ram-Lak filter, are peaked at the centre of the detector and decay to zero relatively quickly towards the edges. Binning results in a reduction of free filter components from N d to N b . Moreover, despite the reduction in components, it does not typically result in a significant change in reconstruction quality (Pelt & Batenburg, 2014). The pseudocode for our filter computation method is shown in Algorithm 1 (see Fig. 3). Here we give further details of the routines used in the algorithm. The filter routine performs filtering in the Fourier domain, which is equivalent to multiplication by the filter followed by an inverse Fourier transform. The reconstruct I routine calls the function for reconstruction in implementation I with the internal filtering disabled. Finally, the lstsq routine calls a standard linear least-squares solver in NumPy (Harris et al., 2020) to compute filter coefficients.
Once a filter h Ã is computed, we can store it in memory, either as a filter in Fourier space or as a filter in real space after computing the Fourier transform of h Ã . Using the filter with a black-box software package involves calling the filter routine with the data and the computed filter as arguments, followed by one call of the reconstruct I routine in a chosen algorithm (with its internal filtering disabled). Thus, the complexity of a reconstruction using a computed implementation-adapted filter is the same as that of a reconstruction run using a standard filter.
In the following sections, we describe numerical experiments and the results of filter optimization on reconstructions.

Data and metrics
We performed a range of numerical experiments on real and simulated data to quantitatively assess (i) the effect of our proposed optimized filters on the variations between reconstructions from different implementations; (ii) the behaviour and dependence of our proposed filters on acquisition characteristics such as noise and sparse angular sampling; and (iii) the effect of our proposed filters on post-processing steps following the reconstruction block in Fig 1. In this section, we describe the software implementations used, data generation steps and the metric used to quantify intra-set variability of reconstructions.

Software implementations of analytical algorithms
We optimized filters to commonly used software implementations of FBP and Gridrec. For FBP, we considered different projector implementations in the ASTRA toolbox (Palenstijn et al., 2013) as well as the iradon backprojection function in scikit-image (Van der Walt et al., 2014). These implementations use different choices of volume and ray discretization as well as numerical integration schemes. From the ASTRA toolbox, we considered projectors implemented on the CPU (strip, line and linear) as well as a pixel-driven kernel on the GPU (gpu-pixel, called cuda in the ASTRA toolbox). For Fourier-space methods, we considered the Gridrec implementation in TomoPy. We used the ASTRA strip kernel as the forward projector W in (5) during filter computations.

Projection data
We performed experiments with both simulated and real data. Both data consisted of projections acquired in a parallelbeam geometry along a complete angular range in [0, ).
4.2.1. Simulated foam phantom data. Simulated data of foam-like phantoms were generated using the foam_ct_ phantom package in Python. This package generates 3D volumes of foam-like phantoms by removing, at random, a pre-specified number of non-overlapping spheres from a cylinder of a given material . The simulated phantoms are representative of real foam samples used in tomographic experiments and are challenging to reconstruct due to the presence of features at different length scales. At the same time, the phantoms are amenable to experimentation as data in different acquisition settings can be easily generated. Slices of one such phantom, which we used for the experiments in this paper, are shown in Figs. 4 and 6.
Ray tracing through the volume is used to generate projection data from a 3D foam phantom. To simulate realworld experimental setups, where detector pixels have a finite area, ray supersampling can be used. This amounts to averaging the contribution of n neighbouring rays within a single pixel, where n is called the supersampling factor.
For our experiments, we generated a 3D foam with 1000 non-overlapping spheres with varying radii. A parallel beam projection geometry, in line with synchrotron setups, was used to generate projection data. We used ray supersampling with a  Algorithm 1 -implementation-adapted filter computation.
supersampling factor of 4, and each 2D projection was discretized on a pixel grid of size 256 Â 256. We varied the number of projection angles, N , in our experiments in order to determine the effect of sparse sampling ranges on our filters.
Poisson noise was added to noiseless data by using the astra.add_noise_to_sino function in the ASTRA toolbox (Palenstijn et al., 2013). This function requires the user to specify a value for the photon flux I 0 . In an image corrupted with Poisson noise, each pixel intensity value k is drawn from a Poisson distribution, with / I 0 . High photon counts (and high values of ) correspond to low noise settings. All noise realizations in our experiments were generated with a pre-specified random seed. 4.2.2. Real data of shale. In order to validate the applicability of our method to real data, we performed numerical experiments using microCT data of the round-robin shale sample N1 from the tomographic data repository Tomobank . We used data acquired at the Advanced Photon Source for our experiments. The round-robin datasets were acquired for characterizing the porosity and microstructures of shale, and the same sample has been imaged at different synchrotrons (using the same experimental settings) for comparison of results (Kanitpanyacharoen et al., 2013). The dataset we used was acquired with a 10Â objective lens and had an effective pixel size of approximately 0.7 mm. Each projection in the dataset had pixel dimensions 2048 Â 2048, and data were acquired over 1500 projection angles. In order to simulate sparse angular range settings, we removed projections at intervals of m = 2, 3, 4, 5 and 10 from the complete data.

Quantitative metrics
Reconstructions of a 3D volume from parallel beam data can be done slice-wise, because data in different slices (along the rotation axis) are independent of each other in a parallel beam geometry. Therefore, all our quantitative metrics were computed on individual slices. Reconstructed slices of the simulated foam phantom were discretized on a pixel grid of size 256 Â 256. Reconstruction slices of the round-robin dataset were discretized on a pixel grid of size 2048 Â 2048. All CPU reconstructions were performed on an Intel(R) Core(TM) i7-8700K CPU with 12 cores. GPU reconstructions were performed on a single Nvidia GeForce GTX 1070 Ti GPU with CUDA version 10.0.
We were interested in comparing the similarity between reconstructions in a set of images, without having a reference reconstruction. We quantified the intra-set variability between reconstruction slices obtained from different implementations using the pixelwise standard deviation between these. For a set of reconstruction slices fr I ; I 2 Ig obtained using different implementations I, the standard deviation of a pixel j is given by where (r I ) j is the intensity value of pixel j in reconstruction r I and N I is the total number of implementations.
In our experiments, we reconstructed the same data using our set of implementations fI 2 Ig, by using the Ram-Lak filter and the Shepp-Logan filter as defined in different packages, and then by using filters fh Ã I ; I 2 Ig (5) that were optimized to those implementations. As a result, we achieved three sets of reconstructions: one set using the Ram-Lak filter, a second set using the Shepp-Logan filter and a third set using the implementation-adapted filters. We computed the pixelwise standard deviation (8) over slices for all sets.
The mean standard deviation of a slice S (with dimensions N Â N) is defined as the mean of pixelwise standard deviations in that slice, where J S is the list of pixels in slice S. In addition to the mean, the histogram of standard deviations (8) provides important information about the distribution of standard deviation values in a slice. The mode of this histogram is the value of standard deviation that occurs most, and the tail of the histogram indicates the number of large standard deviations observed. For reconstructions that are more similar to each other, we would expect the histogram to be peaked at a value close to 0 and have a small tail.
In order to quantify the difference between a reconstruction slice and the ground truth (in experiments where a ground truth was available), we used the root mean squared error (RMSE) given by where r gt is the ground truth reconstruction. For a set of reconstructions we used the squared bias defined below to quantify the difference with respect to the ground truth, h bias ÀÈ where " r r := P I 2 I ð1=N I Þ r I is the mean over the set of reconstructions. The squared bias, similar to the standard deviation in (8), is a pixelwise measure. The mean squared bias over a slice S is obtained by taking the mean of (11) over all pixels in the slice.
In our experiments, we also quantify the effect of filter optimization on later post-processing steps after reconstruction. To do this, we threshold a set of reconstructions using Otsu's method (Otsu, 1979), which picks a single threshold to maximize the variance in intensity between binary classes. To quantify the accuracy of the resulting segmentations and to compare the similarity in a set we used two standard metrics for segmentation analysis: the F 1 score and the Jaccard index. The F 1 score takes into account false positives (fp), true research papers positives (tp) and false negatives (fn) in binary segmentation and is given by The Jaccard index is the ratio between the intersection and union of two sets A and B. In our case, one set is the segmented binary image and the other set is the binary ground truth image,

Numerical experiments and results
In this section, we give details of our numerical experiments and discuss their results.

Foam phantom data
5.1.1. Reduction in differences between reconstructions. Figure 4 shows the central (ground truth) slice of the foam phantom. Data along N = 32 angles were reconstructed using all implementations using the Ram-Lak filter, the Shepp-Logan filter and our implementation-adapted filters. Recon-structions using the various filters are shown in Fig. 4. In order to highlight intra-set variability, we include heatmaps showing the absolute difference with respect to one (strip) reconstruction. Upon visual inspection, we see that discrepancies between reconstructions are smaller in the set obtained using implementation-adapted filters. An interesting point to note is that the Gridrec and iradon reconstructions show the largest differences from the ASTRA strip kernel reconstruction in both sets. This suggests that differences between different software packages are greater than differences between different projectors in the same software package.
To further investigate intra-set variability, we use pixelwise standard deviation maps for all sets of reconstructions. We observe that higher values of standard deviation are observed when using the Ram-Lak and Shepp-Logan filters. This indicates that quantitative differences between these reconstructions were more pronounced. In contrast, reconstructions using our implementation-adapted filters were more similar, resulting in low pixelwise standard deviations. Furthermore, the mode of the histogram of standard deviations (in the central slice) is shifted closer to zero for reconstructions with our filters, and the tail of the histogram is shorter. This highlights the fact that the maximum standard deviation between reconstructions with our filters is smaller than the maximum standard deviation in reconstructions with the Shepp-Logan or Ram-Lak filters.

Dependence of filters on noise and sparse angular
sampling. We consider the effect of noise and sparse sampling on our filters. For the central slice of the foam phantom shown in Fig. 4, we generated data by varying the number of projection angles N and the photon flux I 0 . For each of these settings, we computed the mean standard deviation (9) between reconstruction slices. Our results are shown in Fig. 5. For all noise and angular sampling settings, the mean standard deviation in the slice was reduced by using implementationadapted filters, with the difference being particularly prominent for noisy and smaller angular sampling settings. Shepp-Logan filter reconstructions had smaller mean standard deviation compared with Ram-Lak filter reconstructions, except in situations where many angles (N ! 256) were used. In the high angle regime, reconstructions using the Ram-Lak filter have a relatively small number of artefacts and improvements due to filter optimization are modest.
We also quantified the mean squared bias and the mean RMSE with respect to the ground truth for this slice. From these plots, we observe that reconstructions using implementation-adapted filters have lower mean squared bias and mean RMSE compared with those for reconstructions with standard filters. High noise (low I 0 ) and sparse angular sampling settings result in an increase in bias and RMSE for all filter types. However, the increase is sharper for the Shepp-Logan and Ram-Lak filters than for our implementationadapted filters. For every noise setting, the Ram-Lak filter results in the worst reconstructions in terms of bias and RMSE. Although both bias and RMSE increase as the number of projection angles is reduced in the noise-free setting, we observe a reduction in mean standard deviation for recon-structions using implementation-adapted filters. This suggests that in spite of a reduction in mean standard deviation due to effective suppression of high frequencies, the reconstructions produced by our implementation-adapted filters in this regime are incapable of mitigating the large number of low-angle artefacts. In effect, these settings show a limit where optimization of a linear filter is not sufficient for good reconstructions, and intra-set homogeneity is achieved at the expense of an increase in bias and RMSE.
In addition, we also show the shapes of the filters (computed for the strip kernel in the ASTRA toolbox) as a function of noise and angular sampling. As the number of projection angles is increased, the shape of implementationadapted filters approaches that of the ramp filter. In these regimes, reconstructions obtained using the Ram-Lak filter and the Shepp-Logan filter are nearly identical in terms of bias and RMSE. For different noise settings, the filters only vary at certain frequencies. It is possible that these frequencies are indicative of the main features in the foam phantom slice used.
5.1.3. Variation of filters with projection data. In order to understand how our filters change with changes in the data, we computed filters for all slices of our simulated foam phantom. Two such slices are shown in Fig. 6. These slices, although visually similar, have different features. Implementationadapted filters for all 256 slices of the foam phantom are shown in Fig. 6.
In order to study the applicability of the central slice filter to other slices, we performed the following experiment. First, we reconstructed all slices using the slice-specific filters, i.e. filters that had been optimized for each individual slice using Implementation-adapted filters for noisy and sparsely sampled data. (Top, left to right) Mean standard deviations " S for slice S = 128 as a function of the number of projection angles N , mean value of the squared bias, mean value of RMSE with respect to the ground truth slice, and optimized filters in Fourier space. (Bottom, left to right) Mean standard deviations in S = 128 as a function of photon flux I 0 (higher values of I 0 correspond to lower noise levels) using N = 64, mean value of the squared bias, mean value of RMSE with respect to the ground truth slice, and optimized filters in Fourier space. different implementations. Next, we reconstructed all slices with the central slice filter. As a baseline, all slices were also reconstructed using the Shepp-Logan filter. Pixelwise standard deviations (8) were computed for all pixels in the foam phantom volume for the three cases. The scatter plot in Fig. 6 shows that the pixelwise standard deviations with the central slice filter are nearly the same as those with the slice-specific filters. In fact, these points lie on a line with slope nearly equal to one. This indicates that using the central slice filter results in an equivalent reduction in differences between reconstructions as slice-specific filters. In contrast, the pixelwise standard deviations using the Shepp-Logan filter are, for a majority of pixels, larger than those obtained using slice-specific filters. This suggests that, for a majority of pixels in the reconstruction volume, smaller values of standard deviation are observed after filter optimization.
Our experiment suggests that using the central slice filter for all slices of the foam phantom results in an equivalent reduction in standard deviation as slice-specific filters. This paves the way to fast application of such filters in a real dataset. An implementation-adapted filter computed for one slice of such a dataset could be reused with all other slices with no additional computational cost, just like any of the standard filters in a software package.
5.1.4. Reduction in differences after thresholding. We investigated the effect of our filters on the results of a simple post-processing step. We reconstructed data (N = 32, no noise) from the central slice of the foam phantom and used Otsu's method in scikit-image (Van der Walt et al., 2014) to threshold reconstruction slices from different implementations. In Fig. 7, we show two sets of thresholded reconstructions, one obtained using the Shepp-Logan filter and the other obtained using our implementation-adapted filters. We show values for the Otsu threshold t, the F 1 score with respect to the ground truth slice and the Jaccard index in the figure. We used routines in scikit-learn (Pedregosa et al., 2011) to compute all segmentation metrics. For the set of Shepp-Logan filter reconstructions, the ranges of threshold values (0.32-0.36), F 1 scores (0.63-0.71) and Jaccard indices (0.46-0.55) were larger than the corresponding ranges for the implementationadapted filter reconstructions. For the latter set, the Otsu threshold varied between 0.32 and 0.33 for all reconstructions. The F 1 scores were between 0.81 and 0.83, and the Jaccard indices were in the range 0.69-0.72. Upon visual inspection of the zoomed-in insets we find greater differences between thresholded reconstructions in the set of Shepp-Logan filter reconstructions. These results suggest that post-processing steps such as segmentation may be rendered more reproducible and amenable to automation if reconstructions are obtained using implementation-adapted filters.
5.1.5. Optimizing to a reference reconstruction. Although we focus on filter optimization in sinogram space in this paper, a related optimization problem is one where reconstruction results from different implementations are optimized to a reference reconstruction. This type of optimization might be useful when the result of one specific implementation is preferred due to its superior accuracy and when the exact settings used with this algorithm are unknown.
In some cases, high-quality reconstructions might be computed with an unknown (possibly in-house) software package during the experiment by expert beamline scientists. When users reconstruct this data later at their home institutes, it might not be possible to use the same software packages with identical settings. Our approach would enable users to  Variation of filters with projection data. (Top) Two slices of a simulated foam phantom with differences in features. (Bottom left) Implementationadapted filters for all slices of the foam phantom (slice-specific filters). Central slice (slice No. 128) filters for each implementation are indicated with bold lines. (Bottom right) Scatter plot of pixelwise standard deviations using slice-specific filters, the central slice filter and the Shepp-Logan filter. Standard deviations using the central slice filter are almost the same as those using slice-specific filters (orange dots). These points lie on a straight line (shown in black) with slope $1 and intercept $0. In contrast, standard deviations using the Shepp-Logan filter are higher than those using slice-specific filters (blue dots) for most pixels.
reduce the difference between their reconstructions and the high-quality reference reconstructions.
Optimization in reconstruction space can be performed by modifying the objective in (5), where r ref is the reference reconstruction.
To illustrate filter optimization in reconstruction space, we performed the following experiment. Using the strip kernel reconstruction (with the Shepp-Logan filter) as a reference, we computed optimized filters for two other implementations (ASTRA line kernel and TomoPy Gridrec) for reconstructing the central slice of the foam phantom. Subsequently, we reconstructed the sinogram with the Shepp-Logan filter and our filters. These reconstructions are shown in the top row of Fig. 8. To quantify similarity with the reference reconstruction, we computed the pixelwise absolute difference between each reconstruction and the reference as well as the RMSE using the reference as ground truth, which we denote as RMSE r . For both line and Gridrec backprojectors, optimizing the filter to a reference reconstruction reduced the RMSE r and absolute difference. As a further test, we applied the filters computed for this slice to a different slice of the foam phantom, which did not have any overlaps with the slice used to compute the filters. For this test slice, we again observed the reduction in RMSE r and absolute error, suggesting that our filters were able to bring the resulting reconstructions closer to the reference reconstruction. Figure 9 shows the results of our method on the central slice (slice No. 896) of the round-robin dataset N1. These reconstructions were performed by discarding every second projection from the entire dataset. From the heatmaps of absolute difference with respect to the strip kernel reconstruction, we observe that intra-set differences are reduced by using implementation-adapted filters. This is further shown by the pixelwise standard deviation maps. Standard deviations between reconstructions using the Ram-Lak and Shepp-Logan filters are larger than those between reconstructions using implementation-adapted filters. Similar to the distributions in Fig. 4, we see that our implementation-adapted filters are able to shift the mode of the histogram of standard deviations towards zero and to reduce the number of large standard deviations in the slice. We also observe that the Ram-Lak filter reconstructions show higher standard deviations than the Shepp-Logan filter reconstructions.

Round-robin data
We also studied the effect of the number of projections used on the mean standard deviation (9) in this slice. To do this, we performed experiments with the whole dataset and also with parts of the data, where every 2, 3, 4, 5 and 10 projections were discarded. For each instance, the data were reconstructed using the Ram-Lak filter, the Shepp-Logan filter and our implementation-adapted filters. The plot of mean standard deviations is shown in Fig. 9. For all projection numbers, filter optimization reduced the mean standard deviation in the slice. The difference was smaller for higher projection numbers, indicating that our filters are especially useful in improving reproducibility of reconstructions when the number of projection angles is small. In practice, data along few angles may be acquired to reduce the X-ray dose on a sample or to speed up acquisition when the sample is evolving over time.

Discussion
In this paper, we have presented a method to improve the reproducibility of reconstructions in the synchrotron pipeline. Differences after thresholding using Otsu's method. Reconstructions shown in Fig. 4 were used as input to the thresholding routine. (Top row) Thresholded reconstructions obtained using different backprojector implementations and the Shepp-Logan filter. Corresponding Otsu thresholds t, F 1 scores and Jaccard indices are given for each image. (Bottom row) Thresholded reconstructions obtained using implementation-adapted filters.

Figure 10
Reconstructions of data corrupted with zingers showing an example where the Shepp-Logan filter reconstruction and corresponding segmentation are better than those using an implementation-adapted filter or an iterative method (SIRT). (Top row) Reconstructions of data from slice 128 (N = 512, no noise) corrupted with zingers. Zingers are more prominent in the reconstruction using an implementation-adapted filter and in the SIRT reconstruction (after 800 iterations). (Bottom row) Segmentations using Otsu's method of all three reconstructions. The Otsu threshold, F 1 score and Jaccard index for each image is given below. during the scanning process. In addition, beam-sensitive samples might deform due to irradiation. Such changes lead to differences in reconstructions that are similar to the differences due to software implementations, albeit less structured than those shown in Fig. 2. To improve hardware reproducibility, controlled phantom experiments might be performed to address differences in data acquisition. Finally, software and hardware solutions can be effectively linked by using approaches like reinforcement learning for experimental design and control (Recht, 2019;Kain et al., 2020). Such creative solutions might provide an efficient way for synchrotron users to perform reproducible experiments in the future.

Conclusion
In this paper, we proposed a filter optimization method to improve reproducibility of tomographic reconstructions at synchrotrons. These implementation-adapted filters can be computed for any black-box software implementation by using only evaluations of the corresponding reconstruction routine. We numerically demonstrated the properties of and use cases for such filters. In both real and simulated data, our implementation-adapted filters reduced the standard deviation between reconstructions from various software implementations of reconstruction algorithms. The reduction in standard deviation was especially evident when the data were noisy or sparsely sampled.
Our filter optimization technique can be used to reduce the effect of differences in discretization and interpolation in commonly used software packages and is a key building block towards improving reproducibility throughout the synchrotron pipeline. We make available the open-source Python code for our method, allowing synchrotron users to obtain reconstructions that are more comparable and reproducible.