computer programs
THORONDOR: a software for fast treatment and analysis of low-energy data
aDepartment of Chemistry, INSTM Reference Center and NIS and CrisDi Interdepartmental Centers, University of Torino, Via P. Giuria 7, Torino 10125, Italy, bThe Smart Materials Research Center, Southern Federal University, Sladkova 178/24, Rostov-on-Don 344090, Russian Federation, and cCNR-IOM, TASC Laboratory, SS 14 km 163.5, Trieste 34149, Italy
*Correspondence e-mail: david.simonne@unito.it, andrea.martini@unito.it
THORONDOR is a data treatment software with a graphical user interface (GUI) accessible via the browser-based Jupyter notebook framework. It aims to provide an interactive and user-friendly tool for the analysis of NEXAFS spectra collected during in situ experiments. The program allows on-the-fly representation and quick correction of large datasets from single or multiple experiments. In particular, it provides the possibility to align in energy several spectral profiles on the basis of user-defined references. Various techniques to calculate background subtraction and signal normalization have been made available. In this context, an innovation of this GUI involves the usage of a slider-based approach that provides the ability to instantly manipulate and visualize processed data for the user. Finally, the program is characterized by an advanced fitting toolbox based on the lmfit package. It offers a large selection of fitting routines as well as different peak distributions and empirical step edges, which can be used for the fit of the NEXAFS rising-edge peaks. Statistical parameters describing the goodness of a fit such as χ2 or the R-factor together with the parameter uncertainty distributions and the related correlations can be extracted for each chosen model.
Keywords: NEXAFS; in situ measurements; data treatment; peak fitting; graphical user interface; Python.
1. Introduction
et al., 2013). These facts render this technique powerful to study surface/interface phenomena such as those found in fuel cells or batteries (Guda et al., 2019; Lassalle-Kaiser et al., 2017). In these contexts, the usage of soft X-rays below 2.0 keV is extremely useful to study the and the coordination geometry of both light elements (at K-edges) and transitions metals (at e.g. L-edges), which play a fundamental role in these fields (Tamenori, 2013).
is a powerful tool for the characterization of a large variety of materials thanks to its chemical selectivity and high sensitivity in determining interatomic distance. Moreover, this technique can simultaneously provide information on the electronic and local structural properties of systems under study, clarifying the relationship between their atomic/electronic structure and their physicochemical properties (MinoIn the soft X-ray energy regime, the high X-ray absorption coefficients often make it necessary to work in low-pressure environments (Stöhr, 1992). Although high-vacuum conditions produce an ideally clean environment for the sample under study, a multitude of chemically relevant phenomena take place only under ambient pressure (Castan-Guerrero et al., 2018; Escudero et al., 2013). In an effort to bridge the pressure gap in this context, different gas and liquid cells were designed in recent years, enabling soft X-ray studies of different reactions under in situ conditions (Blum et al., 2009; Castan-Guerrero et al., 2018; Escudero et al., 2013; Forsberg et al., 2007; Fuchs et al., 2008; Guo & Luo, 2010; Hävecker et al., 1999; Knop-Gericke et al., 1998; Tamenori, 2013; Tokushima et al., 2009; Zheng et al., 2011; Beaumont, 2020). In general, their design implies that the X-ray beam penetrates the reaction volume through an Si3N4 membrane a few tens of nanometres thick (Castan-Guerrero et al., 2018). These membranes have sufficient mechanical resistance to the difference in pressure between the vacuum of the chamber, where the cell is situated, and the gas environment inside it, at atmospheric pressure (Escudero et al., 2013). Because of the high yield of the photoelectric effect in the soft X-ray range and pushed by the experimental simplicity, the so-called total electron yield (TEY) measured by the replacement current (or drain current) has emerged as the most popular approach to perform in the soft X-ray range (below 2 keV). This technique combines surface sensitivity, resulting from the short escape depth of photoelectrons in this energy range, and the practical advantage of minimizing the alignment problems with the detector (Escudero et al., 2013). The standard approach to acquire the is realized by moving the monochromator with a discrete step, recording the TEY intensity at the selected energy, and repeating this operation for the entire energy range of interest. Recently, the experimental practice has been improved by continuously scanning the grating monochromator through the desired energy range (and sometimes also the undulator gap) while collecting the signal in streaming mode. This last methodology, sometimes known as a `fast-scan' or an `on-the-fly scan', significantly improves the time resolution of the NEXAFS (near-edge X-ray absorption fine structure) measurements, allowing the user to follow different dynamic processes under in situ conditions (e.g. chemical reactions) (Castan-Guerrero et al., 2018). In line with the work by Stöhr (1992), we will use the term NEXAFS for soft X-ray absorption spectra (with an energy edge lower than 2 keV), and XANES (X-ray absorption near-edge structure) will be used to indicate absorption spectra referring to hard X-rays.
Although several software packages have been developed for the analysis of hard e.g. GNXAS (Filipponi & DiCicco, 1995; Hatada et al., 2016), ATHENA (Ravel & Newville, 2005), VIPER (Klementev, 2001), EDAXAFS (Kuzmin, 1995), SIXPACK (Webb, 2005)], only a few have been specifically designed for data treatment of soft X-ray absorption spectra such as QANT (Gann et al., 2016), Blueprint XAS (Delgado-Jaime et al., 2010) and KKCalc (Watts, 2014). The critical features necessary for accurate and efficient treatment of NEXAFS data are a user-friendly interface, a fast and straightforward installation of the program on any machine, and a versatile range of functions covering the whole data treatment, from the subtraction of the background to the fit of the spectrum. THORONDOR was made with the aim to analyse multiple NEXAFS spectra, and be flexible enough to manage data collected in conventional ultra-high-vacuum (UHV) measurements as well as during more challenging experiments under environmental conditions. Equipped with an intuitive graphical user interface (GUI), this program, developed in Python, allows fast data treatment and the visualization of several spectral profiles collected under different working conditions, ranging from UHV to ambient-pressure atmosphere. Similarly to PyFitIt (Martini et al., 2020), one of its strengths is the possibility to quickly perform conventional data-handling procedures, such as spectral background subtraction and normalization, using an approach based on sliders and cursors. A peak-fitting toolbox characterized by a high variety of peak functions and ionization step potentials is also included for in-depth studies. Herein, it is worth noting that users can exploit different minimization algorithms to perform the of a defined NEXAFS spectrum and evaluate, using different statistical criteria, the quality of the chosen model and the uncertainties associated with the parameters retrieved by the fit. THORONDOR has been designed principally for the analysis of TEY measurements. However, its multiple functions can also be applied to spectra collected using alternative detection modes such as (FY).
data [The discussion in this article is organized as follows: after a description of the software design, we discuss how to properly handle and correct experimental spectra in order to obtain a set of reliable and comparable data. Then, we focus on the description of the NEXAFS peak-fit toolbox and the variety of fitting options it provides to the user.
THORONDOR is freely distributed and can be downloaded at the following page together with more information and practical examples about its usage: https://pypi.org/project/THORONDOR/.
2. Software structure
THORONDOR is based on two Python objects: the classes `Dataset' and `GUI'. During the initialization procedure, a new instance of the class GUI, containing only temporary information, is generated. Here, the user can provide several datasets as input to the GUI, as long as they focus on the same absorption edge.
The term `dataset' herein refers to the n columns contained in a single experimental datafile, saved directly from the beamline with a minimum of two columns: the incoming photon energy and the corresponding NEXAFS intensity. The remaining columns can contain supplementary data, such as the intensity of the incoming beam or the NEXAFS spectrum of a reference compound. At present, despite important efforts of several scientists, involving the definition of a common data exchange and archival format for X-ray experiments named NeXus-NXxas (Könnecke et al., 2015) for soft X-rays absorption measurements, there is no well established conventional protocol describing how an output file containing raw data should be properly formatted and designed. The number of columns characterizing a dataset thus varies depending on the beamline where the measurements are taken.
If the datasets do not possess the same exact energy range and/or number of points, all the contained spectra will automatically be interpolated on the common energy range, with a step fixed by the user.
Once one or more datasets have been selected, a new instance of the dataset class, having as its first attribute the raw data, is created for each of them (see Fig. 1). If a logbook was compiled during the experiment and saved in .xlsx format (common excel file), it can be imported into the program too. Specific experimental parameters, such as the temperature, can then be extracted from it, saved as class attributes and used by the program. This method drastically simplifies the data analysis procedure that every scientist needs to follow after an experiment, allowing rapid visualization and manipulation of several datasets simultaneously. The Pandas package (McKinney, 2010) is employed to transform any common format of data into a DataFrame: a Python object that allows fast manipulation and visualization of the data as an array [provided by the NumPy package (Oliphant, 2006)].
It is worth noting that each new variable, parameter or model specific to one dataset will be automatically saved as an attribute of the associated class. Hence, the user can always come back to resume their work or to alternate between different datasets without losing progress.
The THORONDOR interface is based on the Jupyter widgets package (Perez & Granger, 2007). The GUI window is divided in multiple tabs: each is built exploiting the ipywidgets.HBox and ipywidgets.VBox objects, which contain several widgets. The instance methods of the GUI class are used to perform the entire data analysis. They are controlled interactively by the ipywidgets.interact and displayed. The result is a user-friendly interface, allowing a quick data visualization, analysis and fitting in a Jupyter notebook environment (Kluyver et al., 2016). Each function can also be used outside the GUI for users that possess a deeper knowledge of Python and of the class-object functionalities. Finally, a documentation tab is provided in the GUI along with extra information reported in the online repository.
2.1. Importing and handling raw data
The experimental datafiles, in .txt or .dat format, directly retrieved from the beamline, must be located inside a data folder, in the same directory as the Jupyter notebook working file, where the THORONDOR package is imported. It is assumed that the experimental files are stored in the same data folder and refer to the same energy edge. The spectral profiles produced by different experiments and belonging to different datasets can be processed only under the condition that they refer to the same absorbing element and that they share the same file architecture. As introduced in Section 2, the raw datafiles can be accompanied by a logbook from which the user can extract, through a filtering method provided by THORONDOR, specific experimental information associated with each dataset, such as the data collection temperature or the composition of the gas feed. These working parameters play a fundamental role in the gas X-ray absorbance correction (see Section 2.2.2) and in the analysis of the features.
To initialize the data treatment, the user needs to provide a name for each column of every dataset contained in the working directory. Each spectrum recorded during the experiment is imported inside a pandas.Dataframes object under a specific column. This operation is performed using practical dropdown-widgets. Once all the columns of a dataset have been renamed, the same nomenclature is applied directly to all the other columns of the remaining files. It is worth noting that THORONDOR requires that at least two columns for each dataset correspond to two specific channels: the photon energy (E) and the NEXAFS intensity (μ). The latter can be computed as the ratio of the intensity of the signal coming from the sample (Is) over the incident beam intensity (I0), see Fig. 2. The nature of I0 and Is clearly depends on the type of measurement. In the case of a TEY experiment, they consist of a current signal in the picoampere range (Castan-Guerrero et al., 2018). This procedure (i.e. recording the beam intensity before the incidence on the sample) is the result of non-constant intensity of the beam in the spectrum energy range [due to the shape of the harmonic of the undulator and the transmission of the beamline optics (Stöhr, 1992)]. Moreover, the beam in the ring can present variation in time (e.g. due to the top-up filling mode of modern synchrotrons). Thus, the division of the absorption signal from the sample by the beam intensity, measured typically on a fine wire mesh of some noble metal, removes those artefacts from the μ shape and has become very popular among beamline experimental stations. In addition, if present as part of the experimental datafile, the user may also specify a reference column, which is useful for energy-alignment purposes (see Section 2.2.1), and a column containing the experimental uncertainties associated with the measurement, which can be used during the peak-fitting routine. It is worth noting that sometimes the monitor mesh can be contaminated by elements which are present in the sample under study too. This problem is usually addressed by normalizing μ for the quantity μref obtained as the ratio among the NEXAFS signal of a reference sample free of that target element Iref for the intensity of the beam collected on the mesh I0 ref: . This approach, the `stable monitor method' (Watts et al., 2006), is employed in THORONDOR, selecting the so-called check box after the creation of the working datasets. Once this option is activated, the user can declare which columns of the dataset (i.e. Iref and I0 ref) can be considered to evaluate the μref spectrum. Afterwards the new normalized spectrum μS = μ/μref is added to the dataframe and can undergo further corrections.
In THORONDOR, the signals coming from channel Is are firstly normalized by I0, if such a procedure has not been performed beforehand, to produce μ. This procedure is the `first normalization' and the related spectral intensities will be indicated in the text as μ. At the end of this scaling process, each dataframe will possess an extra column containing the first normalized signal μ. A description of the signal background subtraction followed by a further data normalization is provided in Section 2.2.4. Finally, the plotting window tool of THORONDOR allows the user to graphically represent the information contained in each dataframe. Each NEXAFS spectrum can be plotted individually or together with the other signals collected during an experiment simultaneously. Herein, in order to gain a better visualization, the colour of each spectrum can be personalized by the user together with the energy range of plotting. A sketch of the program window is shown in Fig. 3.
2.2. Data treatment
In general, an acquired NEXAFS scan requires some corrections in order to be converted from raw data to an interpretable THORONDOR can be realized in four steps: (i) alignment of the measured spectrum to a determined reference and its subsequent calibration, (ii) removal of eventual glitches affecting the experimental datum, (iii) membrane and gas transmittance correction, (iv) spectral background subtraction and `second normalization'. In the following sections, each of these steps and their implementation in the software will be described in detail.
These spectral modifications in2.2.1. Data energy alignment
It is quite common for monochromators to not retain a perfect energy calibration over the course of multiple measurements. It follows that, in some cases, there could be some drift or jump effects in energy within a range of a few electronvolts (Calvin, 2013). THORONDOR offers the possibility to align all datasets with respect to a common spectral feature.
If along each scan a reference spectrum of a well known compound (containing the same selected absorbing element) is collected simultaneously with the sample measurement, it can be used for the energy alignment procedure. The reference spectrum must be imported during the data-importing step as described in Section 2.1. Afterwards the user, by means of a cursor, can select the position of the same spectral feature for each reference spectrum per dataset. This yields to a list containing the position of the same feature, perchance slightly shifted, for each dataset.
Once this step has been completed all the references will be shifted by a quantity of energy equal to the difference among their features and those of the selected reference. The shifts in energy accompanying each aligned reference are automatically exported to each spectrum of every dataset, realizing, in this way, their alignment. Finally, it may happen that the reference spectrum is not acquired during the measurement. In this case, the user can align the NEXAFS spectra over a feature belonging to the Is or μ channel.
2.2.2. Treating the effects of the window and gas X-ray absorption
Under UHV conditions, it is possible to measure the I0 impinging on the sample surface. This can be realized, for example, by measuring the TEY from a highly transparent metal grid intercepting a fraction of the incoming beam, typically localized before the entrance of the experimental chamber (Castan-Guerrero et al., 2018). In the case of ambient-pressure measurements, this important part of the acquisition in UHV cannot be achieved because of the presence of both the cell membrane and the gas layer, which act as photon absorbers. However, considering these limitations, the hitting the sample I0 eff can be estimated from a standard I0 measurement before the entrance in the reaction volume and the window and gas slabs transmittances as follows,
where kw and lw together with kg and lg are the X-ray attenuation lengths and thickness of the membrane and of the gas, respectively. The attenuation length for an element in a given material (in the solid or gas state) is calculated as the product of the atomic density ρa by the atomic photo-absorption σabs given by
where r0 is the classical electron radius, λ is the X-ray wavelength and fim is the imaginary part of the of the element under analysis (Henke et al., 1993).
In THORONDOR, the attenuation length for the membrane refers to the Si3N4 compound and has been taken from the tabulated value in the work by Henke et al. (1993). The only free parameter that, in this case, can be managed by the user is the window thickness lw (in µm). Regarding the X-ray absorption phenomena caused by the gases inside the cell, the user can easily calculate the transmittance factor for any gas mixture with THORONDOR, see Fig. 4. In particular, given the working pressure p (in Pa) and the temperature T (in K) of one molecular component of an N-gas mixture, the related kg term used in equation (1) is derived using the following formula,
where hj is the stoichiometric index of the jth element composing the molecule and kB is the Once recovered, the X-ray transmittances for each gas component are multiplied by their percentage their final product is then equal to the total gas-mixture transmittance. It is worth noting that the correction described is not suitable if the measurement is performed at the energy edge of the elements constituting the gas phase present in the cell.
2.2.3. Deglitching
At certain orientations, the diffraction peak being utilized by the monochromator can interfere with multiple reflections associated with another set of crystal planes (Calvin, 2013), resulting in a glitch in I0. Thermal (especially at high temperatures) and electrical noise can also cause some spikes in the Is signal. The presence of glitches can distort some fundamental procedures in the program such as the background subtraction and the spectral normalization (Calvin, 2013). In THORONDOR, it is possible to select, through a single slider, the energy region surrounding a glitch and to replace it with a set of points obtained using a spline interpolating function (linear, quadratic and cubic). This curve is generated considering a user-defined number of points, situated before and after the glitch, as shown in Fig. 5.
2.2.4. Background subtraction and second normalization
The background removal procedure for μb, which is usually approximated by a spline function represented by a Victoreen polynomial curve p(E; a, b) = aE−3 + bE−4, whose coefficients (a, b) are obtained via least-squares methods (Klementev, 2001). Afterwards, the normalization is performed employing the scaling for the edge-step defined by the following formula,
with hard X-rays in transmission mode (excluding phenomena of self-absorption) is well established and relatively easy. It aims to subtract a pre-edge background contributionwhere μ(E) is the raw XANES spectrum, while the normalization constant Δμ0, shown in equation (4), is the edge-step parameter. This last term is computed as the difference between the pre-edge and post-edge curves (approximated with a spline too) at the energy E0. This energy value is usually identified as taking the maximum of the first derivative of the XANES spectrum.
The application of the procedure to a NEXAFS spectrum may be problematic in some cases. A limitation of this method can be found when dealing with a spectrum that possesses a low ). The estimation of the edge jump will contain a larger uncertainty in this case.
edge or when two edges are situated at close distances from each other, therefore limiting the pre- and post-edge energy ranges used to define the spline functions (see Fig. 6Further problems can also emerge if the spectra have been acquired outside UHV conditions. In particular, the NEXAFS background can increase with the gas absorption of the X-ray beam and, at the same time, some signal features can be distorted if the gas concentration quickly changes during a spectrum acquisition (Castan-Guerrero et al., 2018).
Aside from these particular cases, the problem is caused by the electron detection mode so popular in the soft X-ray range. In fact, for one absorbed photon, n electrons are generated, a number dependent on many parameters that are not always constant in the energy range of the spectrum. This effect gives rise to slopes that are superimposed to the NEXAFS spectrum (often called background) and that alter the shape of the spectrum thus making the extrapolation of the meaningful information difficult.
THORONDOR offers five different techniques which can be exploited to subtract the NEXAFS background. In the GUI, these methods are: Splines, Single spline, Polynomial curves, Asymmetric Least Squares and Chebyshev polynomials. The first method, Splines, was described earlier for hard X-rays. It is recommended only for those spectra which have been acquired in UHV or referring to samples with a high concentration of the absorber element. The Single spline method is the fastest to execute and allows a quick visualization of the data quality during an experiment. The last three approaches are suitable for NEXAFS data characterized by a non-linear variation of the background and by an extremely small edge jump, similar to those reported in Fig. 6. Each of these techniques is described in detail in the previous paragraph.
In THORONDOR, for each method, the parameters regulating the generation of the background curves are completely accessible to the user through sliders. In particular, the program allows for a user-defined energy range, the simultaneous visualization of the original (untreated) spectrum and the background-subtracted spectrum on two separated graphic windows. Once defined for a spectrum (e.g. μ) in a first dataset, the same background subtraction parameters can be applied to the other spectra for all the acquired datasets. This is an important feature of THORONDOR which allows the user to define a set of parameters on one dataset, and then, if satisfactory, to use the same parameters on the other datasets; thereby quickly correcting the background for all datasets and allowing a quick visualization of the corrected data.
In the case of a NEXAFS signal treated using the Splines method, the second normalization procedure is achieved using equation (4). For the other cases, the intensity of each NEXAFS point is divided by the total area under the background subtracted curve.
Splines method. The first step of this technique involves the identification of the absorption energy edge (E0) for the spectrum under analysis. This is carried out in the program by calculating the first-order derivative of the NEXAFS spectrum and taking the energy value of its maximum. The selection of the maximum of the derivative is done automatically by the program. However, the user has the possibility, through a cursor, to select a specific point of the derivative curve and save the related energy value as E0. Once the value of the edge energy position has been defined, the user can start to manipulate two sliders controlling the number of energy points situated in the pre-edge and post-edge part of the NEXAFS spectrum. These two sets of points are used to define the pre-edge and post-edge spline functions which are subsequently employed to remove the background and normalize the spectrum in accordance with equation (4). THORONDOR also offers different kinds of interpolating functions which can be used instead of the classical splines introduced in Section 2.2.4. These include linear, quadratic and cubic polynomial models which exploit the numpy.polyfit method (Oliphant, 2006). An example where this method is applied with success is shown in Fig. 7.
Single spline method. This method can be used as an alternative normalization procedure for a NEXAFS spectrum whose background has been subtracted with the same kind of interpolating curves (splines or polynomials) employed in the Splines method described in the previous paragraph. Through a single slider, the user can select the number of points situated in the pre-edge of the NEXAFS spectrum. Once this step is completed, the range of points is fitted by a spline or a polynomial function, which is subsequently subtracted from the raw NEXAFS spectrum. Contrary to the `classic' Splines method, which foresees the edge-step normalization, this procedure is realized by scaling the background-subtracted NEXAFS spectrum to the magnitude of a point in the curve [e.g. the maximum peak intensity of the NEXAFS white line or a point corresponding to the maximum value of the energy range (Qayyum et al., 2013)], which is selected through the usage of a proper slider. A demonstrative representation of this approach is given in Fig. 8.
Polynomial curves method. Given an experimental spectrum, a background curve is generated based on a determined number of points μ(Ei) belonging to the NEXAFS signal. The amount of points and position in energy are user-defined. Through sliders, the user can distribute them along the entire spectrum selecting specific energy positions which are not characterized by real spectral features but uniquely by the signal background (e.g. some region of the spectrum without any peak), see Fig. 9. Once this step has been completed, the related background function, consisting of a third-order spline, is generated using the splrep method of the SciPy package (Virtanen et al., 2020) and directly subtracted from the raw data.
Asymmetric least-squares method. Among all the approaches, the Asymmetric least-squares method has proven to be the fastest and most accurate. This baseline subtraction approach was introduced by Eilers and Boelens and it has been extensively used in the field of Raman spectroscopy (Baek et al., 2015; Eilers, 2003). It exploits an asymmetric least-squares (AsLS) method. The method aims to fit a smooth background f to an experimental spectrum μ(E). To do so, it is necessary to minimize the following objective function,
The first term of equation (5) expresses the goodness of the data fitting whereas the second is related to the smoothness of f. Herein, μ(Ei) and fi are the ith value of the experimental NEXAFS spectrum (μ having N values) and the smoothed function f evaluated at the ith energy point, respectively. The Δ2fi term is a difference operator defined as: = ( fi - fi - 1 ) - ( fi - 1 - fi - 2 ) = fi - 2fi - 1 + fi - 2, where λ is a regularization parameter and wi represents a set of weights chosen asymmetrically: wi = p if μ(Ei) > fi and wi = 1 − p otherwise. In THORONDOR, the user has direct access to λ and p and, as a consequence, can move them in the recommended ranges within 107–109 for λ and 0.001–0.1 for p (Baek et al., 2015). Once that the parameters have been chosen, the background function is automatically generated and subtracted from the experimental spectrum. The THORONDOR tab window designed for this approach is shown in Fig. 10.
Chebyshev polynomials method. This method has been already applied with success to powder diffractograms (Simonne, 2019). The first kind of Chebyshev polynomials Ti can be derived using the following equation,
The term a is a vector containing a set of coefficients where ai is the N + 1 coefficient determined by a weighted least-squares regression.
The degree N of the equation must be determined empirically, the weights (optional) can simply be taken as the square of the variance of the counting statistics to prevent the function from fitting the spectral peaks. The background f(E, a) is then assimilated as a summation of Chebyshev polynomials where each of them fits a small area of the spectrum (see Fig. 11). The number of polynomials must be high enough in order to take account of the baseline and avoid the fitting of the existing curve. This method usually shows problems with peaks with a large full width at half-maximum (FWHM) where the polynomials tend to unfortunately fit the peaks. On the contrary, the method is very effective for peaks possessing a small FWHM.
3. Peak fitting
Once the data treatment procedure (described in Section 2.2) is complete, a NEXAFS spectrum can be further processed using the THORONDOR peak-fitting toolbox.
In general, a NEXAFS spectrum is always characterized by resonances corresponding to different transitions from an occupied core state to an unfilled final state (Gann et al., 2016). These resonances can usually be modelled as peak shapes, properly reproduced by Lorentzian peak functions (de Groot, 2005; Henderson et al., 2014; Stöhr, 1992; Watts et al., 2006). The procedure of peak decomposition becomes extremely important when someone wants to decompose an NEXAFS spectrum into a set of peaks where each of them can be assigned to an existing and physically reasonable electronic transition. Finally, spectral energy shifts for a set of scans can be recovered from the fitting procedure too. They correspond to inflection points in the step function (i.e. the maximum of their first derivatives). The evaluation of these quantities is extremely important because they properly indicate the presence of reduction or oxidation phenomena involving the absorber atoms in the system under study.
THORONDOR offers a large class of peak functions including Gaussian, Lorentzian, Voigt and pseudo-Voigt profiles. The signal step can be properly modelled using an arctangent function (Poe et al., 2004) as well as an error function, which have been proven to be suitable for this usage too (Henderson et al., 2014; Outka & Stöhr, 1988). In general, the user should pick a step-function according to their knowledge prior to the fitting, since it has been shown that the width of the error function is related to the instrumental resolution (Outka & Stöhr, 1988), whereas the width of the arctangent is connected with the lifetime of the The step localization depends on the quality of the spectrum, usually several electronvolts below the core-level (Outka & Stöhr, 1988). Sometimes the background in the pre-edge can differ slightly from the step function due to features linked to the transition to the bound states in the system (de Groot, 2005). In THORONDOR, if one wishes to focus on that energy range, it is possible to use splines of a different order to fit the baseline for those energy values and then pass to fit and normalize the pre-edge peaks (Wilke et al., 2001).
In THORONDOR, the parameters associated with the peak and step profiles (i.e. the number of peaks and their energy position, their FWHM, the peak function amplitudes, the number of step functions and their slopes, etc.) are defined by the user via cursors and text-boxes (see Fig. 12). After the definition of a fitting model, the user needs to provide an initial guess to initialize the fitting routine. The sum of all the user-defined functions with the current guess for the parameters is plotted along the experimental spectrum by clicking the button `See current guess'. Therefore, by tuning the initial guess, the user can visualize the agreement between the experimental curve and the reconstructed one. Once this step has been performed, the user-defined parameters are employed to initialize the fitting routine.
The fitting routine is based on the minimization of a square residual objective function Ξ, defined as
where is the set of M parameters characterizing the selected peak and step functions, N is the number of the energy points, and are the ith value of the experimental and theoretical spectra, respectively, and ∊i is the uncertainty weighting related to the ith experimental point. Equation (7) assumes that the experimental signal is only affected by random Gaussian noise with a standard deviation equal to ∊i around the true signal (Filipponi & DiCicco, 1995).
Thanks to the use of the lmfit package (Newville et al., 2014), THORONDOR provides different minimization algorithms that can be applied to minimize equation (7). In particular, the Levenberg–Marquardt algorithm (Moré, 1978) is recommended for the fitting procedure if the user decides to start the analysis with a good initial guess. Indeed, this method is quite fast and converges quickly towards a local minimum. If the fitting routine does not succeed, some additional algorithms are provided, such as the Nelder–Mead method (Nelder & Mead, 1965) which has been demonstrated to be more robust than the precedent one (Newville et al., 2014).
3.1. Estimating experimental uncertainties
As shown in equation (7), the definition of Ξ requires the evaluation of the experimental errors in ∊i. If these are not provided by the user, THORONDOR offers three different alternatives.
The first procedure has been inspired by the work of Dent et al. (1992) and it is employed in the GNXAS software to estimate the error associated with an experimental spectrum (Filipponi & DiCicco, 1995). It is based on three parts: first, a few points (from three to twenty) are selected in the spectrum around a point. Second, a low-order polynomial (degree one, two or three) is fitted on the selected data. Third, the root mean square (r.m.s.) deviation of all the data within the selected range from the polynomial curve is assigned to the selected NEXAFS point. This last procedure is then repeated on several narrow intervals along the total spectrum. Finally, all the extracted r.m.s. values are interpolated with a smooth function and its inverse is used as the error term in equation (7).
The second method simply uses the errors from the user, imported along with the data, as uncertainty weights for equation (7).
Finally, if the errors provided by the first method seem under- or overestimated and, if the user is unable to quantify the uncertainty on the measurement, the errors can either be equalled to the inverse of the background subtracted data, or to one, resulting in a non-weighted fitting routine for the latter.
3.2. Evaluation of the goodness of fit
Mismatch between data and fit can be measured in a number of ways (Calvin, 2013). One of the common methods implemented in THORONDOR is the R-factor. According to the International Society Standard and Criteria Committee (2000) it is defined as
When the signal-to-noise ratio of the data is good, the RIXS2(%) of an adequate fit can be expected to be in the order of few percent (Calvin, 2013; International Society Standard and Criteria Committee, 2000).
Because of the presence of M parameters in the fit, the quantity , where the vector is the minimum value of equation (7), can be interpreted by a random variable. Thus, the statistical χ2 test can be performed in THORONDOR to check if the actual value of is only due to the residual noise or it otherwise contains unexplained physical information (Filipponi & DiCicco, 1995).
3.3. Finding uncertainties in fitted parameters
In THORONDOR, the parameter uncertainties retrieved by the fitting procedure can be estimated in different ways. In general, this is done by inverting the Hessian matrix of equation (7) determining the related covariance matrix, whose diagonal elements are the squared parameters errors (Bunker, 2010). However, sometimes the uncertainties cannot be estimated, which generally indicates that the Hessian matrix cannot be properly inverted because the fit is not actually sensitive to one of the variables that must be optimized. This can happen if a parameter is stuck at an upper or lower bound, if the variable is simply not used by the fit or if the value, for that variable, is such that it has no real influence on the fit (Newville et al., 2014). Moreover, as previously introduced in Section 3, the standard errors computation assumes that the residuals follow a normal distribution with a mean equal to zero, and that a map of probability distributions for pairs of parameters would be elliptical (the size of the ellipse provides the uncertainty and the eccentricity provides the correlation) (Bevington & Keith, 2003; Newville et al., 2014). The validity of the uncertainty estimation can be discussed since it ignores outliers, highly asymmetric uncertainties or complex correlations between the estimated parameters. Nevertheless, the results yielded from this estimation are usually quite good when it is possible to determine them, which is usually the case if one starts the algorithm with an initial guess close enough to a local minimum.
A more detailed investigation of the probability distribution of the parameters can be performed a posteriori via the emcee Markov Chain Monte Carlo package (Foreman-Mackey et al., 2013) (version 3 or superior) by exploring the parameter space. This additional step is recommended, especially if the estimation of the covariance matrix fails, roadblocks can be present with models composed of numerous parameters and bounds or constraints. Hence, one can estimate the uncertainties and find the correlations between pairs of parameters. A corner plot can be drawn using the corner package (Foreman-Mackey, 2016).
As described before, in THORONDOR, confidence intervals are determined in both methods providing a clear idea of the uncertainties associated with each parameter. Overall, the fitting module of THORONDOR allows one to quickly fit specific features or entire spectra using different approaches and many without neglecting the statistical analysis of the fit quality.
4. Conclusions and perspectives
In this paper, we have presented THORONDOR, a free software package designed in Python, suitable for the quick analysis of large series of NEXAFS spectra collected under UHV or during in situ experiments. The program allows the user to correct and normalize the acquired spectra using various fast techniques, directly and interactively accessible to the user via sliders and cursors. After the selection of a NEXAFS spectrum, by exploiting the THORONDOR fitting toolbox, the user can recover the energies and intensities of the most prominent absorption features together with their uncertainties. In particular, different peak functions (Gaussian, Lorentzian, Voigt, etc.) and absorption-edge step functions (arc-tangent, error functions, etc.) can be employed for this purpose.
Regarding the future development perspectives of this software, we are going to implement three new tools. (i) First, we are going to make THORONDOR suitable to read and deal with the NeXus-NXxas datafile format (Könnecke et al., 2015), which is imposing as one of the most commonly used data formats in the community. In particular, the software will be able to extract, visualize and save the metadata contained in each file, which can be rich in information about the experimental conditions of the NEXAFS measurements (e.g. sample temperature, sample positions in the beamline end-station, etc.). (ii) We intend to insert a section dedicated to the compositional analysis of the acquired experimental spectra. The new module will allow the user to perform a linear combination fit of a NEXAFS spectrum on the basis of user-defined references. Moreover, we are also considering the possibility to implement a second module in order to realize the spectral decomposition procedure based on the multivariate curve resolution-alternating least-squares (MCR-ALS) algorithm (Jaumot et al., 2005). (iii) Finally, since is it progressively becoming standard procedure to understand the collected NEXAFS data, we are going to interface THORONDOR with the simulated spectroscopic data coming directly from different time-dependent density functional theory (TD-DFT) calculations [e.g. the Amsterdam Density Functional (Atkins et al., 2013; te Velde et al., 2001)] or from the atomic multiplet simulations (de Groot, 2005) [e.g. Quanty (Haverkort, 2016; Haverkort et al., 2012)], which can be compared with the experimental spectra obtained after the data treatment procedure.
Acknowledgements
We are grateful to E. Groppo (University of Turin), F. Tavani, P. D'Angelo (University of Rome, La Sapienza) and P. Ghigna (University of Pavia) for the several fruitful discussions about the NEXAFS data treatment procedures and for their contribution in the field of soft X-rays analysis. We thank A. A. Guda, S. A. Guda and A. V. Soldatov (The Smart Materials Research Center, Southern Federal University) for the useful advice connected with the architecture of the THORONDOR code and S. Zafeiratos (CNRS and University of Strasbourg) for providing us with a set of samples which also represented a useful testing ground for the program. We are deeply indebted to Professor C. Lamberti, an amazing mentor and a brilliant guide who, unfortunately, left us too early. This work has received support from project PRIN-2017 MOSCATo (cutting-edge X-ray methods and models for the understanding of surface site reactivity in heterogeneous catalysts and sensors).
References
Atkins, A. J., Bauer, M. & Jacob, C. R. (2013). Phys. Chem. Chem. Phys. 15, 8095–8105. Web of Science CrossRef CAS PubMed Google Scholar
Baek, S. J., Park, A., Ahn, Y. J. & Choo, J. (2015). Analyst, 140, 250–257. Web of Science CrossRef CAS PubMed Google Scholar
Beaumont, S. K. (2020). Phys. Chem. Chem. Phys. Advance Article. Google Scholar
Bevington, P. R. & Keith, R. D. (2003). Data Reduction and Error Analysis for the Physical Sciences, 3rd ed. New York: McGraw-Hill. Google Scholar
Blum, M., Weinhardt, L., Fuchs, O., Bar, M., Zhang, Y., Weigand, M., Krause, S., Pookpanratana, S., Hofmann, T., Yang, W., Denlinger, J. D., Umbach, E. & Heske, C. (2009). Rev. Sci. Instrum. 80, 6. Web of Science CrossRef Google Scholar
Bunker, G. (2010). Introduction to XAFS: a Practical Guide to X-ray Absorption Fine Structure Spectroscopy. Cambridge University Press. Google Scholar
Calvin, S. (2013). XAFS for Everyone. Boca Raton: CRC Press. Google Scholar
Castan-Guerrero, C., Krizmancic, D., Bonanni, V., Edla, R., Deluisa, A., Salvador, F., Rossi, G., Panaccione, G. & Torelli, P. (2018). Rev. Sci. Instrum. 89, 8. Google Scholar
Delgado-Jaime, M. U., Mewis, C. P. & Kennepohl, P. (2010). J. Synchrotron Rad. 17, 132–137. Web of Science CrossRef CAS IUCr Journals Google Scholar
Dent, A. J., Stephenson, P. C. & Greaves, G. N. (1992). Rev. Sci. Instrum. 63, 856–858. CrossRef CAS Web of Science Google Scholar
Eilers, P. H. C. (2003). Anal. Chem. 75, 3631–3636. Web of Science CrossRef PubMed CAS Google Scholar
Escudero, C., Jiang, P., Pach, E., Borondics, F., West, M. W., Tuxen, A., Chintapalli, M., Carenco, S., Guo, J. & Salmeron, M. (2013). J. Synchrotron Rad. 20, 504–508. Web of Science CrossRef CAS IUCr Journals Google Scholar
Filipponi, A. & Di Cicco, A. (1995). Phys. Rev. B, 52, 15135–15149. CrossRef CAS Web of Science Google Scholar
Foreman-Mackey, D. (2016). J. Open Source Softw. 1, 24. Google Scholar
Foreman-Mackey, D., Hogg, D. W., Lang, D. & Goodman, J. (2013). Publ. Astron. Soc. Pac. 125, 306–312. Google Scholar
Forsberg, J., Duda, L. C., Olsson, A., Schmitt, T., Andersson, J., Nordgren, J., Hedberg, J., Leygraf, C., Aastrup, T., Wallinder, D. & Guo, J. H. (2007). Rev. Sci. Instrum. 78, 083110. Web of Science CrossRef PubMed Google Scholar
Fuchs, O., Maier, F., Weinhardt, L., Weigand, M., Blum, M., Zharnikov, M., Denlinger, J., Grunze, M., Heske, C. & Umbach, E. (2008). Nucl. Instrum. Methods Phys. Res. A, 585, 172–177. Web of Science CrossRef CAS Google Scholar
Gann, E., McNeill, C. R., Tadich, A., Cowie, B. C. C. & Thomsen, L. (2016). J. Synchrotron Rad. 23, 374–380. Web of Science CrossRef CAS IUCr Journals Google Scholar
Groot, F. (2005). Coord. Chem. Rev. 249, 31–63. Web of Science CrossRef Google Scholar
Guda, A. A., Guda, S. A., Lomachenko, K. A., Soldatov, M. A., Pankin, I. A., Soldatov, A. V., Braglia, L., Bugaev, A. L., Martini, A., Signorile, M., Groppo, E., Piovano, A., Borfecchia, E. & Lamberti, C. (2019). Catal. Today, 336, 3–21. Web of Science CrossRef CAS Google Scholar
Guo, J. H. & Luo, Y. (2010). J. Electron Spectrosc. Relat. Phenom. 177, 181–191. Web of Science CrossRef CAS Google Scholar
Hatada, K., Iesari, F., Properzi, L., Minicucci, M. & Di Cicco, A. (2016). J. Phys. Conf. Ser. 712, 012002. CrossRef Google Scholar
Hävecker, M., Knop-Gericke, A. & Schedel-Niedrig, T. (1999). Appl. Surf. Sci. 142, 438–442. Google Scholar
Haverkort, M. W. (2016). J. Phys. Conf. Ser. 712, 012005. CrossRef Google Scholar
Haverkort, M. W., Zwierzycki, M. & Andersen, O. K. (2012). Phys. Rev. B, 85, 165113. Web of Science CrossRef Google Scholar
Henderson, G. S., de Groot, F. M. F. & Moulton, B. J. A. (2014). Rev. Mineral. Geochem. 78, 75–138. Web of Science CrossRef CAS Google Scholar
Henke, B. L., Gullikson, E. M. & Davis, J. C. (1993). Atom. Data Nucl. Data Tables, 55, 349. Google Scholar
International Society Standard and Criteria Committee (2000). Report of the International XAFS Society Standards and Criteria Committee. Google Scholar
Jaumot, J., Gargallo, R., de Juan, A. & Tauler, R. (2005). Chemom. Intell. Lab. Syst. 76, 101–110. Web of Science CrossRef CAS Google Scholar
Klementev, K. V. (2001). J. Phys. D Appl. Phys. 34, 209–217. Web of Science CrossRef CAS Google Scholar
Kluyver, T. Ragan-Kelley, B., Perez, F., Granger, B., Bussonnier, M., Frederic, J., Kelley, K., Hamrick, J., Grout, J., Corlay, S., Ivanov, P., Avila, D., Abdalla, S. & Willing, C. (2016). Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87–90. Amsterdam: IOS Press. Google Scholar
Knop-Gericke, A., Hävecker, M., Neisius, T. & Schedel-Niedrig, T. (1998). Nucl. Instrum. Methods Phys. Res. A, 406, 311–322. Web of Science CrossRef CAS Google Scholar
Könnecke, M., Akeroyd, F. A., Bernstein, H. J., Brewster, A. S., Campbell, S. I., Clausen, B., Cottrell, S., Hoffmann, J. U., Jemian, P. R., Männicke, D., Osborn, R., Peterson, P. F., Richter, T., Suzuki, J., Watts, B., Wintersberger, E. & Wuttke, J. (2015). J. Appl. Cryst. 48, 301–305. Web of Science CrossRef IUCr Journals Google Scholar
Kuzmin, A. (1995). Physica B, 208–209, 175–176. CrossRef Web of Science Google Scholar
Lassalle-Kaiser, B., Gul, S., Kern, J., Yachandra, V. K. & Yano, J. (2017). J. Electron Spectrosc. Relat. Phenom. 221, 18–27. CAS Google Scholar
Martini, A., Guda, S. A., Guda, A. A., Smolentsev, G., Algasov, A., Usoltsev, O., Soldatov, M. A., Bugaev, A., Rusalev, Y., Lamberti, C. & Soldatov, A. V. (2020). Comput. Phys. Commun. 250, 107064. Web of Science CrossRef Google Scholar
McKinney, W. (2010). Proceedings of the 9th Python in Science Conference, edited by S. van der Walt & J. Millman, pp. 51–56. Austin: SciPy Society. Google Scholar
Mino, L., Agostini, G., Borfecchia, E., Gianolio, D., Piovano, A., Gallo, E. & Lamberti, C. (2013). J. Phys. D Appl. Phys. 46, 423001. Web of Science CrossRef Google Scholar
Moré, J. J. (1978). Numerical Analysis, edited by G. A. Watson, pp. 105–116. Berlin, Heidelberg: Springer. Google Scholar
Nelder, J. A. & Mead, R. (1965). Comput. J. 7, 308–313. CrossRef Web of Science Google Scholar
Newville, M., Stensitzki, T., Allen, D. B. & Ingargiola, A. (2014). LMFIT: Non-Linear Least-Square Minimization and Curve-Fitting for Python, https://doi.org/doi:10.5281/zenodo.11813. Google Scholar
Oliphant, T. E. (2006). A Guide to NumPy. Trelgol Publishing. Google Scholar
Outka, D. A. & Stöhr, J. (1988). J. Chem. Phys. 88, 3539–3554. CrossRef Web of Science Google Scholar
Perez, F. & Granger, B. E. (2007). Comput. Sci. Eng. 9, 21–29. Web of Science CrossRef CAS Google Scholar
Poe, B. T., Romano, C. & Henderson, G. (2004). J. Non-Cryst. Solids, 341, 162–169. Web of Science CrossRef CAS Google Scholar
Qayyum, M. F., Sarangi, R., Fujisawa, K., Stack, T. D. P., Karlin, K. D., Hodgson, K. O., Hedman, B. & Solomon, E. I. (2013). J. Am. Chem. Soc. 135, 17417–17431. Web of Science CrossRef CAS PubMed Google Scholar
Ravel, B. & Newville, M. (2005). J. Synchrotron Rad. 12, 537–541. Web of Science CrossRef CAS IUCr Journals Google Scholar
Simonne, H. D. (2019). Masters Thesis, Technische Universitat Munchen, Germany. Google Scholar
Stöhr, J. (1992). NEXAFS Spectroscopy. Berlin Heidelberg: Springer-Verlag. Google Scholar
Tamenori, Y. (2013). J. Synchrotron Rad. 20, 419–425. Web of Science CrossRef CAS IUCr Journals Google Scholar
te Velde, G., Bickelhaupt, F. M., Baerends, E. J., Fonseca Guerra, C., van Gisbergen, S. J. A., Snijders, J. G. & Ziegler, T. (2001). J. Comput. Chem. 22, 931–967. Web of Science CrossRef CAS Google Scholar
Tokushima, T., Horikawa, Y., Harada, Y., Takahashi, O., Hiraya, A. & Shin, S. (2009). Phys. Chem. Chem. Phys. 11, 1679–1682. Web of Science CrossRef PubMed CAS Google Scholar
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C. J., Polat, Feng, Y., Moore, E. W., VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa, F. & van Mulbregt, P. (2020). Nat Methods, 17, 261–272. CrossRef CAS PubMed Google Scholar
Watts, B. (2014). Opt. Express, 22, 23628–23639. Web of Science CrossRef PubMed Google Scholar
Watts, B., Thomsen, L. & Dastoor, P. C. (2006). J. Electron Spectrosc. Relat. Phenom. 151, 105–120. Web of Science CrossRef CAS Google Scholar
Webb, S. M. (2005). Phys. Scr. 115, 1011–1014. CrossRef Google Scholar
Wilke, M., Farges, F., Petit, P. E., Brown, G. E. & Martin, F. (2001). Am. Mineral. 86, 714–730. CrossRef CAS Google Scholar
Zheng, F., Alayoglu, S., Guo, J. H., Pushkarev, V., Li, Y. M., Glans, P. A., Chen, J. L. & Somorjai, G. (2011). Nano Lett. 11, 847–853. Web of Science CrossRef CAS PubMed Google Scholar
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.