Journal of Synchrotron Radiation

Volume 21, Part 5 (September 2014)

research papers

J. Synchrotron Rad. (2014). 21, 1140-1147    [ doi:10.1107/S1600577514013526 ]

Estimating the number of pure chemical components in a mixture by X-ray absorption spectroscopy

A. Manceau, M. Marcus and T. Lenoir

Abstract: Principal component analysis (PCA) is a multivariate data analysis approach commonly used in X-ray absorption spectroscopy to estimate the number of pure compounds in multicomponent mixtures. This approach seeks to describe a large number of multicomponent spectra as weighted sums of a smaller number of component spectra. These component spectra are in turn considered to be linear combinations of the spectra from the actual species present in the system from which the experimental spectra were taken. The dimension of the experimental dataset is given by the number of meaningful abstract components, as estimated by the cascade or variance of the eigenvalues (EVs), the factor indicator function (IND), or the F-test on reduced EVs. It is shown on synthetic and real spectral mixtures that the performance of the IND and F-test critically depends on the amount of noise in the data, and may result in considerable underestimation or overestimation of the number of components even for a signal-to-noise (s/n) ratio of the order of 80 ([sigma] = 20) in a XANES dataset. For a given s/n ratio, the accuracy of the component recovery from a random mixture depends on the size of the dataset and number of components, which is not known in advance, and deteriorates for larger datasets because the analysis picks up more noise components. The scree plot of the EVs for the components yields one or two values close to the significant number of components, but the result can be ambiguous and its uncertainty is unknown. A new estimator, NSS-stat, which includes the experimental error to XANES data analysis, is introduced and tested. It is shown that NSS-stat produces superior results compared with the three traditional forms of PCA-based component-number estimation. A graphical user-friendly interface for the calculation of EVs, IND, F-test and NSS-stat from a XANES dataset has been developed under LabVIEW for Windows and is supplied in the supporting information. Its possible application to EXAFS data is discussed, and several XANES and EXAFS datasets are also included for download.

Keywords: XANES; EXAFS; PCA; factor analysis; F-test.

pdfdisplay filedownload file

Portable Document Format (PDF) file (6635.9 kbytes)
[ doi:10.1107/S1600577514013526/hf5263sup1.pdf ]
Supplementary figures S1, S2 and S3

pdfdisplay filedownload file

Portable Document Format (PDF) file (198.1 kbytes)
[ doi:10.1107/S1600577514013526/hf5263sup2.pdf ]
Information about the workings of the NSS analysis referred to in the main text and included in the PCA_Estimator.exe programme supplied in the zip file of the supplementary material

zipopen filedownload file

Zip compressed file (16699.9 kbytes)
[ doi:10.1107/S1600577514013526/ ]
Graphical data-analysis programme and user manual for the calculation of the number of pure component XAFS spectra from a dataset of multicomponent spectra. The datasets listed in Table 1 are also included


To open or display or play some files, you may need to set your browser up to use the appropriate software. See the full list of file types for an explanation of the different file types and their related mime types and, where available links to sites from where the appropriate software may be obtained.

The download button will force most browsers to prompt for a file name to store the data on your hard disk.

Where possible, images are represented by thumbnails.

 bibliographic record in  format

  Find reference:   Volume   Page   
  Search:     From   to      Advanced search

Copyright © International Union of Crystallography
IUCr Webmaster