research papers
On the neglecting of higherorder cumulants in
data analysis^{a}Università degli Studi di Verona, Facoltà di Scienze, Dip. di Informatica, Strada le Grazie 15, I37134 Verona, Italy
^{*}Correspondence email: andrea.sanson@univr.it
The cumulant expansion is one of the most powerful and useful methods for
data analysis, in which the higherorder cumulants allow to consider deviations from a simple Gaussian distribution. In this work, analytical expressions have been derived to show the effects of neglecting higherorder cumulants in analysis by the ratio method. The errors in the bestfitting procedure owing to the omission of the higherorder cumulants, as well as of the can be determined.Keywords: EXAFS; cumulant analysis of EXAFS.
1. Introduction
The cumulant method is a modelindependent technique based on the expansion of extended Xray absorption finestructure (EXAFS) amplitudes and phases as a series of cumulants of the interatomic distance distribution (Teo, 1986). In the analysis, the contributions of the different coordination shells are singled out by Fourier filtering and separately analyzed. The Fourier filtering process allows the phase Φ(k) and amplitude A(k) of the single shell to be separated. The difference between the phases and the logarithm of the amplitude ratio can be written through the ratio method as (Bunker, 1983; Fornasini et al., 2001)
where k is the photoelectron wavevector, N = N_{s}/N_{r} is the ratio, ΔC_{n} indicates the cumulant difference C_{n}^{ s}C_{n}^{ r}, and the subscripts s and r refer to the sample and reference, respectively. The cumulant method allows the characterization of the first coordination shell in terms of parameters which describe the distance distributions: the first cumulant C_{1} is the mean value, C_{2} is the variance, C_{3} measures the distribution asymmetry and C_{4} measures its flatness. For a Gaussian distribution the cumulants C_{n} are zero for n > 2.
Anharmonicity effects on ). After the first pioneering studies on AgI (Boyce et al., 1981) and CuBr (Tranquada & Ingalls, 1983), it has been shown that anharmonicity cannot be neglected even in systems like germanium (Dalba et al., 1995) or copper (a Beccara et al., 2003), where the third cumulant has been taken into account in the analysis to obtain accurate values of the first cumulant; even more so in the case of systems affected by structural disorder, such as liquids, glasses, molten salts and alloys, where the higherorder cumulants must be taken into account when fitting the data (Wei et al., 2000; Sanson et al., 2008; Swilem et al., 2005). In strongly disordered systems, where the convergence of the cumulant series is in principle questionable (Filipponi, 2001), the cumulants can sometimes be considered to parameterize only the shortrange component of the whole distance distribution, as tested in αAgI (Boyce et al., 1977, 1981) and more recently in silver molybdate glasses (Sanson, Rocca, Dalba et al., 2007).
were detected quite early (Eisenberger & Brown, 1979The importance of including higherorder cumulants in et al., 1997; Soldo et al., 1998; Bus et al., 2006; Vaccari et al., 2007; Ahmed et al., 2009). Other groups recognized the asymmetry in the distance distribution, but did not use the cumulants beyond the second order (DiazMoreno et al., 1997; Berlier et al., 2002; Katsikini et al., 2008; Chu et al., 2009), with the consequence that the resulting errors in the fit parameters may have drastic effects on the structural parameters. In some specific cases, the errors owing to the use of a Gaussian pair distribution have been estimated (Mobilio & Incoccia, 1984; Wei et al., 2000), with the result that it produces a significant error for the distance and At present, a general treatment of this problem is still lacking.
analysis has been recognized in many works (YokoyamaIn this work, for the first time, analytical expressions have been derived to determine the errors in the bestfitting procedure owing to the neglect of the higherorder cumulants (up to the sixth order). The paper is organized as follows: the procedure to derive these expressions is briefly described in §2; the results are reported in §3 and discussed in §4; §5 is dedicated to conclusions.
2. Procedure
Let us consider the best fit of the phases difference, assuming that it is sufficient to truncate equation (1) at the third order (ΔC_{3}) to have a good fit. In order to evaluate the resulting error on the relative first cumulant (i.e. on the bond distance variation) owing to the neglect of the third cumulant, we can solve the following equation with respect to ,
which corresponds to minimize the fitting difference between 2kΔC_{1} − (4/3)k^{3}ΔC_{3} and ; k_{m} and k_{M} are the minimum and maximum values of the fitting interval, respectively. Expanding (3) we obtain
and so
As a result, from (5) it can be observed that for ΔC_{3} > 0 the neglect of the third cumulant gives an underestimation of the relative first cumulant. On the contrary, for ΔC_{3} < 0 the relative first cumulant is overestimated. More important, equation (5) allows the error on ΔC_{1} to be quantitatively estimated. For example, by (5), the neglect of a third cumulant ΔC_{3} ≃ 0.0005 Å^{3} in the fitting interval k = 2–10 Å^{−1} (i.e. k_{m} = 2 and k_{M} = 10 Å^{−1}) gives an underestimation of ΔC_{1} of about 0.020 Å.
As a second example, let us consider the best fit of the amplitudes ratio, assuming that it is sufficient to truncate (2) at the fourth order (ΔC_{4}) to have a good fit. To evaluate the resulting error on the and on the second cumulant owing to the neglecting of the fourth cumulant, we can solve the following system of equations with respect to N′ and ,
whose solutions are
and
As a result, from (8)–(9) it can be seen that the neglect of the fourth cumulant gives an underestimation/overestimation of the relative second cumulant and of the ratio, according to the sign of ΔC_{4}. These errors, which depend on the fitting interval (k_{m}–k_{M}), can be quantitatively estimated by (8) and (9). For example, the neglect of the fourth cumulant ΔC_{4} ≃ 0.0001 Å^{4} in the fitting interval 2–10 Å^{−1} gives an underestimation of ΔC_{2} of about 0.0032 Å^{2} and on the logarithm of N of about 0.096.
3. Results
Following the procedure described in the previous section, the errors on the cumulants analysis have been derived for different cases. The results are listed below.
3.1. Phases difference
3.2. Amplitudes ratio
3.2.1. Neglect of the fourth and sixth cumulant
In the case that the
fourth and sixth cumulant are neglected, the second cumulant becomes3.2.2. Neglect of the fourth and sixth cumulant
Neglecting both the fourth and sixth cumulant, the
and the second cumulant becomeWhen the sixth cumulant is negligible (ΔC_{6} = 0), equations (14) and (15) reduce to (8) and (9), respectively.
3.2.3. Neglect of the sixth cumulant
When the sixth cumulant is neglected, the
the second and the fourth cumulant change asAccordingly, they are underestimated when ΔC_{6} > 0 and overestimated when ΔC_{6} < 0.
3.2.4. Neglect of the coordination number
Let us consider the case that (2) truncated at the fourth order (ΔC_{4}) gives a good fit. If the variation of the is neglected, the second and the fourth cumulant result as follows,
Equation (19) shows the correlation between and Debye–Waller factor. As expected, a decrease (or increase) of the if neglected, leads to an increase (or decrease) of the second cumulant, according to the amplitude of the signal. This is particularly important in studies of glasses or disordered systems, where and Debye–Waller factor play a key role (Kuzmin et al., 2006; Sanson, Rocca, Dalba et al., 2007; Sanson, Rocca, Fornasini et al., 2007).
4. Discussion
Let us test the equations calculated in the previous sections through an experimental example. To this aim, let us consider the phases difference and the logarithm of the amplitudes ratio of the et al., 2007).
signals measured in silver molybdate glasses at room temperature against the same glass at 25 K used as reference (Sanson, Rocca, DalbaFig. 1 shows the difference of the phases and the corresponding best fits in the range k = 2.5–12 Å^{−1}. The fits were performed (a) using only the first cumulant (ΔC_{1}), (b) including the third cumulant (ΔC_{1} + ΔC_{3}) and (c) including the fifth cumulant (ΔC_{1} + ΔC_{3} + ΔC_{5}). The corresponding fitting results are reported in the first part of Table 1. It can be observed that the third cumulant is essential to obtain accurate relative values of the first cumulant. In this example, the discrepancy on ΔC_{1} between fit (a) and fit (b) [or fit (c)] is about 0.02 Å. The discrepancy on ΔC_{3} between fit (b) and fit (c) (although less important) is about 10^{−4} Å^{3}. These discrepancies can be directly estimated by (5), (10) and (11)–(12) (depending on the fit case) with k_{m} = 2.5 and k_{M} = 12 Å^{−1}. The results are listed in the second part of Table 1. It can be seen that the agreement between predicted values (i.e. second part of Table 1) and bestfit values (i.e. first part of Table 1) is excellent.
Analogously, Fig. 2 shows the logarithm of the amplitude ratios and the corresponding best fits in the same interval k = 2.5–12 Å^{−1}. The fits were performed (a) using only the second cumulant (ΔC_{2}), (b) including the fourth cumulant (ΔC_{2} + ΔC_{4}), (c) only including the and the second cumulant (N + ΔC_{2}) and (d) including second and fourth cumulants (N + ΔC_{2} + ΔC_{4}). For simplicity, the best fits that include the sixth cumulant are not reported, but the reliability of the corresponding equations (14)–(18) is assured anyway.
The bestfitting results are listed in the first part of Table 2. It can be seen, for example, that the changes of the when neglected, drastically affect the values of the second cumulant, as well as of the fourth cumulant. The discrepancy on ΔC_{2} between fit (b) and fit (d) is almost 0.004 Å^{2}, and about 7 × 10^{−5} Å^{4} on ΔC_{4}. The cumulant differences among the fits can be directly estimated from the equations of §2 and §3. The results are listed in the second part of Table 2. The agreement with the bestfit values (i.e. with the first part of Table 2) confirms the goodness of the analytical expressions derived in this paper.

Before the conclusions, let us make a final observation. The experimental data cannot be fitted using an unrestricted number of fitting parameters, otherwise the fit becomes better but the essential parameters (i.e. distance, Debye–Waller factor, coordination number) can give worse results. However, on the other side, the main higherorder cumulants cannot be neglected in many cases, but it is necessary to find a good balance.
5. Conclusions
In this work, analytical expressions have been derived to determine the errors in the i.e. bond distance, and Debye–Waller factor, is demonstrated.
analysis, by the ratio method, owing to the neglect of the higherorder cumulants. The reliability of the present results has been tested on experimental data. The importance of the higherorder cumulants to obtain accurate values of the lowerorder cumulants,References
Ahmed, S. I. S., Dalba, G., Fornasini, P., Vaccari, M., Rocca, F., Sanson, A., Li, J. & Sleight, A. W. (2009). Phys. Rev. B, 79, 104302. Web of Science CrossRef Google Scholar
Beccara, S. a, Dalba, G., Fornasini, P., Grisenti, R., Pederiva, F., Sanson, A., Diop, D. & Rocca, F. (2003). Phys. Rev. B, 68, 140301. Google Scholar
Berlier, G., Spoto, G., Fisicaro, P., Bordiga, S., Zecchina, A., Giamello, E. & Lamberti, C. (2002). Microchem. J. 71, 101–116. Web of Science CrossRef CAS Google Scholar
Boyce, J. B., Hayes, T. M. & Mikkelsen, J. C. Jr (1981). Phys. Rev. B, 23, 2876–2896. CrossRef CAS Web of Science Google Scholar
Boyce, J. B., Hayes, T. M., Stutius, W. & Mikkelsen, J. C. Jr (1977). Phys. Rev. Lett. 38, 1362–1365. CrossRef CAS Web of Science Google Scholar
Bunker, G. (1983). Nucl. Instrum. Methods Phys. Res. 207, 437–444. CrossRef CAS Web of Science Google Scholar
Bus, E., Miller, J. T., Kropf, A. J., Prins, R. & van Bokhoven, J. A. (2006). Phys. Chem. Chem. Phys. 8, 3248–3258. Web of Science CrossRef PubMed CAS Google Scholar
Chu, W. S., Zhang, S., Yu, M. J., Zheng, L. R., Hu, T. D., Zhao, H. F., Marcelli, A., Bianconi, A., Saini, N. L., Liu, W. H. & Wu, Z. Y. (2009). J. Synchrotron Rad. 16, 30–37. Web of Science CrossRef CAS IUCr Journals Google Scholar
Dalba, G., Fornasini, P., Grazioli, M. & Rocca, F. (1995). Phys. Rev. B, 52, 11034–11043. CrossRef CAS Web of Science Google Scholar
DiazMoreno, S., Koningsberger, D. C. & MunozPàez, A. (1997). Nucl. Instrum. Methods Phys. Res. B, 133, 15–23. Web of Science CrossRef CAS Google Scholar
Eisenberger, P. & Brown, G. S. (1979). Solid State Commun. 29, 481–484. CrossRef CAS Web of Science Google Scholar
Filipponi, A. (2001). J. Phys. Condens. Matter, 13, R23–R60. Web of Science CrossRef CAS Google Scholar
Fornasini, P., Monti, F. & Sanson, A. (2001). J. Synchrotron Rad. 8, 1214–1220. Web of Science CrossRef CAS IUCr Journals Google Scholar
Katsikini, M., Pinakidou, F., Paloura, E. C., Komninou, Ph., Georgakilas, A. & Welter, E. (2008). Phys. Status Solidi A, 205, 2611–2614. CrossRef CAS Google Scholar
Kuzmin, A., Dalba, G., Fornasini, P., Rocca, F. & Sipr, O. (2006). Phys. Rev. B, 73, 174110. Web of Science CrossRef Google Scholar
Mobilio, S. & Incoccia, L. (1984). Nuovo Cimento D, 3, 846–866. CrossRef Web of Science Google Scholar
Sanson, A., Rocca, F., Armellini, C., Dalba, G., Fornasini, P. & Grisenti, R. (2008). Phys. Rev. Lett. 101, 155901. Web of Science CrossRef PubMed Google Scholar
Sanson, A., Rocca, F., Dalba, G., Fornasini, P. & Grisenti, R. (2007). New J. Phys. 9, 88. Web of Science CrossRef Google Scholar
Sanson, A., Rocca, F., Fornasini, P., Dalba, G., Grisenti, R. & Mandanici, A. (2007). Philos. Mag. 87, 769–777. Web of Science CrossRef CAS Google Scholar
Soldo, Y., Hazemann, J. L., Aberdam, D., Inui, M., Tamura, K., Raoux, D., Pernot, E., Jal, J. F. & DupuyPhilon, J. (1998). Phys. Rev. B, 57, 258–268. Web of Science CrossRef CAS Google Scholar
Swilem, Y., Sobczak, E., Nietubyć, R. & ŚlawskaWaniewska, A. (2005). Physica B, 364, 71–77. Web of Science CrossRef CAS Google Scholar
Teo, B. K. (1986). EXAFS: Basic Principles and Data Analysis. Berlin: SpringerVerlag. Google Scholar
Tranquada, J. M. & Ingalls, R. (1983). Phys. Rev. B, 28, 3520–3528. CrossRef CAS Web of Science Google Scholar
Vaccari, M., Grisenti, R., Fornasini, P., Rocca, F. & Sanson, A. (2007). Phys. Rev. B, 75, 184307. Web of Science CrossRef Google Scholar
Wei, S., Oyanagi, H., Liu, W., Hu, T., Yin, S. & Bian, G. (2000). J. NonCryst. Solids, 275, 160–168. Web of Science CrossRef CAS Google Scholar
Yokoyama, T., Ohta, T. & Sato, H. (1997). Phys. Rev. B, 55, 11320–11329. CrossRef CAS Web of Science Google Scholar
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.