Volume 36 Received 15 March 2002 | Improved measures of quality for the atomic pair distribution functionaDepartment of Physics and Astronomy and Center for Fundamental Materials Research, Michigan State University, East Lansing, MI 48824-1116, USA, and bLos Alamos National Laboratory, LANSCE-12, Mailstop H805, Los Alamos, NM 87544, USA The introduction of neutron spallation-source instruments, such as the General Materials Diffractometer (GEM) at ISIS, allows measurement of pair distribution function (PDF) data at significantly higher rates than previously possible. As a result of the increased rate, a single experiment can produce over a hundred individual runs. Manual processing of all these data using traditional methods becomes inconvenient and inefficient. This article presents quality criteria that help produce automated direct Fourier transformed PDFs of quality similar to hand-processed data, and compares optimization methods. Keywords: pair distribution function; total scattering; neutron diffraction; . |
Different techniques for structure determination have developed over the years. X-ray and neutron diffraction studies are useful since they allow structure measurements on the ångström length scale. The technique discussed in this article is the atomic pair distribution function (PDF) method. The PDF is widely used to study glasses and amorphous materials (Wagner, 1978
; Warren, 1990
), but more recently it has been used to study the structure of crystalline materials (Egami, 1990
, 1998
). An important aspect of every structural technique is to obtain the best results possible from a given measurement. This can be achieved either by improving the measurement method or by processing the data better; we will be discussing the latter.
The PDF, G(r), is obtained from the experimentally determined total-scattering structure function, S(Q), by a Sine Fourier transform
where r is the distance between two atoms and Q is the momentum transfer.
(r), defined as (Faber & Ziman, 1965
)
is the microscopic pair density, and
0 is the average number density of the sample. The sum is taken over all atoms in the sample and rij = |ri - rj| is the distance separating atoms i and j. It gives the probability of finding two atoms separated by a distance r, weighted by the scattering lengths, and averaged over all pairs of atoms in the sample.
In a real experiment, a number of corrections to the intensity data must be carried out in order to obtain the structure function, S(Q), normalized to the total-scattering cross section of the sample. In principle, the data corrections are well known and well understood (Klug & Alexander, 1974
; Wagner, 1978
; Soper et al., 1989
; Hannon et al., 1990
; Warren, 1990
; Billinge & Egami, 1993
; Wright et al., 1995
) and the data analysis can be carried out with no adjustable parameters. In practice, a number of approximations must be made in calculating these corrections and certain parameters are not known with high accuracy. Using these approximations results in a corrected normalized S(Q) that contains distortions. The distortions are usually dealt with in somewhat arbitrary ways, as described below. Fortunately, the structural information in the PDF is rather robust with respect to these distortions. Inadequacies in the corrections tend to result in very long wavelength distortions of S(Q), giving rise to nonphysical features in G(r) at low values of atomic separation, r, too small for real atomic separations. The distortions do not affect the data except insofar as ripples from these features propagate into the high-r region. Frequently, an expert eye is needed to minimize distortions of S(Q) so that the physics can be studied. Nonetheless, it is clearly of interest to find more quantitative criteria for assessing the quality of a PDF and to minimize these distortions and obtain the most accurate PDFs possible. This becomes more important as new instruments, such as the General Materials Diffractometer (GEM) at ISIS, come online, and increased data acquisition rates necessitate an automated data processing method. Though the data corrections to obtain S(Q) from X-ray data are different from the neutron case described above and discussed in detail in the paper below, the S(Q) function before direct Fourier transformation is important in both cases. The quality criteria and optimization procedures described here are applicable to both X-ray and neutron-derived S(Q) functions.
A number of indirect methods have been proposed for the Fourier transform, such as the use of the Reverse Monte Carlo method (Zetterstrom & McGreevy, 2000
) and Bayesian methods (Terwilliger, 1994
). Here we consider how to obtain the best S(Q) possible for direct Fourier transform. Also, in most experiments multiple diffraction patterns are combined to form a single S(Q). The process of combining multiple spectra, from different scans or banks, is a sizable topic on its own and will be dealt with in a separate article.
The S(Q) and G(r) functions exhibit certain known properties that can be made use of to assess and optimize the data corrections. With real data, the easiest problem to detect is when S(Q) does not asymptote to unity as Q
. In practice, the data are adjusted to obtain the right asymptote. However, it is not a priori clear whether the correction should be to add a constant, multiply by a factor, or apply some other correction. This is because it is often not clear which data distortion, or distortions, are primarily to blame for the problem. Some data distortions are additive, such as background, empty can, multiple scattering and incoherent scattering subtractions, and others are multiplicative, such as normalization for flux and number of atoms in the illuminated sample volume, and absorption corrections.
A number of different approaches are often taken at this point. The most common is to make the sample density a parameter and vary it until the asymptotic behavior of S(Q) is correct [S(Q)
1 as Q
]. This practice is somewhat arbitrary, since in many cases this will not be the limiting factor in the accuracy of the corrections as the sample density is easy to determine with reasonable accuracy. It should be noted that varying the sample density applies a predominantly multiplicative correction (it strongly affects the sample normalization and the absorption correction) to the data with a small additive part (from the multiple-scattering correction and an incorrect background subtraction due to the incorrect absorption correction). Another commonly varied parameter is the effective beam width. The beam size is known from the collimation of the instrument, but the beam may not be homogeneous (Soper et al., 1989
; Hannon et al., 1990
). Therefore, the effective beam width will be different from the physical beam size due to varying intensity across the beam profile. This beam size is a predominantly multiplicative correction due to flux normalization; however, it will also have an additive component due to differently evaluated multiple-scattering corrections. Less commonly used parameters are sample height or effective beam height.
Since the choice of parameters to vary is mostly arbitrary, it is interesting to see whether taking a completely arbitrary approach of simply multiplying the data by a constant and/or adding a constant results in PDFs of equally good quality. Here we compare a number of different approaches for data normalization. Since our primary interest is crystalline materials, we have chosen to study a neutron data set from pure germanium. We systematically apply both a multiplicative and an additive correction to the processed S(Q), obtain the PDF by direct Fourier transform, and analyse the results using a number of PDF quality criteria defined below. The PDF is very sensitive to the asymptotic behavior of S(Q) so we confine our interest to data where
= 1.00 (2). Since germanium is crystalline with a well defined structure, by modeling we can easily determine when the data are properly normalized. It is then possible to compare different approaches that satisfy these criteria; for example, varying a multiplicative constant,
, and an additive constant,
, varying the sample density
eff and
, and so on. The resulting PDFs are then compared using the quality factors. Finally, we suggest a protocol for automatically obtaining the best PDF given an initial experimentally derived S(Q). While the analysis is only carried out for a single data set, the methods described below can be used to automate the analysis of several data sets.
As discussed above, the corrections to the data are incomplete. This is not only due to the simplifications made to the data corrections but also because characterization runs do not yield the complete picture of the physical setup. Multiple scattering events when an incident neutron or X-ray scatters off the sample, then the sample environment, before entering a detector, cannot be fully subtracted using characterization runs since the sample is not present. These inadequacies in the measurement and processing result in a total-scattering structure function, S'(Q), that is different from the true S(Q). In general, the measured S'(Q) can be written in terms of the true S(Q) as
where
(Q) and
(Q) are dimensionless functions. The simplest approximation is to take
(Q) and
(Q) as being independent of momentum transfer, Q, when equation (3
) becomes
Then the PDF associated with S'(Q) can be written in terms of S(Q) as
In addition to
and
, the other experimental effect is a finite measurement range. The PDF then becomes
where Gc(r) is the ideal G(r) convoluted with a termination function and
(r) is defined in equation (6)
. The term
(r) is the result of an improper high-Q asymptote of the S(Q) and gives rise to low-r ripples often seen in experimental PDFs. Both Gc(r) and
(r) are discussed in more detail in Appendix A
. For compactness, in the rest of the article G(r) will refer to the measured PDF approximated by equation (6)
, G'(r).
Quantifying the difference between the measured and ideal PDF is of great interest when the PDF is unknown. G(r) has certain known properties that can be used to assess its quality. Here we list several criteria used for determining an optimal PDF that is closest to the real structural PDF. The quality criteria presented here are derived and symbols are defined in Appendix B
.
In the case where the crystallographic structure of the sample is being studied there are well established methods of determining the quality of data. Programs such as PDFFIT (Proffen & Billinge, 1999
) and GSAS (Larson & Von Dreele, 1994
; Toby, 2001
) allow a structural model to be refined and then find how well the model and data agree. Usually this is used to determine how correct a model is; however, if the structure is already well known then these programs can be used to test the quality of measured data. In PDFFIT two criteria that tell us about data quality are the weighted profile agreement factor, Rwp, and the scale factor, Nm. Rwp is defined as (Proffen & Billinge, 1999
)
where Gobs(r) and Gcalc(r) are the measured and model PDFs, respectively, and w(r) is the weighting factor. This definition is used in standard crystal structure refinements. Note that the definition of Rwp, equation (7)
, is valid as it stands and provides a useful way of optimizing a model and of comparing the goodness-of-fit of different models (Peterson et al., 2001
). However, because points in the PDF are statistically correlated, Rwp does not have the same statistical significance as crystallographic Rwp functions. For example, it is not possible to obtain a reliable
2 value from it. The scale factor, Nm, is the factor by which the model must be multiplied to give good agreement with the data. We therefore define the factor Nd = N = 1/Nm as the factor by which the data must be multiplied to result in a properly scaled G(r) (Nm = 1). These two criteria, and others resulting from such fitting, require significant knowledge of the material a priori. In general they will not be used to determine data quality but instead model quality. For this reason they are not discussed further. The remaining criteria presented will require knowledge of the chemical composition and average number density,
0.
Using only the information about the general behavior of S(Q) and the PDF, there are three criteria that S(Q) should conform to. The first is the equality (Appendix B
)
Secondly, the high-Q portion of S(Q) should approach unity. This is seen in
where the angle brackets indicate an average over a range in Q. In this article the average is taken over 24 Å-1 < Q < Qmax = 40 Å-1 for the synthetic data in §2.2
, and 15 Å-1 < Q < Qmax = 25 Å-1 for the measured data presented in §3.1
. This range, 0.6Qmax < Q < Qmax, was determined empirically. Thirdly, one can also look at the total dispersion of S(Q) between the low-Q and high-Q asymptotes to find (Appendix B
)
where b is the neutron scattering length and the averages are over isotopes and elements. An analog of equation (10
) exists for X-rays where the neutron scattering lengths are replaced with X-ray form factors evaluated at Q = 0 Å-1, f(0). The methods for calculating Savg and Sdisp are both done by averaging S(Q) over a range. While the principles behind Savg and Sdisp are sound, a better method of calculating their values should be determined. The method of determining Savg and Sdisp could be improved, for example, through the use of Bayesian statistics (David & Sivia, 2001
), though this is beyond the scope of this article.
Similar to the previous criteria of S(Q), the PDF can be looked at without prior structural knowledge. The real-space analogy of equation (8
) is (Appendix B
)
The largest effect of distortions in S(Q) is to introduce ripples in G(r) at low r. It is therefore reasonable to adjust
and
in such a way as to minimize these ripples. We propose the following criterion,
Glow, to accomplish this:
where
fit is the average number density determined by fitting the low-r region of the PDF and rlow is before the first peak. For the data presented here, rlow of 2 Å was used.
Glow is designed as a robust criterion for automatically estimating (i.e. with no user input) the magnitude of ripples in the unphysical low-r region of the PDF below the first atom-pair peak. The exact form of
Glow is justified in Appendix B
. While many criteria presented in this section are defined for an infinite range, for real data they are evaluated over a finite range. In the next section these criteria will be further explored using synthetic data.
In order further to understand the quality measures presented in the previous section, they will be tested against synthetic PDFs of varying quality. All of the test PDFs were generated from the same initial PDF. The initial PDF was created by calculating G(r) from the known structure of germanium (at 10 K) using PDFFIT. Instrumental parameters in PDFFIT were chosen to be appropriate for the Glass, Liquid and Amorphous Materials Diffractometer (GLAD) at the Intense Pulsed Neutron Source (IPNS), so comparisons can be made with measured data (the instrument resolution,
Q, is 0.0657 Å-1) (Proffen & Billinge, 1999
). This PDF was Fourier transformed to produce an ideal S(Q) using
At r = 60 Å, the PDF has already reached its asymptotic value of zero, with this instrument resolution, as seen in Fig. 1
(a). The S(Q) produced by this method can be Fourier transformed back from 0 to 100 Å-1 to reproduce the initial PDF. A higher instrument resolution could be created by calculating to larger r values. For example, to synthesize GEM data (
Q
0.035 Å-1), where the PDF reaches its asymptotic value near 160 Å, the integral in equation (13)
would need to be evaluated to a significantly higher upper limit.
| Figure 1 Synthetic 10 K germanium PDF: (a) as calculated by PDFFIT and resulting S(Q); (b) calculated by Fourier transform. Insets are a reduced vertical scale of the functions. The synthetic data were calculated using parameters to mimic a GLAD measurement. |
The utility of the quality factors can be seen by looking at the synthetic data for different values of Qmax representing ideal and measured data. The different quality factors calculated for Qmax values of 100 and 40 Å-1 (rmax = 100 Å) are shown in Table 1
. In the table, the reader will quickly notice that the quality factors in equations (8
) and (11
) vary significantly from their theoretical values, especially as Qmax is reduced to 40 Å-1, even in the current case where there are no distortions to the data. This is because of the instrument resolution,
Q. In real space, the instrument resolution dampens the peaks, removing weight from the PDF. This dampening is the reason for the PDF reaching its high-r behavior at the small distance of 60 Å. This was seen as Sint and Gint varied with the instrument resolution. Since the peaks are missing weight, Gint and Sint are not their ideal values. For this reason, they will not be considered further. Besides Sint and Gint, the values of the other criteria are not more than 3% different from their theoretical values. This also gives a measure of the minimum significant uncertainty in the quality criteria, 3%.
| ||||||||||||||||||||||||||||||||||||||||||||||||||
Measured spectra tend to have errors which are both random and systematic in nature. Random noise originates from measurement statistics. There are many sources of systematic errors, from bad detectors to an unstable source. Systematic errors are normally dealt with by determining where they come from, fixing the problem and remeasuring. Frequently, it is not possible to remeasure a data set, or the systematic error is subtle enough not to be noticed. In these cases one can either disregard the data or get an idea of their quality and try to understand the underlying physics, knowing that the data do have a systematic error and knowing its effect on the data. The following examples all started from the synthetic data set described above with known errors introduced into S(Q). The S(Q) functions were Fourier transformed using a Qmax of 40 Å-1, obtainable at spallation-source neutron and synchrotron X-ray instruments.
The effect of random noise is handled in two ways: by adding constant noise and noise that increases with Q. Noise is introduced by adding a random number between ±0.5 dS, where dS is shown in the insets to Fig. 2
, along with the resulting S(Q) functions and PDFs. The effect of random noise on the criteria can be seen in Table 2
. As seen in Fig. 2
, the effect of random noise is most easily seen in the regions where there are no PDF peaks. Being average criteria, the values of Savg and Sdisp do not appreciably change, within the prescribed 3%; the value of
Glow does.
| |||||||||||||||||||||||||||||||||||||
| Figure 2 Reduced total-scattering structure function (left), form of noise dS added to S(Q) (inset), and associated PDFs (right). From top to bottom the synthetic data are pure, constant-noise added, Q-dependent-noise added. The abscissae in the insets are Q (Å-1). |
One might expect the effect of systematic errors on data to be more dramatic than random noise. Since systematic errors carry information, their effect is more complicated than random noise. Three types of systematic errors will be presented here. While the source of the errors is not mentioned, they are typical of errors that have been seen in real data.
The most common systematic error is a scaling error. This comes from the fact that diffraction data are inherently arbitrarily scaled. Therefore, this type of systematic error will always be encountered. Many authors have described various techniques for finding either an absolute scale factor (Kartha, 1953
; Krogh-Moe, 1956
; Norman, 1957
; Kaszkur, 1990
; Cumbrera et al., 1995
; Leadbetter & Wright, 1972
) or a relative scale factor (Hannon et al., 1990
; Soper et al., 1989
; Louca & Egami, 1999
) to compare data sets. When S(Q) is scaled, this changes the asymptotic behavior. To ensure proper asymptotic behavior we set
= 1 -
; hence
The effect of scaling can be seen in Fig. 3
and Table 3
. As expected from the analytic result, equation (6)
, G'(r) =
Gc(r) and G(r) remains undistorted but changes its scale. By definition, Savg does not change for the three cases within reasonable accuracy, while it is also noticed that
Glow is scale invariant as well. The other two criteria, Sdisp and
fit, do vary with scale, as seen in the second half of Table 3
. In principle, therefore, Sdisp and
fit could be used to determine the scale of the data. Sdisp only requires knowledge of the sample chemical composition and
fit only the average sample number density. This result shows that the `quality' of the PDF [i.e. spurious ripples and distortions of G(r)] does not change with scale factor, provided that the asymptotic behavior of S(Q) is satisfied. Obtaining the correct absolute scale factor, or relative scale factor between data sets, is important when carrying out model-independent analyses of data such as peak integrations or peak height analyses. However, when fitting models to data, provided that the model PDF can be scaled, it is not necessary to satisfy both Savg and N independently.
| ||||||||||||||||||||||||||||||||||
| Figure 3 S(Q) (left) and associated PDFs (right). From top to bottom the scale, , is 0.5, 1.0 and 2.0. Both S(Q) and G(r) are offset for clarity. |
More interesting and widely encountered in real data are Q-dependent additive and multiplicative distortions. We now consider a slowly oscillating additive sine wave that might originate from an imperfectly corrected background (Fig. 4a
). As before, we want S'(Q) to have the right asymptotic form, Savg = 1, so the constant
is changed in such a way that this is satisfied at our chosen Qmax. This is a common situation in real experimental PDFs: an unknown slowly oscillating additive correction is arbitrarily corrected with a (mostly) multiplicative correction. The exact form of S'(Q) used here is
The choices of the amplitude (0.1) and wavelength (100 Å-1) were made to produce S'(Q) similar to what might be encountered in real measurements. The reduced total-scattering structure function is shown in Fig. 4
(a) and the associated PDF can be seen in Fig. 5
(a), with the quality factors being listed in Table 4
, column (a). From equation (6)
we expect to see the PDF scaled and low-r ripples due to
(r). While there is little noticeable change in S(Q), the effect on the PDF is quite large. This type of systematic error shows the behavior often seen in PDF data of large spurious peaks near r = 0 Å with small oscillations extending into the physical portion of the PDF. Savg is within a reasonable range of its ideal value, while
Glow varies significantly. This is expected because
Glow is a quantification of the low-r noise in the PDF.
| ||||||||||||||||||||||||||||||||
| Figure 4 Q[S(Q) - 1] for three types of systematic errors: (a) additive long-wavelength sine oscillation, (b) Gaussian multiplicative function, and (c) additive sine oscillation with Q-dependent random noise. Below each structure function is the difference between the data with and without errors. The insets are the same data plotted from 0 to 100 Å-1. |
| Figure 5 G(r) for three types of errors: (a) long-wavelength sine oscillation, (b) step function, and (c) sine oscillation with Q-dependent random noise. Below each PDF is the difference between the data with and without errors. |
The next type of systematic error to be discussed here is a slowly varying multiplicative factor. This type of systematic error might result from an improper absorption correction. The exact form of the function used here is
The reduced total-scattering structure function is in Fig. 5
(b) and the associated PDF can be seen in Fig. 5
(b), with the quality factors being listed in Table 4
column (b). In this case the value of Sdisp is less than one because this systematic error suppresses the high-Q intensity. There is also a clear effect on the low-r portion of the PDF.
For completeness, the fourth spoiling of the synthetic data was achieved by introducing both Q-dependent random noise and a sine wave oscillation. The results of this synthetic data can be seen in Figs. 4
(c) and 5
(c), with some of the quality factors listed in Table 4
column (c). As expected, the results are qualitatively similar to those without the random noise [Fig. 5(a) and Table 4
column (a)].
From these seven examples we know which quality criteria are most useful. Sint and Gint do not work on the test data due to finite instrument resolution, and even if they did they are still not useful for measurements of crystalline materials due to the large measurement range required to determine them accurately. As expected, Sdisp and
fit are scale dependent. They are then not useful as data quality criteria since, in general, the absolute scale of the data is not known a priori. However, they are potentially useful independent measures for the scale of the data. This leaves Savg and
Glow as the preferred criteria. For perfect data, Savg and
Glow are equivalent. However, for real data including systematic errors they are not. We have found from the test data that smaller low-r ripples can sometimes be obtained when Savg is actually not equal to unity. Given the problems with determining Savg accurately, especially if as sometimes happens S(Q) is curved in the high-Q region, we believe that
Glow is a more robust quality criterion.
In this section we use
Glow to compare the quality of PDFs determined from real data but where different data analysis parameters were varied to obtain the optimal S(Q). As we discussed earlier, the parameters that are normally varied for this purpose, such as sample density, are somewhat arbitrary. We would like to determine if better results are achieved by varying a particular parameter or whether arbitrary additive and/or multiplicative factors could be used.
The data are time-of-flight neutron powder diffraction data from germanium collected on the GLAD at IPNS. Finely powdered germanium was sealed inside an extruded cylindrical vanadium container with helium exchange gas. The sample weighed 4.472 g and filled a container (0.9272 cm diameter and 5.4 cm high) to a height of 4.0 cm. We therefore estimate the mass density of the powder sample to be 1.7 g cm-3. This was mounted on a closed-cycle helium refrigerator. Neutron powder diffraction data were measured at 10 K for 4 h with a collimator of width 0.4636 cm. Parasitic scattering from the heat shields was estimated by taking data with the sample environment in place but no sample at the sample position. Scattering from the sample container was measured from an empty container. The scattering from a vanadium rod was also measured to allow the data to be normalized with respect to the incident spectrum and detector efficiencies. Standard data corrections were carried out as described elsewhere (Wagner, 1978
; Billinge & Egami, 1993
) using the program PDFgetN (Peterson et al., 2000
). A representative reduced total-scattering structure factor {Q[S(Q) - 1]} is shown in Fig. 6
and was Fourier transformed using Qmax = 25 Å-1 to produce the PDF shown in Fig. 7
as the open circles.
| Figure 6 Representative reduced total-scattering structure factor for germanium at 10 K measured using GLAD at IPNS. |
| Figure 7 Representative PDF (circles), PDFFIT model (line) and difference curve with 2 error bars as dotted lines (offset for clarity) for germanium at 10 K measured using GLAD at IPNS. |
The data were modeled using PDFFIT. Pure germanium has a diamond structure (F
3m with atoms at 0 0 0 and ¼ ¼ ¼). The structure refinement was carried out over the range 2 < r < 15 Å using the following method. The starting structure was a crystal structure with lattice parameter of 5.66 Å and germanium atoms only at the symmetric sites. The anisotropic displacement factors were set to be equal (U11 = U22 = U33 = 0.0021 Å2). Then the refinement proceeded by varying parameters in five steps.
(i) The scale factor (Nm), Q-resolution (qsig[1]), and r-dependent sharpening (delt[1]) are varied.
(ii) Nm, delt[1] and the lattice parameter (latt[i]) are varied.
(iii) Nm, qsig[1], delt[1] and latt[i] are varied.
(iv) The isotropic displacement factor (u[i,j]) is varied.
(v) Nm, latt[i] and u[i,j] are varied.
This method was repeated for all data, however processed, to obtain values of N and Rwp that were reproducible. A representative fit is shown in Fig. 7
as the solid line. A difference curve is shown beneath the data. Notice the nonphysical peak in the experimental PDF at very low r coming from the imperfect data corrections. Also, note that the structural information is not affected in a significant way by this feature.
When processing a given run there are multiple ways of varying the processing parameters to minimize
Glow. This discussion is better understood by looking at Table 5
. The top section of Table 5
gives the values used in PDFgetN to analyse the data and obtain the data PDF. The middle section has the values of the quality criteria, Savg and
Glow, for the resulting data PDFs. The bottom section of the table gives refined values for Nm, Rwp, a and <U> from the best model fit to the data PDFs. The first column gives the theoretical values of parameters that should be used in an ideal data analysis, and the resulting optimal values for the quality criteria. In subsequent columns, different analysis parameters from the possibilities
,
,
eff and
eff, have been varied in such a way as to minimize
Glow. The second column is for the PDF obtained using the `ideal' parameters. While this does represent the known experimental setup, due to problems and approximations in the data analysis, it clearly is not the best PDF: Savg is far from one,
Glow is very large, indicating large low-r ripples, Rwp is very large, again indicating significant fluctuations from the best fit model in the region of the PDF containing structural information, and finally the scale factor, Nm, is far from the optimal value of unity. Interestingly, the refined structural parameters, a and <U>, are the same, within the estimated uncertainties, as those determined from the best PDFs. This illustrates the robustness of the Fourier transform and the highly constrained modeling in preserving and extracting the structural information, even in the presence of significant systematic errors. Nonetheless, it is our objective here to minimize these errors.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Traditionally, one iteratively adjusts the effective sample density,
eff, the effective beam width,
eff, or both, in such a way as to make S(Q) asymptote to one at high Q. Generally this is done `by eye'. The third and fourth columns of Table 5
show the results of varying these traditional parameters, but here it was done in such a way as to minimize
Glow. In the third column,
eff was varied; in the fourth column
eff was varied to reflect a realistic beam profile and then
eff was varied. These approaches result in adequate PDFs as evidenced by the low
Glow and Rwp values. Use of a more realistic beam-profile results in smaller low-r ripples but a comparable, and slightly worse, Rwp. However, these corrections did not result in PDFs that had the right scale. This affected neither Rwp nor the refined structural parameters because the model contains a refinable scale factor supporting the notion that the scale of the data is not closely related to its `quality', except when model-independent analyses are being carried out.
The reason for the improper scale is not known. Clearly, it is a significant effect in the sense that when reasonable values are used for the known analysis parameters it results in a PDF that has only 60% of its proper weight. The best quality PDF is obtained when the asymptotic behavior of S(Q) is best satisfied, but clearly this criterion has been satisfied with predominantly multiplicative corrections when a mixture of multiplicative and additive corrections was called for. This effect can be corrected in an arbitrary way by introducing a constant additive correction,
. Then the data can be corrected using either
eff or
eff and
to achieve the proper scale as well as the proper asymptotic behavior. We finally introduce a correction by multiplicative constant
and
. This has the advantage that during the iterative process of optimizing S(Q), application of
and
does not require a lengthy re-analysis of the data. The results are shown in columns 5-7 of Table 5
. In this case the relevant parameters were varied both to minimize
Glow and to make Nm = 1. All of the data sets have comparable or lower
Glow and Rwp values than those corrected using traditional methods in columns 3 and 4. Interestingly, this occurred with uniformly lower values of Savg. The data corrected with the arbitrary constant
gave Rwp values that are indistinguishable from those obtained using the other methods. All of the different analysis methods resulted in PDFs yielding identical structural parameters within the errors.
Errors in scattering data, both random and systematic, have an effect on the PDF. Here we introduced and tested various quantitative measures of the quality of a PDF in the presence of systematic errors. The quality criteria (Savg, Sdisp,
Glow and
fit) were evaluated by comparison with synthetic data with different known systematic errors introduced. The parameters Sdisp and
fit vary proportionally with the scale of the data. This makes them useful for determining an absolute scale factor for the data. The parameters Savg and
Glow (which is a quantitative measure of the ripples in the PDF in the low-r region) are scale invariant and are useful for determining the quality of the S(Q) and resulting PDF. Either one can be optimized to yield the best possible PDF from a given data set through direct Fourier transform. Our tests suggest that
Glow is the more robust of the two criteria.
Data can be optimized using the above criteria by varying various process parameters. We have investigated a number of traditional methods (for example, varying the sample density) as well as the use of arbitrary multiplicative and additive parameters. All yield essentially comparable results and structural parameters that are identical within the estimated uncertainties. We show that it is the high-Q asymptotic behavior of S(Q) that is the main determinant of the quality of the PDF (freedom from artificial ripples) regardless of the scale of the data. If properly scaled data are desired it is necessary to use both multiplicative and additive corrections to the data. The straightforward use of additive and multiplicative constants gave comparably good results to the traditional methods, making this a desirable alternative because of the computational speed.
This work was motivated principally by the advent of next-generation neutron powder diffractometers that can produce hundreds of data sets in a single experiment. By automating the data optimization process it is possible to obtain properly scaled high-quality PDFs from multiple data sets in an efficient manner. Use of the quality-criteria analysis protocols described here should help this process enormously. These procedures have been incorporated into the data analysis program PDFgetN (Peterson et al., 2000
).
As discussed in the main text, data are always measured over a finite range. This appendix determines the effect of finite measurement range on the PDF. In addition, the form of
(r) in equation (6)
will be determined. To reiterate equation (6)
where Gc(r) is the PDF convoluted with a termination function:
The termination function
has the Fourier transform
where j0(Qr) denotes the zeroth-order spherical Bessel function, shown for completeness. Then the convoluted PDF can be written as