conference papers
New two-dimensional data treatment software for small-angle scattering
aInstitute of Solid State Physics, SzFKI, POB 49, H1525-Budapest, Hungary, and bLaboratoire Léon Brillouin (LLB), CEA Saclay, 91191 Gif sur Yvette CEDEX, France
*Correspondence e-mail: gerard.pepy@cea.fr
A new program is presented which performs data treatment in one, two or three dimensions. While one-dimensional data treatment can be performed easily by many programs (canSAS, EMBL, ISIS, ILL, NIST), few deal efficiently with more dimensions. Indeed, specific attention has to be paid to the selection of the relevant data. Their display and models are relatively complex. This new program has been developed according to the needs of small-angle scattering users, but is not limited to these fields. Its original purpose was to model forward anisotropic scattering and diffuse scattering typically produced by large structures such as polymers, aggregates, self-assembly systems or micellar solutions. It is also suited to modelling Bragg scattering. With time, many filter configurations (rectangles or sectors, with possible symmetries and display along various coordinates) and many model functions (centred or not, possibly with cofactors, with Cartesian, polar, or three-dimensional coordinates) have been added. Models are fitted by the steepest descent model with χ2 as the minimization function. The software is written in Fortran with the PGPLOT graphics package. It runs with the Windows operating system.
Keywords: data treatment; two-dimensional data; small-angle scattering; software.
1. Introduction
Sometimes complex pictures are obtained while performing small-angle (and large-angle) scattering, for instance if the sample shows anisotropic or diffuse scattering or Bragg peaks: the data-treatment problem becomes two-dimensional; meanwhile most available programs [canSAS (Ghosh, 2006), EMBL (Svergun, 2006), ISIS (Heenan, 2004), ILL (Ghosh & Rennie, 2007), NIST (Munter, 2004)] are one-dimensional. Moreover, it may be of interest to consider simultaneously files obtained while varying the temperature or rotating an angle: a third dimension comes in. Such problems are very often solved by the restriction of the fit to narrow bands, each of which may be considered as one-dimensional. This is reasonable only if the measured intensity changes slowly along the neglected dimension (Len et al., 2007); in any case most pixels are just thrown away even though they have valuable information. Correlation problems may appear for parameters appearing in distinct fits [for instance, different fits along different directions of reciprocal space usually do not produce the same value for I(0), the intensity at zero scattering vector]. Multidimensional fitting is needed.
The representation of the data, the way they are extracted by filters, how the fit is handled and finally how the results are saved are all complex. For the new program presented here, named PXY, a lot of attention has been given to these matters and several solutions have been tried by users before the most efficient has been selected. Alternative ways are often kept in order to test various user feelings. Two features, simplicity and uniquity, are specific to this program:
Simplicity: from the start we assumed that a single program with a central menu with many functionalities was easiest to handle with a minimum of learning by a casual user.
Uniquity: it was a crucial requirement that every action leading to the fit results should be displayed in the working window and that this same window could be saved and printed.
This software is written in Fortran, with the PGPLOT graphics package from Caltech (Pearson, 2002), and runs with the Windows operating system. The PXY program displays only two windows on the computer screen: one is a PGPLOT graphics window where the mouse is active, the other one provides short guidelines for keyboard input.
Data treatment consists of five steps: reading the data files, displaying the data, applying filters, fitting models and recording the results. We shall review these steps and get a general feeling for the program by processing a case study. Then we shall look at some examples that exhibit some interesting characteristics.
2. Processing a case
The PGPLOT graphics window (Fig. 1) is divided into four parts: the upper left (Fig. 1a) is devoted to the data file(s) input and gather its characteristics, the lower right (Fig. 1b) maps the data, the lower left (Fig. 1b) displays the intensity from filtered pixels and the upper right (Fig. 1a) contains information about the fit. Submenus appear according to the functionality selected in the general menu. The fifth step, recording the results, consists of a copy of the graphics window as a PostScript or GIF file (without the menu).
In all the subsequent paper it will be assumed that the multidetector is flat and perpendicular to the incoming beam, with wavevector ki whose modulus is 2π/λ, where λ is the incoming radiation wavelength. Every detector pixel determines a scattered beam direction with wavevector kf and scattering angle θ between ki and kf. As the scattering is assumed to be elastic, ki and kf have the same magnitude. The scattering vector Q is defined as Q = kf − ki. Its modulus is . Its coordinates Qx, Qy, Qz are best defined in an orthogonal frame with the X and Y directions in the detector plane, X along the horizontal direction and Y along the vertical direction (Fig. 1b).
2.1. Reading data files
Currently it is possible to read data files with formats from various laboratories: small-angle neutron scattering (SANS) spectrometers from Institut Laue Langevin (ILL), Laboratoire Léon Brillouin (LLB) and Budapest Neutron Center (BNC), Hahn Meitner Institut (HMI), Forschungszentrum Jülich (including USANS), SAND at the Argonne National Laboratory, YuMO at the Frank Laboratory for Neutron Physics, small-angle X-ray scattering (SAXS) beamlines ID01 and ID02 at the European Synchrotron Research Facility (ESRF), and some Bruker machines. We also implemented general formats like FITS (an astronomy format for charge-coupled-device cameras) and any image format (like BMP or TIFF) after conversion to ASCII. The PXY program was initially developed for SAXS, therefore the data file input may include the classical corrections files for calibration, reference and background. Then PXY immediately calculates the uncertainties. Alternatively, the sample data file may be provided with its own uncertainty table (SAND at Argonne). PXY may handle several files for comparison, display and calculation; from four for 1250 × 1250 pixel files up to 16 if their size is less than 128 × 128 pixels.
The main menu (Fig. 2a) is divided into chapters corresponding to the above four parts. From it one may navigate either directly to simple functionalities (reset picture…) or to submenus (pick a filter, image intensity range, make a fit…).
2.2. Mapping the data
The intensity (I) map associates a colour palette (among 25 predefined palettes) to 84 intensity levels. The number and range of these levels may be adjusted and one may associate the palette to functions of I, ln(I) or sqrt(I) in order to enhance details. It is possible to bore holes (rectangular, circular, elliptic), to remove or to restore given data.
2.3. Filtering the data
Selecting pixels in filters has a double role:
(i) only pixels inside the filters will be fitted to a model;
(ii) the intensity of the pixels inside a filter will be displayed through a projection on an axis corresponding to the filter (X, Y, ρ, θ).
Indeed, not all pixels are always relevant to the problem. Most of all, when starting to analyse data it is useful to limit the number of data to accelerate the process and to get a fast understanding of it, therefore narrow filters are useful at the beginning while larger filters including all significant data are used at a later stage. It is equally important to use filters with adequate geometry. For this purpose various filters are available (Fig. 2b): horizontal and vertical rectangles, sectors with display versus the radius, crowns with display versus the angle, oblique rectangle or cross, as in Fig. 1. A `spike' filter is under development to match star-like scattering occurring in foams or polycrystals. All these filters may benefit from symmetry operations versus the X or Y axis, or central symmetry (Fig. 2b). In the case of sectors (or rings) as many as six may be used, with or without central symmetry, periodically spaced or individually defined.
It is easier to consider a graph of a one-dimensional function than any other representation. Moreover, it is nearly impossible to compare calculated curves with data points otherwise. Therefore the program includes a one-dimensional representation of the pixels inside a filter. This is obtained through a projection of the pixels on the relevant coordinate axis. For instance, the pixels inside a rectangle elongated along X will be displayed versus X, and the pixels inside a sector will be displayed along ρ (or θ according to the filter type). The pixels inside each of the two rectangles of a `cross' filter will be displayed versus X and Y, respectively.
It should be stressed that this one-dimensional representation is a mere aid to the user. In any case, each pixel is individually fitted, most of the time by a two-dimensional function. After a fit the calculated points are projected and displayed in the same way as the data points in order to ease the evaluation of the fit. However, especially at a preliminary stage of a fit process, one may use a one-dimensional function in a narrow filter, but this should not be the normal procedure for treating data exhibiting rapidly varying scattering functions (Len et al., 2007). On the other hand, one may use three-dimensional models to fit data in a set of two-dimensional data files depending on a parameter like temperature or pressure.
It is also possible to store in a file the intensities of the projected pixels seen in the display as a one-dimensional set. Users can then use any data treatment or display program (e.g. ORIGIN, EXCEL, KALEIDAGRAPH and so on).
Many other functions are available. For instance, it is possible to write a title or a comment on the graphics page. It is also possible with the mouse to select a pixel and get its position in reciprocal -space coordinates, with its intensity, and the average intensity over its next neighbours, or over the next and the second nearest neighbours. It is also possible to view the contents of a filter with complementary images: one resulting from a projection along X and the other versus Y (or ρ and θ).
2.4. Fitting the data
The fitting process is the well known `steepest descent' of a minimization function (Bevington, 1969; Press et al., 1992). The most common function is
where Ii is the intensity in pixel i, N is the number of data points, p is the number of free parameters; {P} is the set of parameters, Yi is the calculated intensity in pixel i and ΔIi is the uncertainty over Ii. If the random variable Ii follows a normal probability law, χ2 = 1, which is a good test for the fit.
The first step in the fit general menu (Fig. 2c) is to choose the dimensionality. The main other selections are the number of iterations, the precision (for the test ending the fit) and the level of information during the fit. Once the general characteristics of the fit are defined, another menu allows the introduction of the background and model functions which can be combined linearly or with cofactors (two cofactors are available: magnetic modulation in the case of a saturated ferromagnetic sample and a correlation factor). Selecting the function box opens a specific menu (Fig. 2d). The model functions may be centred on the main beam or not centred; in this case they include two more parameters for their origin. Many functions are predefined, such as Gauss–Gauss, Gauss–Lorentz, the scattering by a sphere, an ellipsoid, a cylinder and so on. The mixed function Gauss–Lorentz means that it is a Gaussian versus the first variable and a Lorentzian versus the second variable. Most of these functions include the possibility to apply a dispersion to one parameter. The built-in dispersions are Gaussian, square, triangle, log-normal and Schultz–Zimm. Finally, one runs the fit and hopefully reaches a good result. If this is not the case, several functionalities may help. For instance, it is possible to draw a map of the minimized function, χ2, over two variables. Probably the most useful functionality is the map of the difference between the calculation and the data. The differences are calibrated for each pixel by the mean-square deviation in order to build a meaningful picture.
2.5. Recording the results
Once the fitting is satisfactory, the `output' functionality in the main menu allows the graphics page with all its information (without the menus) to be recorded as a PostScript or a GIF file (Fig. 1). Along with this, a text file is created with the full history of how this page was reached. It is possible to run the `history' file up to the step where the page was recorded at a later time, and to resume the data treatment from this point. Note that the `output' includes a short menu allowing some of the four elements of the page to be selected and the typography of the lines and symbols to be changed.
3. Examples
3.1. Difficulty with power law
On Fig. 1b we had selected narrow rectangles in order that the projected display would not be too different to the well known shape of Lorentzian (scattering by a polymer coil) and Gaussian (resolution-widened Bragg peak) curves. For the case in Fig. 3, the rectangles have been tailored to include as many of the relevant pixels as possible, while avoiding a perturbed part of the picture. The fit was not considered to be good enough. Indeed, the difference calculation data in Fig. 4 revealed a mis-setting of the central position. This effect was particularly strong because of the presence in the model of a power law; close to the centre it is very steep and sensitive to a centre-position error. In this case, it was enough to use functions with fitted centre parameters, instead of the functions centred on the assumed beam position, to reach an excellent fit.
3.2. A complex model
The data file displayed in Fig. 5 shows the SAXS by nanochannels in a track-etched membrane. It has more than 1.5 million pixels. However, thanks to the sample orientation, the Ewald sphere cut of the scattering function reduces the number of interesting pixels. It was mandatory to perform a fit including the two dimensions in order to verify the model. Some samples required a small dispersion over the channel radius along the beamline resolution. Let us mention that the model depends strongly on the angle of the channel axis versus the main beam. It has been possible to fit simultaneously two files with very different pictures corresponding to two different angular positions of the sample: a first step to three-dimensional fitting.
3.3. A convenient display
The picture in Fig. 6 is complex because it is the superposition of two anisotropic functions. In this case, the best approach to the problem implied a polar representation. Most of the pixels are selected for the fit. They belong to four sectors (with their centrosymmetric partners) which allows a pertinent representation of the anisotropy and an appreciation of the fit quality.
3.4. Data evolution
The case considered in Fig. 7 was designed to explore the evidence for small changes (like anisotropy) by fitting several files simultaneously. It is possible to provide each file with the same set of model functions. The parameters of subsequent files may be linked by a linear relation to the parameters of the previous file. Thus, it is possible to fit some parameters independently for each file and to fit others with linear constraints between subsequent files.
4. Developments
Throughout the development of this program, it was a constant concern to keep it user-friendly. Therefore the immediate projects are to write menus as Visual Basic objects to improve their convenience and to give users the possibility to link their own entry format and model function.
Further improvements will include the implementation of the MINUIT fitting kernel from the European Organization for Nuclear Research (CERN) and the extension of the functionalities to grazing-incidence SAXS and reflectometry, each with its own style. Suggestions and contributions are welcome.
Acknowledgements
The author thanks C. Trautmann and B. Schiedt (GSI Darmstadt) who provided the nanochannel sample and L. Noirez (LLB, CEA Saclay) who provided all the liquid-crystal polymer examples. Moreover, the constant interest and encouragements of L. Noirez had a fundamental influence in the development of this program.
References
Bevington, P. R. (1969). Data Reduction and Error Analysis for the Physical Sciences. New York: McGraw-Hill.
Gautier, P., Brunet, M., Grupp, J., Noirez, L. & Anglaret, E. (2003). Phys. Rev. E, 68, 11709-1–11709-12.
Ghosh, R. (2006). https://www.ill.fr/lss/canSAS/
.
Ghosh, R. & Rennie, A. (2007). https://www.ill.fr/data_treat/newsans.html
.
Heenan, R. K. (2004). FISH, https://www.isis.rl.ac.uk/largescale/
.
Len, A., Pépy, G., Rosta, L. & Harmat, P. (2007). Private communication.
Mendil, H., Baroni, P., Grillo, I. & Noirez, L. (2006). Phys. Rev. Lett. 96(7), 077801-1–077801-12.
Munter, A. (2004). Simulator, https://www.ncnr.nist.gov/resources/simulator.html
; https://www.ncnr.nist.gov/programs/sans/data/red_anal.html
.
Noirez, L. (1999). Europhys. Lett. 46(6), 728–734. CrossRef
Noirez, L. & Lapp, A. (1997). Phys. Rev. Lett. 78(1), 70–73. CrossRef
Pearson, T. J. (2002). PGPLOT, https://www.astro.caltech.edu/~tjp/pgplot/
.
Pépy, G., Boesecke, P., Kuklin, A., Manceau, E., Schiedt, B., Siwy, Z., Toulemonde, M. & Trautmann, C. (2007). J. Appl. Cryst. 40, s388–s392. Web of Science CrossRef IUCr Journals
Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1992). Numerical Recipes in FORTRAN77 – The Art of Scientific Computing. USA: Cambridge University Press.
Svergun, D. (2006). https://www.embl-hamburg.de/ExternalInfo/Research/Sax/software.html
.
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.