research papers
A posteriori determination of the useful data range for smallangle scattering experiments on dilute monodisperse systems
^{a}Hamburg Outstation, European Molecular Biology Laboratory, Notkestrasse 85, Hamburg 22607, Germany, and ^{b}Laboratory of Reflectometry and Smallangle Scattering, Institute of Crystallography of the Russian Academy of Sciences, Leninsky prospekt 59, Moscow 119333, Russian Federation
^{*}Correspondence email: svergun@emblhamburg.de
Smallangle Xray and neutron scattering (SAXS and SANS) experiments on solutions provide rapidly decaying scattering curves, often with a poor signaltonoise ratio, especially at higher angles. On modern instruments, the noise is partially compensated for by oversampling, thanks to the fact that the angular increment in the data is small compared with that needed to describe adequately the local behaviour and features of the scattering curve. Given a (noisy) experimental data set, an important question arises as to which part of the data still contains useful information and should be taken into account for the interpretation and model building. Here, it is demonstrated that, for monodisperse systems, the useful experimental data range is defined by the number of meaningful Shannon channels that can be determined from the data set. An algorithm to determine this number and thus the data range is developed, and it is tested on a number of simulated data sets with various noise levels and with different degrees of oversampling, corresponding to typical SAXS/SANS experiments. The method is implemented in a computer program and examples of its application to analyse the experimental data recorded under various conditions are presented. The program can be employed to discard experimental data containing no useful information in automated pipelines, in modelling procedures, and for data deposition or publication. The software is freely accessible to academic users.
Keywords: smallangle scattering; WAXS; SAXS; solution scattering; protein structure; Shanum.
1. Introduction
Smallangle scattering (SAS) of Xrays (SAXS) and neutrons (SANS) is a powerful method for the analysis of biological macromolecules in solution (Svergun et al., 2013). Over the last decade, major advances in instrumentation and computational methods have led to new and exciting applications of SAXS to structural biology (Graewert & Svergun, 2013). However, for biological systems the contrast of the particles in aqueous solution is rather small and the useful signal may be weak compared with the background (Jacques et al., 2012). This leads to a low signaltonoise ratio for the data, especially at higher scattering angles. A question arises as to how to determine the useful angular data range of the experimental scattering pattern that can be taken for subsequent interpretation and model building. A common practice is to use only that portion of the scattering curve where the signaltonoise ratio exceeds a certain threshold (Skou et al., 2014), but the choice of the threshold remains a rather subjective procedure. Also, relying only on the signaltonoise ratio does not take into account the degree of oversampling of the data.
The problem of assessing the useful data range is also pertinent for other diffraction techniques, e.g. Xray crystallography. Accepted criteria for data quality and accuracy include the signaltonoise ratio of the intensities in the highest resolution shell [〈I/σ(I)〉] and the spread function of the equivalent reflections (R_{merge}) (Wlodawer et al., 2008). In SAS data analysis, no agreed criteria exist and, in view of the recent standardization developments of SAS publications (Jacques et al., 2012; Trewhella et al., 2013) and efforts towards making experimental data and models publicly available (Valentini et al., 2015), the absence of an objective method to assess the useful range of a data set is a serious drawback.
Here, we present an approach using Shannon sampling (Shannon & Weaver, 1949) to determine the useful range in a given experimental scattering data set from a dilute monodisperse system via the number of Shannon channels that can be determined from this data set. To establish a robust algorithm for the determination of this number, simulated data sets with different signaltonoise ratios and different oversampling corresponding to typical Xray and neutron scattering experiments are generated and analysed. The algorithm is implemented in a computer program and applied to experimental SAXS and SANS data sets recorded under various conditions and on various instruments. The proposed method is easy to incorporate into automated analysis pipelines, and it can also be employed to select a fitting range in modelling procedures, especially those relying on higher resolution data, and during data deposition or publication to discard the portions of the (higherangle) SAS data containing no useful information.
2. Truncated Shannon approximation
The scattering intensity I(s) from a of identical particles (e.g. a monodisperse solution of macromolecules) is related to the distance distribution function p(r) in real space as
where s = 4πsin(θ)/λ, 2θ is the scattering angle and λ is the radiation wavelength. Equation (1) takes into account the fact that the p(r) function has finite support and it is equal to zero for all r > D_{max} (where D_{max} is the maximum size of the particle). If I(s) is known, p(r) can be calculated by the inverse transformation
From equations (1) and (2), one can easily see that the functions sI(s) and p(r)/r are Fourier mates related by a sine transformation, and that p(r) is conveniently represented as a Fourier sine series
where n is an integer. Substituting equation (3) into equation (1) gives the Shannon interpolation formula (Shannon & Weaver, 1949)
where s_{n} = nπ/D_{max} are the positions of the Shannon channels.
Equation (4) contains, generally speaking, an infinite number of Shannon channels. However, for experimental data measured over a limited range of scattering vectors (s < s_{max}), the contribution of the channels beyond this range (i.e. with indices n > s_{max}D_{max}/π) to the fit in this range is relatively small. The number of Shannon channels in the measured range, N_{S} = s_{max}D_{max}/π, was therefore suggested (Damaschun et al., 1968; Taupin & Luzzati, 1982) as an estimate of the information content of the scattering data. Methods have been proposed to calculate the p(r) function (Moore, 1980) and to assess fits to experimental data (Rambo & Tainer, 2013) based on the Shannon representation.
Although larger values of N_{S} do generally indicate a greater information content, it is clear that this value alone cannot provide an ultimate estimate, due to the fact that the signaltonoise ratio is not taken into account. Furthermore, SAS data are usually oversampled, i.e. measured with an angular increment Δs much smaller than the distance between the Shannon channels π/D_{max}. The amount of information in the data must be related to both the level of experimental error and the degree of oversampling.
When the summation index in equations (3) and (4) is limited by an integer number M, the corresponding truncated expressions are denoted p_{M}(r) and U_{M}(s), respectively. Given an experimental data set, one can construct its truncated approximation U_{M}(s) using M Shannon channels by minimizing the discrepancy
where the summation index i runs over N experimental points and is the standard deviation for the measured intensity at s_{i}. The best leastsquares solution should meet the condition δχ^{2}/δa_{m} = 0, leading to the system of normal equations
where
For solution scattering experiments, the experimental data I(s_{i}) represent the difference between the scattering from the solute and the pure solvent, and may show negative values due to experimental errors. These negative values should enter equations (5) and (6). However, the computed SAS intensity U_{M}(s_{i}) must always be nonnegative, and equation (6) can be solved using standard methods under the constraint of nonnegativity of a_{n} (Lawson & Hanson, 1974).
The truncated Shannon approximation provides a way of assessing the information content and useful range of an experimental data set. Indeed, if M is too small, this approximation will not have a sufficient number of terms to fit the experimental data. With increasing M one will improve the fit, but at some stage an overfitting would be observed where the determined a_{n} values will not significantly improve the discrepancy, being poorly defined by the experimental data. There should therefore be an optimum (effective) value of the channels M_{S} reflecting the information content of the data, and the useful range of the given experimental data set will be defined as πM_{S}/D_{max}. Note that M_{S} does not necessarily coincide with N_{S}, and the following sections will present a procedure for a reliable automated determination of the effective number of Shannon channels.
3. Noise level and oversampling
In order to test how the truncated Shannon approximation is influenced by noise and oversampling, we have simulated a number of scattering patterns from various geometric bodies (see Table 1). The data were generated with a fixed momentum transfer value up to s_{max} = 4 nm^{−1} and containing varying numbers of Shannon channels for different bodies due to their different size. A dense grid with an angular step Δs = 0.0025 nm^{−1} was used to simulate typical synchrotron Xray data collection, and a sparse grid with Δs = 0.042 nm^{−1} (i.e. having about 17 times fewer points in the same angular range) emulated SANS data. For each intensity point, random Gaussian noise was added, with the relative error of the simulated noise varying from 1 to 400% for the different data sets.

For each simulated data set, Shannon fits were calculated with increasing M according to equations (4)–(6), and the quality of the approximation was assessed by the R factor between the ideal theoretical curve without noise I_{ref}(s) and the corresponding Shannon fit U_{M}(s)/s, according to the formula
The simulated data sets and the best Shannon fits (corresponding to the minimum R factors) are shown in Fig. 1 and in the supporting information (Figs. S1–S4 ). The optimum number of Shannon channels M_{B} providing the best agreement with the ideal curve depends on both the noise level and the angular step (see Table 1). One should also note that the quality of the fits from the truncated Shannon approximation depends on the anisometry of the object. For very anisometric particles, high noise levels (100% noise in Fig. 1a; 20 and 100% noise in Fig. 1b) lead to significant oscillations in the Shannon approximations. Still, all the fits in Fig. 1, even those with oscillations, provide the best agreement with the ideal curve compared with the Shannon fits with other M, and are therefore best fits in terms of the truncated Shannon approximation.
As is evident from Table 1(a), for oversampled and accurate (1–5% noise) data the best Shannon fits sometimes require more channels M_{B} than N_{S}, indicating that the amount of information in the data warrants extrapolation beyond the available range. This possibility reflects the well known property of oversampled measurements of analytical functions [and the scattering intensity, being a Fourier transform of a p(r) function having a finite support, is an analytical function according to the Wiener–Paley–Schwartz theorem (Schwartz, 1952)]. The effect is utilized e.g. for `superresolution' in optical image reconstruction (Frieden, 1971) but can clearly be observed only for very accurate data. Obviously, M_{B} decreases with an increasing level of added noise, but interestingly and somewhat unexpectedly, for oversampled data, even at a very high (100% and above) noise level, M_{B} may still be essentially equal to N_{S} (taking into account the ±1 uncertainty of determination of M_{B}). In other words, oversampled data, even if looking very noisy (e.g. Fig. 1c, bottom curve), still contain useful information about the ideal scattering curve over the entire measured range. In contrast, for data simulated on a sparse angular grid, M_{B} starts to decrease at a noise level of 20–50% (Table 1b), indicating an insufficient quantity of information to define N_{S} channels for sparse noisy data.
4. Determination of the effective number of Shannon channels
In a real experiment, the ideal scattering curve and thus M_{B} are of course not available, and M_{S} should be determined based on experimental data only. The extensive simulations described in the previous section allowed us to define quantitative criteria for the selection of M_{S}. In principle, the choice could be performed by monitoring the discrepancy χ^{2} of the Shannon fit as a function of M, given that the poorly defined channels would not significantly improve the fit. Such a procedure is employed to determine the number of independent components in singularvalue decomposition (Golub & Reinsch, 1970), although formalization of the `nonsignificant' condition is not trivial and the results are not always accurate. Fortunately, a reliable estimate of M_{S} is obtained by combining reciprocal and realspace criteria. Indeed, each Shannon approximation U_{M}(s) expressed by a set of coefficients a_{n} corresponds to a distance distribution in real space p_{M}(r) according to equation (3). Increasing M adds extra terms to p_{M}(r), oscillating with a higher and higher frequency πM/D_{max}. One would expect that the unreliably determined Shannon channels a_{n} will provide nothing but increasing oscillations in the p_{M}(r) function, and this can be captured by a measure of the integral derivative Ω(p)
The quality of the Shannon representation can be characterized by a combined measure
where the coefficient α ensures proper scaling of the two metrics (see below). The procedure to determine the optimum number of Shannon channels M_{S} is therefore formulated as follows:
(i) Given an experimental data set, estimate the maximum particle size D_{max} (this is done e.g. by the programs AutoRG and AutoGnom (Petoukhov et al., 2007).
(ii) Calculate the nominal number of Shannon channels as N_{S} = s_{max}D_{max}/π, and set up the search range. In practical applications, we use M_{min} = max(3, 0.2N_{S}), M_{max} = 1.25N_{S}.
(iii) For M_{min} < M < M_{max}, calculate the coefficients of the Shannon approximation a_{n} (n = 1, … M) by solving equation (6) using a nonnegative linear leastsquares procedure (Lawson & Hanson, 1974).
(iv) For each Shannon fit, calculate the discrepancy χ^{2}(M) and the integral derivative Ω(p_{M}).
(v) Evaluate the scaling coefficient α as the ratio between χ^{2}(M_{max}) and Ω[p(M_{min})].
(vi) Determine the optimum value M_{S} corresponding to the minimum of the target function f(M) as defined in equation (10).
Typical examples of fits with different numbers of Shannon channels and the corresponding p(r) functions are shown in Figs. 2(a) and 2(b) for the case of an oblate ellipsoid. As expected, the χ^{2} values decrease with increasing Shannon channel number (Fig. 3, blue curve), reaching a plateau when approaching M_{B} (which, for this example, coincides with N_{S}). The integral derivative Ω(p_{M}) increases slightly with increasing M and displays a sharp upturn when M exceeds M_{B} (Fig. 3, green curve). This behaviour further confirms the fact that, beyond the range of their reliable definition, the Shannon channels do not significantly improve the fit by the interpolated curve but, at the same time, they lead to strong oscillations in the p(r) function (clearly seen in Fig. 2b). The target function f(M) is dominated by the discrepancy term χ^{2}(M) (misfit to the data) at smaller M, and by the rapidly increasing integral derivative Ω(p_{M}), due to an oscillating p_{M}(r) function at larger M (Fig. 3, red curve). This leads to a characteristic Ushaped profile of f(M) and allows for a straightforward localization of M_{S} corresponding to the minimum of the target function.
A computer program, Shanum, was written to perform the selection of M_{S} following the above algorithm. To verify its performance, Shanum was applied to the simulated scattering curves described in the previous section, and it determined M_{S} values coinciding with M_{B} within one Shannon channel for all cases (Table 1). These extensive test calculations indicated that the proposed algorithm allows one to determine reliably the effective number of Shannon channels in a data set M_{S} and therefore the useful range of the experimental data (since s = πM_{S}/D_{max}).
5. Examples of practical application
After validation using simulated data, the method was applied to a number of experimental Xray and neutron data sets collected over different angular ranges from macromolecular solutions containing particles of various sizes at different concentrations. Some of these examples are presented below to illustrate the capacity of the method to detect the useful data range. The Xray synchrotron scattering data were recorded in collaborative user projects on the X33 beamline of the EMBL (Blanchet et al., 2012) at the storage ring DORISIII (DESY, Hamburg). Fig. 4(a) presents the Xray scattering data from an Importin α/β complex with a molecular mass (M_{r}) of 160 kDa and D_{max} = 19 nm (Falces et al., 2010). Due to the low protein concentration (0.5 mg ml^{−1}), the scattering data are extremely noisy at higher angles. Despite the fact that the measured range of scattering vectors (up to s_{max} = 6 nm^{−1}) nominally contains N_{S} = 36 Shannon channels, the algorithm returns M_{S} = 9, indicating that the highangle data beyond s = 1.5 nm^{−1} contain no useful information. The scattering pattern from the DNA methyltransferase SsoII (M_{r} = 45 kDa, D_{max} = 11 nm) displayed in Fig. 4(b) (Konarev et al., 2014) appears rather noisy starting from s = 2 nm^{−1}, but the algorithm indicates that the data contain useful information up to 4 nm^{−1}. The data from LSAQIDEA Lumazine synthase (Zhang et al., 2006), which forms icosahedral assemblies in solution (with M_{r} = 2 MDa and D_{max} = 33 nm), display a good signaltonoise ratio over the entire range displayed in Fig. 4(c) and the algorithm does find the full data range, with 20 Shannon channels to contain useful information. Interestingly, the Shanum estimates correlate well with the data ranges actually used for data analysis in the abovementioned publications.
It was also interesting to check whether the method is applicable to wideangle Xray scattering (WAXS) data. WAXS curves provide higherresolution information and generally contain larger numbers of Shannon channels compared with SAXS data. We applied Shanum to WAXS data from a concentrated (28 mg ml^{−1}) solution of myoglobin [downloaded from the SmallAngle Scattering Data Bank (SASBDB), www.sasbdb.org , entry SASDAK2] and from a dilute (2 mg ml^{−1}) solution of cytochrome c (recorded at X33; unpublished data). Whereas for the former case the entire measured WAXS range was selected as useful, only about half of this range was deemed informative for the latter case (Fig. 5).
Finally, we shall illustrate the use of the algorithm on several published neutron scattering data sets. Fig. 6(a) displays SANS data from thioredoxin reductase, a dimeric protein with M_{r} = 68 kDa and D_{max} = 11 nm, recorded on the D22 instrument at the Institute Laue Langevin, Grenoble, France (Svergun et al., 1998). The two data sets, collected in H_{2}O and in D_{2}O over the same angular range (up to s_{max} = 5.2 nm^{−1}), nominally both cover N_{S} = 17 Shannon channels. However, the H_{2}O data are noisier, due to the lower contrast and the incoherent background, such that the algorithm returns 14 effective channels for the H_{2}O data and 16 channels for the D_{2}O data. The next example demonstrates that the approach is not limited to biological macromolecules in aqueous solutions. The SANS data in Fig. 6(b) were collected on the KWS2 beamline (Julich Centre for Neutron Science, FRMII reactor, TU München, Germany) from hybrid gold nanoparticles protected by dodecanethiol (C_{12}) or hexanethiol (C_{6}) dissolved in deuterated chloroform (Moglianetti et al., 2014). The top and bottom curves were recorded on the hybrid particles with specifically deuterated dodecanethiol or hexanethiol, respectively. The composite nanoparticle solutions are close to monodisperse, with a diameter of 8 nm, as shown by the shapes of the scattering curves and also by complementary methods. Shanum provides feasible results, suggesting that most of the dodecanethiol curve is informative, whereas the last third of the noisier hexanethiol curve bears no useful information. Given that chemically synthesized nanoparticles inevitably have a certain degree of polydispersity, the presented example indicates the applicability of Shanum not only for a nonbiological system but also for a slightly polydisperse one.
6. Discussion and conclusions
Until now, no established procedure was available to assess the useful range of experimental SAXS and SANS data. The main problems of assessment based on the signaltonoise ratio are a lack of objectivity in the selection of the threshold and the fact that the degree of oversampling is not taken into account. The proposed method overcomes both problems and offers an objective procedure to determine the useful range. The procedure, implemented in the program module Shanum included in the ATSAS package (http://www.emblhamburg.de/biosaxs/software.html ), is freely available to academic users, together with other ATSAS programs as from the 2.6 release.
Given an experimental data set, the program requires only the maximum size of the particle, D_{max}, to determine the useful range. By default, the programs AutoRG and AutoGnom (Petoukhov et al., 2007) are employed to estimate D_{max}, but if this value is known a priori (e.g. when analysing data from a protein with a known structure) it can be specified by the user. Importantly, the Shannon formalism [equations (4)–(6)] is valid not only for the maximum size D_{max} but also for any value D > D_{max}. This makes the entire procedure even more robust, allowing one safely to use a somewhat overestimated maximum size and also to handle slightly polydisperse systems (see the nanoparticle example presented above). In the test and practical calculations presented in this work, the use of 5–10% overestimated values yielded practically the same useful data ranges.
In Xray crystallography, the useful data range assessed by I/σ and R_{merge} determines the set of reflections to enter the and therefore directly defines the resolution of the model. In SAS, cutting out higherangle data would not influence the accuracy of some parameters, e.g. the determined from lowangle data by the Guinier approximation (Guinier, 1939). Obviously, the removal of meaningless data is expected to improve the results of indirect transformation analysis and of the fitting procedures making use of WAXS data (e.g. shape determination using GASBOR; Svergun et al., 2001), and also of the calculation of overall particle parameters such as the Porod volume V_{p}. This last represents the excluded particle volume and is computed as (Porod, 1982)
In practical applications, the Porod invariant Q is calculated over a finite range [0, s_{m}] and appropriate corrections are applied to compensate for the missing data from s_{m} to infinity (e.g. in the POROD module of PRIMUS; Konarev et al., 2003). The lower panel of Fig. 4(a) presents the Porod volume of the Importin α/β complex as a function of the upper integration limit s_{m}. Given an empirical relation V_{p} (in nm^{3}) ≃ 1.7–1.8M_{r} (in kDa) (Petoukhov et al., 2012), the expected Porod volume of the complex is about 280 nm^{3}. The volume computed directly by the POROD module provides stable values, with moderate variations in the useful data range detected by Shanum (i.e. up to s_{m} ≃ 1.3 nm^{−1}), and starts to oscillate wildly as soon as higherangle data are taken into account. Similarly, for DNA methyltransferase SsoII, V_{p} reveals meaningful values of around 75 nm^{3} when s_{m} stays within the useful data range and unreasonable oscillations beyond this range (lower panel of Fig. 4b). Note that, in practice, the above empirical relation is used in the opposite direction and V_{p} is considered to be one of the ways of assessing M_{r} without absolute calibration. These examples illustrate the importance of the removal of meaningless data for preventing potential problems in the determination of basic particle parameters.
We should underline that the proposed algorithm is not intended to serve as a lowpass filter to provide noise reduction by fitting of the experimental data. As evident from Fig. 1, at high noise levels the Shannon fits may display noticeable artificial oscillations, especially for anisometric particles. Further, the truncated Shannon representations inevitably display a termination effect due to the missing higher orders [in particular, U_{M}(s) exhibits unphysical negative values oscillating around zero for arguments exceeding πM/D_{max}]. The method is developed as a means of assessing the information content, and not as a smoothing tool for noisy data.
In cases where the experimental errors in the data set are not available and the value of χ^{2} cannot be reliably calculated, one can use a recently developed correlation map test instead (Franke et al., 2015). In this approach, the agreement between the experimental data and the Shannon approximation is measured by the longest contiguous stretch of the same sign of the residuals, whereby the length of this stretch can be translated into a statistical probability value. We have implemented the correlation map criterion in Shanum as an alternative to χ^{2} in equation (5) and found similar results to the use of discrepancy, allowing one to evaluate reliably the range of useful data also when the experimental errors are not available. The present version of Shanum uses the correlation map if the associated errors are not provided in the input experimental data set.
Importantly, the method proposed here does not require user input and is thus applicable in automated pipelines for data analysis. Further, Shanum is being implemented in a suite of validation tools for the deposited experimental SAXS/SANS data in SASBDB. The principle of assessment of the useful data range proposed here might be useful for other types of scattering or spectroscopic experiments yielding discrete oversampled data.
Acknowledgements
The authors acknowledge the support of the Bundesministerium für Bildung und Forschung (BMBF), project BIOSCAT, grant No. 05K20912, and of the European Commission, FP7 Infrastructure Programme grant BioStructX, project No. 283570.
References
Blanchet, C. E., Zozulya, A. V., Kikhney, A. G., Franke, D., Konarev, P. V., Shang, W., Klaering, R., Robrahn, B., Hermes, C., Cipriani, F., Svergun, D. I. & Roessle, M. (2012). J. Appl. Cryst. 45, 489–495. Web of Science CrossRef CAS IUCr Journals Google Scholar
Damaschun, G., Mueller, J. J. & Puerschel, H. V. (1968). Monatsh. Chem. 99, 2343–2348. CrossRef CAS Google Scholar
Falces, J., Arregi, I., Konarev, P. V., Urbaneja, M. A., Svergun, D. I., Taneva, S. G. & Bañuelos, S. (2010). Biochemistry, 49, 9756–9769. Web of Science CrossRef CAS PubMed Google Scholar
Franke, D., Jeffries, C. & Svergun, D. I. (2015). Nat. Methods, 12, doi: 10.1038/nmeth.3358. Google Scholar
Frieden, B. R. (1971). Evaluation, Design and Extrapolation Methods for Optical Signals, Based on the Use of the Prolate Functions. Progress in Optics, edited by E. Wolf, pp. 312–407. Amsterdam: North Holland. Google Scholar
Golub, G. H. & Reinsch, C. (1970). Numer. Math. 14, 403–420. CrossRef Web of Science Google Scholar
Graewert, M. A. & Svergun, D. I. (2013). Curr. Opin. Struct. Biol. 23, 748–754. Web of Science CrossRef CAS PubMed Google Scholar
Guinier, A. (1939). Ann. Phys. (Paris), 12, 161–237. CAS Google Scholar
Jacques, D. A., Guss, J. M., Svergun, D. I. & Trewhella, J. (2012). Acta Cryst. D68, 620–626. Web of Science CrossRef CAS IUCr Journals Google Scholar
Konarev, P. V., Kachalova, G. S., Ryazanova, A. Y., Kubareva, E. A., Karyagina, A. S., Bartunik, H. D. & Svergun, D. I. (2014). PLoS One, 9, e93453. CrossRef PubMed Google Scholar
Konarev, P. V., Volkov, V. V., Sokolova, A. V., Koch, M. H. J. & Svergun, D. I. (2003). J. Appl. Cryst. 36, 1277–1282. Web of Science CrossRef CAS IUCr Journals Google Scholar
Lawson, C. L. & Hanson, R. J. (1974). Solving LeastSquares Problems. Englewood Cliffs, New Jersey, USA: Prentice–Hall Inc. Google Scholar
Moglianetti, M., Ong, Q. K., Reguera, J., Harkness, K. M., Mameli, M., Radulescu, A., Kohlbrecher, J., Jud, C., Svergun, D. I. & Stellacci, F. (2014). Chem. Sci. 5, 1232–1240. CrossRef CAS Google Scholar
Moore, P. B. (1980). J. Appl. Cryst. 13, 168–175. CrossRef CAS IUCr Journals Web of Science Google Scholar
Petoukhov, M. V., Franke, D., Shkumatov, A. V., Tria, G., Kikhney, A. G., Gajda, M., Gorba, C., Mertens, H. D. T., Konarev, P. V. & Svergun, D. I. (2012). J. Appl. Cryst. 45, 342–350. Web of Science CrossRef CAS IUCr Journals Google Scholar
Petoukhov, M. V., Konarev, P. V., Kikhney, A. G. & Svergun, D. I. (2007). J. Appl. Cryst. 40, s223–s228. Web of Science CrossRef CAS IUCr Journals Google Scholar
Porod, G. (1982). Smallangle Xray Scattering, edited by O. Glatter and O. Kratky, pp. 17–51. London: Academic Press. Google Scholar
Schwartz, L. (1952). Comm. Sém. Math. Univ. Lund, Tome supplémentaire, 196–206. Google Scholar
Shannon, C. E. & Weaver, W. (1949). The Mathematical Theory of Communication. Urbana: University of Illinois Press. Google Scholar
Skou, S., Gillilan, R. E. & Ando, N. (2014). Nat. Protoc. 9, 1727–1739. Web of Science CrossRef CAS PubMed Google Scholar
Svergun, D. I., Koch, M. H. J., Timmins, P. A. & May, R. P. (2013). Smallangle Xray and Neutron Scattering from Solutions of Biological Macromolecules. Oxford University Press. Google Scholar
Svergun, D. I., Petoukhov, M. V. & Koch, M. H. J. (2001). Biophys. J. 80, 2946–2953. Web of Science CrossRef PubMed CAS Google Scholar
Svergun, D. I., Richard, S., Koch, M. H. J., Sayers, Z., Kuprin, S. & Zaccai, G. (1998). Proc. Natl Acad. Sci. USA, 95, 2267–2272. Web of Science CrossRef CAS PubMed Google Scholar
Taupin, D. & Luzzati, V. (1982). J. Appl. Cryst. 15, 289–300. CrossRef CAS Web of Science IUCr Journals Google Scholar
Trewhella, J., Hendrickson, W. A., Kleywegt, G. J., Sali, A., Sato, M., Schwede, T., Svergun, D. I., Tainer, J. A., Westbrook, J. & Berman, H. M. (2013). Structure, 21, 875–881. Web of Science CrossRef CAS PubMed Google Scholar
Valentini, E., Kikhney, A. G., Previtali, G., Jeffries, C. M. & Svergun, D. I. (2015). Nucleic Acids Res. 43, D357–D363. CrossRef PubMed Google Scholar
Wlodawer, A., Minor, W., Dauter, Z. & Jaskolski, M. (2008). FEBS J. 275, 1–21. Web of Science CrossRef PubMed CAS Google Scholar
Zhang, X., Konarev, P. V., Petoukhov, M. V., Svergun, D. I., Xing, L., Cheng, R. H., Haase, I., Fischer, M., Bacher, A., Ladenstein, R. & Meining, W. (2006). J. Mol. Biol. 362, 753–770. CrossRef PubMed CAS Google Scholar
This is an openaccess article distributed under the terms of the Creative Commons Attribution (CCBY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.