research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoFOUNDATIONS
ADVANCES
ISSN: 2053-2733

The generalized F constraint in the maximum-entropy method – a study on simulated data

aLaboratory of Crystallography, University of Bayreuth, Germany
*Correspondence e-mail: [email protected]

(Received 2 April 2002; accepted 28 August 2002)

One of the classical problems in the application of the maximum-entropy method (MEM) to electron-density reconstructions is the uneven distribution of the normalized residuals of the structure factors Mathematical equation of the resulting electron density. This distribution does not correspond to the expected Gaussian distribution and it leads to erroneous features in the MEM reconstructions. It is shown that the classical Mathematical equation constraint is only one of many possible constraints, and that it is too weak to restrict the resulting distribution to the expected Gaussian shape. It is proposed that constraints should be used that are based on the higher-order central moments of the distribution of the structure-factor residuals. In this work, the influence of different constraints on the quality of the MEM reconstruction is investigated. It is proposed that the use of a combined constraint on more than one central moment simultaneously would lead to again improved results. Oxalic acid dihydrate was chosen as model structure, from which several data sets with different resolutions and different levels of noise were calculated and subsequently used in the MEM. The results clearly show that the use of different constraints leads to significantly improved results.

1. Introduction

The maximum-entropy method (MEM) is used as a powerful tool for a model-free image reconstruction in many scientific applications (von der Linden et al., 1998[Linden, W. von der, Dose, V., Fisher, R. & Preuss, R. (1998). Editors. Maximum Entropy & Bayesian Methods. Dordrecht: Kluwer Academic Publishers.]). In crystallography, one particular application is the investigation of the electron density in the crystal structure. After the first promising applications in this field (Collins, 1982[Collins, D. M. (1982). Nature (London), 298, 49-51.]; Sakata & Sato, 1990[Sakata, M. & Sato, M. (1990). Acta Cryst. A46, 263-270.]), several warnings concerning the reliability and possible pathologies of the method appeared (Jauch, 1994[Jauch, W. (1994). Acta Cryst. A50, 650-652.]; de Vries et al., 1996[Vries, R. Y. de, Briels, W. J. & Feil, D. (1996). Phys. Rev. Let. 77, 1719-1722.]). One of the obvious problems was that the distribution of the normalized residuals of the structure factors Mathematical equation strongly deviated from the expected Gaussian distribution. Some of the reflections – usually strong reflections at low angles – had very large Mathematical equation, while the others were fitted almost exactly. The large deviation of the histogram of Mathematical equation from the Gaussian distribution was responsible for unphysical features in the corresponding electron density. A solution to this problem was proposed by de Vries et al. (1994[Vries, R. Y. de, Briels, W. J. & Feil, D. (1994). Acta Cryst. A50, 383-391.]), who employed an ad hoc weighting scheme within the classical Mathematical equation constraint. However, a theoretical basis for this weighting scheme does not exist.

Here we propose new constraints based on the higher-order central moments of the distribution of Mathematical equation. We show that the use of these constraints produces results with better distributions of Mathematical equation and with less artifacts in the reconstructed electron density than the classical Mathematical equation constraint.

The method is tested against data sets of various resolutions and with various noise levels that were computed for a known electron density of oxalic acid dihydrate.

2. The method

The basic principle of the MEM is that the optimal image is defined to be the image with the maximum value of the entropy functional S, while one or more constraints of the type Mathematical equation are fulfilled. For our purposes, the image is the electron density (Mathematical equation) in the unit cell, which is defined by its values Mathematical equation on a grid of Np = N1×N2×N3 points. The entropy is defined as

Mathematical equation

where the values Mathematical equation define the prior or reference electron density. For an overview of the crystallographic applications of the MEM, see Gilmore (1996[Gilmore, C. J. (1996). Acta Cryst. A52, 561-589.]). The constraints should be selected so as to define which image is in agreement with the observed data. The first reasonable constraint is the normalization of Mathematical equation to the expected number of electrons per unit-cell volume:

Mathematical equation

Traditionally, the constraint to the scattering data is the least-squares likelihood criterion Mathematical equation, with

Mathematical equation

where the summation runs over all measured structure factors NF. This definition of the constraint cannot be used directly, since it does not contain the information about the phases of the structure factors and does not lead to convergence. Therefore, the so-called F constraint is usually employed:

Mathematical equation

The value of CF depends on both the amplitudes and phases of Fobs and FMEM. CF is minimal if the phases of all Fobsi are equal to the corresponding FMEMi. In that case, Mathematical equation.

The use of the Mathematical equation statistics (and its phased modification in the CF constraint) is based on an assumption that the experimental errors on Mathematical equation are random with a Gaussian distribution:

Mathematical equation

where Mathematical equation is a sample of the random variable with normalized Gaussian distribution. Since the resulting electron density Mathematical equation should be the best estimate of the true density, the corresponding calculated structure factors FMEM should be the best estimate of Ftrue and the distribution of the normalized residuals should be Gaussian too.

It is obvious that the Gaussian distribution of errors does imply the validity of the Mathematical equation (or CF) constraint, but not vice versa. Constraining only Mathematical equation is not sufficient to ensure the proper Gaussian form of the resulting error distribution.

A probability distribution of a random variable x is characterized by the values of its central moments mn. For the normalized Gaussian distribution, the central moments are

Mathematical equation

The values of the moments of odd order are all zero and the moments of even order are:

Mathematical equation

In the case of N samples of the variable x, the central moments mn can be computed from

Mathematical equation

It follows from (3)[link] and (8)[link] that Mathematical equation is the m2 central moment of the distribution of Mathematical equation. Thus, the concept of generalized F constraint can be introduced, with F2 referring to the classical constraint on the second-order moment, and with Fn defining a constraint based on the moment of order n:

Mathematical equation

Only the constraints with n even restrict the width of the histogram, constraints with n odd are sensitive only to the symmetry of the distribution with respect to the origin. Therefore, only the constraints with n even are used in this work.

It has been suggested that more simultaneous constraints (up to the number of independent observations) of the form Mathematical equation could be used instead of the single Mathematical equation constraint (Carvalho et al., 1996[Carvalho, C., Hashizume, H., Stevenson, A. & Robinson, I. (1996). Physica (Utrecht), B221, 469-486.]). This requires some additional criterion for defining the point of convergence and strongly restricts the role of the MEM as the noise filter. We suggest that the use of several Fn constraints simultaneously is the proper way to handle noisy data, since the expected shape of the histogram is the only information about the noise that is available. However, the available algorithms do not allow such a generalization. Therefore, in the present stage of the work, the influence of different choices of a single constraint based on (9)[link] on the result of MEM was investigated.

3. Computational details

The method was tested on the structure of oxalic acid dihydrate. The main reason for this choice was that this compound has become a kind of standard for charge-density studies. In addition, the structure of oxalic acid dihydrate is very suitable for this type of work, since it is centrosymmetric and the central molecule is planar. This allows an easy interpretation of the majority of the features using only one section of the electron density. The basic characteristics of the structure are summarized in Table 1[link].

Table 1
Basic characteristics of the structure of oxalic acid dihydrate

Chemical formula HOOC—COOH·2H2O
Chemical formula weight 126.06
Cell setting, space group Monoclinic, unique axis b, P21/n
a, b, c (Å) 6.101, 3.500, 11.955
Mathematical equation (°) 105.78
V2) 245.64
Z 2

First, the electron density of the procrystal structure (superposition of independent atoms, Mathematical equation) was created. This was done by a method due to Papoular et al. (2002[Papoular, R. J., Collin, G., Colson, D. & Viallet, V. (2002). In Proccedings of the 21st Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, edited by R. Fry. Melville, NY: American Institute of Physics. To be published.]). The analytical approximation to spherical atomic scattering factors (Su & Coppens, 1997[Su, Z. & Coppens, P. (1997). Acta Cryst. A53, 749-762.]) for each atom of the structure was multiplied by the anisotropic displacement factor of that atom. The resulting three-dimensional distribution in reciprocal space was then transformed by means of the analytical Fourier transform to obtain the electron density of that atom. The density was sampled on the 64 ×32 ×128 pixel grid, which corresponds to a pixel size of approximately 0.1×0.1 ×0.1 Å. The positional and displacement param­eters from the refinement due to Šlouf (2001[Šlouf, M. (2001). PhD Thesis, Charles University, Prague, Czech Republic.]) were used. The electron densities of the individual atoms were then summed to obtain Mathematical equation. The `true' electron density Mathematical equation was then constructed by summing Mathematical equation with the dynamic deformation density Mathematical equation, as determined by the multipole refinement of Šlouf (2001[Šlouf, M. (2001). PhD Thesis, Charles University, Prague, Czech Republic.]) (Fig. 1[link]a). This caused 1.65% of the pixels of the resulting electron density to be negative. The lowest density was −0.021 e Å−3. The negative areas were located in the low-density intermolecular regions. This unphysical feature probably originates from the inaccuracy of the multipole expansion in these very low density regions. The MEM cannot handle these negative regions and very low density regions increase the dynamic ratio of the electron density inadequately. Therefore, the pixels with Mathematical equation e Å−3 were set to 0.005 e Å−3. 2.45% of the pixels were corrected.

[Figure 1]
Figure 1
The sections of the true electron density showing the oxalic acid molecule. (a) The dynamic deformation density Mathematical equation obtained by the multipole refinement (Šlouf, 2001[Šlouf, M. (2001). PhD Thesis, Charles University, Prague, Czech Republic.]). (b) The total electron density Mathematical equation. Scale in Å, contours 0.07 e Å−3, cut-off 2.0 e Å−3, zero contour omitted. Maximum of the deformation density 0.56 e Å−3, maximum of the total density 56.79 e Å−3.

The electron density obtained by this procedure is certainly not the true electron density of oxalic acid dihydrate. The analytical approximation used in the first step is not absolutely accurate and the structure parameters and multipole deformation density can contain a substantial degree of inaccuracy, too. However, this model of electron density is good enough to be used as the reference electron density for MaxEnt calculations and will be denoted as Mathematical equation (Fig. 1[link]b).

The structure factors corresponding to the original map were calculated by means of a numerical Fourier transform. To investigate the influence of noise and resolution on the quality of the MEM reconstruction, 16 different data sets were created. The value Mathematical equation is used as a measure of resolution in this paper. It was chosen to be 0.5, 0.75, 1.0 and 1.25 Å−1 for the respective data sets, and for each resolution four different levels of Gaussian noise were added to the calculated structure factors. To simulate the error distribution in real experimental data, Mathematical equation were calculated from

Mathematical equation

where Mathematical equation defines the noise level, Mathematical equation simulates the influence of non-zero background and p is the commonly used instability factor. The noisy `observed' structure factors were then calculated to fulfil the equation

Mathematical equation

Here, Mathematical equation is a random variable with normalized Gaussian probability distribution. Three different non-zero noise levels were created this way. The noiseless data sets at each resolution were included for checking purposes. Although the structure factors in the noiseless data sets were exact, which means they should be assigned a zero standard deviation, this is not possible owing to the nature of the constraints [equation (9)[link]]. Therefore, the value of Mathematical equation was set to 0.005 for all structure factors so as to be low enough and to allow the computations to finish in a reasonable time. The parameters of different noise levels and resolutions are summarized in Table 2[link] and Fig. 2[link].

Table 2
Parameters of the data sets

Noise levels

  Level 0 Level 1 Level 2 Level 3
Mathematical equation 0.005 0.025 0.1 0.25
Mathematical equation 0 1 10 15
p 0 0.0001 0.0001 0.0001

Resolution

Shells in Independent Observed/unobserved
Mathematical equation−1) reflections Level 1 Level 2 Level 3
Mathematical equation 258 253/5 235/23 217/41
Mathematical equation 608 574/34 468/140 358/250
Mathematical equation 1182 1042/140 714/468 425/757
Mathematical equation 1981 1480/501 604/1377 165/1816
[Figure 2]
Figure 2
Distribution of Mathematical equation as a function of the resolution for different noise levels. Note that for uniform prior Fprior = 0 for all structure factors except F(000). Black: Mathematical equation; dark gray: Mathematical equation; light gray: Mathematical equation; white: Mathematical equation.

It is interesting to compare the phases of structure factors corresponding to Mathematical equation with the phases corresponding to Mathematical equation. In the present case, which is representative for investigations of accurate electron densities, the amount of the unknown structure is minute and the phases of the true structure factors are very well estimated by the phases of the structure factors of Mathematical equation. Among all 4029 structure factors, up to Mathematical equation Å−1, only nine have different phases for Mathematical equation and Mathematical equation. Moreover, equation (11)[link] allows for changes of phases between Fobs and Ftrue. As a consequence of the introduction of the noise, there have been many more phases changed in each noisy data set than nine. Thus, the results presented here are not influenced by the preliminary multipole refinement and can be regarded as being obtained using just the standard refinement.

We have developed our own computer program BayMEM (first version by Schneider, 2001[Schneider, M. (2001). PhD thesis, University of Bayreuth, Germany.]) for the application of the MEM in charge-density analysis. This program is designed to work in general n-dimensional space to allow computations of the MEM electron density of incommensurately modulated structures, but can be used for standard three-dimensional structures too without any restrictions. BayMEM can use both the algorithm of Sakata & Sato (1990[Sakata, M. & Sato, M. (1990). Acta Cryst. A46, 263-270.]) and the MEMSys5 package (Gull & Skilling, 1999[Gull, S. F. & Skilling, J. (1999). MemSys5 v1.2 Program Package. Suffolk, United Kingdom.]). The program was extended to deal with the generalized F constraint. For the present study, the algorithm by Sakata & Sato (1990[Sakata, M. & Sato, M. (1990). Acta Cryst. A46, 263-270.]) was used.

The following characteristics are used to compare the quality of the MaxEnt reconstructions:

  • (i) the values of the even central moments of the distribution of normalized residuals;

  • (ii) the overall shape of the histogram;

  • (iii) the section through Mathematical equation in the plane of the HOOC—COOH molecule;

  • (iv) the section through the difference map Mathematical equation Mathematical equation in the plane of the HOOC—COOH molecule;

  • (v) the MEM deformation density Mathematical equation in the plane of the HOOC—COOH molecule;

  • (vi) the coincidence factor C, which allows for an easy comparison among different reconstructions by one number:

    Mathematical equation

For all n1, n2 and n3 data sets, the computations using the Fn constraints of order 2 to 8 were performed, for the n0 data set only the orders 2 to 6 were used, since there was no visible influence of the constraint on the results. For comparison, the computations using the ad hoc weighting (de Vries et al., 1996[Vries, R. Y. de, Briels, W. J. & Feil, D. (1996). Phys. Rev. Let. 77, 1719-1722.], referred to as static weighting hereafter) were performed on the noisy data sets. The F constraint with additional static weighting is defined as

Mathematical equation

Weights of the form Mathematical equation (Mathematical equation is the length of the diffraction vector) with n equal to 3, 4 and 5 were used in this work. To investigate the influence of the prior electron density, two series of calculations were performed. The first series was made with the uniform prior, the second series with the procrystal prior Mathematical equation.

The quality of the MEM reconstructions can be compared with Fourier maps. The Fourier transform of the observed structure factors with calculated phases results in an electron density (Mathematical equation) that can be compared with Mathematical equation as obtained with the uniform prior. Inspection of Mathematical equation shows that the noise is much larger than in Mathematical equation. This is quantified by the C values (Table 4).

The classical method to derive information about electron densities beyond the model is the difference Fourier. We have computed the difference Fourier for Fobs-Fpro (Mathematical equation). To be able to compare the result with Mathematical equation, we have added Mathematical equation to Mathematical equation. Again, the noise in Mathematical equation is significantly larger than in Mathematical equation (Table 5).

4. Results and discussion

4.1. The uniform prior

In the first series of calculations, a uniform electron density was used as prior. The dominating structure of Mathematical equation is the oscillatory electron density around each atomic position (Fig. 3[link]). Its presence is independent of the constraint and of the noise level. However, at high noise levels these features are partly camouflaged by the noise of Mathematical equation itself. The oscillations are most pronounced at the zero noise level. Clearly, this effect is a demonstration of the series-termination error intrinsically present in the method, as pointed out already by Jauch (1994[Jauch, W. (1994). Acta Cryst. A50, 650-652.]) and later discussed in detail by Roversi et al. (1998[Roversi, P., Irwin, J. J. & Bricogne, G. (1998). Acta Cryst. A54, 971-996.]). The present results show the extent of this effect and its dependence on the resolution of the data set. The amplitude of the artifacts Mathematical equation decreases with resolution, but even at resolution 1.25 Å remains significant (Fig. 3[link], Table 3[link]). Further lowering of the artifacts by increasing the resolution is in practice not possible due to the experimental limitations. Possible ways to overcome this problem are summarized in §5[link].

Table 3
Extremals of the artifacts at different resolutions for n0 noise level and F2 constraint

  max min
r0.50 4.36 −28.62
r0.75 3.32 −10.87
r1.00 4.94 −1.84
r1.25 3.42 −0.95
[Figure 3]
Figure 3
Sections through the difference electron-density map Mathematical equation showing one COOH group. Uniform prior. (a) n0r0.75, contours 0.2 e Å−3, cut-off 3.0 e Å−3. (b) n0r1.00, contours 0.05 e Å−3, cut-off 1.0 e Å−3. (c) n0r1.25, contours as in (b). The decreasing width of the waves of the difference density with increasing resolution and the interference of the waves is clearly visible.

The Mathematical equation obtained for different noise levels and different resolutions is characterized by the C values (Table 4[link]), by the shapes of the histograms of Mathematical equation (Fig. 4[link]), and by the values of the central moments of the distribution of Mathematical equation (Fig. 5[link]). The following conclusions can be made based upon the table and the figures:

  • (i) The use of the higher-order constraints significantly improves the quality of Mathematical equation. The improvement is largest between the F2 and F4 constraints. Only for the noiseless data sets does the use of different constraints not have any effect on the resulting C value, although the effect on the histogram is large. This is because at this noise level the C value is determined mainly by the series-termination artifacts, which are almost independent of the particular constraint. The improvement is generally better with increasing resolution. The probable reason for this is not the higher resolution itself but rather the higher number of reflections in the data set.

  • (ii) The histograms of the higher-order constraints are much closer to the ideal Gaussian distribution than the F2 histograms and the number of very large normalized residuals is reduced (Fig. 4[link]). On the other hand, these histograms are not free of systematic errors either. The histograms of the higher-order constraints tend to be slightly asymmetric towards positive differences. For a smaller number of reflections and/or lower noise level, the histograms tend to have a flatter peak with respect to the ideal shape and in the extreme case split into two distinct peaks (Fig. 4[link]). The two peaks tend to be at the positions Mathematical equation, which correspond to the average value of normalized residual necessary to fulfil the given constraint. This is not the exclusive property of higher-order constraints, similar splitting can appear in the F2 histograms, too, although only in very extreme cases (n0r0.50).

  • (iii) The quality of the result (measured by the C value) is perfectly correlated with the quality of the histogram expressed by the values of its central moments. The best results are obtained with that constraint, which produces a histogram closest to the expected normalized Gaussian (compare Table 4[link] and Fig. 5[link]). With increasing order of the constraint, the resulting histograms get better first (the large positive slope of the curve in Fig. 5[link] gets smaller) and then the high-order central moments of the histograms become overestimated (the slope of the curves in Fig. 5[link] becomes negative). The best result is obtained when the slope of the curve is close to zero. We suggest that, if there are two constraints close to the optimal slope, the one with positive slope should be preferred. This can be understood to be a choice between slightly underestimating and slightly overestimating the data. Using the constraint with positive slope means possibly losing some information present in the data, using the one with negative slope means letting the MEM fit some noise and thus introducing some false features in the resulting Mathematical equation. But in practice the difference between the two results is negligible.

The improvement of the Mathematical equation is visible in both the total and difference electron-density maps Mathematical equation and Mathematical equation (Fig. 6[link]). The waviness of the low-density contours in Mathematical equation is suppressed, the overall amount of the residual structure in Mathematical equation decreases. It should be noted that the total density maps do not give sufficient insight into the accuracy of the result and cannot be used as a single criterion of the quality of the MaxEnt reconstruction. This can be seen from the comparison of the total and difference maps (Fig. 6[link]). The largest errors occur in the medium and high density levels, where the total density map seems to be smooth and well behaved. This is especially true for the low-resolution maps, which seem to be smooth at first sight, but which exhibit large differences in comparison to the original map.

Table 4
The coincidence factors Mathematical equation for MaxEnt calculations using the uniform prior and Mathematical equation

Fn denotes the generalized F constraint of order n, swn denotes the static weighting with weight Mathematical equation [for definition see equation (13)[link], for definition of shorthand notation of different data sets see Table 2[link]]. Note: Some calculations could not be finished using the algorithm of Sakata & Sato (1990[Sakata, M. & Sato, M. (1990). Acta Cryst. A46, 263-270.]) due to convergence problems. For static weighting computations, this could be overcome by using the MEMSys5 package (Gull & Skilling, 1999[Gull, S. F. & Skilling, J. (1999). MemSys5 v1.2 Program Package. Suffolk, United Kingdom.]). These results are shown in italic. Generally, the differences between the results of the two algorithms are not very large, but the results of the latter algorithm seem to be slightly better. The calculation with the F6 constraint on the n0r0.50 data set did not converge (denoted by n.c.).

Data set F2 F4 F6 F8   sw3 sw4 sw5   Mathematical equation
n3r0.50 0.3515 0.2971 0.2942 0.2961   0.2884 0.2631 0.2548   1.3375
n3r0.75 0.3455 0.2237 0.2180 0.2230   0.1836 0.1546 0.1567   1.2187
n3r1.00 0.4137 0.2021 0.1873 0.1885   0.1569 0.1119 0.1179   1.1329
n3r1.25 0.4880 0.2316 0.1976 0.1970   0.1709 0.1073 0.1046   1.1434
                     
n2r0.50 0.2730 0.2498 0.2515 0.2539   0.2447 0.2353 0.2326   1.3323
n2r0.75 0.2126 0.1476 0.1469 0.1502   0.1359 0.1212 0.1209   1.2073
n2r1.00 0.2250 0.1059 0.1010 0.1033   0.0935 0.0661 0.0685   1.1000
n2r1.25 0.2755 0.1063 0.0967 0.0969   0.1018 0.0632 0.0629   1.0440
                     
n1r0.50 0.2287 0.2250 0.2254 0.2260   0.2233 0.2221 0.2457   1.3290
n1r0.75 0.1186 0.1026 0.1026 0.1033   0.1017 0.0998 0.1303   1.2061
n1r1.00 0.0815 0.0458 0.0448 0.0456   0.0451 0.0382 0.0708   1.0977
n1r1.25 0.0952 0.0365 0.0343 0.0353   0.0355 0.0255 0.0247   1.0339
                     
n0r0.50 0.2199 0.2199 n.c.              
n0r0.75 0.0949 0.0949 0.0950              
n0r1.00 0.0286 0.0289 0.0290              
n0r1.25 0.0147 0.0151 0.0155              
[Figure 4]
Figure 4
The histograms of Mathematical equation for different constraints. Uniform prior. For the F2 histograms, only the central section is shown for good comparability; the full histogram is shown in the inset. The ideal Gaussian shape is shown as the grey area in each histogram. The counts of normalized residuals in classes higher than 4.0 are multiplied by 10.
[Figure 5]
Figure 5
The even central moments m2 to m16 of the histograms of all MEM runs on the n2 data sets. Uniform prior. Horizontal axis = order of the moment, vertical axis = normalized values of the moments mn(MEM)/mn(Gauss) on a logarithmic scale. Each curve corresponds to one histogram and is labeled with the constraint used for the MaxEnt calculation.
[Figure 6]
Figure 6
Mathematical equation and Mathematical equation obtained with the n2r1.00 data set and with the uniform prior. (a) Mathematical equation, F2 constraint. (b) Mathematical equation, F6 constraint. (c) Mathematical equation, F2 constraint. (dMathematical equation, F6 constraint. All contours as in Fig. 1[link].

Despite the significant improvement of the MEM reconstructions obtained with the constraints on the higher-order moments, the quality of the reconstructions using the static weighting was in our case even better (Table 4[link]). This surprising effectiveness of the idea of the static weighting suggests that there might exist some fundamental reason for it. A closer investigation of possible theoretical foundations of this type of weighting is desirable.

The systematic investigation of the large number of different data sets allows one to make some general conclusions about the influence of the noise and the resolution on the quality of the result. The expected improvement of the C factors with decreasing noise level is clearly visible. The improvement with the increasing resolution is visible, too, but not as an absolute rule (compare C values of n3r1.00 and n3r1.25, n2r1.00 and n2r1.25 in Table 4[link]). This can be correlated with Fig. 2[link]. The larger the fraction of unobserved reflections present in the outer shell, the smaller is the amount of information it contains. In the data sets with the high noise level, almost all reflections in the outer shells are less-than's, and they cannot contribute to the improvement of the MEM reconstruction.

4.2. The procrystal prior

In the second series of calculations, the procrystal electron density Mathematical equation was used as prior. The summary of the resulting C values is given in Table 5[link]. The deformation density Mathematical equation obtained with data sets n2r1.00 and n1r0.75 is shown in Fig. 7[link]. We believe that these examples are quite close to the data sets obtainable in practice.

Table 5
The coincidence factors Mathematical equation for MaxEnt calculations using the procrystal prior and Mathematical equation

For explanation of the symbols see Table 4[link]. The C factor of the procrystal prior is 0.0598.

Data set F2 F4 F6 F8   sw3 sw4 sw5   Mathematical equation
n3r0.50 0.0538 0.0560 0.0589 0.0585   0.0554 0.0574 0.0575   0.1015
n3r0.75 0.0554 0.0552 0.0580 0.0574   0.0534 0.0513 0.0533   0.2023
n3r1.00 0.0598 0.0597 0.0590 0.0592   0.0598 0.0598 0.0545   0.3308
n3r1.25 0.0598 0.0598 0.0598 0.0598   0.0598 0.0598 0.0555   0.4856
                     
n2r0.50 0.0421 0.0423 0.0443 0.0458   0.0400 0.0403 0.0386   0.0598
n2r0.75 0.0434 0.0404 0.0414 0.0433   0.0361 0.0353 0.0328   0.1016
n2r1.00 0.0496 0.0447 0.0453 0.0466   0.0420 0.0358 0.0372   0.1744
n2r1.25 0.0545 0.0491 0.0486 0.0496   0.0483 0.0473 0.0350   0.2702
                     
n1r0.50 0.0285 0.0259 0.0258 0.0262   0.0253 0.0248 0.0236   0.0340
n1r0.75 0.0275 0.0233 0.0219 0.0224   0.0209 0.0184 0.0172   0.0206
n1r1.00 0.0290 0.0220 0.0208 0.0211   0.0205 0.0170 0.0157   0.0339
n1r1.25 0.0321 0.0245 0.0229 0.0229   0.0218 0.0174 0.0150   0.0563
                     
n0r0.50 0.0224 0.0223 0.0223              
n0r0.75 0.0106 0.0105 0.0104              
n0r1.00 0.0057 0.0056 0.0057              
n0r1.25 0.0038 0.0041 0.0045              
[Figure 7]
Figure 7
MEM deformation electron density, Mathematical equation. Calculations with Mathematical equation prior. (a) n2r0.75 data set, F4 constraint. (b) n1r1.00 data set, F6 constraint. All contours as in Fig. 1[link].

As expected, the artifacts are strongly reduced and visible only in the vicinity of the atomic center. The deformation density resembles the true deformation density quite well even for the medium noise level. The differences in C factors among the different Fn constraints and the different static weighting are much smaller than in the case of the uniform prior, but they are still significant, especially for the low noise levels.

With increasing noise level, the outer shells of structure factors contain so much noise that it masks their statistical difference from the prior structure factors. Such reflections do not improve the result and can even lead to a slightly worse Mathematical equation (compare Table 5[link] and Fig. 2[link]). In an extreme case – noise level 3 – the reflections do not provide any additional information at all and Mathematical equation is almost identical with the prior. In other words, the MEM indicates that the data do not contain any evidence for deviation from the prior.

The results confirm that, with procrystal prior information, the MEM is able to reveal the deformation electron density even from the medium-resolution data, provided they are sufficiently accurate.

5. Conclusions

The intrinsic presence of the series-termination effect in the crystallographic applications of the MEM is demonstrated. The extent of this effect depends on the resolution of the data set and on the kind of prior electron density. For the uniform prior, the artifacts are significantly higher than the bonding electron-density level and make this version of the MEM unsuitable for investigation of fine features in the electron density. Nevertheless, it is still a useful method for investigation of more robust features like anharmonic atomic movement or disorder (Bagautdinov et al., 1998[Bagautdinov, B., Luedecke, J., Schneider, M. & van Smaalen, S. (1998). Acta Cryst. B54, 626-634.]; Dinnebier et al., 1999[Dinnebier, R. E., Schneider, M., van Smaalen, S., Olbrich, F. & Behrens, U. (1999). Acta Cryst. B55, 35-44.]; Wang et al., 2001[Wang, C.-R., Jai, T., Tomiyama, T., Yoshida, T., Kobayashi, Y., Nishibori, E., Takata, M., Sakata, M. & Shinohara, H. (2001). Angew. Chem. Int. Ed. Engl. 40/2, 397-399.]).

The procrystal prior electron density lowers the artifacts and the reconstructions with this prior contain the information about the fine features of the electron density. Further lowering of the artifacts could probably be achieved with the two-channel MEM (Papoular et al., 1996[Papoular, R. J., Vekhter, Y. & Coppens, P. (1996). Acta Cryst. A52, 397-407.]) or with the valence-only MEM proposed by Roversi et al. (1998[Roversi, P., Irwin, J. J. & Bricogne, G. (1998). Acta Cryst. A54, 971-996.]). The latter method uses the refined structure parameters to create a core electron-density fragment, which is then considered to be known and is not included in the MaxEnt optimization. Only the valence electron density is modified. However, this method is of practical use only for extremely accurate data from simple structures, since it relies on the knowledge of the temperature parameters, which are often inaccurate and correlated with systematic errors in the data sets.

The use of the generalized F constraint dramatically improves the quality of the MEM results. The selection criterion for the proper order is the best coincidence of the histogram with the expected Gaussian distribution. From our experience, the order 4 or 6 gives the best result.

Static weighting still gives better results than the non-weighted Fn constraints. But this type of weighting lacks any theoretical foundation, and the choice of the best weighting is very data set dependent (Yamamoto et al., 1996[Yamamoto, K., Takahashi, Y., Ohshima, K., Okamura, F. P. & Yukino, K. (1996). Acta Cryst. A52, 606-613.]). On the other hand, the constraints based on the expected moments of the distribution of Mathematical equation have a clear interpretation. One can expect that the new algorithms that will allow the simultaneous use of several constraints in the MEM will again lead to improved results.

One more advantage of the higher-order F constraints in comparison to the classical F2 constraint or static weighting is faster convergence, which makes the computation time significantly shorter.

Acknowledgements

Financial support by the Deutsche Forschungsgemeinschaft is gratefully acknowledged.

References

First citationBagautdinov, B., Luedecke, J., Schneider, M. & van Smaalen, S. (1998). Acta Cryst. B54, 626–634.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationCarvalho, C., Hashizume, H., Stevenson, A. & Robinson, I. (1996). Physica (Utrecht), B221, 469–486.  CrossRef Web of Science Google Scholar
First citationCollins, D. M. (1982). Nature (London), 298, 49–51.  CrossRef CAS Web of Science Google Scholar
First citationDinnebier, R. E., Schneider, M., van Smaalen, S., Olbrich, F. & Behrens, U. (1999). Acta Cryst. B55, 35–44.  Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
First citationGilmore, C. J. (1996). Acta Cryst. A52, 561–589.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationGull, S. F. & Skilling, J. (1999). MemSys5 v1.2 Program Package. Suffolk, United Kingdom.  Google Scholar
First citationJauch, W. (1994). Acta Cryst. A50, 650–652.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationLinden, W. von der, Dose, V., Fisher, R. & Preuss, R. (1998). Editors. Maximum Entropy & Bayesian Methods. Dordrecht: Kluwer Academic Publishers.  Google Scholar
First citationPapoular, R. J., Vekhter, Y. & Coppens, P. (1996). Acta Cryst. A52, 397–407.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationPapoular, R. J., Collin, G., Colson, D. & Viallet, V. (2002). In Proccedings of the 21st Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, edited by R. Fry. Melville, NY: American Institute of Physics. To be published.  Google Scholar
First citationRoversi, P., Irwin, J. J. & Bricogne, G. (1998). Acta Cryst. A54, 971–996.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationSakata, M. & Sato, M. (1990). Acta Cryst. A46, 263–270.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationSchneider, M. (2001). PhD thesis, University of Bayreuth, Germany.  Google Scholar
First citationŠlouf, M. (2001). PhD Thesis, Charles University, Prague, Czech Republic.  Google Scholar
First citationSu, Z. & Coppens, P. (1997). Acta Cryst. A53, 749–762.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationVries, R. Y. de, Briels, W. J. & Feil, D. (1994). Acta Cryst. A50, 383–391.  CrossRef Web of Science IUCr Journals Google Scholar
First citationVries, R. Y. de, Briels, W. J. & Feil, D. (1996). Phys. Rev. Let. 77, 1719–1722.  CrossRef Web of Science Google Scholar
First citationWang, C.-R., Jai, T., Tomiyama, T., Yoshida, T., Kobayashi, Y., Nishibori, E., Takata, M., Sakata, M. & Shinohara, H. (2001). Angew. Chem. Int. Ed. Engl. 40/2, 397–399.  Web of Science CrossRef Google Scholar
First citationYamamoto, K., Takahashi, Y., Ohshima, K., Okamura, F. P. & Yukino, K. (1996). Acta Cryst. A52, 606–613.  CrossRef CAS Web of Science IUCr Journals Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.

Journal logoFOUNDATIONS
ADVANCES
ISSN: 2053-2733
Follow Acta Cryst. A
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds