Received 29 June 2001
A new interpretation and practical aspects of the direct-methods modulus sum function. VIII
Since the first publication of the direct-methods modulus sum function [Rius (1993). Acta Cryst. A49, 406-409], the application of this function to a variety of situations has been shown in a series of seven subsequent papers. In this way, much experience about this function and its practical use has been gained. It is thought by the authors that it is now the right moment to publish a more complete study of this function which also considers most of this practical knowledge. The first part of the study relates, thanks to a new interpretation, this function to other existing phase-refinement functions, while the second shows, with the help of test calculations on a selection of crystal structures, the behaviour of the function for two different control parameters. In this study, the principal interest is focused on the function itself and not on the optimization procedure which is based on a conventional sequential tangent formula refinement. The results obtained are quite satisfactory and seem to indicate that, when combined with more sophisticated optimization algorithms, the application field of this function could be extended to larger structures than those used for the test calculations.
Direct methods solve crystal structures by combining the information contained in the measured intensities with the positivity and atomicity of the electron-density distribution. Conventional multisolution direct methods refine the collectivity of phases h of the normalized structure factors Eh of the strong reflections h by searching for an extremum of a given target or phase refinement function. For simplicity, an equal-atom structure with N atoms in the unit cell belonging to space group P1 will be assumed throughout this paper.
From a practical point of view, there exist two very important types of phase-refinement functions:
The residual R() can be expressed in an alternative way by making full use of the atomicity constraint. Effectively, if the atomicity condition is fulfilled, then, as predicted by the acentric probability distribution of the |E| values, the moduli |E| have restricted values close to 1 (Wilson, 1949; Woolfson, 1970). Consequently, the following approximate relationship between the moduli |E| and their squares |E|2 will hold (Fig. 1), namely
By introducing in |E| the dependence on and taking the average over H, C() may be approximated by
S() is the so-called direct-methods modulus sum function (Rius, 1993; Rius, Torrelles & Miravitlles, 2000). From (10), it follows that, as long as the atomicity condition is satisfied, minimization of R() is equivalent to maximizing S() and that no estimation of scaling constants is necessary. By use of the same procedure as Debaerdemaeker et al. (1985), the new phase estimates maximizing S() can be solved for the limit of S(),
| || Figure 1 |
|E|2 approximated with equation (straight lines) using experimental mean values taken from MBH2; dashed line for Nweak Nlarge; continuous line with all reflections.
Two important parameters in the implementation of function S() have been investigated with the help of some test calculations: (i) the value of |E|min, i.e. the cut-off value for considering a reflection as large, and (ii) once the optimum |E|min is fixed, the ratio r of number of weak reflections to number of large reflections (r = Nweak/Nlarge).
This value controls the number of large reflections in S() and, consequently, the number of phases to be refined. It also has a direct influence on the accuracy of |EH()| since, as indicated in (4), |EH()| is expressed as a function of using Sayre's equation. It is tempting to select a large |E|min to reduce as much as possible the number of refined phases. However, since this is at the cost of the accuracy of |EH()|, it is first necessary to study to what extent the variation of |E|min affects the accuracy at different data resolutions dmin. This study has been performed with the data of the organic structure PEP1. The indicator selected for measuring the evolution of the accuracy for different |E|min and dmin values has been the correlation coefficient CC between the values |Eh| and Xh = |Eh()| cos(-h + h), the latter calculated with correct phases (Fig. 2),
The principal conclusion from inspection of Fig. 2 is that, for dmin 1 Å, |E|min values up to approximately 1.6 can be tolerated (CC = 0.96). At this resolution, the degradation of the accuracy progresses relatively slowly. This is in clear contrast with the situation for dmin 1.2 Å. Here, the accuracy deteriorates markedly as evidenced by the value CC = 0.89 obtained for |E|min = 1.55. This low CC value means that, at this resolution, purely organic compounds will be hardly solved with the S() function.
| || Figure 2 |
Correlation coefficient CC between |Eh| and Xh = |Eh()| cos(-h + h) for different cut-off values |E|min and data resolutions dmin. Xh computed with correct phases. Test data taken from a purely organic compound. For dmin 1.0 Å, CC is relatively insensitive to the cut-off value, so that values as large as |E|min 1.6 can be tolerated. For dmin > 1 Å, deterioration progresses very quickly.
Once the upper cut-off value of |E|min is fixed at 1.6-1.7 for dmin values close to 1 Å, tests with smaller |E|min values on a variety of diffraction data sets were performed. Their degree of difficulty is variable and in most cases the classical tangent formula (Karle & Hauptman, 1956; Yao, 1981) cannot solve them in a reasonable number of trials. The diffraction data are almost complete up to dmin = 1 Å and the number of weak reflections has been estimated from Nweak Nlarge. The information about the selected compounds is summarized in Table 1 which also contains the PDB file code when protein data have been deposited in the Protein Data Bank. The largest structures are at the bottom of the table. All calculations have been carried out with a new version of program XLENSTM which, after clustering and sorting the direct-methods solutions according to the refined S() values, automatically performs, for each solution, a Fourier-recycling step followed by the corresponding R-value calculation for the measured reflections. The direct-methods solutions are analysed until the correct solution is found or a given cut-off value of S() is reached. The R values in Table 2 have been computed with
assuming an overall thermal parameter. A solution has been considered as correct if after 10 Fourier cycles most atoms show up in the E map. The number of cycles was enlarged to 40 for the largest structures tested (and for PGE2).
The three analysed |E|min values were 1.25, 1.45 and 1.65. Inspection of the data listed in Table 2 clearly shows that the best results are obtained for |E|min = 1.25. Not only the ranking number of the correct solutions is lower but also the number of correct solutions increases. In most cases, the correct solution is the top solution. This result is understandable since lower |E|min values produce more accurate |E(H, )| and, owing to the size of these structures, the increase in the number of variables is still manageable.
For the larger structures, only tests with |E|min in the range 1.45-1.59 have been performed owing to the large number of triplets. For APP, rubredoxin and alpha1, the results look very promising. One surprise has been the rather low ranking number of the correct solutions for these three compounds. For pheromone, however, the ranking number of the correct solution is 65 (out of 4000 trials), which could be identified because it has the best PSIZERO and RESID figures of merit. Tests with larger structures possessing the origin fixed in all three directions, e.g. P212121, have been avoided. As shown in Rius et al. (1994), the number of correct solutions drastically decreases in comparison with space groups having the origin floating at least in one direction. Since the maximization procedure used by the actual version of XLENSTM is a conventional tangent formula refinement, it is very improbable that correct solutions can be found for these compounds.
Since the first applications of the S() function, the number of weak reflections has been estimated by making Nweak Nlarge. Owing to the good results obtained from the beginning, no further investigations were carried out regarding this point. Recently, in order to complete the study of the function, XLENSTM has been slightly modified to allow for different ratios r. The results of a series of test calculations for |E|min = 1.25 with ratios ranging from 0.5 to 2 are listed in Table 3. These results indicate that the efficiency of the function is rather insensitive to ratio variations within the studied interval.
Finally, a series of test calculations has been performed including not only the large and weak reflections but all measured reflections. As can be seen in Table 4, the results are also excellent, although the practical importance of these results is limited because the computing effort is much higher than for lower ratios.
Work supported by the Ministerio de Educacíon y Cultura (Project PB98-0483).
Anderson, D. H., Weiss, M. S. & Eisenberg, D. (1996). Acta Cryst. D52, 469-480.
Antel, J., Sheldrick, G. M., Bats, J. W., Kessler, H. & Müller, A. (1995). Unpublished.
Bhat, T. N. & Ammon, H. L. (1990). Acta Cryst. C46, 112-116.
Bhuiya, A. K. & Stanley, E. (1963). Acta Cryst. 16, 981-984.
Burla, M. C., Camalli, M., Carrozzini, B., Cascarano, G. L., Giacovazzo, C., Polidori, G. & Spagna, R. (2000). Acta Cryst. A56, 451-457.
Butters, T., Hütter, P., Jung, G., Pauls, N., Schmitt, H., Sheldrick, G. M. & Winter, W. (1981). Angew. Chem. 93, 904-905.
Debaerdemaeker, T., Tate, C. & Woolfson, M. M. (1985). Acta Cryst. A41, 286-290.
Debaerdemaeker, T. & Woolfson, M. M. (1983). Acta Cryst. A39, 193-196.
DeTitta, G. T., Langs, D. A., Edmonds, J. W. & Duax, W. L. (1980). Acta Cryst. B36, 638-645.
Glover, I., Haneef, I., Pitts, J.-E., Wood, S. P., Moss, D., Tickle, I. J. & Blundell, T. L. (1983). Biopolymers, 22, 293-304.
Irngartinger, H., Reibel, W. R. K. & Sheldrick, G. M. (1981). Acta Cryst. B37, 1768-1771.
Karle, J. & Hauptman, H. (1956). Acta Cryst. 9, 635-651.
Oliver, J. D. & Strickland, L. C. (1984). Acta Cryst. C40, 820-824.
Pedio, M., Felici, R., Torrelles, X., Rudolf, P., Capozi, M., Rius, J. & Ferrer, S. (2000). Phys. Rev. Lett. 85, 1040-1043.
Poyser, J. P., Edwards, P. L., Anderson, J. R., Hursthouse, M. B., Walker, N. P., Sheldrick, G. M. & Whalley, A. J. S. (1986). J. Antibiot. 39, 167-169.
Privé, G. G., Anderson, D. H., Wesson, L., Cascio, D. & Eisenberg, D. (1999). Protein Sci. 8, 1-9.
Rius, J. (1993). Acta Cryst. A49, 406-409.
Rius, J. (2000). Powder Diffr. 14, 267-273.
Rius, J. & Miravitlles, C. (1991). Acta Cryst. A47, 567-571.
Rius, J., Miravitlles, C. & Allmann, R. (1996). Acta Cryst. A52, 634-639.
Rius, J., Miravitlles, C., Gies, H. & Amigó, J. M. (1999). J. Appl. Cryst. 32, 89-97.
Rius, J., Sañé, J., Miravitlles, C., Amigó, J. M., Reventós, M. M. (1994). Acta Cryst. A51, 268-270.
Rius, J., Sañé, J., Miravitlles, C., Gies, H., Marler, B. & Oberhagemann, U. (1995). Acta Cryst. A51, 840-845.
Rius, J., Torrelles, X. & Miravitlles, C. (2000). EPDIC-7: Abstracts Book, p. 59.
Rius, J., Torrelles, X., Miravitlles, C., Ochando, L. E., Reventós, M. M. & Amigó, J. M. (2000). J. Appl. Cryst. 33, 1208-1211.
Sheldrick, G. M., Dauter, Z., Wilson, K. S., Hope, H. & Sieker, L. C. (1993). Acta Cryst. D49, 18-23.
Smith, G. D., Duax, W. L., Langs, D. A., DeTitta, G. T., Edmonds, J. W., Rohrer, D. C. & Weeks, J. (1975). J. Am. Chem. Soc. 97, 7242-7247.
Suck, D., Manor, P. C. & Saenger, W. (1976). Acta Cryst. B32, 1727-1737.
Szeimies-Seebach, U., Harnisch, J., Szeimies, G., Van Meersche, M., Germain, G. & Declercq, J. P. (1978). Angew. Chem. Int. Ed. Engl. 17, 848-850.
Torrelles, X., Rius, J., Boscherini, F., Heun, S., Mueller, B. H.,Ferrer, S., Alvarez, J. & Miravitlles, C. (1998). Phys. Rev. B, 57, R4281-4284.
Torrelles, X., Rius, J., Miravitlles, C. & Ferrer, S. (1998). Surf. Sci. 423, 338-345.
Weeks, C. M., DeTitta, G. T., Hauptman, H. A., Thuman, P. & Miller, R. (1994). Acta Cryst. A50, 210-220.
Weeks, C. M., DeTitta, G. T., Miller, R. & Hauptman, H. A. (1993). Acta Cryst. D49, 179-181.
Williams, D. J. & Lawton, D. (1975). Tetrahedron Lett. pp. 111-114.
Wilson, A. J. C. (1949). Acta Cryst. 2, 318-321.
Woolfson, M. M. (1970). An Introduction to X-ray Crystallography, pp. 234-240. Cambridge University Press.
Yao, J. X. (1981). Acta Cryst. A37, 642-644.