research papers
Highorder analytic translation matrix elements for realspace sixdimensional polar Fourier correlations
^{a}Department of Computing Science, University of Aberdeen, Aberdeen AB24 3UE, Scotland
^{*}Correspondence email: dritchie@csd.abdn.ac.uk
Analytic expressions are presented for calculating translations of highorder threedimensional expansions of orthonormal real spherical harmonic and Gaussiantype or exponentialtype radial basis functions. When used with real spherical harmonic rotation matrices, the resulting translation matrices provide a fully analytic method of calculating sixdimensional realspace rotational–translational correlations. The correlation algorithm is demonstrated by using an exhaustive search to superpose the steric density functions of a pair of similar globular proteins in a matter of seconds on a contemporary personal computer. It is proposed that the techniques described could be used to accelerate the calculation of e.g. realspace electron density correlations in docking proteins into density maps, and searching the Protein Data Bank for structural homologues.
Keywords: polar Fourier correlation; 6D correlation; fast Fourier transform; fast Hartley transform; spherical Bessel transform; Gaussiantype orbitals; Slatertype orbitals; exponentialtype orbitals; Laguerre polynomials; protein shape; protein docking.
1. Introduction
Solving protein crystal structures by , 2001), fitting protein structures into (EM) density maps (Roseman, 2000; Rossmann, 2000; Frank, 2002), and predicting the docking mode of a pair of proteins (KatchalskiKatzir et al., 1992; Ritchie & Kemp, 2000) are three examples of tasks which typically involve searching for the coordinate transformation that maximizes a sixdimensional correlation. Often, such correlations can be expressed as the overlap between one or more pairs of threedimensional scalar functions, such as electron densities, and it is primarily this type of correlation that is considered here. For example, if some phase information is available from multiple (MAD) or multipe (MIR) data, or from a partial model, the phased translation function may be used to correlate a lowresolution with the density from a model structure to help locate the model in the crystal (Read & Schierbeek, 1988). This type of translational correlation may be accelerated using a threedimensional fast Fourier transform (FFT). However, the correct solution can sometimes be missed if the rotational solution is in error. This can often be corrected by varying the rotational orientations (Fujinaga & Read, 1987) or by translating multiple rotational peaks (Navaza, 2001), but both these techniques entail additional computational cost. Hence a more thorough but economical method of correlating electron densities could be beneficial. We note that combined sixdimensional rotationaltranslational Patterson correlation (PC) approaches have recently been successful in solving difficult MR cases without using phase information (Chang & Lewis, 1997; Sheriff et al., 1999). However, these approaches are based on pointwise evaluations of the PC in (Chang & Lewis, 1997; Kissinger et al., 1999; Jamrog & Zhang, 2003) and hence appear not to be amenable to the type of correlation discussed here. Correlating electron densities may also be used to locate secondary structure fragments in density maps (Cowtan, 1998), and has been proposed as a novel way to search the Protein Data Bank (PDB) (Sussman et al., 1998) for candidate models (Cowtan, 2001). Similarly, protein–protein docking algorithms often use a threedimensional translational FFT to search for good docking orientations (KatchalskiKatzir et al., 1992; Vakser, 1995). However, in each of these examples the translational FFTs must be repeated for a large number of rotational orientations in order to cover the remaining three Hence, more efficient correlation techniques would be useful in these areas as well.
(MR) (Rossmann, 1990The FFT may also be used to accelerate rotational searches. For example, the MR fast rotation function represents a ). Vagin & Isupov (2001) adapted the fast rotation function to implement a phased rotation function (rotational density correlation) which allows a spherically averaged search model to be located in the using the phased translation function before solving its rotational orientation. However, this approach still requires rotational and translational correlations to be treated separately. More recently, Kovacs et al. (2003) employed an elegant Euler angle factorization in order to fit proteins into EM density maps using a fivedimensional rotational FFT. However, their approach entails calculating certain residual overlap integrals which must be computed numerically.
of selfvectors as an expansion of spherical Bessel and spherical harmonic functions and uses a twodimensional FFT to orient a search model with respect to the observed diffraction data (Navaza, 1987In our own areas of interest, namely protein docking and shape comparison, we developed a sixdimensional spherical polar Fourier (SPF) correlation approach (Ritchie & Kemp, 2000) to try to address the limitations of the Cartesian FFT docking methods. In the SPF representation, each threedimensional scalar property of interest is written as an expansion of real spherical harmonic plus orthonormal radial basis functions. This allows the correlation between a pair of proteins to be expressed naturally as a search over one translational and five rotational Because the rotational properties of the spherical harmonics have been described extensively elsewhere (see e.g. Rose, 1957; Biedenharn & Louck, 1981a), this article focuses on calculating translations for SPF expansions. Formerly, we calculated translations by numerical integration in the (r, θ) plane (Ritchie & Kemp, 2000). Here, we give analytic expressions for efficiently calculating highorder translation matrix elements for both harmonic oscillator (HO) and orthogonal Coulombtype radial basis functions using spherical Bessel transforms. The HO functions, sometimes referred to as Gaussiantype orbitals (GTOs), have long been used in calculations because of the ease of calculating overlap integrals between pairs of such functions (Boys, 1950). The Coulombtype functions, sometimes referred to as exponentialtype orbitals (ETOs) (Guseinov, 2001, 2002), have an exponential factor in place of the Gaussian and correspond to orthogonalized Slatertype atomic orbitals (STOs) (Barnett & Coulson, 1951) and Bfunctions (Filter & Steinborn, 1978), or equivalently Besseltype orbitals (BTOs) (Ozdogan et al., 2005).
In what follows, the abbreviations GTO and ETO will be used to refer to the radial functions used here for consistency with the recent quantum mechanics literature. Despite much prior work, there is still considerable interest in developing efficient and stable methods of calculating integrals over GTOs (Arakane & Matsuoka, 1999; Chiu & Moharerrzadeh, 1999), and especially STOs (Filter & Steinborn, 1980; Weniger & Steinborn, 1983; Hierse & Oppeneer, 1994; Ozdogan et al., 2005; Rico et al., 2005). Nonetheless, to our knowledge, the analytic expressions presented here to translate ETO basis functions and to relate ETOs to BTOs are novel. We also describe nonorthogonal expansions of GTOs and ETOs which give at least a twofold speedup compared with calculating translations in the orthogonal bases.
Our expressions have been tested for numerical accuracy up to order N = 32, which is sufficient for highresolution protein–protein docking correlations (Ritchie, 2003). We find it is necessary to use an extendedprecision arithmetic library to calculate highorder translation matrix elements analytically, although all subsequent calculations may be performed in ordinary doubleprecision arithmetic. However, because much of our docking algorithm has been described previously (Ritchie & Kemp, 2000), the calculations are illustrated here by superposing a pair of similar superantigen proteins. In contrast to multidimensional FFT approaches, our correlation is calculated as a combinatorial search over pairs of rotated and translated coefficient vectors. With loworder polynomial expansions there is no advantage in using an FFT, although for highorder correlations our algorithm can be accelerated using a onedimensional real fast Hartley transform (FHT) (Bracewell, 1999). In the example given, a good superposition is achieved using loworder expansions to N = 6, with a total calculation time of under 8 s for an exhaustive sixdimensional search on a 2 GHz Pentium Xeon processor. This corresponds to evaluating approximately 6 × 10^{6} trial orientations s^{−1}. This demonstrates that the SPF correlation technique offers a novel and powerful approach for problems that involve calculating realspace sixdimensional rotational–translational correlations.
2. Methods
2.1. Polar Fourier expansions
In the SPF approach, each scalar property of interest, A(r), is represented as an infinite expansion, truncated to order N as
where a_{nlm} are the expansion coefficients, = are real normalized spherical harmonics, and R_{nl}(r) are orthonormal GTO or ETO radial basis functions. Generally, we use GTOs to represent steric shape and the more diffuse ETOs to represent electrostatic properties. We follow the quantum chemistry convention in which the radial index n, or principal quantum number, counts from unity. Hence the highest harmonic order and highest polynomial power in any individual coordinate is L = N − 1. An expansion to order N involves N(N + 1)(2N + 1)/6 coefficients. These are calculated just once for each property as described previously (Ritchie & Kemp, 2000).
2.2. Overlap Integrals as translation matrix elements
First, we consider the correlation between a fixed `body' A, or scalar function A(r), and a moving body B, or function B(r), under an active translation of B by T = (R, 0, 0) along the positive z axis:
Substituting the expansions of A(r) and B(r) gives
where the shorthand notation etc. is used to indicate summation over the subscript ranges given in equation (1), and where r = and r′ = r − T = . In this case, and remain coincident, so the circular functions, , may be integrated out and equation (3) reduces to a sum over twodimensional integrals in the plane. Because these integrals clearly depend only on the distance R, we write
and interpret each as a matrix element of the translation operator. For example, from equation (4) it can be seen that the two sums
and (after relabeling the subscripts)
represent a positive translation of the body B, or equivalently a negative translation of the body A, respectively. The translation matrices are obviously fivedimensional quantities. However, because they do not depend on the sign of m (see below), it is useful to consider each matrix as being composed of = N twodimensional arrays, each indexed by = N(N + 1)/2 possible values for each pair of nl subscripts. The matrix elements vanish trivially where m > l. The notation used here is intended to be consistent with the usual convention for the complex and real spherical harmonic rotation matrix elements, and , respectively, in the sense that a positive z translation of the basis functions is expressed as
2.3. Spherical Bessel transform method of calculation
If the spherical Bessel transform of R_{nl}(r) is defined as
where j_{l}(z) is the spherical Bessel function, then the inverse transform is given by (Hochstadt, 1971):
It is shown in Appendix A that the translation matrix elements for SPF basis functions may be calculated as a sum over onedimensional inverse Bessel transforms
where the coefficients are given by
From the permutational symmetries of the second 3–j symbol, the righthand side is independent of the sign of m, hence justifying the use of m to label the matrix elements. Because the first 3–j symbol vanishes whenever l + l′ + k is odd, it can be seen that the nonvanishing coefficients are always real and that the summation in equation (10) need only be calculated for even increments: k = l − l′, l − l′ + 2, …, l + l′. By similar arguments (see Appendix A), it can also be shown that
Consequently, nearly half of all matrix elements can be found by symmetry. Given that the original basis functions form a complete orthonormal set, it is straightforward to show that the translation matrices are also orthonormal in the sense that
Evaluating this equation provides a convenient way to verify the following calculations.
2.4. GTO translation matrix elements
The normalized GTO radial functions are given by (Biedenharn & Louck, 1981b)
where = r^{2}/ with scale factor . For protein shape representations we set = 20. Throughout this article (x)_{n} = x(x + 1)…(x + n − 1) = (x + n)/(x) denotes a rising factorial and the substitution (n + 1/2) = (1/2)_{n} is frequently employed for numerical accuracy. The generalized Laguerre polynomials, , are defined by the binomial expansion (Erdelyi et al., 1953a)
Highorder polynomials may be calculated efficiently using the stable recursion
along with the identities = 1 and = + 1 − x. The GTO functions are eigenfunctions of the spherical Bessel transform (derived from Erdelyi et al., 1953b, p. 42, equation 3 therein):
where x^{2} = . Here, it is convenient to use equation (15) to expand equation (17) as a power series:
where serves as shorthand notation for and where the coefficients X_{nlj} are given by
Substituting equation (18) twice into equation (10) and collecting coefficients of x^{2k} using
(where is the Kronecker delta) gives for the GTO translation matrix elements:
Applying the relation (from Erdelyi et al., 1953b, p. 30, equation 13 therein)
then gives the final analytic result
where M = j + (l + l′ − k)/2.
2.5. ETO translation matrix elements
The normalized ETO radial functions are given by (Biedenharn & Louck, 1981b)
where = 2r with scale factor . We set = 1/2 for protein–protein electrostatic calculations (Ritchie & Kemp, 2000). Using an argument based on orthogonality, Keister & Polyzou (1997) recently proved that the spherical Bessel transform of these functions may be written in terms of the Jacobi polynomials, :
where s = /. Following a similar treatment to the GTO case, the shifted series expansion (Keister & Polyzou, 1997)
may be used to collect factors of 1/(s^{2} + 1) = (1 − t)/2 to write equation (25) as a power series:
where
Substituting equation (27) twice into equation (10) and collecting coefficients of 1/(s^{2} + 1)^{k} using
gives for the ETO translation matrix elements:
where M = (l + l′ − k)/2 and J = j + l + l′ + 2. It is shown in Appendix B that the remaining integral may be calculated as
where is a reduced Bessel function of the second kind. For halfintegral degree, these functions may be calculated using the recurrence relations (Weniger & Steinborn, 1983):
and
Thus, the ETO translation matrix elements may also be calculated analytically, although compared with the GTO basis an additional inner summation is necessary.
2.6. Nonorthogonal translation matrices
Translations of SPF expansions in both the GTO and ETO bases can be computed more economically by eliminating the inner summation on the subscript j in equations (23) and (30). This is equivalent to calculating overlap integrals that correspond to expansions of nonorthogonal radial basis functions. For example, substituting equation (18) into equation (10) and applying equation (22) directly gives the factorization
where each is an overlap integral in a nonorthogonal basis,
now with M = j + j′ + (l + l′ − k)/2. This corresponds to expanding R_{nl}(r) as a sum of nonorthogonal functions, :
It can be shown that these functions correspond to the nonorthogonal Laguerre–Gaussian basis proposed by Chiu & Moharerrzadeh (1999). With this factorization, translated expansion coefficients, a^{R}_{nlm}, in the original orthogonal basis may be calculated using the sequence
In a similar manner, ETO translations may be calculated in a nonorthogonal basis by substituting equation (27) into equation (10) and collecting powers of 1/(s^{2} + 1) directly to give
where the nonorthogonal matrix elements, , are given by
with M = (l + l′  k)/2 and now J = j + j′ + l + l′ + 2.
Finally, using equation (31) with M = 0 and J = l + j, it is straightforward to apply the inverse Bessel transform to equation (27) to give
where the are the well known Bfunctions, or BTOs (Filter & Steinborn, 1978). In other words, equation (43) shows that ETOs correspond to orthogonalized BTOs.
2.7. Correlation algorithm and complexity analysis
To illustrate an application of our approach, we demonstrate the superposition of a pair of similar proteins, A and B, with each protein initially centred at the origin and represented as a single threedimensional function [equation (1)], e.g. steric density or electrostatic charge. Letting and represent the pair of properties to be correlated, it is then convenient to express the correlation as
where represents an Euler rotation operator, , and where translates the rotated by R along the negative z axis. We could have chosen instead to translate the rotated along the positive z axis. We use an icosahedral tessellation of the sphere to generate nearregular rotational samples for each molecule (Ritchie & Kemp, 1999). If the proteins are similar, the translational part of the search will normally be small. In any case, we seek to find the maximum of the expansion
The real rotation matrix elements, , may be calculated efficiently using recursion formulae (Navaza, 1990; Ritchie & Kemp, 1999). However, the cost of computing equation (45) as it stands scales in the order of O(N^{7}) operations for each trial orientation. A much more efficient strategy is to compute the sum in stages using precalculated rotation and translation matrix elements. For example, vectors of coefficients representing different threedimensional orientations of molecule A may be calculated in O(N^{3}) × [O(N) + O(N^{2})] = O(N^{5}) operations per vector (orientation) using
and
Similarly, vectors of rotated instances of molecule B may be calculated in O(N^{3}) × O(N) = O(N^{4}) operations per orientation using
The final degree of freedom is a twist rotation about the z axis, which could be applied to molecule B as
However, it is more efficient to complete the correlation by iterating combinatorially over all pairs of the abovecalculated A and B orientations in O(N) × O(N^{2}) = O(N^{3}) operations per iteration using
and
followed by an inner iteration over using
which clearly scales as O(N) per trial orientation. The calculation of equation (53) for multiple angular samples, M, where M N, may be performed in O(M logM) operations by using a onedimensional FFT. However, because all quantities here are real, when N 16, we use a single real FHT instead of the complex FFT to obtain around a 10% speedup. Otherwise, it is faster to compute equation (53) explicitly for each value of in O(MN) time. Thus, despite the relatively high cost of rotating [O(N^{4})] and translating [O(N^{5})] individual threedimensional coefficient vectors, it can be seen that the above scheme reduces the cost of the combinatorial part of a sixdimensional correlation to just two inexpensive nested iterations, with the inner cycle costing only O(N) operations or less per trial orientation.
3. Results
3.1. Verification and numerical precision of expressions
In order to verify our implementation, numerical results from the analytic translation matrix element expressions were compared with those from a twodimensional numerical integration of equation (3) and a onedimensional integration of equation (10). The twodimensional integration was calculated over a regular 200 × 200 grid in the plane with r ranging from 0 to 50 Å. The onedimensional integration used 200 steps in in a lognumerical scheme using
with = and = . In both cases, increasing the number of integration steps or integration range gave a negligible effect on the numerical results. When calculating the angular coefficients [equation (11)] for the onedimensional integration, we found it became essential to use extendedprecision arithmetic for N 16 due to the many large factorials in Wigner's formula for the 3–j symbols (Biedenharn & Louck, 1981a). Hence an array of coefficients was precalculated using 256bit arithmetic using the GMP arbitrary precision math library (http://www.swox.com/gmp/ ). We note that Tuzan et al. (1998) describe a procedure for calculating 3–j symbols using only singleprecision arithmetic, but this requires significantly more programming than our current approach. Nonetheless, the remaining calculations do not require such highprecision arithmetic. For example, when using 128bit arithmetic, the analytic calculations agreed with the numerical integration results to within three decimal digits or more for all matrix elements up to N = 32 for all translations up to R = 9 Å in the GTO basis, and up to R = 90 Å in the ETO basis. As an additional check, the orthonormality of both the GTO and ETO translation matrices was verified by evaluating the orthogonality formula equation (13) at a reduced distance of = 0.1. This gave zero or unity, as appropriate, to within at least eight decimal digits up to N = 20, and to within at least two decimal digits for all translation matrices up to N = 32. Hence we are confident that our formulae and implementation are correct.
However, the above tests indicate that numerical precision can fall off significantly at high order. Table 1 presents further details on the overall computational cost and numerical precision for calculating GTO translations. Table 2 gives the corresponding results for the ETO basis. These tables show that lognumerical integration is the fastest method and that this gives better numerical precision than using 64bit (i.e. Fortran or C `double precision') arithmetic to evaluate the analytic expressions for highorder expansions. The tables also show that the numerical accuracy of the analytic calculations falls off markedly with highorder expansions unless highprecision arithmetic is used. This is presumably due to summing many terms of very different magnitudes. Overall, it can be seen that the ETO basis gives somewhat longer calculation times and slightly larger rounding errors than the corresponding GTO functions. The final columns of each table show that the nonorthogonal expansion method is at least twice as fast as using the orthogonal analytic formulae and gives equal or better precision. To achieve doubleprecision programming accuracy for calculations to N = 32, Tables 1 and 2 indicate that translation matrix elements should be calculated using 160bit and 192bit arithmetic for the GTO and ETO bases, respectively. Our correlation algorithm has the option to use either nonorthogonal expansions for `on the fly' translations or to calculate and store orthogonal translation matrices. In the latter case, the matrix elements may be stored and subsequently used in ordinary doubleprecision arithmetic.

3.2. Protein superposition correlation example
Fig. 1(A) shows some GTO steric density representations of a pair of globular protein domains, namely the superantigens Streptococcal pyrogenic exotoxin A1 (PDB code 1B1Z) (Papageorgiou et al., 1995) and the Staphylococcus aureus exotoxin SEC3 (PDB code 1 JCK) (Fields et al., 1996). These globular proteins have a relatively low sequence identity of 46%, but share a highly similar fold. Hence they may be superposed well by conventional leastsquares fitting of conserved C_{α} coordinates. However, in this illustration the superposition was performed by maximizing the overlap between the respective GTO steric density expansions using correlations to N = 6. A nearidentical superposition (not shown) was also achieved by correlating electrostatic charge densities in the ETO basis. Fig. 1(B) shows the corresponding backbone traces in the calculated superposition, along with the molecular surfaces from which the GTO expansions were derived. It can be seen that the highorder expansions capture the detailed shape of each protein remarkably well, although the loworder expansions still encode sufficient steric information to allow a very good global superposition to be calculated. The superposition shown was calculated by searching over some 21 × 10^{6} trial orientations generated from 162 icosahedral angular samples for each protein, 128 twist samples in , and 40 distance steps of 0.25 Å. After each cycle over the twist angle, the best overlap score was saved along with the corresponding coordinate values in a block of memory sufficient to store around 10^{6} orientations. As the search progressed, this memory block was periodically sorted and culled. Hence there is essentially no practical limit on the number of trial orientations that may be evaluated and subsequently stored. In this example, the total memory usage never exceeded 15 Mbyte and the overall correlation time was under 8 s on a 2 GHz Pentium Xeon processor. This corresponds to a rate of 6 × 10^{6} trial orientations s^{−1}, or just a few hundred arithmetic operations per orientation.
4. Discussion
Conceptually, our approach follows in the spirit of Crowther's original spherical harmonic plus spherical Bessel method of correlating Patterson maps (Crowther, 1972). However, by replacing the Bessel functions with a set of orthonormal radial basis functions which are eigenfunctions of the spherical Bessel transform, our SPF approach extends this rotationcentric representation in a way which also allows the effect of translations to be calculated analytically. We have presented efficient spherical Bessel transform methods of calculating translation matrix elements for both GTO and ETO radial basis functions. Although it is well known that the Laguerre–Gaussian functions are eigenfunctions of the spherical Bessel transform, we believe the methods of calculation and the expressions presented here, especially for the ETO basis, are straightforward yet novel. However, it is necessary to call upon a significant body of special function theory to derive analytic forms for the SPF translation matrices, and the resulting expressions require considerably more programming to implement than the FFT. Nonetheless, the computational `payoff' that follows is that once an initial transform into the SPF basis has been calculated for each molecular property of interest, a full sixdimensional correlation search may be performed using only the initial expansion coefficients. Because all subsequent expressions are entirely real, no inverse transform is required. On the other hand, computing highorder matrix elements is relatively expensive. However, this overhead can be alleviated by calculating nonorthogonal expansions or by precalculating and storing orthogonal translation matrices for subsequent use.
There are also close parallels between our technique and the complex fivedimensional FFTbased EM density fitting approach of Kovacs et al. (2003). However, their approach is currently limited to relatively loworder (L = 16) spherical harmonic expansions in order to stay within the memory limits of contemporary 32bit computers, and their use of discrete radial envelope functions requires the translational component of the correlation to be computed by twodimensional numerical integration (Kovacs et al., 2003). Unlike the translation matrix elements used here, these integrals cannot be precomputed as a oneoff cost because they depend intrinsically on the shapes to be compared.
In general, FFTbased correlation algorithms use a power of two for both the polynomial order of the representation and the number of search increments in each degree of freedom in the correlation. For example, for angular searches, a rotational FFT step size of at least 64 is desirable in order to give reasonably small search increments (e.g. 360°/64 5.6°). Similarly, Cartesian FFT docking algorithms often use grids of 64^{3} or 128^{3} elements in order to accommodate all possible translations of one protein about the other with a subÅ step size (KatchalskiKatzir et al., 1992). Hence, unless a truncation technique is employed (Segal & Eisenstein, 2005), most FFT docking algorithms implicitly use a very highorder polynomial representation for each protein. In contrast, in the SPF approach the polynomial order and search step size may easily be varied independently. In fact, because our polynomial order, N, is generally significantly less than the corresponding FFT order (say M) for any given coordinate, we typically have O(MN) ≲ O(M lnM). Thus, in practice, it is not a disadvantage that our correlation algorithm uses an FFT (actually a real FHT) in at most just one of the search coordinates. Indeed, by calculating and reusing vectors of rotated and translated expansion coefficients, the complexity analysis given above shows that the inner iteration of the combinatorial part of a sixdimensional search scales as O(N) or better per trial orientation.
Given the modest memory overhead of under 15 Mbyte of our loworder N = 6 superposition calculation (even our most exhaustive highorder N = 32 docking correlations use under 200 Mbyte), and considering that our superposition correlation time compares very favourably with the timings in Table I of Kovacs et al. (2003), it would seem that the SPF approach could offer significant computational advantages to both EM and MR applications. The straightforward analytic method of calculating ETO translation matrix elements presented here could also provide a useful new alternative to using BTO expansions to calculate STO overlap integrals in quantum mechanics. In principle, GTO and ETO expansions should give comparable results if suitable scale factors are chosen and if a sufficient number of terms are used. However, in practice, the ETOs decay much more gradually with distance than the GTOs. In our experience, the GTO basis works well for both superposing and docking globular protein domains, whereas the ETO basis is more appropriate for rapidly calculating electrostatic interactions (Ritchie & Kemp, 2000). However, ETOs might also be a good choice for correlating very large lowresolution macromolecular structures. The results here indicate that either type could be used to calculate e.g. realspace electron density correlations. Although the example presented here shows that SPF expansions have good computational properties, it is not yet clear how the algorithm might perform when the translational component is large or when very high angular resolution is required. It would also be interesting to test the algorithm on a wider range of examples. Hence a more extensive study of the utility of the approach is planned. Developing a fast and robust sixdimensional correlation algorithm would be useful in several areas including e.g. docking proteins into density maps, and searching the PDB for structural homologues.
5. Conclusion
Spherical Bessel transform formulae have been given for the analytic calculation of translation matrices for SPF expansions using GTO and ETO radial basis functions. To our knowledge, the formulae presented to translate ETO basis functions and to relate ETOs to BTOs are novel. It has been shown that translation matrices for both the GTO and ETO bases may be calculated accurately up to N = 32 using an extendedprecision arithmetic library. The calculation may be accelerated by over a factor of two by using nonorthogonal translation matrix elements, which give equal or better precision. Although our SPF correlation algorithm uses an FFT in only one of the search coordinates, this is not a practical disadvantage for relatively lowresolution superposition and docking correlations. On the contrary, the SPF approach is very economical in both CPU time and computer memory. By ordering the subexpressions according to computational cost, an exhaustive sixdimensional superposition correlation of a pair of protein shapes can be performed in a matter of seconds on a contemporary personal computer. This demonstrates the utility of the SPF approach. It is proposed that the techniques described here, which have been implemented in the protein docking and superposition program Hex (http://www.csd.abdn.ac.uk/hex/ ), may be of value in other fields, such as MR and EM, that need to calculate realspace sixdimensional correlations.
APPENDIX A
Derivation of equation (10)
The twocoordinate systems r = and r′ = r − T = may be related functionally by multiplying the vector equation
by an arbitrary complex vector and by exponentiating each side to give
Then, substituting k = , r = , r′ = , and T = into Raleigh's plane wave equation (Bransden & Joachain, 1997)
gives
where etc. are complex spherical harmonics. Here, each summation has an infinite range, subject to , etc. Using the identity = , multiplying both sides by , and integrating over gives
Substituting the standard formula for Gaunt's integral on the right gives
Now, the first 3–j symbol vanishes when l + s + k is odd, so the phase factor for the surviving terms is always real. Furthermore, the second 3–j symbol vanishes unless m + + = 0. Hence the expression may be reduced to a triple infinite sum by substituting j = m − t:
When the translation is only in the z direction, = 0 and either = 0 or = , and = , which entails t = m. This allows the summation over t to be eliminated, and the expression becomes valid for real as well as complex spherical harmonics. For a positive z translation, the harmonic Y_{k,0}( = 0, 0) reduces to
Then, collecting the angular factors, relabelling , and restricting the range of k according to the triangle rule for the 3–j symbols, gives the plane wave addition theorem:
where the angular coefficient is given by
For a negative translation, an additional factor of appears in the above expression because Y_{k,0}( = , 0) = (1)^{k} Y_{k,0} (0,0) and because is even.
Now, multiplying both sides of equation (63) by and integrating over [i.e. applying the inverse Bessel transform, equation (9)] gives
Then, multiplying each side by and integrating over all space in the corresponding variables gives
Finally, recognizing the integral in as the spherical Bessel transform of , the result reduces to
This generalizes the expression given by Danos & Maximon (1965) for translating multipole expansions to the more general case for arbitrary orthonormal radial functions, R_{nl}(r).
APPENDIX B
Derivation of equation (31)
Following Weniger & Steinborn (1983), but correcting a mistake in their working [here, equation (75) onwards], a closedform expression for the integral
may be obtained with the help of the basic relation (Erdelyi et al., 1953b, p. 24, equation 20 therein):
where is a general Bessel function of the first kind, and is a modified Bessel function of the second kind. When the degree is halfintegral, which is the case here, has a closed form:
However, this function has a singularity at the origin, so it is useful to use instead the reduced Bessel function
which is well behaved for all z. Equation (69) can then be cast in the desired form using the standard relation (Hochstadt, 1971):
Hence putting = + m and applying to each side of equation (69) gives
or in terms of the reduced Bessel function [equation (71)]
It can be shown (Weniger & Steinborn, 1983) that
Hence setting = 2 and substituting equation (75) into equation (74) gives
Here, is an integer so the rising factorial may be recast as
We also have + 2m and so to keep the degree of the Bessel function positive it is convenient to use the identity
Some further working then gives
Finally, putting = n + 1, = k + 1/2, a = 1, and replacing J_{k+1/2}(xy) by the corresponding spherical Bessel function gives
Acknowledgements
Part of this work was funded by the BBSRC (Grant Ref. 1/B10454).
References
Arakane, F. & Matsuoka, O. (1999). Int. J. Quant. Chem. 66, 273–279. Web of Science CrossRef Google Scholar
Barnett, M. P. & Coulson, C. A. (1951). Philos. Trans. R. Soc. London A, 243, 221–249. CrossRef Google Scholar
Biedenharn, L. C. & Louck, J. C. (1981a). Angular Momentum in Quantum Physics. Reading, MA: AddisonWesley. Google Scholar
Biedenharn, L. C. & Louck, J. C. (1981b). The Racah–Wigner Algebra in Quantum Theory. Reading, MA: AddisonWesley. Google Scholar
Boys, S. F. (1950). Proc. R. Soc. London A, 200, 542–554. CrossRef CAS Google Scholar
Bracewell, R. N. (1999). The Fourier Transform and its Applications, 3rd ed. New York: McGrawHill. Google Scholar
Bransden, B. H. & Joachain, C. J. (1997). Introduction to Quantum Mechanics. Harlow, UK: Addison Wesley Longman. Google Scholar
Chang, G. & Lewis, M. (1997). Acta Cryst. D53, 279–289. CrossRef CAS Web of Science IUCr Journals Google Scholar
Chiu, L. Y. C. & Moharerrzadeh, M. (1999). Int. J. Quant. Chem. 73, 265–273. Web of Science CrossRef CAS Google Scholar
Cowtan, K. (1998). Acta Cryst. D54, 750–756. Web of Science CrossRef CAS IUCr Journals Google Scholar
Cowtan, K. (2001). Acta Cryst. D57, 1435–1444. Web of Science CrossRef CAS IUCr Journals Google Scholar
Crowther, R. A. (1972). The Molecular Replacement Method, edited by M. G. Rossmann, pp. 173–178. New York: Gordon and Breach. Google Scholar
Danos, M. & Maximon, L. C. (1965). J. Math. Phys. 6, 766–778. CrossRef Web of Science Google Scholar
Erdelyi, A., Magnus, W., Oberhettinger, F. & Tricomi, F. G. (1953a). Higher Transcendental Functions, Vol 2. New York: McGrawHill. Google Scholar
Erdelyi, A., Magnus, W., Oberhettinger, F. & Tricomi, F. G. (1953b). Tables of Integral Transforms, Vol. 2. New York: McGrawHill. Google Scholar
Fields, B. A., Malchiodi, E. L., Li, H., Ysern, X., Stauffacher, C. V., Schlievert, P. M., Karjalainen, K. & Mariuzza, R. A. (1996). Nature (London), 384, 188–192. CrossRef CAS PubMed Web of Science Google Scholar
Filter, E. & Steinborn, O. (1978). Phys. Rev. A, 18, 1–11. CrossRef CAS Web of Science Google Scholar
Filter, E. & Steinborn, O. (1980). J. Math. Phys. 21, 2725–2736. CrossRef Web of Science Google Scholar
Frank, J. (2002). Ann. Rev. Biophys. Biolmol. Struct. 31, 303–319. Web of Science CrossRef CAS Google Scholar
Fujinaga, M. & Read, R. J. (1987). J. Appl. Cryst. 20, 517–521. CrossRef Web of Science IUCr Journals Google Scholar
Guseinov, I. I. (2001). Int. J. Quant. Chem. 81, 126–129. CrossRef CAS Google Scholar
Guseinov, I. I. (2002). Int. J. Quant. Chem. 90, 114–118. Web of Science CrossRef CAS Google Scholar
Hierse, W. & Oppeneer, P. M. (1994). Int. J. Quant. Chem. 52, 1249–1265. CrossRef CAS Web of Science Google Scholar
Hochstadt, H. (1971). The Functions of Mathematical Physics. New York: Wiley. Google Scholar
Jamrog, D. C., Zhang, Y. & Philips, G. N. Jr (2003). Acta Cryst. D59, 304–314. Web of Science CrossRef CAS IUCr Journals Google Scholar
KatchalskiKatzir, E., Shariv, I., Eisenstein, M., Friesem, A. A. & Aflalo, C. (1992). Proc. Natl Acad. Sci. 89, 2195–2199. CrossRef PubMed CAS Web of Science Google Scholar
Keister, B. D. & Polyzou, W. N. (1997). J. Comput. Phys. 134, 231–235. CrossRef CAS Web of Science Google Scholar
Kissinger, C. R., Gelhaar, D. K. & Fogel, D. B. (1999). Acta Cryst. D55, 484–491. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kovacs, J. A., Chacon, P., Cong, Y., Metwally, E. & Wriggers, W. (2003). Acta Cryst. D59, 1371–1376. Web of Science CrossRef CAS IUCr Journals Google Scholar
Navaza, J. (1987). Acta Cryst. A43, 645–653. CrossRef Web of Science IUCr Journals Google Scholar
Navaza, J. (1990). Acta Cryst. A46, 619–620. CrossRef Web of Science IUCr Journals Google Scholar
Navaza, J. (2001). Acta Cryst. D57, 1367–1372. Web of Science CrossRef CAS IUCr Journals Google Scholar
Ozdogan, T., Orbay, M. & Degirmenci, S. (2005). J. Math. Chem. 37, 27–35. Web of Science CrossRef Google Scholar
Papageorgiou, A. C., Collins, C. M., Gutman, D. M., Kline, J. B., O'Brien, S. M., Tranter, H. S. & Acharya, K. R. (1995). EMBO J. 18, 9–21. Web of Science CrossRef Google Scholar
Read, R. J. & Schierbeek, A. J. (1988). J. Appl. Cryst. 21, 490–495. CrossRef CAS Web of Science IUCr Journals Google Scholar
Rico, J. F., López, R., Ema, I. & Ramírez, G. (2005). J. Comput. Chem. 26, 846–855. Web of Science CrossRef PubMed CAS Google Scholar
Ritchie, D. W. (2003). Proteins Struct. Func. Genet. 52, 98–106. Web of Science CrossRef CAS Google Scholar
Ritchie, D. W. & Kemp, G. J. L. (1999). J. Comput. Chem. 20, 383–395. Web of Science CrossRef CAS Google Scholar
Ritchie, D. W. & Kemp, G. J. L. (2000). Proteins Struct. Func. Genet. 39, 178–194. CrossRef CAS Google Scholar
Rose, M. E. (1957). Elementary Theory of Angular Momentum. New York: Wiley. Google Scholar
Roseman, A. M. (2000). Acta Cryst. D56, 1332–1340. Web of Science CrossRef CAS IUCr Journals Google Scholar
Rossmann, M. G. (1990). Acta Cryst. A46, 73–82. CrossRef CAS Web of Science IUCr Journals Google Scholar
Rossmann, M. G. (2000). Acta Cryst. D56, 1341–1349. Web of Science CrossRef CAS IUCr Journals Google Scholar
Rossmann, M. G. (2001). Acta Cryst. D57 1360–1366. Web of Science CrossRef CAS IUCr Journals Google Scholar
Segal, D. & Eisenstein, M. (2005). Proteins Struct. Func. Bioinf. 59, 580–591. Web of Science CrossRef CAS Google Scholar
Sheriff, S., Klei, H. E. & Davis, M. E. (1999). J. Appl. Cryst. 32, 98–101. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sussman, J. L., Lin, D., Jiang, J., Manning, N. O., Prilusky, J., Ritter, O. & Abola, E. E. (1998). Acta Cryst. D54, 1078–1084. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tuzan, R. E., Burkhardt, P. & Secrest, D. (1998). Comput. Phys. Commun. 112, 112–148. Web of Science CrossRef Google Scholar
Vagin, A. A. & Isupov, M. N. (2001). Acta Cryst. D57, 1451–1456. Web of Science CrossRef CAS IUCr Journals Google Scholar
Vakser, I. A. (1995). Protein Eng. 8, 371–377. CrossRef CAS PubMed Web of Science Google Scholar
Weniger, E. J. & Steinborn, E. O. (1983). J. Chem. Phys. 78, 6121–6132. CrossRef CAS Web of Science Google Scholar
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.