[Journal logo]

Volume 69 
Part 1 
Pages 98-107  
January 2013  

Received 25 May 2012
Accepted 25 September 2012
Online 14 November 2012

On the use of the C map in Patterson deconvolution procedures

aIstituto di Cristallografia, CNR, Via G. Amendola 122/O, 70126, Bari, Italy, and bDipartimento di Chimica, Università della Basilicata, 85100, Potenza, Italy
Correspondence e-mail: carmelo.giacovazzo@ic.cnr.it

The cross-correlation function between the target and a model electron density, denoted as the C map, has been crystallographically characterized. In particular, a study of its interatomic vectors and of their relation with the Patterson vectors has been undertaken. Since the C map is not available during the phasing process, the C' map, its centric modification, is considered. It may be computed at any stage of the phasing process and shows properties that are very useful for the crystal structure determination process. It has been combined with the implication transformation method and with vector-superposition techniques for performing the Patterson deconvolution and obtaining an initial model for dual-space recycling. While Patterson methods are traditionally considered to be more efficient for structures containing heavy atoms, the C map extends their potential to light-atom structures (i.e. containing atoms not heavier than O).

1. Notation

N, Np: number of atoms in the unit cell of the target structure (the one we want to phase) and of the model structure, respectively.

F,E,Fp,Ep: structure factor and normalized structure factors of the target and of the model structure, respectively.

[\textstyle\sum_N {} = \textstyle\sum_{j = 1}^N {f_j^2}, \textstyle\sum_{{N_p}} = \textstyle\sum_{j = 1}^{{N_p}} {f_j^2}]: fj is the scattering factor of the jth atom, thermal factor included.

Ii(x): modified Bessel function of order i.

[D = \langle \cos (2\pi {\bf h}\Delta {\bf r}) \rangle]: the average is performed per resolution shell.

[{\sigma _A} = D(\Sigma _{N_p}/{\Sigma _N})^{1/2}.]

[{\bf C} \equiv ({\bf R},{\bf T})]: symmetry operator. R and T are the rotation and the translation matrix, respectively.

2. Introduction

In a recent paper (Carrozzini et al., 2010[Carrozzini, B., Cascarano, G. L. & Giacovazzo, C. (2010). J. Appl. Cryst. 43, 221-226.]) the cross-correlation function [C({\bf r})] was crystallographically characterized. We briefly recall its definition. Given two functions f(x) and g(x), their cross-correlation function is defined by

[C(y) = f(x) \otimes g(x) = \textstyle\int\limits_{ - \infty }^{ + \infty } {f^*}(x)g(x + y)\,{\rm d}x, \eqno (1)]

where the star indicates the complex conjugate. The cross correlation is both associative and distributive but not commutative: owing to the classical Wiener-Kinchin theorem, it satisfies the relations

[T[\,f(x) \otimes g(x)] = {[Tf(x)]^*}[Tg(x)] \eqno (2)]

and

[T[g(x) \otimes f(x)] = {[Tg(x)]^*}[Tf(x)]. \eqno (3)]

The right-hand sides of equations (2)[link] and (3)[link] are complex conjugates.

The above definition was applied to the following two functions: the electron density,

[\rho ({\bf{r}}) = \textstyle\sum\limits_{j = 1}^N {{\rho _j}({\bf r} - {{\bf r}_j})} = (1/ V)\textstyle\sum\limits_{\bf h} {F_{\bf h}}\exp (- 2\pi i{\bf h} \cdot {\bf r}), \eqno (4)]

(from now on referred to as the density of the target structure, because we are interested in knowing this), and the model density (presumed to be known),

[{\rho _p}({\bf r}) = \textstyle\sum\limits_{j = 1}^{{N_p}} {{\rho _{{p_j}}}({\bf r} - {\bf r}^\prime_j)} = (1 / V)\textstyle\sum\limits_{\bf h} {F_{p{\bf h}}}\exp (- 2\pi i{\bf h} \cdot {\bf r}). \eqno (5)]

[{\rho _j}] and [{\rho _{{p_j}}}] are atomic electron densities, the first centered on [{{\bf r}_j}], the second centered on [{\bf r}{'_j} = {{\bf r}_j} + \Delta {{\bf r}_j}]. If Np/N is sufficiently high and the [\Delta {{\bf r}_j}]'s are sufficiently small, the two functions [\rho ({\bf r})] and [{\rho _p}({\bf r})] are highly correlated with each other. The target and model structure are assumed to show the same space-group symmetry.

According to equation (1)[link] we have

[C({\bf u}) = \rho ({\bf{r}}) \otimes {\rho _p}({\bf{r}}) = \textstyle\int\limits_S {} \rho ({\bf{r}}){\rho _p}({\bf{r}} + {\bf{u}})\,{\rm d}{\bf{r}} \eqno (6)]

and, in accordance with relations (2)[link] and (3)[link],

[C({\bf{u}}) = (1 / V)\textstyle\sum\limits_{\bf{h}} |{F_{\bf{h}}}{F_{p{\bf{h}}}}|\exp i({\varphi _{\bf{h}}} - {\varphi _{p{\bf{h}}}})\exp (- 2\pi i{\bf{h}} \cdot {\bf{u}}), \eqno (7)]

where [{F_{\bf{h}}}=|{F_{\bf{h}}}|\exp (i{\varphi _{\bf{h}}})] and [{F_{p{\bf{h}}}} = |{F_{p{\bf{h}}}}|\exp (i{\varphi _{p{\bf{h}}}})] are the structure factors of [\rho ({\bf{r}})] and [{\rho _p}({\bf{r}})], respectively. [|{F_{\bf{h}}}{F_{p{\bf{h}}}}|\exp i({\varphi _{\bf{h}}} - {\varphi _{p{\bf{h}}}})] is the Fourier coefficient of [C({\bf{u}})]: since it is a complex number, [C({\bf{u}})] is acentric [[C({\bf{u}})] is centric only if both [\rho ({\bf{r}})] and [{\rho _p}({\bf{r}})] are centric]. It was shown by Carrozzini et al. (2010[Carrozzini, B., Cascarano, G. L. & Giacovazzo, C. (2010). J. Appl. Cryst. 43, 221-226.]) that the space group of the C function is the symmorphic variant of the space group of the target structure (e.g. P222 as opposed to P212121).

If the model and target structures are correlated then:

(a) [{\varphi _{\bf{h}}} \simeq {\varphi _{p{\bf{h}}}}]. Then the C map will show a peak at the origin, the amplitude of which increases with the correlation:

[C({\bf{0}}) = (1 / V)\textstyle\sum\limits_{\bf{h}}|{F_{\bf{h}}}{F_{p{\bf{h}}}}|\exp i({\varphi _{\bf{h}}} - {\varphi _{p{\bf{h}}}}). \eqno (8)]

(b) If [{N_p} \to N] and [\langle |\Delta {{\bf{r}}_j}| \rangle \to 0], then [{\varphi _{p{\bf{h}}}} \to {\varphi _{\bf{h}}}] and [C({\bf{u}}) \to P({\bf{u}})], where [P({\bf{u}})] is the Patterson function (Patterson, 1934a[Patterson, A. L. (1934a). Phys. Rev. 45, 763.],b[Patterson, A. L. (1934b). Phys. Rev. 46, 372-376.]).

The map [C({\bf{u}})] cannot be computed during the phasing process, essentially because the [{\varphi _{\bf{h}}}]'s are unknown. Fortunately, the approximating function [C'({\bf{u}})], given by

[C'({\bf{u}}) = (1 /V)\textstyle\sum\limits_{\bf{h}} {m_{\bf{h}}}|{F_{\bf{h}}}{F_{p{\bf{h}}}}|\exp (- 2\pi i{\bf{h}} \cdot {\bf{u}}), \eqno (9)]

is easily computable. In equation (9)[link] [m = \langle \cos (\varphi - {\varphi _p}) \rangle] = I1(X)/I0(X) and [X = 2{\sigma _A}|E{E_p}|/(1 - \sigma _A^2)] (Sim, 1959[Sim, G. A. (1959). Acta Cryst. 12, 813-815.]; Srinivasan & Ramachandran, 1965[Srinivasan, R. & Ramachandran, G. N. (1965). Acta Cryst. 19, 1008-1014.]; Read, 1986[Read, R. J. (1986). Acta Cryst. A42, 140-149.]). [C'({\bf{u}})] is a useful approximation to [C({\bf{u}})] and shows remarkable properties: the Fourier coefficients of [C'({\bf{u}})] are real numbers, and consequently the space group of [C'({\bf{u}})] is centric, so coinciding with the Patterson space group (e.g., Pmmm if the space group of the target is P212121).

It was observed by Carrozzini et al. (2010[Carrozzini, B., Cascarano, G. L. & Giacovazzo, C. (2010). J. Appl. Cryst. 43, 221-226.]) that since both [\rho ({\bf{r}})] and [{\rho _p}({\bf{r}})] are non-negative definite functions, [C({\bf{u}})] and therefore [C'({\bf{u}})] are also non-negative definite. The map [C'({\bf{u}})] may therefore be suitably modified and Fourier inverted, as in the usual electron-density modification (EDM) procedures (Cowtan, 1999[Cowtan, K. (1999). Acta Cryst. D55, 1555-1567.]; Abrahams, 1997[Abrahams, J. P. (1997). Acta Cryst. D53, 371-376.]; Abrahams & Leslie, 1996[Abrahams, J. P. & Leslie, A. G. W. (1996). Acta Cryst. D52, 30-42.]; Refaat & Woolfson, 1993[Refaat, L. S. & Woolfson, M. M. (1993). Acta Cryst. D49, 367-371.]; Giacovazzo & Siliqi, 1997[Giacovazzo, C. & Siliqi, D. (1997). Acta Cryst. A53, 789-798.]), so leading to better estimates of the invariants [({\varphi _{\bf{h}}} - {\varphi _{p{\bf{h}}}})]. The first applications of the procedure showed that it is able to drive to convergence sets of phases that are far away from the correct values.

In this paper we suggest applying the [C'] map to the Patterson deconvolution process, with particular interest in its integration with implication transformation (Simpson et al., 1965[Simpson, P. G., Dobrott, R. D. & Lipscomb, W. N. (1965). Acta Cryst. 18, 169-179.]; Pavelcík et al., 1992[Pavelcík, F., Kuchta, L. & Sivý, J. (1992). Acta Cryst. A48, 791-796.]) and superposition methods (Buerger, 1959[Buerger, M. J. (1959). Vector Space, ch. 11. New York: Wiley.]; Richardson & Jacobson, 1987[Richardson, J. W. & Jacobson, R. A. (1987). Patterson and Pattersons, edited by J. P. Glusker, B. K. Patterson & M. Rossi, pp. 310-317. Oxford University Press.]; Sheldrick, 1992[Sheldrick, G. M. (1992). Crystallographic Computing 5, edited by D. Moras, A. D. Podjarny & J. C. Thierry, pp. 145-157. Oxford University Press.]).

Implication transformation and Patterson superposition methods are traditionally applied to structures containing heavy atoms. It has also been shown (Pavelcík, 1988[Pavelcík, F. (1988). Acta Cryst. A44, 724-729.]; Pavelcík & Pivovarcikova, 2002[Pavelcík, F. & Pivovarcikova, O. (2002). J. Appl. Cryst. 35, 526-532.]) that they may be successfully applied to solve small (e.g., less than 80 atoms in the asymmetric unit) light-atom structures (i.e., no atoms heavier than O). No attempt has been made so far to solve medium-size (i.e., from 80 to 400 non-H atoms in the asymmetric unit) light-atom structures. In this paper we:

(a) characterize the interatomic peak distribution in a C' map and its relationship with Patterson interatomic peaks;

(b) describe a new procedure for ab initio phasing, involving the C' map rather than the Patterson map; and

(c) apply the new procedure to a large set of small- and medium-size structures, so demonstrating the greater efficiency of the new method.

We will not consider in this paper the application of the C' map to proteins, because it requires the use of some supplementary filtering techniques and will be described in a separate paper. However, as a general consequence of the results described here, Patterson deconvolution techniques, possibly integrated with the use of the C' map, should be considered as the most versatile tool for the solution of the crystallographic phase problem. This conclusion is supported by the following considerations: they succeed even in the case of light-atom structures (as shown in this paper), in the case of powder data (Burla et al., 2007[Burla, M. C., Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Polidori, G. & Siliqi, D. (2007). J. Appl. Cryst. 40, 834-840.]), and in the case of large proteins containing heavy atoms (up to 7900 atoms in the asymmetric unit with a data resolution of 1.65 Å, or also at 1.92 Å resolution for a protein with about 1300 non-H atoms in the asymmetric unit) (Caliandro et al., 2008[Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Mazzone, A. & Siliqi, D. (2008). J. Appl. Cryst. 41, 548-553.]).

3. Patterson deconvolution and implication transformations

To better clarify the role of C' as powerful substitute for the P map, we recap the most advanced procedures aiming at deconvolving the Patterson function by using implication transformation and superposition methods. The typical procedure may be essentially summarized in three steps:

(i) Given the symmetry operator [{{\bf{C}}_s}], the related implication transformation Is(r) is calculated: it is a function of the atomic position r defined over the corresponding Harker section.

[{I_s}\left({\bf{r}} \right) = {{P\left({{\bf{r}} - {{\bf{C}}_s}{\bf{r}}} \right)} /{{n_s}}}, \eqno (10)]

where P is the Patterson function and ns is the number of symmetry operators that give rise to the same Harker section (Harker, 1936[Harker, D. (1936). J. Chem. Phys. 4, 381-390.]).

(ii) The symmetry minimum function is calculated, given by

[{\rm SMF}\left({\bf{r}} \right) = \mathop {\rm min}\limits_{s = 1}^{\overline m} [{I_s}\left({\bf{r}} \right)], \eqno (11)]

where min indicates that SMF assumes in r the minimum among the values of the [\overline m] independent functions [{I_s}\left({\bf{r}} \right)].

(iii) The largest SMF peaks are used in turn to calculate the minimum superposition function

[S({\bf{r}}) = {\rm min}[P({\bf{r}} - {{\bf{r}}_q}), {\rm SMF}({\bf{r}})]. \eqno (12)]

Sometimes more than one superposition vector is used, according to

[S({\bf{r}}) = {\rm min}[P({\bf{r}} - {{\bf{r}}_q}),\ldots, P({\bf{r}} - {{\bf{r}}_n}), {\rm SMF}({\bf{r}})]. \eqno (13)]

From now on we will refer to the vectors [{{\bf{r}}_q},\ldots,{{\bf{r}}_n}] as to pivot vectors, because they regulate the map overlap. As stated in §2[link], the above techniques have been recently revisited, with a dramatic increase in efficiency (Caliandro et al., 2008[Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Mazzone, A. & Siliqi, D. (2008). J. Appl. Cryst. 41, 548-553.]). The key to the improvements is in the following two additional steps:

(iv) Efficient filtering algorithms are applied to break down the additional crystallographic symmetry present in the function SMF and the residual Patterson symmetry in the S(r) map.

(v) Subsequent cycles of electron density modification-difference electron density modification (EDM-DEDM) are automatically applied to the current maps to obtain higher-quality model maps.

In the next sections we will describe a simple algebra showing the potential advantages obtainable by replacing the Patterson by the function C' in step (iii) of the algorithm described above. The new algorithm for Patterson deconvolution will be described in §6[link] and its practical applications in §7[link].

4. C, C' and P maps

Let us analyze, in the C, C' and P maps, those features of the interatomic vectors that are particularly relevant for the application of implication transformation and superposition methods. In the algebraic calculations below we will emphasize the role of the heavy atoms, because they are often part of the model structure. The relations we will obtain are, however, quite general (the reader can set the number of heavy atoms equal to zero in our expressions if the structure is composed only of light atoms). We will suppose that:

(a) the target unit cell contains N atoms, NH of which are heavy atoms;

(b) the positions of the heavy atoms are known, with negligible errors, and constitute the model;

(c) [{{\bf{r}}_{Hi}}], [i = 1,\ldots, NH] are the heavy-atom positions and [{{\bf{r}}_{l\nu }}] are the light-atom positions. We order the light atoms in such a way that they follow the heavy-atom list (i.e., l goes from NH + 1 to N).

We want to obtain, under the above hypotheses, a more complete model of the target structure by replacing, in equations (12)[link] and (13)[link], [P({\bf{r}} - {{\bf{r}}_q})] by [C({\bf{r}} - {{\bf{r}}_q})] or [C'({\bf{r}} - {{\bf{r}}_q})].

For ideal diffraction data, Patterson peak positions will be the union of the following three sets:

[\eqalignno{&\{ {{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}},\ {\rm{ }}i,j = 1, \ldots, NH\} &(14a)\cr &\{ \pm ({{\bf{r}}_{Hi}} - {{\bf{r}}_{l\nu }}),\ i = 1,\ldots NH,\ \nu = NH + 1, \ldots, N\} & (14b)\cr &\{{{\bf{r}}_{l\nu }} - {{\bf{r}}_{l\mu }},\ \nu, \mu = NH + 1, \ldots, N\}. &(14c)}]

In this order they correspond to heavy-heavy, heavy-light and light-light atom distances.

Let us now shift the Patterson map by the pivot vector [{{\bf{r}}_{Hq}}]: we should obtain a noisy image of the structure. The following peak sets arise:

[\eqalignno{&\{{{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}} + {{\bf{r}}_{Hq}},\ i,j = 1, \ldots, NH\} & (15a)\cr &\{ \pm ({{\bf{r}}_{Hi}} - {{\bf{r}}_{l\nu }}) + {{\bf{r}}_{Hq}},\ i = 1, \ldots NH,\ \nu = NH + 1, \ldots, N\} &\cr &&(15b)\cr &\{ {{\bf{r}}_{l\nu }} - {{\bf{r}}_{l\mu }} + {{\bf{r}}_{Hq}},\ {\rm{ }}\nu, \mu = NH + 1, \ldots, N\}. & (15c)\cr}]

Emphasizing the case j = q for the subset (15a)[link] and the case i = q for the subset (15b)[link] allows us to rewrite the peaks (15)[link] as

[\eqalignno{&\{ {{\bf{r}}_{Hi}},\ i = 1, \ldots, NH\semi\ {{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}} + {{\bf{r}}_{Hq}},\ i,j = 1, \ldots, NH,\cr&\quad j \ne q\} &(16a)\cr &\{ {{{\bf{r}}_{l\nu }}}, {\rm{ }}\nu = NH + 1, \ldots, N \semi\ ({{\bf{r}}_{l\nu }} - {{\bf{r}}_{Hi}}) + {{\bf{r}}_{Hq}},\ i = 1, \ldots , NH,\cr&\quad i \ne q,\ \nu = NH + 1, \ldots, N\semi\ ({{\bf{r}}_{Hi}} - {{\bf{r}}_{l\nu }}) + {{\bf{r}}_{Hq}},\cr&\quad i = 1,\ldots\,NH,\ \nu = NH + 1, \ldots, N \} & (16b)\cr &\{{{\bf{r}}_{l\nu }} - {{\bf{r}}_{l\mu }} + {{\bf{r}}_{Hq}},\ \nu, \mu = NH + 1,\ldots, N\}. &(16c)}]

If the peaks (16)[link] are overlapped with the SMF map, the set (16a)[link] would provide the heavy-atom substructure plus noise, the set (16b)[link] would generate all the light-atom positions plus noise, and the set (16c)[link] would produce only noise.

Let us now describe what we should obtain if the C map is shifted by the pivot vector [{{\bf{r}}_{Hq}}]. The interatomic vectors present in the C map will be the union of two sets: according to the chosen enantiomorph [which depends on whether in equation (7)[link] we use [\exp i({\varphi _{\bf{h}}} - {\varphi _{p{\bf{h}}}})] or [\exp i({\varphi _{p{\bf{h}}}} - {\varphi _{\bf{h}}})]] the following sets arise (see Appendix A[link]):

[\eqalignno{&\{ {{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}},\ i,j = 1, \ldots, NH\} \cup \{ {{\bf{r}}_{l\nu }} - {{\bf{r}}_{Hi}},\ j = 1, \ldots, NH,\cr&\quad \nu = NH + 1, \ldots, N\} & (17a)}]

or

[\eqalignno{&\{{{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}},\ i,j = 1, \ldots, NH\} \cup \{ {{\bf{r}}_{Hi}} - {{\bf{r}}_{l\nu }},\ {\rm{ }}j = 1, \ldots, NH,\cr&\quad \nu = NH + 1, \ldots, N\}. & (17b)}]

The reader may notice that, in accordance with Appendix A[link], the heavy-atom-heavy-atom distances (more generally speaking, the distances between the atoms included in the model) constitute a centric set, even if the target space group is acentric, while the distances between heavy and light atoms form an acentric set. In our examples we will comply with (17a)[link].

If the heavy-atom position [{{\bf{r}}_{Hq}}] is added to the interatomic vectors in equations (17), we obtain the following vectorial sets:

[\eqalignno{&\{{{\bf{r}}_{Hi}},\ i = 1, \ldots, NH\semi\ {{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}} + {{\bf{r}}_{Hq}},\ i,j = 1, \ldots, NH,\cr&\quad j \ne q\} & (18a)\cr &\{ {{{\bf{r}}_{l\nu }}} ,\ \nu = NH + 1, \ldots, N\semi\ {\rm{ }}({{\bf{r}}_{l\nu }} - {{\bf{r}}_{Hi}}) + {{\bf{r}}_{Hq}},\ i = 1,\ldots,NH,\cr&\quad i \ne q,\ \nu = NH + 1, \ldots, N\}. & (18b)}]

We observe: (i) the set (18a)[link] provides the heavy-atom substructure plus the same amount of noise included in the set (16a)[link]; (ii) the set (18b)[link] generates all the light-atom positions, but less noise than in the set (16b)[link]; (iii) the set (16c)[link], corresponding to light-light interatomic vectors, is absent when the C map is used.

As a consequence, the integration of the C map in the superposition method may offer a model structure less noisy than the Patterson map. Unfortunately, the C map is unknown during the phasing process, but we may replace it by its approximation: the C' map. If m is a good approximation of [\cos (\varphi - {\varphi _p})] for a sufficiently large set of reflections (for the moment we will assume that this condition is satisfied), then the C' peaks will be located at

[\eqalignno{&\{{{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}},\ i,j = 1, \ldots, NH\} \cup \{ \pm ({{\bf{r}}_{l\nu }} - {{\bf{r}}_{Hi}}),\ j = 1,\ldots, NH,\cr &\quad \nu = NH + 1, \ldots, N\}. & (19)}]

By adding the heavy-atom position [{{\bf{r}}_{Hq}}] to the interatomic vectors in (19)[link] we obtain

[\eqalignno{&\{ {{\bf{r}}_{Hi}},\ i = 1, \ldots, NH,\ {{\bf{r}}_{Hi}} - {{\bf{r}}_{Hj}} + {{\bf{r}}_{Hq}},\ i,j = 1, \ldots, NH,\cr&\quad j \ne q\} &(20a)\cr &\{ {{\bf{r}}_{l\nu }},\ \nu = NH + 1, \ldots, N\semi\ {{\bf{r}}_{l\nu }} - {{\bf{r}}_{Hi}} + {{\bf{r}}_{Hq}},\ i = 1, \ldots ,NH,\cr &\quad i \ne q,\ \nu = NH + 1, \ldots, N\semi\ {{\bf{r}}_{Hi}} - {{\bf{r}}_{l\nu }} + {{\bf{r}}_{Hq}},\cr &\quad i = 1, \ldots , NH,\ \nu = NH + 1, \ldots, N\}. &(20b)}]

Let us compare sets (20)[link], obtained by the use of the C' map, with sets (16)[link] obtained by using the P map:

(i) (20a)[link] and (20b)[link] coincide with (16a)[link] and (16b)[link], respectively. Therefore both set (20) and set (16) provide the atomic positions of the full target structure, with the same amount of noise.

(ii) The C' map has no noise term corresponding to (16c)[link]. This property may be very important when the target structure contains a large number of light atoms, as frequently occurs for medium-size structures and for proteins.

This superior characteristic of the C' map requires that one condition is satisfied: the good quality of the model. To clarify this point, let us summarize the results of our algebraic analysis and combine them with the conclusions described in Appendix A[link]. The C map may be calculated only if the cosines [\cos (\varphi - {\varphi _p})] are known: it is acentric if the target structure is acentric, with symmetry defined by the symmorphic variant of the space group of the target. Since the target phases are usually unknown, [\cos (\varphi - {\varphi _p})] may be replaced by its statistical estimate m. The corresponding map is now the C' map, centrosymmetric, with the same space group as the Patterson map. The C peaks (ideally) are also peaks of the C' map.

The C' map, ideally, will only show interatomic vectors relating the Np atoms to the N atoms of the target, regardless of the value of Np. The reader should not think that this property is guaranteed by the condition that m is a good approximation of [\cos (\varphi - {\varphi _p})]: this conclusion is clearly demonstrated in Appendix A[link]. As a consequence, the property will also hold when Np is very small, for example when Np is equal to one or two (see the practical applications described in §7[link]). However, if Np is very small, one wrong atomic position in the model structure may lead to wrong calculated amplitudes [|{F_p}|] and therefore to wrong C' maps. The assumption that the quality of the model is a basic condition for the success of the C'-based procedure is therefore demonstrated.

There is another question which deserves to be discussed: are the C or the C' maps new variants (weighted) of the Patterson functions? The answer may be summarized as follows:

(i) The C map involves phases of the target and of the model structures, and therefore cannot be considered a variant of the Patterson function. Indeed, the C-map symmetry does not coincide with the Patterson symmetry.

(ii) The C' map is part of the Patterson map, showing only model-target vectors. The weight m privileges the structure factors with large observed and calculated amplitudes. From a formal point of view, the C' function may be considered as a special Patterson function with weight m|Fp|/|F|, but this equivalence does not grasp the essence of the C' map. Indeed, while the Patterson function is calculated from the observations, the C'-map calculation needs a model, and therefore corresponding amplitudes and phases.

(iii) Some similarity exists between C' and the sum function (Buerger, 1959[Buerger, M. J. (1959). Vector Space, ch. 11. New York: Wiley.]), where a weight is necessary: such a weight, however, is a complex number which is calculated from the translational vectors used for the image superposition.

The above considerations describe the ideal properties of the C' map. In practice, m is only a statistical estimate of [\cos (\varphi - {\varphi _p})]: therefore the light-atom-light-atom peaks may be present in the C' map, but their intensity is expected to be much smaller compared with the corresponding peaks in the Patterson map.

A further property makes the use of the C' map more interesting: while the Patterson map is invariant during the phasing process, the C' map changes with the current model: we will show that this characteristic is crucial for a successful crystal structure determination.

A simple graphical demonstration of the above described algebra is shown in Fig. 1[link]: it is very didactical, though not realistic, but it is useful for checking the properties of the vectorial sets. In Fig. 1[link](a) the target electron density is shown at 1 Å data resolution: a two-dimensional unit cell is used, with a = b = 20 Å, plane group pg, S is at (0.1, 0.1), and O atoms are at (0.165, 0.302) and (0.302, 0.185). In Fig. 1[link](b) the Patterson map is depicted, plane group p2mm. In Fig. 1[link](c) the C map is shown when S is the only atom in the model: the map is acentric, space group pm, while the component substructure corresponding to S-S distances is centric, with plane group p2mm (since the model substructure contains only one symmetry-independent atom, the resulting vector substructure consists of two Harker peaks). The peaks corresponding to light-atom-light-atom distances are absent. The C' map is shown in Fig. 1[link](d): its plane group is p2mm, and the intensities corresponding to O-O distances are very weak.

[Figure 1]
Figure 1
(a) A three-atom two-dimensional target structure: plane group pg; (b) the corresponding Patterson map, plane group p2mm; (c) the C map when the S atom is the only atom in the model structure. Its symmetry group is pm, but the Patterson S-atom substructure satisfies the p2mm symmetry; (d) the corresponding C'map, with p2mm symmetry.

An additional two-dimensional example (simulating a centric arrangement of the atoms in three dimensions) is shown in Fig. S1:1 a two-dimensional unit cell is used, with a = b = 20 Å, plane group p2, with S at (0.1, 0.1), and O atoms at (0.1, 0.3) and (0.3, 0.1). In this case P, C and C' maps show the same p2 symmetry: C and C' are more closely related, and show very faint peaks corresponding to O-O distances.

In §2[link] we anticipated that the new phasing procedure, exploiting the properties of the C' map, may succeed even when applied to light-atom structures. We therefore need to generalize to light-atom structures the algebraic results described above, so far tailored for structures containing heavy atoms. For this we will suppose that the current model is composed of NH light atoms (instead of by NH heavy atoms). Then:

(a) in a C map the vectors between model atoms will show the Laue symmetry, the vectors between model and non-model atoms should satisfy the symmorphic variant of the target space group and, finally, the vectors between non-model atoms and non-model atoms should be absent; and

(b) in a C' map the vectors between model atoms and the vectors between model and non-model atoms should satisfy the Laue symmetry, and the vectors between non-model atoms and non-model atoms should be weak or absent.

5. The C' map deconvolution

The standard Patterson superposition is usually carried out by equations (11)[link]-(13)[link][link]. In our applications (13)[link] was not effective: the reasons are the following. Multiple map superposition, as symbolically represented in equation (13)[link], has been attempted by several workers: the risk, frequently met in the practical applications, is that the resulting [S({\bf{r}})] map becomes poorer and poorer when the number of overlapped maps increases. Indeed, as an effect of resolution bias and of the vector overlap in the Patterson map, Patterson peaks are shifted from their ideal positions and the symmetry minimum function may vanish even in sites corresponding to heavy-atom positions in the electron-density map. A straightforward implementation of the superposition technique, based on the shifted C' map rather than on the shifted Patterson map, would lead to

[S({\bf{r}}) = {\rm min}[C'({\bf{r}} - {{\bf{r}}_q}), {\rm SMF}({\bf{r}})]. \eqno (21)]

To more clearly compare the expected features of (12)[link] and (21)[link] we observe:

(i) adding [{{\bf{r}}_q}] to the set of interatomic vectors (19)[link] provides the set (20)[link]: target positional vectors and noise are obtained, as described in §4[link];

(ii) if equation (12)[link] is applied, the noise reduction may not be equally efficient. Indeed, the P map always contains, particularly for large structures, a huge amount of noise corresponding to light-atom distances.

Some graphical examples (Fig. 2 and Figs. S2, S3) can illustrate the potential superiority of the C'-map-based deconvolution: we will use the ideal structures depicted in Figs. 1(a) and S1(a).

In Fig. 2[link](a) we show the SMF map corresponding to the two-dimensional structure illustrated in Fig. 1[link](a). According to the planar group symmetry pg, the SMF peaks are of columnar type. The four strongest columns correspond to the S-S Harker peaks: the number four is due to the origin ambiguity along the x axis. We freely choose for the x coordinate of the S atom that corresponding to the column highlighted by the yellow arrow: the y coordinate is free, and we choose, for simplicity, the true y coordinate of the sulfur. In Fig. 2[link](b) we show the Patterson map shifted by [{{\bf{r}}_q} = {{\bf{r}}_{\rm S}}]; the minimum superposition functions (12)[link] and (21)[link] are shown, respectively, in Figs. 2[link](c) and 2[link](d). It can be noted that in both cases the full structure is obtained but some false peaks are present, and using (12)[link] or (21)[link] leads to equivalent results.

[Figure 2]
Figure 2
(a) SMF map corresponding to the structure illustrated in Fig. 1[link](a); (b) Patterson map shifted by rS (rS is the sulfur position, chosen along the columnar peak highlighted by the yellow arrow). For simplicity, the y coordinate of the sulfur has been arbitrary fixed to the true one; (c,d) symmetry minimum function obtained according to equations (12)[link] and (21)[link], respectively.

In Fig. S2 we show the results obtained by applying the Patterson deconvolution to the target structure depicted in Fig. S1(a). Since the strongest Patterson peak, say the S-S peak, is at (0.2, 0.2), the SMF map will show a strong peak at (0.1, 0.1) (see Fig. S2a), which is used a pivot vector [{{\bf{r}}_q} = {{\bf{r}}_{\rm S}}] in map superposition. In Fig. S2(b) we show the Patterson map shifted by [{{\bf{r}}_{\rm S}}] and in Figs. S2(c) and S2(d) the minimum superposition functions (12)[link] and (21)[link], respectively: the full structure is again obtained with some false peaks, but the use of (21)[link] provides a cleaner map.

In more detail, we observe:

(a) The [S({\bf{r}})] maps obtained via equation (12)[link] (say Figs. 2c and S2c) and equation (21)[link] (say Figs. 2[link]d and S2d) have very sharp peak domains, a symptom of partial peak overlap at the atom sites (even in this ideal condition).

(b) None of the [S({\bf{r}})] maps is completely deconvoluted, because in all the cases a double superposition is calculated: a triple one would be necessary to obtain a single image of the structure.

(c) The larger efficiency of the C'-map-based deconvolution with respect to that based on the Patterson map is expected to increase with the structure complexity, where the disturbance from light-atom-light-atom vectors is larger: this is confirmed by the experimental tests described in §6[link].

An alternative approach can be also followed, by replacing the SMF with the atomic minimum superposition (AMS) introduced by Simpson et al. (1965[Simpson, P. G., Dobrott, R. D. & Lipscomb, W. N. (1965). Acta Cryst. 18, 169-179.]) and Pavelcík (1986[Pavelcík, F. (1986). J. Appl. Cryst. 19, 488-491.]). Let [{{\bf{r}}_q}] be the pivot vector and [{{\bf{C}}_\tau }{{\bf{r}}_q}] a position symmetry equivalent to [{{\bf{r}}_q}] (i.e., [{{\bf{C}}_\tau } \ne {\bf{I}}] is a symmetry operator of the target space group). Then the AMS is calculated as

[S({\bf{r}}) = \mathop {\mathop {\rm min}\limits^m }\limits_{\tau = 2} [P({\bf{r}} - {{\bf{r}}_q}), P({\bf{r}} - {{\bf{C}}_\tau }{{\bf{r}}_q})], \eqno (22)]

which can be modified (by introducing the C' map) according to

[S({\bf{r}}) = \mathop {\mathop {\rm min}\limits^m }\limits_{\tau = 2} [C'({\bf{r}} - {{\bf{r}}_q}), C'({\bf{r}} - {{\bf{C}}_\tau }{{\bf{r}}_q})]. \eqno (23)]

The reader should observe that in equation (23)[link]: (i) in accordance with the reasons mentioned above, the two C' maps are not superimposed with the SMF map: this should only be used for finding the atomic positions to be employed as pivots; (ii) overlapping maps by using pivot vectors which are related by symmetry reduces the risk of obtaining poor [S({\bf{r}})] functions: such a risk is higher if two symmetry-independent pivot vectors are used. The deconvolutions performed according to (22)[link] and (23)[link] are reported in Figs. S3(a,c) and S3(b,d) for the pg and p2 examples, respectively. The resulting S maps are no longer constituted by sharp peaks: again, using (22)[link] or (23)[link] leads to equivalent results for the pg case, using (23)[link] provides a cleaner map for the p2 case.

The use of (22)[link] and (23)[link] implies a large degree of map superposition for high-symmetry space groups, with the consequent risk of obtaining a final map that is too poor. To overcome this problem, we preferred to obtain a partial decomposition and to use only two pivot vectors related by symmetry, according to

[S({\bf{r}}) = {\rm min}[P({\bf{r}} - {{\bf{r}}_q}),P({\bf{r}} - {{\bf{C}}_2}{{\bf{r}}_q})], \eqno (24)]

[S({\bf{r}}) = {\rm min}[C'({\bf{r}} - {{\bf{r}}_q}),C'({\bf{r}} - {{\bf{C}}_2}{{\bf{r}}_q})]. \eqno (25)]

As we shall see in the next section, equations (12)[link], (21)[link], (24)[link] and (25)[link] are tools of our deconvolution algorithm.

6. The new deconvolution algorithm

Using the C' map in superposition techniques requires a preliminary condition: a model structure should already be available. That may be achieved (staying with direct-space techniques) in a simple way. First, the Patterson map is computed, then the implication transformation [{I_s}({\bf{r}} )] as defined by equation (10)[link] is obtained, then the symmetry minimum function [{\rm SMF}({\bf{r}})] as defined by equation (11)[link] is derived, from which a starting model may be extracted. This corresponds to the block SMF in Fig. 3[link], where a schematic view of our algorithm and of all the phasing procedure is given. Owing to the dynamic nature of the C' map, the quality and the size of the model may change during the deconvolution process. To take into account this feature, we designed a three-step algorithm, in which the [S({\bf{r}})] map obtained at a given step is used to generate a new C' map to be used in the next step (see Fig. 3[link]a).

[Figure 3]
Figure 3
Schematic view of the new Patterson deconvolution algorithm in the first (a1) and second (a2) iteration and of the ab initio structure solution strategy of SIR2011 (b).

Step A submits the coordinates and thermal factor of the pivot peak [{{\bf{r}}_q}] obtained in the block SMF to least-squares refinement (only if an atom heavier than Ca is present in the structure) and applies equation (12)[link].

In step B the highest peak of the S(r) map obtained at the end of step A, possibly submitted to least-squares refinement, is chosen as pivot peak [{{\bf{r}}_p}]: the corresponding peak is used as a model to build the C' map. Equation (21)[link] is then applied to obtain a new S(r) map, where interatomic vectors corresponding to distances between atoms not in the model have reduced intensity.

In step C two peaks are selected from the [S({\bf{r}})] map obtained at the end of step B: [{{\bf{r}}_{p1}}], the highest one, and [{{\bf{r}}_{p2}}], chosen as the highest peak at least 2 Å away from [{{\bf{r}}_{p1}}] (to avoid taking a ripple of [{{\bf{r}}_{p1}}]). [{{\bf{r}}_{p1}}] and [{{\bf{r}}_{p2}}] are used as a model for building a further C' map, which, shifted by [{{\bf{r}}_{p1}}], is overlapped with the SMF map according to equation (21)[link]. The resulting [S({\bf{r}})] map is subjected to cycles of EDM in which filtering algorithms are applied to break down the residual Patterson symmetry. They have not been changed with respect to our previous implementation (Caliandro et al., 2007[Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C. & Siliqi, D. (2007). J. Appl. Cryst. 40, 883-890.]): they were based on the properties of the FF and of the Patterson maps. The only modification introduced here is that now the C' map calculated in step C plays the role of the Patterson map in the old algorithm.

The algorithm described so far may also be applied to the space group P1, provided that the SMF map is replaced by the Patterson map in equations (12)[link] and (21)[link].

The above algorithm has been embedded in the SIR2011 framework for ab initio crystal structure solution (Burla et al., 2012[Burla, M. C., Caliandro, R., Camalli, M., Carrozzini, B., Cascarano, G. L., Giacovazzo, C., Mallamo, M., Mazzone, A., Polidori, G. & Spagna, R. (2012). J. Appl. Cryst. 45, 357-361.]), as sketched in Fig. 3[link](b). The trials progressively obtained are ordered by a figure of merit (FOM) (Burla et al., 2004[Burla, M. C., Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C. & Polidori, G. (2004). J. Appl. Cryst. 37, 791-801.]), and then submitted to phase refinement, which is based on cycles of EDM and DEDM, followed by routines that interpret the electron-density map in terms of a structural model. The RELAX procedure (Burla et al., 2005[Burla, M. C., Caliandro, R., Camalli, M., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Polidori, G. & Spagna, R. (2005). J. Appl. Cryst. 38, 381-388.]) is also used at this stage for the most promising trials. The structural model is then refined by diagonal least squares (block LSQ in Fig. 3[link]), after which the crystallographic residual is calculated. If it falls below a given threshold (0.25 by default), the program stops.

If the solution is not obtained at the end of the whole procedure, a second iteration is performed using a different version of the deconvolution module: equations (24)[link] and (25)[link] are used in steps A and B instead of equations (12)[link] and (21)[link], respectively.

A last point deserves to be clarified. The reader has certainly noticed that we choose a one- or two-atom symmetry-independent model structure, while the target structure may contain a large number of atoms. According to Appendix A[link], the C' map will show only the interatomic vectors relating model to target atoms, and therefore is a small or a large part of the Patterson map depending on whether Np is small or large. If Np is very small, the number of interatomic vectors in the unit cell is much smaller than in the Patterson map: this strongly reduces the vector overlap, improves the peak location and, consequently, minimizes the error in the model structure.

7. Applications

Any new algorithm should be checked by a large number of test structures to prove its general applicability. We used a total of 188 test structures, whose code names, space groups, chemical compositions and references are listed as supplementary information (Tables S1 and S2). To obtain statistical figures summarizing the results of the phasing process, we divided the set of test structures into subsets according to the number of non-H atoms in the asymmetric unit (Nasym) and their heavier atomic species (L = light, H = heavy): for practical usefulness an atom is considered heavy if its atomic number is larger than that of Ca. The number of test structures for each subset are reported in Table 1[link], where we also show the average phasing efficiencies of two deconvolution methods, the first C'-based (as described in §5[link]), the second Patterson-based, carried out by using equation (12)[link]. For each structure subset EffC' and EffP are the ratio (number of solved structures)/(number of structures) when the C'-based and the Patterson-based algorithms are used, respectively. In Fig. 4[link] we show, for the C'-based algorithm and for the H and L subsets, the average values of: the crystallographic residual (Rcr), the sequential number of the pivot peak for which the solution is obtained (np), and the cpu time necessary to reach the solution (t in min) versus the average values of Nasym (<Nasym>). The averages have been calculated over the test structures contained in each of the four subsets considered in Table 1[link].

Table 1
Statistics for the test structures

Values are grouped according to the number of non-H atoms in the asymmetric unit (Nasym) and the presence or absence of atoms heavier than Ca (H/L). Entries correspond to the number of structures in each class (Num), and the phasing efficiencies of the C'-based (EffC') and of the Patterson-based deconvolution (EffP) algorithms

  H L
Num EffC' EffP Num EffC' EffP
Nasym [less-than or equal to] 20 23 100 100 14 100 100
20 < Nasym [less-than or equal to] 80 21 100 100 45 100 96
80 < Nasym [less-than or equal to] 150 14 100 100 46 100 93
150 < Nasym 4 100 100 21 90 81
[Figure 4]
Figure 4
For H and L structures we show, versus <Nasym>: (a) the average values of the crystallographic residual Rcr; (b) the pivot peak sequential order np at which the solution has been obtained; and (c) the cpu time (t in min) needed to reach the solution.

It can be noted that:

(i) The deconvolution algorithm based on the C' map is more efficient than that based on the Patterson map for structures containing only light atoms, and in particular for difficult cases, i.e. for structures with Nasym > 150. If heavy atoms are present, both algorithms work with full efficiency.

(ii) On average, the crystallographic residual does not depend on the structure complexity within the range of Nasym considered.

(iii) For structures containing heavy atoms, the solution is obtained by the very first (higher) SMF peaks, indicating that the true heavy-atom position is always found in the very early stage of the phasing procedure. As a consequence, the cpu time needed to reach the solution simply scales with the size of the structure. The efficiency of the deconvolution algorithm does not depend on the size of the structure in this case.

(iv) For structures containing only light atoms, a steady increase of the pivot peak order number as a function of the structure complexity is seen, indicating that locating light atoms in their true positions by superposition techniques is more and more challenging as the size of the structure increases. Therefore the cpu time increases with Nasym more rapidly than for the heavy-atom case.

Most of the medium-size test structures are a challenge for any ab initio phasing program, especially if they include only light atoms: thus the results obtained for the L structures with Nasym > 150 and no atom heavier than O are reported in Table 2[link]. The efficiency of the algorithm does not seem to depend on the crystal symmetry. The only structures resistant to the new phasing method are ceho2z and cemc2z: the failure may be attributed to the large measurement errors for the high-resolution spots (Rint > 45% for data at atomic resolution), which strongly affect the Patterson or C' map quality. A unique structure (cyclo_bnz) has been solved by iterating the whole decomposition procedure (part of the standard approach): after having explored all the 23 SMF peaks by using the SMF approach, for a total of 1472 s, the AMS approach took 180 s to reach the solution. It is worth noting that the AMS approach is less efficient (it has 5 failures), but it constitutes an alternative to the SMF approach.

Table 2
L test structures with Nasym > 150 and no atom heavier than O

The number of the pivot peak at which the solution is obtained by the C'-based algorithm, the final crystallographic residual and the cpu time necessary for achieving the solution are given.

Name Reference Space group Nasym Cell content Pivot peak R (%) Time (min)
babu (a) Cc 204 C528H576N96O192 1 13.1 0.8
ceho2z (b) P21 232 C264H304N32O168 - - -
cemc2z (c) P21 207 C210H444N4O200 - - -
cyclo_bnz (d) P21 206 C216H224N8O189 18 13.8 27.5
cyclo_dba (d) P212121 219 C508H448N16O352 43 13.8 39.5
dodecak (e) P212121 200 C568H912N96O136 4 17.5 1.4
helix (f) C2 164 C472H736N92O92 45 19.6 19.2
inclu_rt (g) P1 194 C97H157O97 1 13.1 0.6
ohba_p1 (h) P1 188 C166H158N2O20 1 13.5 0.6
tb02rlmk (i) P1 197 C96H224O101 1 12 0.3
tb (j) P212121 186 C520H784O224 22 11.2 52.5
tp (j) P21 161 C210H308O112 17 9.1 12.2
tval (k) P1 156 C108H180N12O36 1 11.9 0.3
References: (a) Narendra Babu et al. (2009[Narendra Babu, S. N., Abdul Rahim, A. S., Osman, H., Jebas, S. R. & Fun, H.-K. (2009). Acta Cryst. E65, o1560-o1561.]); (b) Onagi et al. (2003[Onagi, H. Carrozzini, B., Cascarano, G. L., Easton, C. J., Edwards, A. J., Lincoln, S. F. & Rae, A. D. (2003). Chem Eur. J. 9, 5971-5977.]); (c) Edwards (2003)[Edwards, A. (2003). Personal communication.]; (d) Giastas et al. (2003[Giastas, P., Yannakopoulou, K. & Mavridis, I. M. (2003). Acta Cryst. B59, 287-299.]); (e) Saviano et al. (2004[Saviano, M., Improta, R., Benedetti, E., Carrozzini, B., Cascarano, G. L., Didierjean, C., Toniolo, C. & Crisma, M. (2004). ChemBioChem, 5, 541-544.]); (f) Rudresh et al. (2004[Rudresh, R. S., Ramagopal, U. A., Inai, Y., Goel, S., Sahal, D. & Chauhan, V. S. (2004). Structure, 12, 389-396.]); (g) Makedonopoulou & Mavridis (2000[Makedonopoulou, S. & Mavridis, I. M. (2000). Acta Cryst. B56, 322-331.]); (h) Ohba & Miyamoto (2004[Ohba, S. & Miyamoto, H. (2004). Acta Cryst. E60, o216-o218.]); (i) H. Gornitzka, personal communication; (j) K. Gessle, personal communication; (k) Karle (1975[Karle, I. L. (1975). J. Am. Chem. Soc. 97, 4379-4386.]).

8. Conclusions

The C map, the Fourier transform of the product [|{F_{\bf{h}}}{F_{p{\bf{h}}}}|\exp i({\varphi _{\bf{h}}} - {\varphi _{p{\bf{h}}}})], has been crystallographically characterized, together with its centric modification C', the Fourier transform of the product [{m_{\bf{h}}}|{F_{\bf{h}}}{F_{p{\bf{h}}}}|]. The C' map, easily computable as soon as a model is available, shows, with reduced intensity, peaks corresponding to interatomic vectors between atoms not belonging to the model. This property may be very useful when the C' map (instead of the Patterson map) is combined with implication transformation methods and superposition techniques, because it is less subject to noise.

Our applications show that superposition methods, combined with the C' map, can solve small- and medium-size structures (up to about 400 non-H atoms in the asymmetric unit) even when no heavy atom (i.e., no atom heavier than O) is present. This result dramatically changes the common judgement about the versatility of Patterson techniques. In International Tables for Crystallography Volume B, Rossmann & Arnold (1993[Rossmann, M. G. & Arnold, E. (1993). International Tables for Crystallography, Vol. B, Reciprocal Space, edited by U. Shmueli, ch. 2.3. Dordrecht: Kluwer.]) write

The feasibility of structure solution by the heavy-atom method depends on a number of factors which include the relative size of the heavy atom and the extent and quality of the data. A useful rule of thumb is that the ratio

[r = {{\textstyle\sum_{\rm heavy} {{Z^2}} } \over {\textstyle\sum_{\rm light} {{Z^2}} }}]

should be near unity if the heavy atom is to provide useful starting phase information (Z is the atomic number of an atom). The condition that r > 1 normally guarantees interpretability of the Patterson function in terms of the heavy-atom positions.

They also state that the rule is rather conservative and quote as an outstanding example vitamin B12 with formula C62H88CoO14P (Hodgkin et al., 1957[Hodgkin, D. C., Kamper, J., Lindsey, J., MacKay, M., Pickworth, J., Robertson, J. H., Shoemaker, C. B., White, J. G., Prosen, R. J. & Trueblood, K. N. (1957). Proc. R. Soc. London Ser. A, 242, 228-263.]), which gave r = 0.14 for the Co atom alone.

Our application clearly shows that heavy atoms are no longer strictly necessary for the success of Patterson-C' procedures, which may successfully be applied also to structures with atoms not heavier than O. This result makes Patterson techniques probably the most versatile phasing method, given also their ability to solve protein structures with heavy atoms up 7900 atoms in the asymmetric unit with a data resolution of 1.65 Å (Caliandro et al., 2008[Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Mazzone, A. & Siliqi, D. (2008). J. Appl. Cryst. 41, 548-553.]).

Appendix A

Symmetry properties of the C map

The symmetry properties of the C map may be obtained by a simple analysis of the product [{F_{\bf{h}}}{F_{p{\bf{h}}}}]. In accordance with §2[link], we will suppose that the unit cells of the target and of the model structure contain N and Np atoms, respectively, and that t and tp are the corresponding numbers of atoms in the asymmetric units. Let [{\bf{C}} = ({\bf{R}},{\bf{T}})] be the generic symmetry operator of the space group and m its order; then

[{F_{\bf{h}}}{F_{ - p{\bf{h}}}} = \textstyle\sum\limits_{i = 1}^t \textstyle\sum\limits_{j = 1}^{{t_p}} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}} \cdot ({{\bf{C}}_s}{{\bf{r}}_i} - {{\bf{C}}_k}{{\bf{r}}_{pj}})]. \eqno (26)]

We subdivide the summation over the target atoms into two parts:

[\eqalignno{ {F_{\bf{h}}}{F_{ - p{\bf{h}}}} &= \textstyle\sum\limits_{i = 1}^{{t_p}} \textstyle\sum\limits_{j = 1}^{{t_p}} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}} \cdot ({{\bf{C}}_s}{{\bf{r}}_i} - {{\bf{C}}_k}{{\bf{r}}_{pj}})] \cr &\quad+ \textstyle\sum\limits_{i = {t_p} + 1}^t \textstyle\sum\limits_{j = 1}^{{t_p}} {} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}} \cdot ({{\bf{C}}_s}{{\bf{r}}_i} - {{\bf{C}}_k}{{\bf{r}}_{pj}})].& (27)}]

Both terms on the right-hand side of equation (27)[link] involve acentric distributions of interatomic distances. However, if the model and target structures are highly correlated, the first term is pseudo-centric: it becomes completely centric when D = 1 (that is, when [{{\bf{r}}_{pj}} \equiv {{\bf{r}}_j},j = 1,\ldots,{N_p}]). In this case

[\eqalignno{ {F_{\bf{h}}}{F_{ - p{\bf{h}}}}& = \textstyle\sum\limits_{i = 1}^{{t_p}} \textstyle\sum\limits_{j = 1}^{{t_p}} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}} \cdot ({{\bf{C}}_s} - {{\bf{C}}_k}){{\bf{r}}_{pj}}] \cr &\quad+ \textstyle\sum\limits_{i = {t_p} + 1}^t \textstyle\sum\limits_{j = 1}^{{t_p}} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}} \cdot ({{\bf{C}}_s}{{\bf{r}}_i} - {{\bf{C}}_k}{{\bf{r}}_{pj}})].& (28)}]

This feature was not noticed in the symmetry analysis described by Carrozzini et al. (2010[Carrozzini, B., Cascarano, G. L. & Giacovazzo, C. (2010). J. Appl. Cryst. 43, 221-226.]). Let us now derive the symmetry of the product [{F_{\bf{h}}}{F_{ - p{\bf{h}}}}], and therefore of the C map. Defining

[{{\bf{C}}_s} = {{\bf{C}}_k}{{\bf{C}}_\mu } = ({{\bf{R}}_k}{{\bf{R}}_\mu },{{\bf{R}}_k}{{\bf{T}}_\mu } + {{\bf{T}}_k})]

gives

[{F_{\bf{h}}}{F_{ - p{\bf{h}}}} = \textstyle\sum\limits_{i = 1}^t \textstyle\sum\limits_{j = 1}^{{t_p}} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}}{{\bf{R}}_k} \cdot ({{\bf{C}}_\mu }{{\bf{r}}_i} - {{\bf{r}}_{pj}})]. \eqno (29)]

If the model is highly correlated with the target structure, equation (29)[link] may be approximated by

[\eqalignno{ {F_{\bf{h}}}{F_{ - p{\bf{h}}}} &= \textstyle\sum\limits_{i = 1}^{{t_p}} \textstyle\sum\limits_{j = 1}^{{t_p}} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}}{{\bf{R}}_k} \cdot ({{\bf{C}}_\mu }{{\bf{r}}_{pi}} - {{\bf{r}}_{pj}})] \cr &\quad+ \textstyle\sum\limits_{i = {t_p} + 1}^t \textstyle\sum\limits_{j = 1}^{{t_p}} \textstyle\sum\limits_{s,k = 1}^m {{f_i}\,{f_j}} \exp [2\pi i{\bf{h}} \cdot ({{\bf{C}}_s}{{\bf{r}}_i} - {{\bf{C}}_k}{{\bf{r}}_{pj}})]. &(30)}]

Each interatomic vector in equations (29)[link] or (30)[link] depends on the space-group symmetry of the target structure, but the symmetry of the interatomic vectors obeys that of the corresponding symmorphic space group. The reader, however, should notice that, according to equation (28)[link], the interatomic vectors among model atoms will show the symmetry of the Laue group, because their distribution is centric.

References

Abrahams, J. P. (1997). Acta Cryst. D53, 371-376.  [CrossRef] [ChemPort] [ISI] [details]
Abrahams, J. P. & Leslie, A. G. W. (1996). Acta Cryst. D52, 30-42.  [CrossRef] [ChemPort] [ISI] [details]
Buerger, M. J. (1959). Vector Space, ch. 11. New York: Wiley.
Burla, M. C., Caliandro, R., Camalli, M., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Polidori, G. & Spagna, R. (2005). J. Appl. Cryst. 38, 381-388.  [ISI] [CrossRef] [ChemPort] [details]
Burla, M. C., Caliandro, R., Camalli, M., Carrozzini, B., Cascarano, G. L., Giacovazzo, C., Mallamo, M., Mazzone, A., Polidori, G. & Spagna, R. (2012). J. Appl. Cryst. 45, 357-361.  [ISI] [CrossRef] [ChemPort] [details]
Burla, M. C., Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C. & Polidori, G. (2004). J. Appl. Cryst. 37, 791-801.  [ISI] [CrossRef] [ChemPort] [details]
Burla, M. C., Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Polidori, G. & Siliqi, D. (2007). J. Appl. Cryst. 40, 834-840.  [ISI] [CrossRef] [ChemPort] [details]
Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C., Mazzone, A. & Siliqi, D. (2008). J. Appl. Cryst. 41, 548-553.  [ISI] [CrossRef] [ChemPort] [details]
Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C. & Siliqi, D. (2007). J. Appl. Cryst. 40, 883-890.  [ISI] [CrossRef] [ChemPort] [details]
Carrozzini, B., Cascarano, G. L. & Giacovazzo, C. (2010). J. Appl. Cryst. 43, 221-226.  [ISI] [CrossRef] [ChemPort] [details]
Cowtan, K. (1999). Acta Cryst. D55, 1555-1567.  [ISI] [CrossRef] [ChemPort] [details]
Edwards, A. (2003). Personal communication.
Giacovazzo, C. & Siliqi, D. (1997). Acta Cryst. A53, 789-798.  [CrossRef] [details]
Giastas, P., Yannakopoulou, K. & Mavridis, I. M. (2003). Acta Cryst. B59, 287-299.  [ISI] [CSD] [CrossRef] [details]
Harker, D. (1936). J. Chem. Phys. 4, 381-390.  [CrossRef] [ChemPort]
Hodgkin, D. C., Kamper, J., Lindsey, J., MacKay, M., Pickworth, J., Robertson, J. H., Shoemaker, C. B., White, J. G., Prosen, R. J. & Trueblood, K. N. (1957). Proc. R. Soc. London Ser. A, 242, 228-263.  [ChemPort]
Karle, I. L. (1975). J. Am. Chem. Soc. 97, 4379-4386.  [CrossRef] [PubMed] [ChemPort] [ISI]
Makedonopoulou, S. & Mavridis, I. M. (2000). Acta Cryst. B56, 322-331.  [ISI] [CSD] [CrossRef] [details]
Narendra Babu, S. N., Abdul Rahim, A. S., Osman, H., Jebas, S. R. & Fun, H.-K. (2009). Acta Cryst. E65, o1560-o1561.  [CSD] [CrossRef] [details]
Ohba, S. & Miyamoto, H. (2004). Acta Cryst. E60, o216-o218.  [CSD] [CrossRef] [details]
Onagi, H. Carrozzini, B., Cascarano, G. L., Easton, C. J., Edwards, A. J., Lincoln, S. F. & Rae, A. D. (2003). Chem Eur. J. 9, 5971-5977.  [CSD] [CrossRef] [PubMed] [ChemPort]
Patterson, A. L. (1934a). Phys. Rev. 45, 763.
Patterson, A. L. (1934b). Phys. Rev. 46, 372-376.  [CrossRef] [ChemPort]
Pavelcík, F. (1986). J. Appl. Cryst. 19, 488-491.  [CrossRef] [ISI] [details]
Pavelcík, F. (1988). Acta Cryst. A44, 724-729.  [CrossRef] [details]
Pavelcík, F., Kuchta, L. & Sivý, J. (1992). Acta Cryst. A48, 791-796.  [CrossRef] [details]
Pavelcík, F. & Pivovarcikova, O. (2002). J. Appl. Cryst. 35, 526-532.  [ISI] [CrossRef] [details]
Read, R. J. (1986). Acta Cryst. A42, 140-149.  [CrossRef] [details]
Refaat, L. S. & Woolfson, M. M. (1993). Acta Cryst. D49, 367-371.  [CrossRef] [ChemPort] [ISI] [details]
Richardson, J. W. & Jacobson, R. A. (1987). Patterson and Pattersons, edited by J. P. Glusker, B. K. Patterson & M. Rossi, pp. 310-317. Oxford University Press.
Rossmann, M. G. & Arnold, E. (1993). International Tables for Crystallography, Vol. B, Reciprocal Space, edited by U. Shmueli, ch. 2.3. Dordrecht: Kluwer.
Rudresh, R. S., Ramagopal, U. A., Inai, Y., Goel, S., Sahal, D. & Chauhan, V. S. (2004). Structure, 12, 389-396.  [ISI] [CrossRef] [PubMed] [ChemPort]
Saviano, M., Improta, R., Benedetti, E., Carrozzini, B., Cascarano, G. L., Didierjean, C., Toniolo, C. & Crisma, M. (2004). ChemBioChem, 5, 541-544.  [ISI] [CSD] [CrossRef] [PubMed] [ChemPort]
Sheldrick, G. M. (1992). Crystallographic Computing 5, edited by D. Moras, A. D. Podjarny & J. C. Thierry, pp. 145-157. Oxford University Press.
Sim, G. A. (1959). Acta Cryst. 12, 813-815.  [CrossRef] [ChemPort] [details]
Simpson, P. G., Dobrott, R. D. & Lipscomb, W. N. (1965). Acta Cryst. 18, 169-179.  [CrossRef] [ChemPort] [details]
Srinivasan, R. & Ramachandran, G. N. (1965). Acta Cryst. 19, 1008-1014.  [CrossRef] [ChemPort] [details]


Acta Cryst (2013). A69, 98-107   [ doi:10.1107/S0108767312040469 ]