How to assign a (3 + 1)-dimensional superspace group to an incommensurately modulated biological macromolecular crystal

(3 + 1)-dimensional superspace groups are explained for incommensurately modulated macromolecular crystals with an example.


Introduction
Our laboratory is interested in solving the structures of incommensurately modulated protein crystals. These crystals have a fascinating diffraction pattern with satellite reflections surrounding the main reflections. Commensurate and incommensurate macromolecular crystallography, with examples of such effects, as well as twinning and multiple crystal cases were reviewed by Helliwell (2008) and are also discussed in Chapter 8 of Rupp's Biomolecular Crystallography text book (Rupp, 2010). This paper concerns our symmetry analysis of the diffraction from (3 + 1)-dimensionally incommensurately modulated crystals of profilin:actin (PA) (Lovelace et al., 2008;Porta et al., 2011). This publication relies heavily on our study of an article by van Smaalen (2005), Chapters 1, 2 and 3 of van Smaalen's textbook on Incommensurate Crystallography (van Smaalen, 2007) and International Tables for Crystallography, Volume C, Chapter 9.8, Incommensurate and Commensurate Modulated Structures, by Janssen et al. (1999). We also studied Schö nleber's lectures on Introduction to Superspace Symmetry ISSN 1600-5767 from the Workshop on Structural Analysis of Aperiodic Crystals held in Bayreuth, Germany, and an article by Wagner & Schö nleber (2009). Although these are excellent sources, they were written for small-molecule crystallographers and physicists and use language and examples that are not encountered in macromolecular crystallography. Therefore, we decided to write this paper for the next biological crystallographer who chooses to solve a modulated crystal, so that it will not be so difficult for them to understand and to confidently assign their superspace group to the crystal diffraction data.
In this article the nomenclature common to periodic threedimensional (3D) crystals is used with adaptations to a fourth dimension as needed (Janssen et al., 1999). It is noteworthy that in much of the aperiodic literature another formalism is used, where subscripts i = 1, 2, 3 are used to indicate the space directions ( van Smaalen, 2007). This makes it easier to add more dimensions as needed. Hence, the symbols (a, b, c), (x, y, z), (hkl) and (, , ) used in this publication correspond to (a 1 , a 2 , a 3 ), (x 1 , x 2 , x 3 ), (h 1 h 2 h 3 ) and ( 1 , 2 , 3 ), respectively, in aperiodic crystallography. Vectors are in bold and scalar coefficients are in italics. This is pointed out here to help avoid confusion when reading the aperiodic literature.
Crystal periodicities can be categorized into three types (Fig. 1). The first is the most common case, where the crystal is periodic and the unit-cell contents are duplicated closely by the lattice translations (Fig. 1a). The second type is the case of a commensurate modulation. Here, the spacing of the satellite reflections relative to the main reflections is a rational value. The diffraction pattern can be indexed and integrated with any protein crystallography data reduction software with three integer indices as a supercell. In Fig. 1(b), the q vector which is used in aperiodic crystallography to index the satellite reflections relative to the main reflection has a rational value of 0.25 (or 1/4) and the modulation of the structure repeats every four unit cells. This means that the lattice parameters for indexing the satellite reflections are integer multiples (1, 2, . . . , n) and the crystal structure can be described with a supercell (Wagner & Schö nleber, 2009). The third type is the case of an incommensurately modulated crystal. Here, at least one component of the q vector is irrational and cannot be calculated with a simple fraction (Fig. 1c). An accurate description of an incommensurately modulated crystal can only be obtained by describing the diffraction pattern with q vectors.
When an incommensurately modulated diffraction pattern is observed in protein crystallography, the sample is typically discarded in favour of a better behaving sample that can be processed with standard macromolecular crystallography software. As a consequence, incommensurately modulated macromolecular crystals are rarely reported and these types of structural modulations in the context of a macromolecular crystal are poorly understood. PA crystals can be chemically induced to form a peculiar incommensurately modulated diffraction pattern. More than 28 years ago (Schutt et al., 1989) it was found that when PA crystals are driven to a phase transition boundary by exposing the crystals to conditions known to promote actin filament formation they transform into an incommensurately modulated state that is thought to contain a superstructure of structural intermediates. By varying the solution conditions PA can be crystallized in either an 'open' or a 'closed/tight' state that corresponds to the nucleotide binding site opening and closing (Chik et al., 1996;Porta, 2011;Schutt et al., 1993). These two states are accompanied by a change in the c unit-cell dimension from 186 to    . using precession photography at room temperature from either open or closed states by shifting the pH to 6.0, a condition known to cause profilin to diffuse away from actin and actin filaments to form in vitro (Carlsson, 1979;Oda et al., 2001;Chik, 1996). This research provided the foundation for our continued studies.
Incommensurate modulations within crystals are a result of a displacement modulation that forms but does not align with the spacing of the basic unit cell. In the resulting diffraction pattern satellite reflections appear near the normal main reflections (see Fig. 2). In the periodic state, all reflections can be indexed by the three integer indices h, k and l such that where a*, b* and c* are the reciprocal lattice vectors of the main reflections and basic unit cell. With satellite reflections, the diffraction pattern becomes (3 + d) dimensional, where d is the number of satellite directions. The most common form of modulation is in only one extra direction (d = 1), and the diffraction patterns for these crystals have satellite reflections on either side of the main reflection (see Figs. 2a and 2b). This is called a (3 + 1)D modulated crystal. The diffraction pattern for this case can be indexed by the introduction of a single q vector such that The positions of the satellite reflections are given by the q vector A modulation wave can be parallel to one of the reciprocal lattice vectors, and in this case two of the scalar q coefficients in (3) would be zero. In more complicated cases two or three of the q coefficients can be nonzero (Fig. 2b). Also, multipleorder satellites evenly spaced from the main reflections can exist (see Figs. 2b and 2c). This is represented by the integer value m in equation (2). Interestingly satellites and multiorder satellites are predominantly in the high-resolution bins of data (see Fig. 5 of Lovelace et al., 2010). In 2008, the Borgstahl laboratory was able to reproduced the incommensurately modulated PA crystals from the Schutt laboratory and measured a single-rotation-style diffraction image from a room-temperature protein crystal (Lovelace et al., 2008). The data were indexed and the first q vector was measured for a macromolecular crystal. Research progress was hindered by the reversibility of the modulation at room temperature, perhaps due to crystal heating or radiation damage from the SuperBright FRE X-ray generator, which prevented the collection of a full set of diffraction data. We have since learned to cryocool crystals that were first crosslinked with acidic glutaraldehyde at room temperature and then cryopreserved with sodium formate (named xMod1 and xMod2, Table 1) and more recently not crosslinked and cryopreserved with d-glucose (gMod3, Table 1). All of the (3 + d)D superspace groups have been tabulated for d = 1, 2 or 3. For d = 1 there are 775 groups, for d = 2 there are 3338 groups and for d = 3 there are 12 584 groups (Stokes et al., 2011). A web site has been developed for searching all 775 (3 + 1)D superspace groups listed in International Tables for Crystallography (http://it.iucr.org/resources/finder/; Orlov et al., 2008). These numbers are greatly reduced for biological crystals as there are only 65 chiral (or biological) threedimensional space groups. Then there are only 135 (3 + 1)D, 368 (3 + 2)D and 1019 (3 + 3)D chiral superspace groups (van Smaalen et al., 2013). An excellent primer to the threedimensional space groups was written by Dauter & Jaskolski (2010) and can be used to review the symmetry elements  Table 1 Eval15 data processing statistics for incommensurately modulated PA diffraction data.  (Porta & Borgstahl, 2012) found in protein crystals (Dauter & Jaskolski, 2010). Modulated PA crystals have a (3 + 1)D-type superspace group because they are modulated in only one direction. The three cryocooled modulated PA data sets (Table 1) all have basic three-dimensional unit cells like that of the PA open-state crystals and have satellite reflections along b* (Fig. 3). The crystals differ in their resolution of diffraction and the extent of modulation, as indicated by their q vector and satellite intensity strength. The q spacing of the satellite from the main reflections varies from 0.2628 to 0.2829. A demonstration of their similarity to and differences from each other and from open-state crystals was made by calculating R merge between data sets ( Table 2) Lovelace et al. (2008). Note that the crystal diffraction data were processed with Eval15, not the CrystalClear software. CrystalClear was used here for the purposes of illustration only.

Figure 4
Pseudo-precession photographs of diffraction data from the gMod3 crystal (see Table 1). Diffraction data, main reflections and satellites, were integrated using Eval15 as described previously . In part (a) the 0kl plane is displayed with portions of the k axis, l axis and centre portion magnified. Systematic absences along k are highlighted with green ovals and along l with red circles. In the centre portion zoom, red rectangles highlight reflections where the satellites do not have equal intensity, green rectangles show where there is just a main reflection with no satellites, and blue rectangles show examples where the satellites extinguish the main reflection. Part (b) shows the h0l plane and part (c) the hk0 plane, with systematic absences along h circled in blue, along l highlighted with red arrows and along k with green ovals. To prepare these pseudo-precession photographs, the reflections were reindexed to a supercell using an Awk script to reindex the k reflections under the supercell condition k = 7k + 2m, where m is the satellite order (m = AE1 in this study). Reindexing the data into a supercell is illustrated in Fig. S1 of the supporting information. The reindexed reflections were converted to realistic pseudo-precession photographs with the MLFSOM software, which applies a point-spread function (Holton, 2008;Holton et al., 2012). reflections the R merge values improve and fall in a range of 25-37%, still not isomorphous. Clearly the three modulated structures are significantly different from each other and from the periodic crystal. The intensity of the modulation is also indicated by the strength of the satellite intensities (e.g. in Table 1 I/ for the satellite reflections ranks their strength as follows: gMod3 > xMod2 = xMod1).
We have three cryocooled incommensurately modulated PA structures to solve of varying modulation strength (Fig. 3). When we look at the gMod3 crystal more closely in pseudoprecession photographs, it can be seen how the satellites relate to the main reflections (Fig. 4). Satellites are not always present (green rectangles, Fig. 4a), do not have to be of equal intensity (red rectangles, Fig. 4a) and can extinguish the main reflections (blue rectangles, Fig. 4a). The relative intensities between the satellite and the main reflections are analysed by resolution bin in Table 3 for the gMod3 crystal (see also Fig. 7 and Table 2 of Porta et al., 2011). The ratio of the satellite to main reflection intensity is lower in low-resolution bins and increases in the high-resolution bins. This is a general feature of modulated PA crystals.

Assignment of a superspace group to a protein crystal
A general procedure for the assignment of a superspace group is given by Janssen et al. (1999). These steps are analysed here with our PA diffraction data and the description of the process is streamlined to include only the symmetry elements found in chiral molecule crystals. Hopefully this example will make these methods more accessible to protein crystallographers.

Determine the Laue class and crystallographic point group
The Laue group of the diffraction pattern is the point group in three dimensions that transforms every diffraction peak into a peak of the same intensity (except for deviations from Friedel's law caused by dispersion) (Rupp, 2010). For biological crystals there are 11 Laue symmetry classes and 11 chiral crystallographic point groups (32 point groups for small molecular crystals). These are triclinic 1, monoclinic 2, orthorhombic 222, tetragonal 4 or 422, trigonal 3 or 32, hexagonal 6 or 622, and cubic 23 or 432 (Table S1 in the supporting information). Processing of the main diffraction data with D*TREK (Table 4) or with Eval15 (Table 5) (Pflugrath, 1999;Schreurs et al., 2010). There are only 23 (3 + 1)D superspace groups with this symmetry.

Find the basic unit cell for the main reflections and a modulation wavevector
The main reflections are separated from the satellites, usually by intensity, and indexed. Reflection extinctions are used to select the Bravais class for the main reflections (Fig. S2). Note that only noncubic classes are possible for (3 + 1)D modulations because a one-dimensional incommensurate modulation is incompatible with cubic symmetry. The satellites are usually assigned to the main reflections (can be extinct) that they are closest to. Then the direction and dimensions of the q vector are determined by fitting the satellites. If possible, it is preferable to place the q vector along a reciprocal lattice vector.
PA crystals are of the primitive orthorhombic Bravais lattice. This can be seen in the analysis of just the main reflections (Table 4, solution 11). Centring-type P orthorhombic has a low least-squares residual almost as low as P triclinic or P monoclinic. C centring is ruled out by the large least-squares residual. Eval15 processing also selects primitive orthorhombic as the Bravais lattice (Table 5). This narrows the assignment down to 15 superspace groups. Eval15 was used to define the q vector, which is in the direction of b*, for each crystal . We note that the magnitude of the q vector for xMod1 is close to 2/7 = 0.2857 . . . and so 2/7 was used as an approximation when the diffraction was reindexed for display as a pseudo-precession in Fig. 4 (see also Fig. S1).

Determine the 3D space group of the average structure
The average structure is commonly found by using the main reflections only and corresponds to averaging the contents of several unit cells in three dimensions. The space group of the average structure is determined from the main reflections. This helps make a good choice for the starting structure in superspace refinement. Tables 4 and 5 show that the three-dimensional space group is P222. The three-dimensional space group for the average structure is determined from the main reflections. In our case, checks for centring rule out C, F or I and the lattice is primitive. Extinctions along h, k and l (Fig. 4) indicate the presence of screw axes along all three dimensions. Since the data are of fairly low resolution the assignment of the space group was checked by performing molecular replacement with just the main reflections using MOLREP (Table 6) (Vagin & Teplyakov, 2000). This settles any uncertainty and the 3D space group of the average structure is P2 1 2 1 2 1 .
nUni is the number of unique reflections and nR sym is the number used to calculate the R values. At this point a refinement of the average structure can be performed and the resulting electron density observed. The average structures refined with REFMAC crystallographic R values of 27-28% (Murshudov et al., 1997). The electron density of the average structure (Fig. 5) reveals that some parts of the structure are modulated more than others. The average electron density for profilin and subdomains 1 and 3 of actin are fairly well ordered. Actin domains 2 and 4 have very weak density, and this indicates that their motions are more dramatic in the modulation wave. The modulation function in actin appears to be more pronounced than that in profilin.
An illustration of the crystal contacts in PA crystals shows how the modulated regions correspond to the crystal directions (Fig. 6). For the PA case, the indexing showed that the modulation is along b (collinear with y) in the crystal, which corresponds to an 'actin ribbon' formed by the crystal lattice (Schutt et al., 1991). It is likely that the protein undergoes a conformational change that affects the neighbouring PA molecules in such a way as to produce the observed modulation in the diffraction pattern (Schutt et al., 1991). The struc-tural basis for the modulated PA diffraction pattern has not yet been determined.

Identification of the (3 + 1)D Bravais lattice type
The (3 + 1)D Bravais class is determined by the 3D Bravais class and the components , , of q. Next we find the superspace group compatible with the previously derived results and with the special extinctions observed in the diffraction pattern. In Table S2 there are 15 orthorhombic (3 + 1)D superspace groups (Nos. 16.1-19.1). From the main reflections we know our lattice is primitive and there is no centring. There are three screw axes. Applying the screw axes narrows the selection down to one and the superspace group for the incommensurately modulated PA crystals is number 19.1 with notation P2 1 2 1 2 1 (00). There are actually three related versions of this space group: P2 1 2 1 2 1 (00), P2 1 2 1 2 1 (00) and P2 1 2 1 2 1 (00). The direction of the modulation is shown by the position of the coefficients. For PA the orientation is P2 1 2 1 2 1 (00). The symmetry operators need to be modified to work with modulation along this axis relative to c* as reported in the tables. Details of this transformation were reported by Lovelace et al. (2013).

Conclusions
After integration of reflections in Eval15 the unit-cell dimensions and q vector length and direction are known. The final test of the superspace group assignment comes in the next step when it is applied to the integrated diffraction data via the SADABS software (Sheldrick, 1996). The program workflow is presented in Fig. 3 of Porta et al. (2011). It can be seen in Table 1 that the R sym values obtained from SADABS look reasonable and are quite good for the well measured data of <4 Å resolution.
During data collection it was noticed that particularly strong satellite reflections were associated with extinguished main reflections (Fig. 4). As it turns out, this is indicative of large movements in the structural modulation (Janssen et al., 1999). It is interesting to note that, when normal periodic actin  Table 6 Check of space-group screw axes with Molrep with main reflections for crystal xMod1.
The score is the product of the correlation coefficient (CC) and packing function [Q(s)]. † The contrast is the ratio of the top score to the mean score. hjF o jihjF c jii=ðhjF o j 2 À hjF o ji 2 ihjF c j 2 À hjF c ji 2 iÞ 1=2 .

Figure 6
The crystal lattice of profilin:actin crystals (a) looking down x and (b) looking down z. In both views the direction of the actin ribbon formed in these crystals is along y and vertical. In (a) one complete actin ribbon is seen on the right half of the figure, composed of four profilins (black) and five actins (cyan, red, green, orange and blue). Domain 2 in the actin small domain is indicated with a dashed circle in magenta in part (a) and red in part (b).
in PA crystals undergoes a transition from the 'open' to 'tight' state, the unit-cell dimension c changes by 14 Å , yet the crystals are stable (Chik, 1996). It is therefore possible that the structural transitions needed to bring about such a large modulation might be on a similar scale, especially those involving actin subdomains 1 and 4. Refinement of the incommensurate PA structures will inevitably shed light on the nature of these higher-order actin structures and provide insight into the early stages of actin filament formation. This is the next step in our research and involves further software development for crystallographic refinement of a protein in a (3 + 1)D superspace group.