A unified convention for biological assemblies with helical symmetry

A new representation of helical structure by four parameters, [n 1, n 2, twist, rise], is able to generate an entire helical construct from asymmetric units, including cases of helical assembly with a seam.


Introduction
Under physiological conditions, many biomolecules are either organized in functional tubular forms or aggregated in disease-related filaments. Tubular and filamentous structures grow with a helical symmetry. The determination of helical structures is important because it provides clues to functional regulation and to the mechanisms of polymerization and depolymerization, and can help in figuring out how to prevent unwanted disease-related fibril aggregation. The most famous example is the determination of the Watson-Crick doublehelical DNA structure in 1953 (Watson & Crick, 1953), which created a new era in the history of molecular biology. In proteins, even a slight difference in the interactions between molecules is sufficient to create similar filamentous or tubular structures with distinct helical symmetries. For this reason, structural polymorphism is a common characteristic of tubular or fibril entities. Depending on the specificity and rigidity of the interacting molecules, some, such as the amyloidogenic peptide A 1-40 (Sachse et al., 2008;Schmidt et al., 2009), can exhibit a broad spectrum of polymorphic assemblies, whereas others only show limited variability, as in the case of microtubules (Sui & Downing, 2010). This underscores the importance of revealing the structural characteristics of helical assemblies directly from a simple helical symmetry description.
Helical symmetry can be formulated in many different ways. Helical transformations can be classified into two categories: one-dimensional (1-D) helical systems and two-dimensional (2-D) helical systems. In structural determination by X-ray fibre diffraction (Klug et al., 1958), a helical structure is described as a set of n 1-D molecular helices related by an n-fold axial symmetry. However, in both systems, if the helically repeating motif has C 2 symmetry, the helical structure has an additional dyad symmetry (Klug et al., 1958). Given the asymmetric units, the helical assembly can be constructed by rotohelical transformations which are defined by a specified line group (Damnjanović et al., 2007). Because of the prevalence of structural polymorphism (DeRosier et al., 1999) and helical discontinuities (seams; Kikkawa, 2004) in electronmicroscopy (EM) images, a description of helical assemblies by a rolled planar 2-D lattice sheet was devised to solve the EM structure in reciprocal space. In this representation, the framework of a helical structure is viewed as a helical net; that is, a set of equivalent points wrapped around a cylindrical surface. Various 2-D lattice wrappings were defined by a circumference vector c = n 1 a + n 2 b, where n 1 and n 2 are two integer constants and a and b are the 2-D lattice vectors. On the other hand, the reconstruction of helical structures in real space is typically based on rotohelical transformations which are applied iteratively using the single-particle method (Sachse et al., 2007;Egelman, 2007Egelman, , 2010. A simple convention for defining the helical symmetry of biological assemblies has been suggested in the remediated Protein Data Bank (PDB; Lawson et al., 2008) and EM Data Bank (Heymann et al., 2005) archives. Both have used a definition of rotohelical transformation that does not fully capture the underlying symmetry properties of helical assemblies. It is therefore not surprising that two very similar tubular structures might be described by very different helical parameters that provide no clue to the fact that they are actually quite similar. This is the case for the two bacteriophage major coat protein helical tubes determined by X-ray fibre diffraction [PDB entries 1hgv (Pederson et al., 2001) and 1ifd (Marvin, 1990)]; the first is presented as a one-start and the second as a five-start helical tube with each helix related by a fivefold rotational symmetry. The discrepancy is understandable because the two PDB structures were the outcome of structure-determination procedures in which the helical symmetry was preset in the minimization procedure. This shortcoming underscores the importance of a standard system that would report helical structures and provide parameters that reflect their structural characteristics. It appears that to date an unambiguous, simple and systematic standard for defining a unique helical specification for constructing helical assemblies from asymmetric units is lacking.
In this paper, we present a new unified convention for the construction of helical assemblies from asymmetric units determined by X-ray fibre diffraction and EM imaging. The unification is made possible by an augmented 1-D helical system (described below) that extends the traditional 1-D helical scheme to adopt the helical symmetry descriptor [n 1 , n 2 ] which is used in the 2-D helical system. A helical structure can be prepared by rolling a planar sheet composed of identical 2-D unit cells (Stewart, 1988). In order to create a seamless 2-D lattice tube, two integer constants [n 1 , n 2 ] define the wrapping process: n 1 refers to the number of cells that are needed to complete a full round of cylinder wrapping and n 2 to the number of cells sliding along the cell edge after the wrapping. The helical symmetry of the tubular structure is explicitly determined by [n 1 , n 2 ] and the corresponding 2-D wrapping transformations can be found in the literature (Tsai et al., 2006;Kikkawa, 2004).
In a traditional 1-D helical system, a helical structure is depicted as either a one-start or an n-start helical structure (Egelman, 2007;Klug et al., 1958). For a one-start helical structure, the assembly consists of only a single helix with two helical parameters, twist (') and rise (); these denote the transformation of the 1-D unit cell which is used to build the entire structure. In fibre diffraction, a one-start helix is formulated as u units in v turns with a helical repeat distance of c, which straightforwardly gives ' = 2v/u and = c/(uv). An n-start helical structure has n helices related by an n-fold axial symmetry (C n ), with the axis coinciding with the helical axis. In the augmented 1-D helical system described below, in addition to the rotational operation of the C n symmetry there is an extra translational operation along the helical axis. In the 2-D helical system this extra translational operation is implicitly included in the 2-D wrapping transformation; however, it is ignored in the traditional 1-D helical scheme. This prevents the 1-D and 2-D systems from being unified in a common helical symmetry description. In contrast, the helical symmetry in the augmented 1-D helical system with the four parameters [n 1 , n 2 , twist, rise] is defined by two consecutive helical (screw) operations: the first helical operation is specified by two helical parameters [twist, rise] exactly as in the traditional 1-D helical transformation and the second screw operation is defined by two [n 1 , n 2 ] constants. n 1 refers to the n 1 -fold rotational symmetry exactly as in the traditional 1-D helical transformation and n 2 specifies the translation part of the second screw operation. We will illustrate the operational transformations as well as the interconversion between the augmented 1-D helical system and the 2-D helical system below.

Methods
In our definition, a helical structure is composed of repetitive identical units, similar to a single crystal which is built from a three-dimensional (3-D) lattice. The definition of repetitive relates to the entire helical structures, which are built from a unit cell with a specified helical symmetry. The unit cell is either a 2-D lattice or a 1-D line segment. The definition of identical implies that each repetitive unit in the construct has exactly the same environment; that is, the independent variables of a helical structure include only parameters involved in helical transformation and coordinates of asymmetric units within a unit cell. Identical implies that when evaluating the energy of the assembly there is no need to include the interactions between all units but just the interactions between one unit cell and its surrounding cells.
Given a 3-D unit cell, it is straightforward to generate the entire single crystal with fractional coordinates. The newly generated fractional coordinates are the number of cells (n a , n b , n c ) away from the origin (0, 0, 0) along each edge. The Cartesian coordinates of a new cell can be converted from the fractional coordinates by an orthogonal matrix (Evans, 2001) computed from 3-D lattice constants, a, b, c, , and . However, unlike the 3-D crystal system, a specified helical transformation is needed to generate the helical assembly from a given 1-D or 2-D unit cell. In the following, the equations for the 2-D wrapping and the augmented 1-D helical transformation will be derived and explained in detail.

2-D helical system
As stated in x1, a planar sheet of 2-D lattice can be wrapped into a tube. If all sheet units are identical, the two constant integers n 1 and n 2 are sufficient to define all possible distinct tubes obtained by rolling it. Here, n 1 is the number of cells along one edge of the 2-D lattice (a) which are required to make a full round of wrapping and n 2 refers to the number of cells sliding along the other edge of the lattice (b) after wrapping.
If we place edge a along the x axis of the Cartesian coordinate and a 2-D lattice (a, b, ) is placed on the xy plane, the wrapping equations for an [n 1 , n 2 ] tube are (x w , y w , z w ) are the wrapped Cartesian coordinates of the tube and (x c , y c , z c ) are the Cartesian coordinates of the associated 2-D sheet. In the wrapping equations, the 2-D lattice sheet is at a distance of the tube radius (r) from the tube axis (t x , t y , 0). The helical transformation is implicitly specified by the helical twist and the helical rise x c t x + y c t y . Fig. 1 provides a graphical summary of the 2-D helical transformation. A more detailed description has been given previously (Tsai et al., 2006). Given asymmetric units in a 2-D lattice and a helical symmetry specified by the 2-D helical system in five parameters [n 1 , n 2 , a, b, ], one can build a complete helical construct based on the 2-D helical transformation equations formulated above.

Augmented 1-D helical system
Instead of rolling a planar 2-D sheet, a helical structure can also be expressed by a single helix or n helices, with the n helices related by an n-fold screw axis instead of just a rotational axis. Because the helical assembly must consist of identical subunits, the rotational part of the screw axis must display a C n rotational symmetry and the translational part should be limited by some discrete numbers. In the augmented 1-D helical system, the four parameters [n 1 , n 2 , ', ] indicate that there are n 1 helices in the assembly, with each individual helix characterized by a unit twist (') and a unit rise (). Because the helices are also related by an n 1 -fold screw axis, each helix denoted by m 1 = 0, 1, 2, . . . , n 1 À 1 has an additional twist of m 1 (2/n 1 ) and a rise of m 1 (n 2 /n 1 ). Note that the rise, which is specified by n 2 with a quantity of n 2 /n 1 , was not included in the traditional 1-D helical system. Note also that m 1 = n 1 refers back to the first helix as specified by (', ), which will give n 2 rise after a complete round of n 1 rotations. In the 2-D helical system, this corresponds to the number of cells involved in the helix sliding after a complete wrapping.
A helical structure in the symmetrical construct is identified by the cell coordinates [m 1 , m 2 ]. The asymmetric units are given in cell [0, 0] and an [m 1 , m 2 ] cell is located in the m 1 helix m 2 units away from the cell [m 1 , 0] along the helix. If the helical axis is parallel to the y axis and passes through the origin (0, 0, 0), the helical transformation equations for an [m 1 , m 2 ] cell in an [n 1 , n 2 ] helical construct are (x w , y w , z w ) are the transformed Cartesian coordinates for the cell [m 1 , m 2 ] and (x c , y c , z c ) are the Cartesian coordinates of asymmetric units in cell [0,0]. h ' and specify the overall rise and twist for the [m 1 , m 2 ] cell as specified by an n 1 -fold screw axis with n 2 unit shift. In the case of a helical structure with a single helix, in which n 1 = 1 and m 1 = 0, the helical transformation above reduces to a simple helical operation defined by [', ] only. A graphical summary of the augmented 1-D helical transformation is given in Fig. 2.

2-D helical system ! augmented 1-D helical system
There are four ways to convert a helical system from 2-D to 1-D: view the continuation of lattice edge b as a helix, view the continuation of lattice edge a as a helix or view the continuation along the vector of a + b or along the vector of a À b. The first is the most convenient choice. By selecting the vector b as an individual helix, the new 1-D helical system retains the same symmetry notation as the 2-D helical system [n 1 , n 2 ]. The unit twist ' (in unit of radians) and rise of the n 1 -start helices are calculated as where t x , t y are the helical axes of the 2-D helical system and x c , y c are the planar Cartesian coordinates at the cell origin research papers (0, 1). Because the tube axis of the 1-D helical tube (along the y axis) is different from the 2-D helical tube (on the xy plane), the Cartesian coordinates referenced in the 2-D system require some transformations in order to correspond to the new 1-D system. This can be performed either directly in wrapped Cartesian coordinates or in native fractional coordinates. In Cartesian coordinates, the transformation is equivalent to aligning the 2-D helical axis of the xy plane back onto the y axis with the z axis (0, 0, 1) as the rotational axis. Under the right-handed rotational system, the angle between the old tube axis and the y axis is calculated as atan(Àt x /t y ) and the transformations are where x w1 , y w1 are the wrapped Cartesian coordinates of the new 1-D helical system and x w2 , y w2 are the wrapped Cartesian coordinates of the old 2-D helical system.

Augmented 1-D helical system ! 2-D helical system
The conversion from a 1-D to a 2-D helical system is not as straightforward as the opposite conversion. This is partly because a 2-D lattice is loosely defined by a single helix structure and partly because of the necessity to revert from the wrapped tube coordinates back to 2-D planar Cartesian coordinates. Given a 1-D helical structure, we first calculate the implicit helical radius from the center of mass of the representative units by assuming that the center of the 1-D helical assembly is located at the origin (0, 0, 0). Secondly, we either use the original n 1 of the 1-D system or determine a new n 1 for the 2-D helical system and define accordingly two wrapped coordinates, (x w1 , y w1 , z w1 ) and (x w2 , y w2 , z w2 ),  A brief graphic summary of the 1-D helical system. For simplicity, on the left-hand side of the figure, only a single helix is drawn to illustrate the 1-D helical transformation. The helical axis t is along the Cartesian y axis and the 1-D cell (asymmetric subunits) is sitting at (x c , y c , z c ) which is labeled as the (0, 0) cell. The transformed Cartesian coordinates (x w , y w , z w ) labeled as (0, m 2 ) accordingly are calculated by the matrix operations with a rotational angle of m 2 ' and a translational rise of m 2 in the upper right-hand side of the figure. For the augmented 1-D helical system with four parameters [n 1 , n 2 , ', ], the helical transformation equation expressed by matrix operations is given in the lower right corner highlighted in color. and h respectively specify the overall helical twist and helical rise for the (m 1 , m 2 ) subunits with respect to the (0, 0) asymmetric subunits, with m 1 referring to the n 1 -order helix and m 2 referring to the subunits along the denoted helix.

Figure 1
A brief graphic summary of the 2-D helical system. On the left, a 2-D lattice wrapping specified by a circumference vector, w = n 1 a + n 2 b, is sketched with an example of n 1 = 7 and n 2 = 4. The 2-D lattice is highlighted in color with the angle between the two axes a and b. The lattice lies on the Cartesian xy plane and the axis a lies along the Cartesian x axis. Thus, the axis a in Cartesian coordinates is (a, 0 ,0) and b is (b cos , b sin , 0). The wrapped helical coordinates (x w , y w , z w ) of the Cartesian coordinates (x c , y c , z c ) in the 2-D planar system can then be calculated by the helical transformation in terms of the helical axis t (which is perpendicular to w) with the two parameters and h, twist angle and rise distance, respectively. It is then straightforward to determine the circumferential unit vector w u , the helical radius r and the helical axis t as formulated on the right-hand side of the sketch with the summarized wrapping equations highlighted in color at the bottom. Note that vector (t x , t y , t z ) as calculated from w u is also a unit vector. from the 1-D helical system to serve as the origin of cells (1, 0) and (0, 1) of the new 2-D helical system, respectively. Thirdly, we reverse the two wrapped coordinates back to unwrapped planar Cartesian coordinates, (x c1 , y c1 , z c1 ) and (x c2 , y c2 , z c2 ). With the new calculated planar coordinates, it is straightforward to calculate the 2-D lattice constants as follows: Fourthly, given the newly determined r, a, b, and n 1 , n 2 , the new 2-D helical tube can be determined by solving the quadratic equation b 2 x 2 + 2n 1 ab cos x + (n 1 a) 2 À (2r) 2 = 0.
Finally, all Cartesian coordinates of the old 1-D system are reversed back to the new 2-D planar coordinates.

Properties of [n 1 , n 2 ] helical system
We have shown that both the 1-D and 2-D helical systems can be represented by two integers, [n 1 , n 2 ], and that the helical assembly can be built through the helical transformation with the associated parameters. However, the helical symmetry specified by these two integers can also be interpreted in a way different from the helical systems' definitions. In the traditional helical description, an assembly with [n 1 , n 2 ] symmetry can be viewed as two sets of n-start helices in which either the arrangement of the n 1 -start helices is specified by n 2 or, vice versa, that of the n 2 -start helices is specified by n 1 . The best way to illustrate [n 1 , n 2 ] helical symmetry is by using a helical net: an unwrapped flattened 2-D net bound by the circumference in one direction and extended to infinity parallel to the helical axis. Figs. 3(a) and 3(b) illustrate an example of wrapping and unwrapping of the helical net with the EM structure of a microtubule with [11, 3] symmetry (Sui & Downing, 2010). The colored circular dots in the helical net represent asymmetric units and a line passing through a set of dots is a helix. The number of intersections (n) between the set of parallel lines with the circumference is exactly the number n of helices that are required to fill the helical assembly. This is the origin of the n-start helices definition. In terms of a helical net description, the helical symmetry can be specified by picking a particular set of two intersecting lines (helices) corresponding to n 1 -start and n 2 -start lines. The intersections define the locations of repeating asymmetric units in the Illustrations of [n 1 , n 2 ] helical symmetry with respect to helical nets. In (a), an EM segment of the microtubule structure is shown in a wrapped helical net with [11,3] symmetry. The corresponding flattened unwrapped helical net is demonstrated in (b). A section of the corresponding [11, 3] helical net is drawn in (c), with the x axis covering the helical circumference, a twist range of 2 and the y axis parallel to the helical axis, corresponding to the helical rise. The colored circular dots in the net are the asymmetric subunits and a solid line that passes through a set of dots is a helix. A helical net can be redefined by any two sets of lines with their intersections covering all dots. With n 2 fixed at 3, there are ten additional sets of helices which can be used to define the same helical structure. See text for an explanation of why only limited sets of helices are feasible with n 2 fixed at 3. In (c), feasible sets of helices are marked beside the dots with the value of n 1 colored red.  Fig. 3(c) for the same helical structure. In the special case of a one-start helix, the entire assembly is built from a single helix instead of a set of helices (n-start helices). Although n 2 is not required in helical symmetry denoted as a one-start helix, it is still represented in the [1, n 2 ] notation.
Based on the augmented 1-D helical system, it is not difficult to realise that a helical symmetry with [n 1 , n 2 , twist, rise] is equivalent to [Àn 1 , Àn 2 , twist, rise], [Àn 1 , n 2 , Àtwist, Àrise] and [n 1 , Àn 2 , Àtwist, Àrise]. To reduce the redundancy in the helical symmetry representation, we set several simple rules. Firstly, n 1 is always positive. For consistency in the interconversion between the 1-D and 2-D helical systems (with the sign of n 2 kept unchanged), we choose the rise to also be positive. Secondly, the value of n 2 is always smaller than that of n 1 . In this way, the handedness of the n 1 helices is determined by the sign of twist: if positive the n 1 helix is right-handed, otherwise it is left-handed. The sign of n 2 gives the handedness of the n 2 helices: if negative it is a right-handed helix, otherwise it is left-handed. To calculate the [twist, rise] of the n 2start helices the helical symmetry can be swapped from [n 1 , n 2 ] to [n 2 , n 1 ].

Local C 2 (dyad) symmetry
Above, the helical symmetry operation has been applied to asymmetric subunits without first assigning a plausible local symmetry. In order to generate all symmetric subunits in a planar 2-D lattice, the local symmetry can be specified by one of the 17 wallpaper groups. However, the local symmetry in the planar 2-D lattice is largely lost by the [n 1 , n 2 ] helical transformation; therefore, we only describe pseudo-local symmetry. A local C 2 symmetry operation is an exception: not only is it maintained between asymmetric subunits within the unit cell, but also in the entire helical construct. In terms of the 1-D helical system, the C 2 symmetry is defined as an additional dyad symmetry (with axial C 2 along the z axis and the helical axis along the y axis). To include a local C 2 symmetry operation, we can assign a wallpaper group p2 before applying the helical transformation.

Manipulation of [n 1 , n 2 ] symmetry
A helical symmetry can be described by many [n 1 , n 2 ] combinations. For a particular preset n 2 there are a limited number of n 1 ; similarly, a preset n 1 will have a limited selection of n 2 for the same helical assembly. To understand the manipulation and limitations of changing from one [n 1 , n 2 ] to another, the helical net is the best reference. For illustration, we use the [11, 3] helical symmetry of the polymorphic helical structure of the microtubule. In terms of the (h, k; n) notation (Toyoshima & Unwin, 1990;Toyoshima, 2000), the set of n-start helices can be specified by the equation n ¼ hn 10 À kn 01 ; where n 10 = 11 and n 01 = 3 for the [11, 3] symmetry. Note that the (h, k) index has to be confined within the circumference range. Starting with the [11, 3] symmetry and a fixed n 2 = 3, the redundant helical symmetry [n 1 , 3] can have n 1 = h 11 À k 3, with h = AE1. On the other hand, given a fixed n 1 = 11, we can have many redundant [11, n 2 ] helical symmetries with n 2 = h 11 À k 3 and k = AE1. In the case of (h = 1, k = AE1), we have new redundant helical symmetries of [8,3] and [14, 3] (Fig. 3c) 3] there is an infinite number of helical symmetries in a planar 2-D lattice rather than just the 11 sets restricted by the circumference. In Fig. 3(c), the corresponding redundant sets of helical symmetries are noted next to the helical dots. For the redundant symmetry [À1, 3], we have created an equivalence between 11-start helices [11, 3] and one-start helix [À1, 3] (or [1, À3]) symmetry.
In order to check whether the (h, k) index is within the circumference, we convert the 1-D system to its equivalent planar 2-D helical net. It is then straightforward to calculate the new helical parameters ' for the new n-start helix. If |'| is less than then it falls within the circumference range.

Relevance to X-ray fibre diffraction and the EM method
In helical structure determination by X-ray fibre diffraction or EM based on the Fourier-Bessel method (in reciprocal space), the first step is indexing the layer-line diffraction pattern to a specified helical symmetry. There are two possible systems for indexing a diffraction pattern. In the first, assuming that a helical assembly can be described by a single (one-start) helix, the 'selection rule' l = tn + um can be utilized to assign (n, l) pairs to layer-lines in which each layer-line is associated with a set of n-start helices. A more general formalism using (n, Z l ) instead of (n, l), which removes the requirement for t/u to be a rational number, is more appropriate for fibre diffraction. However, for simplicity, we prefer to use the (n, l) system here. A successful layer-line indexing then gives the helical organization as the selection rule implies: u units require t turns of the one-start helix to complete a true repeat with a rise distance of c. A second more systematic (h, k; n) indexing system (Toyoshima & Unwin, 1990;Toyoshima, 2000) interprets diffraction patterns based on the helical surface lattice. In a planar 2-D lattice, diffraction by a set of lines gives a row of dots in reciprocal space. Therefore, it is straightforward to determine a 2-D planar symmetry from an ideal diffraction pattern. For a helical structure which is obtained by wrapping of a 2-D lattice, a set of lines now becomes a set of helices and the corresponding diffraction dots become layer-lines. To define a surface lattice, two indices, (1, 0; n 10 ) and (0, 1; n 01 ), are first assigned where n is the start number of the associated helices, which can be estimated from its peak position in the layer-line diffraction (Toyoshima & Unwin, 1990;Toyoshima, 2000). If the remaining layer-lines can be indexed and related by the equation n = hn 10 À kn 01 then the helical symmetry is determined. Fig. 4 illustrates the relationship between the new [n 1 , n 2 ] helical scheme and the symmetry in both indexing systems. The nodes in Fig. 4 represent (i) asymmetric units based on a simple helical structure which is described by a one-start helix (t = 4 and u = 13), i.e. with 13 units and four turns completing a true repeat of distance c, and (ii) a simplified diffraction pattern of the same helical structure. However, instead of the layer-line pattern for n-order Bessel diffraction, each dot gives the position of (n, l) diffraction where the layer-line pattern has a maximum diffraction peak at the $n + 2 position (Diaz et al., 2010). In the figure, [n 1 , n 2 ] of the new helical scheme correspond to the choice of n 10 and n 01 in the (h, k; n) indexing system. The same (t = 4 and u = 13) structure is related by two different helical systems [3, 1] and [3, À2] in Figs. 4(a) and 4(b), respectively, for a simplified diffraction pattern in terms of n and Àl. The figure only shows one fourth of the diffraction pattern. For example, the n = 4, l = 3 diffraction is at the left-upper corner of the figure without a label of (h, k; n, l, m).

General guidelines for presenting a helical structure
There is no clear-cut advantage in treating a helical assembly as a 1-D or a 2-D system; both systems have pluses and minuses. However, we believe that the augmented 1-D helical system is more suitable than the 2-D system for describing a helical structure, even though the two systems are equivalent and interchangeable. There are two reasons for favoring the 1-D helical system for describing a helical symmetry. Firstly, the 1-D helical system is simpler than the 2-D system, with one fewer parameter. Secondly, the 1-D helical scheme is independent of the helical radius, while the surface lattice parameters (a, b, ) in the 2-D system will change with different radii. Here, based on a 1-D helical system we suggest general guidelines for helical structure representation. With our guidelines, if the assembly units can be unambiguously defined and follow the helical paths, each helical assembly is expected to provide a unique symmetry [n 1 , n 2 , ', ] that also explicitly reflects the helical structural characteristics.
In terms of a helical net, a helical structure is composed of a set of n helices in which the individual helix is named an n-start helix. If a helical structure can be expressed by just a single helix, it is a one-start helical structure. In the augmented 1-D helical system, the [n 1 , n 2 ] representation implies that the organization of the n 1 helices is specified by n 2 , with the individual n 1 -start helix defined by [', ]. We can also swap the representation to say that there are n 2 helices in the structure related by n 1 . Since there are many [n 1 , n 2 ] combinations for a particular helical structure, the first and the most important guideline is to define the rule for choosing a unique [n 1 , n 2 ] specification. In order to reflect the helical structural characteristics, the rule states that only protofilaments will be candidates for the [n 1 , n 2 ] selection. In our definition, if adjacent asymmetric subunits in an assigned helix are in physical contact, this helix is a protofilament. Therefore, we first sort protofilaments according to the extent of contacts between adjacent asymmetric subunits. Of the best four protofilaments, the one with the twist angle closest to zero is set as the primary protofilament n 1 and the next best protofilament is selected as the secondary protofilament n 2 . Note that n 1 is always larger than |n 2 | under this guideline.
To ensure a unique helical symmetry representation for a helical assembly, redundancy needs to be reduced to singular [n 1 , n 2 , ', ]. The reduction guideline requires that n 1 , > 0 and n 1 > |n 2 |. In the case of n 1 < 0, one can apply the equivalent rule that the new [n 1 , n 2 ] = [Àn 1 , Àn 2 ]. If is negative, one can simply apply the equivalent rule [n 1 , n 2 , ', ] = [n 1 , Àn 2 , À', À] to make it a positive value.  The relationship between the [n 1 , n 2 ] helical scheme and the helical symmetry utilized in two common indexing systems. The dots in the figure represent two properties. Firstly, they represent asymmetric units based on a simple helical structure which is described by a one-start helix (t = 4 and u = 13) with 13 subunits and four turns completing a repeat of distance c. Secondly, they describe a simplified diffraction pattern of the same helical structure. Thus, instead of showing the layer-line pattern for n-order Bessel diffraction, each dot gives the position of (n, l) diffraction where the layer-line pattern has a maximum diffraction peak at the $n + 2 position (Diaz et al., 2010). The diffraction pattern in terms of Àl and n is related to the helical net description with 13 subunits enclosed by orange lines as a repeating unit. The two implicit helical symmetries, l = tn + um and n = h n 10 À k n 01 , are then related by the [3, 1] and [3, À2] helical symmetry in (a) and (b), respectively, with n 10 = n 1 and n 01 = n 2 . In the figure, each dot is labeled with an (h, k; n, l, m) index and the helical lines are in terms of the [n 1 , n 2 ] helical symmetry.

Results
There are many helical filaments and tubular structures in the PDB which have been solved either directly by X-ray fibre diffraction or by fitting individual crystal structures into cryo-EM density maps. Similar to X-ray crystal structures where only the coordinates of the asymmetric units are included in the PDB file, most of the helical structures deposited in the PDB also contain only asymmetric units. Therefore, in principle, the entire helical structures should be constructed from the deposited asymmetric units by a specified helical symmetry. In the case of crystal structures, a space group and six lattice constants are defined in the keyword 'CRYST1' in the PDB for calculating all symmetric units in the unit cell. However, owing to the lack of a simple, complete and widely accepted system for helical symmetry, no keyword has been set to define the helical symmetry and helical parameters are implicitly stated in the comments. Furthermore, the creation of the entire helical structure relies on a set of translational and rotational matrices which are hard-coded in the PDB.
We have applied the augmented 1-D helical scheme along with the suggested guidelines to all helical structures deposited in the PDB. A small portion of the results are given in Table 1 and a complete list is available on the web at http:// protein3d.ncifcrf.gov/helicalSymmetry/table1.html. The newly determined helical parameters [n 1 , n 2 , twist, rise] not only directly reflect the helical characteristics but also provide sufficient information for constructing an entire helical structure from given asymmetric units. Here, we propose four helical parameters in a new keyword named HELSYM in the PDB for the specified helical symmetry to avoid using matrices and comments when specifying a helical symmetry.
In the PDB, the axial symmetry of a helical structure is conventionally along the z axis and passes through the origin (0, 0, 0). The helical symmetry specified in the PDB usually follows the rotohelical description, which provides the helical parameters (', ) for a single helix or n helices related by a C n rotational axis. The manually extracted data, the helical twist ' and the helical rise , are first verified against the helical transformation matrices if also given in the PDB file. The corresponding augmented 1-D helical parameters will then be either [1, 0, ', ] or [n, 0, ', ] for one-start helices or n-start helices, respectively. We then use our graphics tool named PNAS (Protein Nanoscale Architecture by Symmetry), inhouse software running both under Linux and Windows, to search for the first four protofilaments with the largest contact between the asymmetric units and determine their n 1 , n 2 , ', values accordingly. Next, we select among them the helical protofilament with the lowest absolute value of twist angle as  Table 1 Helical parameters for helical structures solved by X-ray fibre diffraction and EM imaging.
The first and second columns provide the PDB code and molecular name of the helical structure. The third column indicates whether the helical assembly was solved by X-ray fibre diffraction (XFD) or cryo-EM imaging (CryoEM). If the structure was solved by cryo-EM imaging, the PDB file records atomic models which have been docked into the corresponding EM map. If the EM map was deposited in EMDB, the entry gives its EMDB ID code. The next three columns (C n , ', ) report the helical symmetry if specified in the PDB or EMDB. The following two columns give the unified helical symmetry of [n 1 , n 2 ] by following the symmetrydetermination guideline (provided in the text). Next are the newly determined helical parameters [', ] of the 1-D helical system and the lattice constants (a, b, ) of the 2-D helical system.

Figure 5
A helical structure with various distinct descriptions of helical assembly. The helical structure of bacteriophage major coat protein (PDB entry 1ifd) is used as an example here to illustrate the variations. In the figure, an individual asymmetric subunit is presented by a color isosurface entity and each color represents a helix specified by the helical symmetry. The first [5,0]  the primary n 1 helix and its associated twist and rise are set as the 1-D helical parameters [', ]. Finally, the highest contact protofilament other than the chosen primary helix is assigned as the secondary n 2 helix to complete the determination of [n 1 , n 2 , ', ].
In Fig. 5, the helical assembly of the bacteriophage major coat protein (PDB entry 1ifd; Marvin, 1990) is assigned into three different 1-D helical symmetries: [5, 0, À33.2, 16.0], [5, 0, 38.8, 16.0] and [10, À5, 5.5, 32.0]. The first symmetry [5,0] corresponds to the rotohelical assignment in the PDB. Apparently, each individual helix is not a protofilament since no contact between helical subunits (shown in the same color) is observed. Therefore, the helical structural characteristics will not be conveyed clearly from its helical parameters. The second [5 ,0] symmetry is based on the first protofilament with the largest number of contacts between helical subunits. However, the guideline suggests using the [10, À5] helical symmetry to represent this structure. This symmetry is advantageous for three reasons: firstly, the [10, À5] symmetry corresponding to the second and first protofilaments in the structure presents the best structural characteristics, unlike the second [5, 0] assignment which only contains information for the first protofilament; secondly, by looking down the helical axis the structure is composed of ten helices, not just five; and thirdly, another inovirus coat protein (PDB entry 1hgv; Pederson et al., 2001) also gives a similar structure with [11, À6] helical symmetry. Here, the primary (11-start) helix is the first protofilament and the secondary (six-start) helix is the second protofilament. These two examples show that the augmented 1-D helical representation not only describes similar helical structures by similar parameters but at the same time also differentiates between similar helical organizations. In Fig. 6, three additional helical structures are depicted in 1-D symmetry. Pictorial descriptions with 1-D helical symmetry for the complete list of known helical structures can be accessed from links on the webpage http://protein3d.ncifcrf.gov/ helicalSymmetry/table1.html.
For helical structures deposited in the EMDB (Lawson et al., 2011) only the helical classification is indicated but no helical symmetry is explicitly given in the data bank. However, it is not difficult to deduce the 1-D helical parameters if the helical axis can be determined from the EM density map. The graphics tool PNAS can be utilized to assign 1-D helical symmetry to the EM structure. Firstly, we determine the location of the primary protofilament by visual inspection of the density map when shown in various isosurface presentations. Looking down the EM map along the helical axis, the number of assigned protofilaments can be counted to give the n 1 helical parameter. PNAS then determines the position of the helical axis using either the given map center or the calculated coordinates of the center of density. Next, we determine the [', ] pair for the visually assigned primary protofilament by calculating the correlation coefficient (Grubisic et al., 2010) between the origin map density and the helical transformed density specified by a pair of manually adjustable parameters [', ]. In this procedure, we follow the guideline to keep the helical twist as close to zero as possible and at the same time change the twist and rise to reach the  Pictorial 1-D helical symmetry description of three helical structures determined by X-ray fibre diffraction. The three depicted helical structures are the filamentous bacteriophage ph75 (PDB entry 1hgv), F-actin (PDB entry 2zwh) and the cucumber green mottle mosaic virus (PDB entry 1cgm). The same isosurface and color definition described in Fig. 5 is used for the three helical assemblies. The pictorial presentation directly indicates the helical signatures of the three helical structures in 11, two and 16 colored protofilaments, which are respectively implied by the specified [11, À5], [2, À1] and [16, À1] helical symmetry.

Figure 7
The procedure of serial density-map superimpositions illustrates the determination of the 1-D helical symmetry from a helical EM structure. The bateriophage fd coat protein (EMD-1240) is used here for demonstration. The original EM structure is displayed as a yellow isosurface and the symmetry-transformed density map is shown as a red isosurface. A first superimposition, denoted by the 1-D helical symmetry operation [1, 0, 0.0, 34.8], gives the result of a 34.8 Å translation only. Following an additional rotation of 2.6 denoted by [1, 0, 2.6, 34.8], the outcome of perfect superposition determines the [', ] of 1-D helical symmetry for the set of ten-start primary protofilaments. Applying a tenfold rotation [10, 0, 2.6, 34.8] then gives the third superimposition. Finally, an extra translation defined by n 2 = 5 completes the determination of 1-D helical symmetry [10, 5, 2.6, 34.8] for the EMD-1240 structure. best match as guided by visual superimposition between the origin and the transformed density map in isosurface presentation. The optimal correlation coefficient should be very close to 1.0. Now, we can use the newly determined helical parameters n 1 , ' and to determine n 2 : simply try integer numbers between Àn 1 and n 1 and perform the 1-D helical transformation to determine n 2 from the result of the superimposition as stated above.
To illustrate the procedure of 1-D helical symmetry determination, four superimpositions between the original EM (EMD-1240; bateriophage fd coat protein B; Wang et al., 2006) and transformed density maps relating to the four stages are given in Fig. 7. The first two superimpositions illustrate the determination of [', ] for the assigned primary protofilament. The last two superimpositions illustrate the determination of the secondary protofilament n 2 , giving the 1-D helical symmetry [10, 5, 2.6, 34.8]. The 1-D helical symmetry for each helical EM map deposited in the EMDB has been determined with the graphics tool PNAS. Some of the results are listed in Table 1 and a complete list is reported on the webpage http:// protein3d.ncifcrf.gov/helicalSymmetry/table1.html. To highlight the importance of a comprehensive helical scheme, Table 2 provides a comparison between the reported helical symmetries determined in the EM reconstruction and the new helical symmetries for six polymorphic helical structures of the microtubule. The results clearly show that the inherent structural characteristics of the microtubule obtained by the new helical scheme can directly discover polymorphic ensembles. The very similar surface lattice within different helical symmetries implies very similar subunit-subunit interactions which the microtubule uses to assemble into divergent helical organizations.
A helical description using four parameters [n 1 , n 2 , ', ], determined according to the augmented 1-D helical symmetry guidelines (in x2) provides the helical signature of the structure. This is because the two sets of defined helices, the n 1 -start and the n 2 -start, correspond to the two sets of protofilaments. However, will the guidelines also always give a unique [n 1 , n 2 ] combination for a given helical structure? The answer is yes, as illustrated by the example below. The docked atomic model of the bacterial flagellar hook (Fujii et al., 2009;PDB entry 3a69) contains an asymmetric subunit with three protein domains spanning the inner, middle and outer layers of the helical cryo-EM map. In terms of individual protein domains, the best protofilament of each domain yields an 11-start, five-start and six-start helix, respectively, from the inner to the outer layers. Even though different helical descriptions of different layers are observed, the guidelines still give an unambiguous helical symmetry of [11, À6, À7.31, 45.32] for this structure. The assignment is based on two clear elements in the structural data: the 11-start helix (the third protofilament in the protein) has a twist angle closest to zero and the six-start helix is the first protofilament. In Fig. 8 Table 2 Comparison of helical parameters between the reported symmetries determined in the EM reconstruction and the new helical symmetries for six polymorphic microtubule structures.
See Table 1 for column name description. The last column gives the helical radius where the 2-D lattice was defined.  1-D helical symmetry determination for a complicated helical structure. This example illustrates that the guideline defined for 1-D helical symmetry determination is capable of giving a unique symmetry assignment [n 1 , n 2 , ', ] to an intricate helical structure. The asymmetric subunit of the bacterial flagellar hook (Fujii et al., 2009; PDB entry 3a69; EMD-1647) contains three protein domains spanning the inner, middle and outer layers of the helical structure. Based on individual domains, the best protofilament in each domain forms a set of six-start, five-start and 11-start helices, respectively, from the outer to the inner layer of the helical structure. The pictorial helical descriptions for three different symmetry assignments are given under [6, À1], [5, À1] and [11, À6] symmetry. The guideline prefers the [11, À6] symmetry assignment simply because the 11-start helix has a twist angle closest to zero and the six-start helix is the protofilament with the largest number of contacts between the asymmetric units along the protofilament.
Not all helical structures have unambiguous primary protofilaments, especially when the growth mechanism does not follow a helical path. The tubular structure of the HIV-1 capsid protein (CA; Byeon et al., 2009) is such an example. In solution, CA forms a dimer via the association of its C-terminal domain (CTD). The cryo-EM tubular structure (EMD-5136) reveals that the basic unit is a trimer of CA dimers with a pseudo-threefold at the CTD-CTD interfaces and the CA dimer is shared between two trimers. Following our guideline, we obtain a helical symmetry of [24, 13, 7.39, 165.78] for the CA tubular structure. The unit cell depicted by the [24,13] symmetry does not correspond to the observed CA hexamer; however, after applying the symmetry-manipulation rules (n 1 = n 1 À n 2 , [n 1 , n 2 ] swapping and n 2 = n 2 + n 1 ) the new helical symmetry of [13, 2, À11.00, 89.80] gives the cell dimensions of the hexamer. The surface lattices for both helical symmetries are highlighted in red in Fig. 9. The fact that the assigned asymmetric units in both helical symmetries ([24, 13] and [13, 2]) do not correspond to the assembly unit implies that the path of a trimer of CA dimers is not helical. In this case, our guidelines will fail to offer an unambiguous helical specification.
The guidelines have two limitations in fulfilling the aim that every helical structure would have a unique [n 1 , n 2 , ', ] helical symmetry. The first arises when the primary protofilament is ambiguous, as discussed above, and the second is encountered when there is a continuous helical density along a protofilament in the cryo-EM structure rather than a clear boundary between asymmetric units. Under such circumstances, for a determined [n 1 , n 2 ] symmetry the helical structure can be described by an infinite number of [', ] pairs, which are always related by a constant. The helical structure of the tubular A 1-42 amyloid with a hollow core (Miller et al., 2010;Zhang et al., 2009) is an example of this limitation. The cryo-EM structure gives a [2, 0] (or [2, 1]) helical symmetry and the two helical parameters [', ] = [À3.75c, 4.8c], where c is a constant.

Relevance to experimental diffraction patterns
The diffraction patterns of helical structures consist of a series of layer-lines. Assuming that the layer-lines do not overlap, each layer-line is the result of diffraction by a set of n-start helices. The position of the peak with the maximum diffraction intensity in each layer-line can be indexed to correspond to a node in the helical net. The relationship between the 1-D helical system [n 1 , n 2 ] symmetry and the diffraction pattern is detailed in Figs. 4(a) and 4(b). The two figures illustrate the different assignments of n 10 and n 01 , which give different helical symmetries, [3, 1] and [3, À2], for the same structure that has a simple helical symmetry of t = 4 and u = 13. The assignment of n 10 with the first peak close to the equator (n) of the diffraction pattern is consistent with the guideline for selecting the n 1 -start helices with a twist angle closest to zero. The assignment of n 01 to the position of the diffraction which is close to the origin is also likely to constitute a main protofilament of the given helical structure.

The minimal number of helices needed for a complete helical structure description
From the rotohelical transformation, we learnt that a single (one-start) helix description is not always sufficient to generate the entire helical structure from given asymmetric units. The helix may need to be related by a C n rotational symmetry, which implies that a minimum of n helices are required to cover the entire helical assembly. Given an [n 1 , n 2 ] symmetry, there are n 1 or n 2 assigned helices for the entire helical description. To determine the minimal number of helices for complete structural description (or to be correlated with the rotohelical transformation), the [n 1 , n 2 ] symmetry is reduced to an equivalent symmetry with n 2 = 1 or 0. In the case of a reduced [n 1 , 0] symmetry, the new n 1 is the minimal number of helices.
It is straightforward to deduce the minimal number of helices for a given [n 1 , n 2 ] helical symmetry. If the numbers n 1 and n 2 do not have a common factor, the symmetry can always be reduced to a one-start helix description by using a combination of the swap and the equivalence rules of n = h n 10 À k n 01 as described in x2. For example, [7,3] can be reduced to [1,3] with h = 0, k = 2. On the other hand, the largest common factor between n 1 and n 2 is the minimal number of helices for a complete helical structure description. For example, [8,4] can be reduced to [4,0] with the number 4, the largest common factor of 4 and 8.

A new description of helical symmetry
Despite the fact that so many helical structures have been determined, a universal formulation for representing helical symmetry is still lacking. The absence of agreement in the community has been attributed to three main reasons. The first apparent reason is a consequence of the fact that helical  symmetries have been formulated in distinct ways to fulfill a particular requirement or convenience in different structuredetermination methods. The diversified helical representations can be classified into two commonly adopted helical schemes named the 1-D and 2-D helical systems. In this study, the two helical schemes were unified into a single helical specification by two constants [n 1 , n 2 ] and we have shown that the two systems are interchangeable and complementary to each other. Because of the simplicity of using one less parameter and the lack of involvement of the axial radius, we suggest using the augmented 1-D helical system with four parameters [n 1 , n 2 , twist, rise] for representing a helical structure.
The second hurdle for defining a helical description is that a helical structure can be pictured in many ways, i.e. in many [n 1 , n 2 ] combinations as two (n 1 -start and n 2 -start) sets of helices. However, in principle, the generalized guidelines for describing a helical symmetry are expected to give a unique [n 1 , n 2 ] specification that reflects the characteristics of the structure, although in a limited number of cases a unique specification is impossible.
The fact that no standard helical symmetry has been accepted so far can be attributed to the last obstacle: a complete coverage of helical description includes the capability of handling helical discontinuity (a seam). However, building an entire helical construct with a seam from given asymmetric units requires no additional modification in our formulation of helical transformation. Instead, a helical structure with a seam is simply reflected in the value of n 2 . By definition, the helical discontinuity indicates that n 2 is no longer an integer but a rational number.

Presentation of a structure with a helical discontinuity
An implicit requirement of the 2-D helical system (x2.1) is that in a seamless helical arrangement [n 1 , n 2 ] must be specified by integer numbers. By treating two consecutive asymmetric subunits in the primary protofilament as a new single asymmetric subunit, the new augmented 1-D helical symmetry becomes [n 1 , n 2 /2, 2', 2], which is equivalent to the original helical symmetry [n 1 , n 2 , ', ] except that the asymmetric units are doubled in size. When the tubulin subunit is treated not as a dimer of subunits but as a single subunit by ignoring the small difference between the and subunits (Sui & Downing, 2010), we do not encounter the microtubule seam problem. However, when treating the dimer as an asymmetric subunit in the new 1-D helical symmetry, helical structures with an odd number for the n 2 symmetry (in single subunit representation) create a seam with a new rational n 2 .
The microtubule EM structure (Cochran et al., 2009; EMD-5038) presents such a helical discontinuity when treating the dimer of subunits as the asymmetric unit. The augmented 1-D helical symmetry in four parameters [13, 3/2, 0.0, 80.0] is sufficient to generate the entire helical structure with a seam, based on the helical transformation matrix summarized in Fig. 2. Owing to the helical discontinuity, the repetitive asymmetric unit is no longer an identical unit. Instead, a complete round of n 1 subunits (13 dimers in the microtubule case) now constitutes the identical unit in the helical structure with a seam. Therefore, the subunit coordination index [m 1 , m 2 ] can no longer have an index with m 1 ! n 1 when applying the helical transformation to generate the repetitive subunits for a helical structure with a seam.
A seam in a helical structure can be classified visually with respect to its helical axis into a strictly vertical seam or a seam that wraps around the helix. The microtubule case above is an example of a vertical seam. Under the restriction that only a rational n 2 and integer n 1 > |n 2 | are allowed in the augmented 1-D helical representation, the corresponding helical structure always produces a vertical seam and the handedness of the seam is determined by the sign of n 2 , with positive indicating a left-handed seam and negative a right-handed seam. In contrast, a seam described by a rational n 1 > |n 2 | and integer n 2 should correspond to the type of seam that wraps around the helix.

Application to polymorphic structural assemblies
Both the 1-D and 2-D helical systems are designed to create helical assemblies from asymmetric subunits with specified helical parameters. The conformational heterogeneity of molecular assemblies is known to set limits on solving cryo-EM structures at high resolution. Polymorphism is particularly problematic in the determination of structures with helical symmetry since even a slight deviation in the interactions between two asymmetric subunits will create distinct structures with different symmetries. We have seen such an example in Table 2 for the microtubule structure. The question is can all such polymorphic structures be generated based on a single helical structure which is given in an atomic model or an EM map? The answer is yes, because the interactions between the asymmetric subunits are preserved in the definition of the 2-D helical system. Thus, to create distinct polymorphic structures with almost the same subunit-subunit interactions we only need to change the specific [n 1 , n 2 ] helical symmetry.

Conclusions
In this paper, we give two helical formulations (augmented 1-D and 2-D) to describe a helical structure. Unlike the rotohelical transformation (1-D formulation) with a helical plus an additional rotational operation, a new augmented 1-D formulation with two consecutive helical operations enables unification with the widely adopted 2-D formulation, giving a common helical symmetry descriptor with two integers [n 1 , n 2 ]. The new formulation requires only four parameters [n 1 , n 2 , twist, rise] for the augmented 1-D helical system and five parameters [n 1 , n 2 , a, b, ] for a 2-D helical system to generate the entire structural assembly from given asymmetric units. We propose using the augmented 1-D helical system with four parameters to describe a helical structure owing to its simplicity and independence from the helical radius compared with the 2-D helical system.

research papers
In terms of a helical net representation, a helical structure with an [n 1 , n 2 ] symmetry indicates that its organization is specified by two sets of helices (n 1 -start and n 2 -start). Because many different [n 1 , n 2 ] combinations exist for the same structure, we suggest general guidelines for selecting a unique [n 1 , n 2 ] symmetry which reflects the structural characteristics of a given helical structure. We provide a computational graphics tool for this purpose which can be used for any helical structure determined by X-ray fibre diffraction or EM imaging.
While there are multiple ways to construct equations that generate the same helical structure, an [n 1 , n 2 , twist, rise] description provides the following advantages: firstly, it provides full helical coverage, including a helical discontinuity (seam) which is indicated by a rational n 2 ; secondly, it reflects the structural characteristics of the assembly (formation mechanism) directly by four helical parameters; that is, similar structures give similar parameters; thirdly, the unnecessary error in reproducing the entire helical structures, such as editing wrong transformation matrices in the PDB or in the deposited EM parameters in the EMDB, will be prevented; and lastly, the new helical symmetry is expected to be useful for maintaining a pre-determined helical symmetry in structural refinement as well as for the generation of all 'meaningful' polymorphic structural assemblies from a given helical atomic model or EM density map.