research papers
A unified convention for biological assemblies with helical symmetry
^{a}Basic Science Program, SAICFrederick Inc., Center for Cancer Research Nanobiology Program, NCIFrederick, Frederick, MD 21702, USA, and ^{b}Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
^{*}Correspondence email: tsaic@mail.nih.gov
Assemblies with helical symmetry can be conveniently formulated in many distinct ways. Here, a new convention is presented which unifies the two most commonly used helical systems for generating helical assemblies from asymmetric units determined by Xray fibre diffraction and EM imaging. A helical assembly is viewed as being composed of identical repetitive units in a one or twodimensional lattice, named 1D and 2D helical systems, respectively. The unification suggests that a new helical description with only four parameters [n_{1}, n_{2}, twist, rise], which is called the augmented 1D helical system, can generate the complete set of helical arrangements, including coverage of helical discontinuities (seams). A unified fourparameter characterization implies similar parameters for similar assemblies, can eliminate errors in reproducing structures of helical assemblies and facilitates the generation of polymorphic ensembles from helical atomic models or EM density maps. Further, guidelines are provided for such a unique description that reflects the structural signature of an assembly, as well as rules for manipulating the helical symmetry presentation.
Keywords: symmetry; Xray fibre diffraction; EM density maps; helical assemblies.
1. Introduction
Under physiological conditions, many biomolecules are either organized in functional tubular forms or aggregated in diseaserelated filaments. Tubular and filamentous structures grow with a helical symmetry. The determination of helical structures is important because it provides clues to functional regulation and to the mechanisms of polymerization and ), which created a new era in the history of molecular biology. In proteins, even a slight difference in the interactions between molecules is sufficient to create similar filamentous or tubular structures with distinct helical symmetries. For this reason, structural is a common characteristic of tubular or fibril entities. Depending on the specificity and rigidity of the interacting molecules, some, such as the amyloidogenic peptide Aβ_{1–40} (Sachse et al., 2008; Schmidt et al., 2009), can exhibit a broad spectrum of polymorphic assemblies, whereas others only show limited variability, as in the case of microtubules (Sui & Downing, 2010). This underscores the importance of revealing the structural characteristics of helical assemblies directly from a simple helical symmetry description.
and can help in figuring out how to prevent unwanted diseaserelated fibril aggregation. The most famous example is the determination of the Watson–Crick doublehelical DNA structure in 1953 (Watson & Crick, 1953Helical symmetry can be formulated in many different ways. Helical transformations can be classified into two categories: onedimensional (1D) helical systems and twodimensional (2D) helical systems. In structural determination by Xray fibre diffraction (Klug et al., 1958), a helical structure is described as a set of n 1D molecular helices related by an nfold axial symmetry. However, in both systems, if the helically repeating motif has C_{2} symmetry, the helical structure has an additional dyad symmetry (Klug et al., 1958). Given the asymmetric units, the helical assembly can be constructed by rotohelical transformations which are defined by a specified line group (Damnjanović et al., 2007). Because of the prevalence of structural (DeRosier et al., 1999) and helical discontinuities (seams; Kikkawa, 2004) in electronmicroscopy (EM) images, a description of helical assemblies by a rolled planar 2D lattice sheet was devised to solve the EM structure in In this representation, the framework of a helical structure is viewed as a helical net; that is, a set of equivalent points wrapped around a cylindrical surface. Various 2D lattice wrappings were defined by a circumference vector c = n_{1}a + n_{2}b, where n_{1} and n_{2} are two integer constants and a and b are the 2D lattice vectors. On the other hand, the reconstruction of helical structures in real space is typically based on rotohelical transformations which are applied iteratively using the singleparticle method (Sachse et al., 2007; Egelman, 2007, 2010).
A simple convention for defining the helical symmetry of biological assemblies has been suggested in the remediated Protein Data Bank (PDB; Lawson et al., 2008) and EM Data Bank (Heymann et al., 2005) archives. Both have used a definition of rotohelical transformation that does not fully capture the underlying symmetry properties of helical assemblies. It is therefore not surprising that two very similar tubular structures might be described by very different helical parameters that provide no clue to the fact that they are actually quite similar. This is the case for the two bacteriophage major coat protein helical tubes determined by Xray fibre diffraction [PDB entries 1hgv (Pederson et al., 2001) and 1ifd (Marvin, 1990)]; the first is presented as a onestart and the second as a fivestart helical tube with each helix related by a fivefold rotational symmetry. The discrepancy is understandable because the two PDB structures were the outcome of structuredetermination procedures in which the helical symmetry was preset in the minimization procedure. This shortcoming underscores the importance of a standard system that would report helical structures and provide parameters that reflect their structural characteristics. It appears that to date an unambiguous, simple and systematic standard for defining a unique helical specification for constructing helical assemblies from asymmetric units is lacking.
In this paper, we present a new unified convention for the construction of helical assemblies from asymmetric units determined by Xray fibre diffraction and EM imaging. The unification is made possible by an augmented 1D helical system (described below) that extends the traditional 1D helical scheme to adopt the helical symmetry descriptor [n_{1}, n_{2}] which is used in the 2D helical system. A helical structure can be prepared by rolling a planar sheet composed of identical 2D unit cells (Stewart, 1988). In order to create a seamless 2D lattice tube, two integer constants [n_{1}, n_{2}] define the wrapping process: n_{1} refers to the number of cells that are needed to complete a full round of cylinder wrapping and n_{2} to the number of cells sliding along the cell edge after the wrapping. The helical symmetry of the tubular structure is explicitly determined by [n_{1}, n_{2}] and the corresponding 2D wrapping transformations can be found in the literature (Tsai et al., 2006; Kikkawa, 2004).
In a traditional 1D helical system, a helical structure is depicted as either a onestart or an nstart helical structure (Egelman, 2007; Klug et al., 1958). For a onestart helical structure, the assembly consists of only a single helix with two helical parameters, twist (φ) and rise (δ); these denote the transformation of the 1D which is used to build the entire structure. In fibre diffraction, a onestart helix is formulated as u units in v turns with a helical repeat distance of c, which straightforwardly gives φ = 2πv/u and δ = c/(uv). An nstart helical structure has n helices related by an nfold axial symmetry (C_{n}), with the axis coinciding with the helical axis. In the augmented 1D helical system described below, in addition to the rotational operation of the C_{n} symmetry there is an extra translational operation along the helical axis. In the 2D helical system this extra translational operation is implicitly included in the 2D wrapping transformation; however, it is ignored in the traditional 1D helical scheme. This prevents the 1D and 2D systems from being unified in a common helical symmetry description. In contrast, the helical symmetry in the augmented 1D helical system with the four parameters [n_{1}, n_{2}, twist, rise] is defined by two consecutive helical (screw) operations: the first helical operation is specified by two helical parameters [twist, rise] exactly as in the traditional 1D helical transformation and the second screw operation is defined by two [n_{1}, n_{2}] constants. n_{1} refers to the n_{1}fold rotational symmetry exactly as in the traditional 1D helical transformation and n_{2} specifies the translation part of the second screw operation. We will illustrate the operational transformations as well as the interconversion between the augmented 1D helical system and the 2D helical system below.
2. Methods
In our definition, a helical structure is composed of repetitive identical units, similar to a single crystal which is built from a threedimensional (3D) lattice. The definition of repetitive relates to the entire helical structures, which are built from a with a specified helical symmetry. The is either a 2D lattice or a 1D line segment. The definition of identical implies that each repetitive unit in the construct has exactly the same environment; that is, the independent variables of a helical structure include only parameters involved in helical transformation and coordinates of asymmetric units within a Identical implies that when evaluating the energy of the assembly there is no need to include the interactions between all units but just the interactions between one and its surrounding cells.
Given a 3D n_{a}, n_{b}, n_{c}) away from the origin (0, 0, 0) along each edge. The Cartesian coordinates of a new cell can be converted from the fractional coordinates by an orthogonal matrix (Evans, 2001) computed from 3D lattice constants, a, b, c, α, β and γ. However, unlike the 3D a specified helical transformation is needed to generate the helical assembly from a given 1D or 2D In the following, the equations for the 2D wrapping and the augmented 1D helical transformation will be derived and explained in detail.
it is straightforward to generate the entire single crystal with fractional coordinates. The newly generated fractional coordinates are the number of cells (2.1. 2D helical system
As stated in §1, a planar sheet of 2D lattice can be wrapped into a tube. If all sheet units are identical, the two constant integers n_{1} and n_{2} are sufficient to define all possible distinct tubes obtained by rolling it. Here, n_{1} is the number of cells along one edge of the 2D lattice (a) which are required to make a full round of wrapping and n_{2} refers to the number of cells sliding along the other edge of the lattice (b) after wrapping.
If we place edge a along the x axis of the Cartesian coordinate and a 2D lattice (a, b, γ) is placed on the xy plane, the wrapping equations for an [n_{1}, n_{2}] tube are
where
(x_{w}, y_{w}, z_{w}) are the wrapped Cartesian coordinates of the tube and (x_{c}, y_{c}, z_{c}) are the Cartesian coordinates of the associated 2D sheet. In the wrapping equations, the 2D lattice sheet is at a distance of the tube radius (r) from the tube axis (t_{x}, t_{y}, 0). The helical transformation is implicitly specified by the helical twist α and the helical rise x_{c}t_{x} + y_{c}t_{y}. Fig. 1 provides a graphical summary of the 2D helical transformation. A more detailed description has been given previously (Tsai et al., 2006). Given asymmetric units in a 2D lattice and a helical symmetry specified by the 2D helical system in five parameters [n_{1}, n_{2}, a, b, γ], one can build a complete helical construct based on the 2D helical transformation equations formulated above.
2.2. Augmented 1D helical system
Instead of rolling a planar 2D sheet, a helical structure can also be expressed by a single helix or n helices, with the n helices related by an nfold screw axis instead of just a rotational axis. Because the helical assembly must consist of identical subunits, the rotational part of the screw axis must display a C_{n} rotational symmetry and the translational part should be limited by some discrete numbers. In the augmented 1D helical system, the four parameters [n_{1}, n_{2}, φ, δ] indicate that there are n_{1} helices in the assembly, with each individual helix characterized by a unit twist (φ) and a unit rise (δ). Because the helices are also related by an n_{1}fold screw axis, each helix denoted by m_{1} = 0, 1, 2, …, n_{1} − 1 has an additional twist of m_{1}(2π/n_{1}) and a rise of m_{1}(n_{2}/n_{1})δ. Note that the rise, which is specified by n_{2} with a quantity of n_{2}/n_{1}δ, was not included in the traditional 1D helical system. Note also that m_{1} = n_{1} refers back to the first helix as specified by (φ, δ), which will give n_{2}δ rise after a complete round of n_{1} rotations. In the 2D helical system, this corresponds to the number of cells involved in the helix sliding after a complete wrapping.
A helical structure in the symmetrical construct is identified by the cell coordinates [m_{1}, m_{2}]. The asymmetric units are given in cell [0, 0] and an [m_{1}, m_{2}] cell is located in the m_{1} helix m_{2} units away from the cell [m_{1}, 0] along the helix. If the helical axis is parallel to the y axis and passes through the origin (0, 0, 0), the helical transformation equations for an [m_{1}, m_{2}] cell in an [n_{1}, n_{2}] helical construct are
where
(x_{w}, y_{w}, z_{w}) are the transformed Cartesian coordinates for the cell [m_{1}, m_{2}] and (x_{c}, y_{c}, z_{c}) are the Cartesian coordinates of asymmetric units in cell [0, 0]. h_{φ} and α specify the overall rise and twist for the [m_{1}, m_{2}] cell as specified by an n_{1}fold screw axis with n_{2} unit shift. In the case of a helical structure with a single helix, in which n_{1} = 1 and m_{1} = 0, the helical transformation above reduces to a simple helical operation defined by [φ, δ] only. A graphical summary of the augmented 1D helical transformation is given in Fig. 2.
2.3. 2D helical system → augmented 1D helical system
There are four ways to convert a helical system from 2D to 1D: view the continuation of lattice edge b as a helix, view the continuation of lattice edge a as a helix or view the continuation along the vector of a + b or along the vector of a − b. The first is the most convenient choice. By selecting the vector b as an individual helix, the new 1D helical system retains the same symmetry notation as the 2D helical system [n_{1}, n_{2}]. The unit twist φ (in unit of radians) and rise δ of the n_{1}start helices are calculated as
where t_{x}, t_{y} are the helical axes of the 2D helical system and x_{c}, y_{c} are the planar Cartesian coordinates at the cell origin (0, 1). Because the tube axis of the 1D helical tube (along the y axis) is different from the 2D helical tube (on the xy plane), the Cartesian coordinates referenced in the 2D system require some transformations in order to correspond to the new 1D system. This can be performed either directly in wrapped Cartesian coordinates or in native fractional coordinates. In Cartesian coordinates, the transformation is equivalent to aligning the 2D helical axis of the xy plane back onto the y axis with the z axis (0, 0, 1) as the rotational axis. Under the righthanded rotational system, the angle θ between the old tube axis and the y axis is calculated as atan(−t_{x}/t_{y}) and the transformations are
where x_{w1}, y_{w1} are the wrapped Cartesian coordinates of the new 1D helical system and x_{w2}, y_{w2} are the wrapped Cartesian coordinates of the old 2D helical system.
2.4. Augmented 1D helical system → 2D helical system
The conversion from a 1D to a 2D helical system is not as straightforward as the opposite conversion. This is partly because a 2D lattice is loosely defined by a single helix structure and partly because of the necessity to revert from the wrapped tube coordinates back to 2D planar Cartesian coordinates. Given a 1D helical structure, we first calculate the implicit helical radius from the center of mass of the representative units by assuming that the center of the 1D helical assembly is located at the origin (0, 0, 0). Secondly, we either use the original n_{1} of the 1D system or determine a new n_{1} for the 2D helical system and define accordingly two wrapped coordinates, (x_{w1}, y_{w1}, z_{w1}) and (x_{w2}, y_{w2}, z_{w2}), from the 1D helical system to serve as the origin of cells (1, 0) and (0, 1) of the new 2D helical system, respectively. Thirdly, we reverse the two wrapped coordinates back to unwrapped planar Cartesian coordinates, (x_{c1}, y_{c1}, z_{c1}) and (x_{c2}, y_{c2}, z_{c2}). With the new calculated planar coordinates, it is straightforward to calculate the 2D lattice constants as follows:
Fourthly, given the newly determined r, a, b, γ and n_{1}, n_{2}, the new 2D helical tube can be determined by solving the quadratic equation b^{2}x^{2} + 2n_{1}ab cos γ x + (n_{1}a)^{2} − (2πr)^{2} = 0. Finally, all Cartesian coordinates of the old 1D system are reversed back to the new 2D planar coordinates.
2.5. Properties of [n_{1}, n_{2}] helical system
We have shown that both the 1D and 2D helical systems can be represented by two integers, [n_{1}, n_{2}], and that the helical assembly can be built through the helical transformation with the associated parameters. However, the helical symmetry specified by these two integers can also be interpreted in a way different from the helical systems' definitions. In the traditional helical description, an assembly with [n_{1}, n_{2}] symmetry can be viewed as two sets of nstart helices in which either the arrangement of the n_{1}start helices is specified by n_{2} or, vice versa, that of the n_{2}start helices is specified by n_{1}. The best way to illustrate [n_{1}, n_{2}] helical symmetry is by using a helical net: an unwrapped flattened 2D net bound by the circumference in one direction and extended to infinity parallel to the helical axis. Figs. 3(a) and 3(b) illustrate an example of wrapping and unwrapping of the helical net with the EM structure of a microtubule with [11, 3] symmetry (Sui & Downing, 2010). The colored circular dots in the helical net represent asymmetric units and a line passing through a set of dots is a helix. The number of intersections (n) between the set of parallel lines with the circumference is exactly the number n of helices that are required to fill the helical assembly. This is the origin of the nstart helices definition. In terms of a helical net description, the helical symmetry can be specified by picking a particular set of two intersecting lines (helices) corresponding to n_{1}start and n_{2}start lines. The intersections define the locations of repeating asymmetric units in the helical structure. In addition to the [11, 3] symmetry, two feasible helical nets with symmetries [8, 3] and [14, 3] are also depicted in Fig. 3(c) for the same helical structure. In the special case of a onestart helix, the entire assembly is built from a single helix instead of a set of helices (nstart helices). Although n_{2} is not required in helical symmetry denoted as a onestart helix, it is still represented in the [1, n_{2}] notation.
Based on the augmented 1D helical system, it is not difficult to realise that a helical symmetry with [n_{1}, n_{2}, twist, rise] is equivalent to [−n_{1}, −n_{2}, twist, rise], [−n_{1}, n_{2}, −twist, −rise] and [n_{1}, −n_{2}, −twist, −rise]. To reduce the redundancy in the helical symmetry representation, we set several simple rules. Firstly, n_{1} is always positive. For consistency in the interconversion between the 1D and 2D helical systems (with the sign of n_{2} kept unchanged), we choose the rise to also be positive. Secondly, the value of n_{2} is always smaller than that of n_{1}. In this way, the handedness of the n_{1} helices is determined by the sign of twist: if positive the n_{1} helix is righthanded, otherwise it is lefthanded. The sign of n_{2} gives the handedness of the n_{2} helices: if negative it is a righthanded helix, otherwise it is lefthanded. To calculate the [twist, rise] of the n_{2}start helices the helical symmetry can be swapped from [n_{1}, n_{2}] to [n_{2}, n_{1}].
2.6. Local C_{2} (dyad) symmetry
Above, the helical n_{1}, n_{2}] helical transformation; therefore, we only describe pseudolocal symmetry. A local C_{2} is an exception: not only is it maintained between asymmetric subunits within the but also in the entire helical construct. In terms of the 1D helical system, the C_{2} symmetry is defined as an additional dyad symmetry (with axial C_{2} along the z axis and the helical axis along the y axis). To include a local C_{2} we can assign a wallpaper group p2 before applying the helical transformation.
has been applied to asymmetric subunits without first assigning a plausible In order to generate all symmetric subunits in a planar 2D lattice, the can be specified by one of the 17 wallpaper groups. However, the in the planar 2D lattice is largely lost by the [2.7. Manipulation of [n_{1}, n_{2}] symmetry
A helical symmetry can be described by many [n_{1}, n_{2}] combinations. For a particular preset n_{2} there are a limited number of n_{1}; similarly, a preset n_{1} will have a limited selection of n_{2} for the same helical assembly. To understand the manipulation and limitations of changing from one [n_{1}, n_{2}] to another, the helical net is the best reference. For illustration, we use the [11, 3] helical symmetry of the polymorphic helical structure of the microtubule. In terms of the (h, k; n) notation (Toyoshima & Unwin, 1990; Toyoshima, 2000), the set of nstart helices can be specified by the equation
where n_{10} = 11 and n_{01} = 3 for the [11, 3] symmetry. Note that the (h, k) index has to be confined within the circumference range. Starting with the [11, 3] symmetry and a fixed n_{2} = 3, the redundant helical symmetry [n_{1}, 3] can have n_{1} = h 11 − k 3, with h = ±1. On the other hand, given a fixed n_{1} = 11, we can have many redundant [11, n_{2}] helical symmetries with n_{2} = h 11 − k 3 and k = ±1. In the case of (h = 1, k = ±1), we have new redundant helical symmetries of [8, 3] and [14, 3] (Fig. 3c), respectively. The complete list of redundant [n_{1}, 3] symmetries with h = 1 are [26, 3], [23, 3], [20, 3], [17,3], [14,3], [11,3], [8, 3], [5, 3], [2, 3], [−1, 3] and [−4, 3]. For [n_{1}, 3] there is an infinite number of helical symmetries in a planar 2D lattice rather than just the 11 sets restricted by the circumference. In Fig. 3(c), the corresponding redundant sets of helical symmetries are noted next to the helical dots. For the redundant symmetry [−1, 3], we have created an equivalence between 11start helices [11, 3] and onestart helix [−1, 3] (or [1, −3]) symmetry.
In order to check whether the (h, k) index is within the circumference, we convert the 1D system to its equivalent planar 2D helical net. It is then straightforward to calculate the new helical parameters φ for the new nstart helix. If φ is less than π then it falls within the circumference range.
2.8. Relevance to Xray fibre diffraction and the EM method
In helical l = tn + um can be utilized to assign (n, l) pairs to layerlines in which each layerline is associated with a set of nstart helices. A more general formalism using (n, Z_{l}) instead of (n, l), which removes the requirement for t/u to be a rational number, is more appropriate for fibre diffraction. However, for simplicity, we prefer to use the (n, l) system here. A successful layerline indexing then gives the helical organization as the selection rule implies: u units require t turns of the onestart helix to complete a true repeat with a rise distance of c. A second more systematic (h, k; n) indexing system (Toyoshima & Unwin, 1990; Toyoshima, 2000) interprets diffraction patterns based on the helical surface lattice. In a planar 2D lattice, diffraction by a set of lines gives a row of dots in Therefore, it is straightforward to determine a 2D planar symmetry from an ideal diffraction pattern. For a helical structure which is obtained by wrapping of a 2D lattice, a set of lines now becomes a set of helices and the corresponding diffraction dots become layerlines. To define a surface lattice, two indices, (1, 0; n_{10}) and (0, 1; n_{01}), are first assigned where n is the start number of the associated helices, which can be estimated from its peak position in the layerline diffraction (Toyoshima & Unwin, 1990; Toyoshima, 2000). If the remaining layerlines can be indexed and related by the equation n = hn_{10} − kn_{01} then the helical symmetry is determined. Fig. 4 illustrates the relationship between the new [n_{1}, n_{2}] helical scheme and the symmetry in both indexing systems. The nodes in Fig. 4 represent (i) asymmetric units based on a simple helical structure which is described by a onestart helix (t = 4 and u = 13), i.e. with 13 units and four turns completing a true repeat of distance c, and (ii) a simplified diffraction pattern of the same helical structure. However, instead of the layerline pattern for norder Bessel diffraction, each dot gives the position of (n, l) diffraction where the layerline pattern has a maximum diffraction peak at the ∼n + 2 position (Diaz et al., 2010). In the figure, [n_{1}, n_{2}] of the new helical scheme correspond to the choice of n_{10} and n_{01} in the (h, k; n) indexing system. The same (t = 4 and u = 13) structure is related by two different helical systems [3, 1] and [3, −2] in Figs. 4(a) and 4(b), respectively, for a simplified diffraction pattern in terms of n and −l. The figure only shows one fourth of the diffraction pattern. For example, the n = 4, l = 3 diffraction is at the leftupper corner of the figure without a label of (h, k; n, l, m).
by Xray fibre diffraction or EM based on the Fourier–Bessel method (in reciprocal space), the first step is indexing the layerline diffraction pattern to a specified helical symmetry. There are two possible systems for indexing a diffraction pattern. In the first, assuming that a helical assembly can be described by a single (onestart) helix, the `selection rule'2.9. General guidelines for presenting a helical structure
There is no clearcut advantage in treating a helical assembly as a 1D or a 2D system; both systems have pluses and minuses. However, we believe that the augmented 1D helical system is more suitable than the 2D system for describing a helical structure, even though the two systems are equivalent and interchangeable. There are two reasons for favoring the 1D helical system for describing a helical symmetry. Firstly, the 1D helical system is simpler than the 2D system, with one fewer parameter. Secondly, the 1D helical scheme is independent of the helical radius, while the surface lattice parameters (a, b, γ) in the 2D system will change with different radii. Here, based on a 1D helical system we suggest general guidelines for helical structure representation. With our guidelines, if the assembly units can be unambiguously defined and follow the helical paths, each helical assembly is expected to provide a unique symmetry [n_{1}, n_{2}, φ, δ] that also explicitly reflects the helical structural characteristics.
In terms of a helical net, a helical structure is composed of a set of n helices in which the individual helix is named an nstart helix. If a helical structure can be expressed by just a single helix, it is a onestart helical structure. In the augmented 1D helical system, the [n_{1}, n_{2}] representation implies that the organization of the n_{1} helices is specified by n_{2}, with the individual n_{1}start helix defined by [φ, δ]. We can also swap the representation to say that there are n_{2} helices in the structure related by n_{1}. Since there are many [n_{1}, n_{2}] combinations for a particular helical structure, the first and the most important guideline is to define the rule for choosing a unique [n_{1}, n_{2}] specification. In order to reflect the helical structural characteristics, the rule states that only protofilaments will be candidates for the [n_{1}, n_{2}] selection. In our definition, if adjacent asymmetric subunits in an assigned helix are in physical contact, this helix is a protofilament. Therefore, we first sort protofilaments according to the extent of contacts between adjacent asymmetric subunits. Of the best four protofilaments, the one with the twist angle closest to zero is set as the primary protofilament n_{1} and the next best protofilament is selected as the secondary protofilament n_{2}. Note that n_{1} is always larger than n_{2} under this guideline.
To ensure a unique helical symmetry representation for a helical assembly, redundancy needs to be reduced to singular [n_{1}, n_{2}, φ, δ]. The reduction guideline requires that n_{1}, δ > 0 and n_{1} > n_{2}. In the case of n_{1} < 0, one can apply the equivalent rule that the new [n_{1}, n_{2}] = [−n_{1}, −n_{2}]. If δ is negative, one can simply apply the equivalent rule [n_{1}, n_{2}, φ, δ] = [n_{1}, −n_{2}, −φ, −δ] to make it a positive value.
3. Results
There are many helical filaments and tubular structures in the PDB which have been solved either directly by Xray fibre diffraction or by fitting individual crystal structures into cryoEM density maps. Similar to Xray crystal structures where only the coordinates of the asymmetric units are included in the PDB file, most of the helical structures deposited in the PDB also contain only asymmetric units. Therefore, in principle, the entire helical structures should be constructed from the deposited asymmetric units by a specified helical symmetry. In the case of crystal structures, a
and six lattice constants are defined in the keyword `CRYST1' in the PDB for calculating all symmetric units in the However, owing to the lack of a simple, complete and widely accepted system for helical symmetry, no keyword has been set to define the helical symmetry and helical parameters are implicitly stated in the comments. Furthermore, the creation of the entire helical structure relies on a set of translational and rotational matrices which are hardcoded in the PDB.We have applied the augmented 1D helical scheme along with the suggested guidelines to all helical structures deposited in the PDB. A small portion of the results are given in Table 1 and a complete list is available on the web at http://protein3d.ncifcrf.gov/helicalSymmetry/table1.html. The newly determined helical parameters [n_{1}, n_{2}, twist, rise] not only directly reflect the helical characteristics but also provide sufficient information for constructing an entire helical structure from given asymmetric units. Here, we propose four helical parameters in a new keyword named HELSYM in the PDB for the specified helical symmetry to avoid using matrices and comments when specifying a helical symmetry.

In the PDB, the axial symmetry of a helical structure is conventionally along the z axis and passes through the origin (0, 0, 0). The helical symmetry specified in the PDB usually follows the rotohelical description, which provides the helical parameters (φ, δ) for a single helix or n helices related by a C_{n} rotational axis. The manually extracted data, the helical twist φ and the helical rise δ, are first verified against the helical transformation matrices if also given in the PDB file. The corresponding augmented 1D helical parameters will then be either [1, 0, φ, δ] or [n, 0, φ, δ] for onestart helices or nstart helices, respectively. We then use our graphics tool named PNAS (Protein Nanoscale Architecture by Symmetry), inhouse software running both under Linux and Windows, to search for the first four protofilaments with the largest contact between the asymmetric units and determine their n_{1}, n_{2}, φ, δ values accordingly. Next, we select among them the helical protofilament with the lowest absolute value of twist angle as the primary n_{1} helix and its associated twist and rise are set as the 1D helical parameters [φ, δ]. Finally, the highest contact protofilament other than the chosen primary helix is assigned as the secondary n_{2} helix to complete the determination of [n_{1}, n_{2}, φ, δ].
In Fig. 5, the helical assembly of the bacteriophage major coat protein (PDB entry 1ifd; Marvin, 1990) is assigned into three different 1D helical symmetries: [5, 0, −33.2, 16.0], [5, 0, 38.8, 16.0] and [10, −5, 5.5, 32.0]. The first symmetry [5, 0] corresponds to the rotohelical assignment in the PDB. Apparently, each individual helix is not a protofilament since no contact between helical subunits (shown in the same color) is observed. Therefore, the helical structural characteristics will not be conveyed clearly from its helical parameters. The second [5 ,0] symmetry is based on the first protofilament with the largest number of contacts between helical subunits. However, the guideline suggests using the [10, −5] helical symmetry to represent this structure. This symmetry is advantageous for three reasons: firstly, the [10, −5] symmetry corresponding to the second and first protofilaments in the structure presents the best structural characteristics, unlike the second [5, 0] assignment which only contains information for the first protofilament; secondly, by looking down the helical axis the structure is composed of ten helices, not just five; and thirdly, another inovirus coat protein (PDB entry 1hgv; Pederson et al., 2001) also gives a similar structure with [11, −6] helical symmetry. Here, the primary (11start) helix is the first protofilament and the secondary (sixstart) helix is the second protofilament. These two examples show that the augmented 1D helical representation not only describes similar helical structures by similar parameters but at the same time also differentiates between similar helical organizations. In Fig. 6, three additional helical structures are depicted in 1D symmetry. Pictorial descriptions with 1D helical symmetry for the complete list of known helical structures can be accessed from links on the webpage http://protein3d.ncifcrf.gov/helicalSymmetry/table1.html.
For helical structures deposited in the EMDB (Lawson et al., 2011) only the helical classification is indicated but no helical symmetry is explicitly given in the data bank. However, it is not difficult to deduce the 1D helical parameters if the helical axis can be determined from the EM density map. The graphics tool PNAS can be utilized to assign 1D helical symmetry to the EM structure. Firstly, we determine the location of the primary protofilament by visual inspection of the density map when shown in various isosurface presentations. Looking down the EM map along the helical axis, the number of assigned protofilaments can be counted to give the n_{1} helical parameter. PNAS then determines the position of the helical axis using either the given map center or the calculated coordinates of the center of density. Next, we determine the [φ, δ] pair for the visually assigned primary protofilament by calculating the (Grubisic et al., 2010) between the origin map density and the helical transformed density specified by a pair of manually adjustable parameters [φ, δ]. In this procedure, we follow the guideline to keep the helical twist as close to zero as possible and at the same time change the twist and rise to reach the best match as guided by visual superimposition between the origin and the transformed density map in isosurface presentation. The optimal should be very close to 1.0. Now, we can use the newly determined helical parameters n_{1}, φ and δ to determine n_{2}: simply try integer numbers between −n_{1} and n_{1} and perform the 1D helical transformation to determine n_{2} from the result of the superimposition as stated above.
To illustrate the procedure of 1D helical symmetry determination, four superimpositions between the original EM (EMD1240; bateriophage fd coat protein B; Wang et al., 2006) and transformed density maps relating to the four stages are given in Fig. 7. The first two superimpositions illustrate the determination of [φ, δ] for the assigned primary protofilament. The last two superimpositions illustrate the determination of the secondary protofilament n_{2}, giving the 1D helical symmetry [10, 5, 2.6, 34.8]. The 1D helical symmetry for each helical EM map deposited in the EMDB has been determined with the graphics tool PNAS. Some of the results are listed in Table 1 and a complete list is reported on the webpage http://protein3d.ncifcrf.gov/helicalSymmetry/table1.html. To highlight the importance of a comprehensive helical scheme, Table 2 provides a comparison between the reported helical symmetries determined in the EM reconstruction and the new helical symmetries for six polymorphic helical structures of the microtubule. The results clearly show that the inherent structural characteristics of the microtubule obtained by the new helical scheme can directly discover polymorphic ensembles. The very similar surface lattice within different helical symmetries implies very similar subunit–subunit interactions which the microtubule uses to assemble into divergent helical organizations.

4. Discussion
4.1. Will the guidelines give a unique [n_{1}, n_{2}, φ, δ] helical system?
A helical description using four parameters [n_{1}, n_{2}, φ, δ], determined according to the augmented 1D helical symmetry guidelines (in §2) provides the helical signature of the structure. This is because the two sets of defined helices, the n_{1}start and the n_{2}start, correspond to the two sets of protofilaments. However, will the guidelines also always give a unique [n_{1}, n_{2}] combination for a given helical structure? The answer is yes, as illustrated by the example below. The docked atomic model of the bacterial flagellar hook (Fujii et al., 2009; PDB entry 3a69) contains an asymmetric subunit with three protein domains spanning the inner, middle and outer layers of the helical cryoEM map. In terms of individual protein domains, the best protofilament of each domain yields an 11start, fivestart and sixstart helix, respectively, from the inner to the outer layers. Even though different helical descriptions of different layers are observed, the guidelines still give an unambiguous helical symmetry of [11, −6, −7.31, 45.32] for this structure. The assignment is based on two clear elements in the structural data: the 11start helix (the third protofilament in the protein) has a twist angle closest to zero and the sixstart helix is the first protofilament. In Fig. 8, each color presents an assigned helix and the assembly illustrates six, five and 11 helices, respectively, for the [6, −1], [5, −1] and [11, −6] symmetries.
Not all helical structures have unambiguous primary protofilaments, especially when the growth mechanism does not follow a helical path. The tubular structure of the HIV1 capsid protein (CA; Byeon et al., 2009) is such an example. In solution, CA forms a dimer via the association of its Cterminal domain (CTD). The cryoEM tubular structure (EMD5136) reveals that the basic unit is a trimer of CA dimers with a pseudothreefold at the CTD–CTD interfaces and the CA dimer is shared between two trimers. Following our guideline, we obtain a helical symmetry of [24, 13, 7.39, 165.78] for the CA tubular structure. The depicted by the [24,13] symmetry does not correspond to the observed CA hexamer; however, after applying the symmetrymanipulation rules (n_{1} = n_{1} − n_{2}, [n_{1}, n_{2}] swapping and n_{2} = n_{2} + n_{1}) the new helical symmetry of [13, 2, −11.00, 89.80] gives the cell dimensions of the hexamer. The surface lattices for both helical symmetries are highlighted in red in Fig. 9. The fact that the assigned asymmetric units in both helical symmetries ([24, 13] and [13, 2]) do not correspond to the assembly unit implies that the path of a trimer of CA dimers is not helical. In this case, our guidelines will fail to offer an unambiguous helical specification.
The guidelines have two limitations in fulfilling the aim that every helical structure would have a unique [n_{1}, n_{2}, φ, δ] helical symmetry. The first arises when the primary protofilament is ambiguous, as discussed above, and the second is encountered when there is a continuous helical density along a protofilament in the cryoEM structure rather than a clear boundary between asymmetric units. Under such circumstances, for a determined [n_{1}, n_{2}] symmetry the helical structure can be described by an infinite number of [φ, δ] pairs, which are always related by a constant. The helical structure of the tubular Aβ_{1–42} amyloid with a hollow core (Miller et al., 2010; Zhang et al., 2009) is an example of this limitation. The cryoEM structure gives a [2, 0] (or [2, 1]) helical symmetry and the two helical parameters [φ, δ] = [−3.75c, 4.8c], where c is a constant.
4.2. Relevance to experimental diffraction patterns
The diffraction patterns of helical structures consist of a series of layerlines. Assuming that the layerlines do not overlap, each layerline is the result of diffraction by a set of nstart helices. The position of the peak with the maximum diffraction intensity in each layerline can be indexed to correspond to a node in the helical net. The relationship between the 1D helical system [n_{1}, n_{2}] symmetry and the diffraction pattern is detailed in Figs. 4(a) and 4(b). The two figures illustrate the different assignments of n_{10} and n_{01}, which give different helical symmetries, [3, 1] and [3, −2], for the same structure that has a simple helical symmetry of t = 4 and u = 13. The assignment of n_{10} with the first peak close to the equator (n) of the diffraction pattern is consistent with the guideline for selecting the n_{1}start helices with a twist angle closest to zero. The assignment of n_{01} to the position of the diffraction which is close to the origin is also likely to constitute a main protofilament of the given helical structure.
4.3. The minimal number of helices needed for a complete helical structure description
From the rotohelical transformation, we learnt that a single (onestart) helix description is not always sufficient to generate the entire helical structure from given asymmetric units. The helix may need to be related by a C_{n} rotational symmetry, which implies that a minimum of n helices are required to cover the entire helical assembly. Given an [n_{1}, n_{2}] symmetry, there are n_{1} or n_{2} assigned helices for the entire helical description. To determine the minimal number of helices for complete structural description (or to be correlated with the rotohelical transformation), the [n_{1}, n_{2}] symmetry is reduced to an equivalent symmetry with n_{2} = 1 or 0. In the case of a reduced [n_{1}, 0] symmetry, the new n_{1} is the minimal number of helices.
It is straightforward to deduce the minimal number of helices for a given [n_{1}, n_{2}] helical symmetry. If the numbers n_{1} and n_{2} do not have a common factor, the symmetry can always be reduced to a onestart helix description by using a combination of the swap and the equivalence rules of n = h n_{10} − k n_{01} as described in §2. For example, [7, 3] can be reduced to [1, 3] with h = 0, k = 2. On the other hand, the largest common factor between n_{1} and n_{2} is the minimal number of helices for a complete helical structure description. For example, [8, 4] can be reduced to [4, 0] with the number 4, the largest common factor of 4 and 8.
4.4. A new description of helical symmetry
Despite the fact that so many helical structures have been determined, a universal formulation for representing helical symmetry is still lacking. The absence of agreement in the community has been attributed to three main reasons. The first apparent reason is a consequence of the fact that helical symmetries have been formulated in distinct ways to fulfill a particular requirement or convenience in different structuredetermination methods. The diversified helical representations can be classified into two commonly adopted helical schemes named the 1D and 2D helical systems. In this study, the two helical schemes were unified into a single helical specification by two constants [n_{1}, n_{2}] and we have shown that the two systems are interchangeable and complementary to each other. Because of the simplicity of using one less parameter and the lack of involvement of the axial radius, we suggest using the augmented 1D helical system with four parameters [n_{1}, n_{2}, twist, rise] for representing a helical structure.
The second hurdle for defining a helical description is that a helical structure can be pictured in many ways, i.e. in many [n_{1}, n_{2}] combinations as two (n_{1}start and n_{2}start) sets of helices. However, in principle, the generalized guidelines for describing a helical symmetry are expected to give a unique [n_{1}, n_{2}] specification that reflects the characteristics of the structure, although in a limited number of cases a unique specification is impossible.
The fact that no standard helical symmetry has been accepted so far can be attributed to the last obstacle: a complete coverage of helical description includes the capability of handling helical discontinuity (a seam). However, building an entire helical construct with a seam from given asymmetric units requires no additional modification in our formulation of helical transformation. Instead, a helical structure with a seam is simply reflected in the value of n_{2}. By definition, the helical discontinuity indicates that n_{2} is no longer an integer but a rational number.
4.5. Presentation of a structure with a helical discontinuity
An implicit requirement of the 2D helical system (§2.1) is that in a seamless helical arrangement [n_{1}, n_{2}] must be specified by integer numbers. By treating two consecutive asymmetric subunits in the primary protofilament as a new single asymmetric subunit, the new augmented 1D helical symmetry becomes [n_{1}, n_{2}/2, 2φ, 2δ], which is equivalent to the original helical symmetry [n_{1}, n_{2}, φ, δ] except that the asymmetric units are doubled in size. When the tubulin subunit is treated not as a dimer of αβ subunits but as a single subunit by ignoring the small difference between the α and β subunits (Sui & Downing, 2010), we do not encounter the microtubule seam problem. However, when treating the αβ dimer as an asymmetric subunit in the new 1D helical symmetry, helical structures with an odd number for the n_{2} symmetry (in single subunit representation) create a seam with a new rational n_{2}.
The microtubule EM structure (Cochran et al., 2009; EMD5038) presents such a helical discontinuity when treating the dimer of αβ subunits as the The augmented 1D helical symmetry in four parameters [13, 3/2, 0.0, 80.0] is sufficient to generate the entire helical structure with a seam, based on the helical transformation matrix summarized in Fig. 2. Owing to the helical discontinuity, the repetitive is no longer an identical unit. Instead, a complete round of n_{1} subunits (13 dimers in the microtubule case) now constitutes the identical unit in the helical structure with a seam. Therefore, the subunit coordination index [m_{1}, m_{2}] can no longer have an index with m_{1} ≥ n_{1} when applying the helical transformation to generate the repetitive subunits for a helical structure with a seam.
A seam in a helical structure can be classified visually with respect to its helical axis into a strictly vertical seam or a seam that wraps around the helix. The microtubule case above is an example of a vertical seam. Under the restriction that only a rational n_{2} and integer n_{1} > n_{2} are allowed in the augmented 1D helical representation, the corresponding helical structure always produces a vertical seam and the handedness of the seam is determined by the sign of n_{2}, with positive indicating a lefthanded seam and negative a righthanded seam. In contrast, a seam described by a rational n_{1} > n_{2} and integer n_{2} should correspond to the type of seam that wraps around the helix.
4.6. Application to polymorphic structural assemblies
Both the 1D and 2D helical systems are designed to create helical assemblies from asymmetric subunits with specified helical parameters. The conformational heterogeneity of molecular assemblies is known to set limits on solving cryoEM structures at high resolution. for the microtubule structure. The question is can all such polymorphic structures be generated based on a single helical structure which is given in an atomic model or an EM map? The answer is yes, because the interactions between the asymmetric subunits are preserved in the definition of the 2D helical system. Thus, to create distinct polymorphic structures with almost the same subunit–subunit interactions we only need to change the specific [n_{1}, n_{2}] helical symmetry.
is particularly problematic in the determination of structures with helical symmetry since even a slight deviation in the interactions between two asymmetric subunits will create distinct structures with different symmetries. We have seen such an example in Table 25. Conclusions
In this paper, we give two helical formulations (augmented 1D and 2D) to describe a helical structure. Unlike the rotohelical transformation (1D formulation) with a helical plus an additional rotational operation, a new augmented 1D formulation with two consecutive helical operations enables unification with the widely adopted 2D formulation, giving a common helical symmetry descriptor with two integers [n_{1}, n_{2}]. The new formulation requires only four parameters [n_{1}, n_{2}, twist, rise] for the augmented 1D helical system and five parameters [n_{1}, n_{2}, a, b, γ] for a 2D helical system to generate the entire structural assembly from given asymmetric units. We propose using the augmented 1D helical system with four parameters to describe a helical structure owing to its simplicity and independence from the helical radius compared with the 2D helical system.
In terms of a helical net representation, a helical structure with an [n_{1}, n_{2}] symmetry indicates that its organization is specified by two sets of helices (n_{1}start and n_{2}start). Because many different [n_{1}, n_{2}] combinations exist for the same structure, we suggest general guidelines for selecting a unique [n_{1}, n_{2}] symmetry which reflects the structural characteristics of a given helical structure. We provide a computational graphics tool for this purpose which can be used for any helical structure determined by Xray fibre diffraction or EM imaging.
While there are multiple ways to construct equations that generate the same helical structure, an [n_{1}, n_{2}, twist, rise] description provides the following advantages: firstly, it provides full helical coverage, including a helical discontinuity (seam) which is indicated by a rational n_{2}; secondly, it reflects the structural characteristics of the assembly (formation mechanism) directly by four helical parameters; that is, similar structures give similar parameters; thirdly, the unnecessary error in reproducing the entire helical structures, such as editing wrong transformation matrices in the PDB or in the deposited EM parameters in the EMDB, will be prevented; and lastly, the new helical symmetry is expected to be useful for maintaining a predetermined helical symmetry in structural as well as for the generation of all `meaningful' polymorphic structural assemblies from a given helical atomic model or EM density map.
Acknowledgements
We would like to thank Dr Edward Egelman for discussions and in particular for his insightful comments, which helped us in improving the paper. This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health under contract No. HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. This research was supported (in part) by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research.
References
Byeon, I.J., Meng, X., Jung, J., Zhao, G., Yang, R., Ahn, J., Shi, J., Concel, J., Aiken, C., Zhang, P. & Gronenborn, A. M. (2009). Cell, 139, 780–790. Web of Science CrossRef PubMed CAS
Cochran, J. C., Sindelar, C. V., Mulko, N. K., Collins, K. A., Kong, S. E., Hawley, R. S. & Kull, F. J. (2009). Cell, 136, 110–122. Web of Science CrossRef PubMed CAS
Damnjanović, M., Nikolić, B. & Milošević, I. (2007). Phys. Rev. B, 033403.
DeRosier, D., Stokes, D. L. & Darst, S. A. (1999). J. Mol. Biol. 289, 159–165. Web of Science CrossRef PubMed CAS
Diaz, R., Rice, W. J. & Stokes, D. L. (2010). Methods Enzymol. 482, 131–165. Web of Science CrossRef CAS PubMed
Egelman, E. H. (2007). J. Struct. Biol. 157, 83–94. Web of Science CrossRef PubMed CAS
Egelman, E. H. (2010). Methods Enzymol. 482, 167–183. Web of Science CrossRef CAS PubMed
Evans, P. R. (2001). Acta Cryst. D57, 1355–1359. Web of Science CrossRef CAS IUCr Journals
Fujii, T., Kato, T. & Namba, K. (2009). Structure, 17, 1485–1493. Web of Science CrossRef PubMed CAS
Grubisic, I., Shokhirev, M. N., Orzechowski, M., Miyashita, O. & Tama, F. (2010). J. Struct. Biol. 169, 95–105. Web of Science CrossRef PubMed CAS
Heymann, J. B., Chagoyen, M. & Belnap, D. M. (2005). J. Struct. Biol. 151, 196–207. Web of Science CrossRef PubMed
Kikkawa, M. (2004). J. Mol. Biol. 343, 943–955. Web of Science CrossRef PubMed CAS
Klug, A., Crick, F. H. C. & Wyckoff, H. W. (1958). Acta Cryst. 11, 199–213. CrossRef CAS IUCr Journals Web of Science
Lawson, C. L. et al. (2011). Nucleic Acids Res. 39, D456–D464. Web of Science CrossRef CAS PubMed
Lawson, C. L., Dutta, S., Westbrook, J. D., Henrick, K. & Berman, H. M. (2008). Acta Cryst. D64, 874–882. Web of Science CrossRef IUCr Journals
Marvin, D. A. (1990). Int. J. Biol. Macromol. 12, 125–138. CrossRef CAS PubMed Web of Science
Miller, Y., Ma, B., Tsai, C.J. & Nussinov, R. (2010). Proc. Natl Acad. Sci. USA, 107, 14128–14133. Web of Science CrossRef CAS PubMed
Pederson, D. M., Welsh, L. C., Marvin, D. A., Sampson, M., Perham, R. N., Yu, M. & Slater, M. R. (2001). J. Mol. Biol. 309, 401–421. Web of Science CrossRef PubMed CAS
Sachse, C., Chen, J. Z., Coureux, P. D., Stroupe, M. E., Fändrich, M. & Grigorieff, N. (2007). J. Mol. Biol. 371, 812–835. Web of Science CrossRef PubMed CAS
Sachse, C., Fändrich, M. & Grigorieff, N. (2008). Proc. Natl Acad. Sci. USA, 105, 7462–7466. Web of Science CrossRef PubMed CAS
Schmidt, M., Sachse, C., Richter, W., Xu, C., Fändrich, M. & Grigorieff, N. (2009). Proc. Natl Acad. Sci. USA, 106, 19813–19818. Web of Science CrossRef PubMed CAS
Stewart, M. (1988). J. Electron Microsc. Tech. 9, 325–358. CrossRef CAS PubMed Web of Science
Sui, H. & Downing, K. H. (2010). Structure, 18, 1022–1031. Web of Science CrossRef CAS PubMed
Toyoshima, C. (2000). Ultramicroscopy, 84, 1–14. Web of Science CrossRef PubMed CAS
Toyoshima, C. & Unwin, N. (1990). J. Cell Biol. 111, 2623–2635. CrossRef CAS PubMed Web of Science
Tsai, C.J., Zheng, J. & Nussinov, R. (2006). PLoS Comput. Biol. 2, 311–319. CrossRef CAS
Wang, Y. A., Yu, X., Overman, S., Tsuboi, M., Thomas, G. J. Jr & Egelman, E. H. (2006). J. Mol. Biol. 361, 209–215. Web of Science CrossRef PubMed CAS
Watson, J. D. & Crick, F. H. (1953). Nature (London), 171, 737–738. CrossRef PubMed CAS Web of Science
Zhang, R., Hu, X., Khant, H., Ludtke, S. J., Chiu, W., Schmid, M. F., Frieden, C. & Lee, J.M. (2009). Proc. Natl Acad. Sci. USA, 106, 4653–4658. Web of Science CrossRef PubMed CAS
This is an openaccess article distributed under the terms of the Creative Commons Attribution (CCBY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.