research papers
xMDFF:
flexible fitting of low-resolution X-ray structuresaBeckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA, bDepartment of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL 60637, USA, cDepartment of Computer Science, University of Missouri, Columbia, MO 65211, USA, and dDepartment of Physics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
*Correspondence e-mail: kschulte@ks.uiuc.edu
X-ray crystallography remains the most dominant method for solving atomic structures. However, for relatively large systems, the availability of only medium-to-low-resolution diffraction data often limits the determination of all-atom details. A new D-ribose-binding protein. xMDFF has been successfully applied to re-refine six low-resolution protein structures of varying sizes that had already been submitted to the Protein Data Bank. Finally, via systematic of a series of data from 3.6 to 7 Å resolution, xMDFF refinements together with electrophysiology experiments were used to validate the first all-atom structure of the voltage-sensing protein Ci-VSP.
flexible fitting (MDFF)-based approach, xMDFF, for determining structures from such low-resolution crystallographic data is reported. xMDFF employs a real-space scheme that flexibly fits atomic models into an iteratively updating electron-density map. It addresses significant large-scale deformations of the initial model to fit the low-resolution density, as tested with synthetic low-resolution maps ofKeywords: xMDFF; molecular dynamics flexible fitting.
1. Introduction
X-ray crystallography is arguably the most versatile and dominant technique for delivering atomic structures of biomolecules. An increasing number of structures are submitted each year to the Protein Data Bank, with over 90% of the current entries coming from X-ray crystal structures. Traditional methods for determining X-ray structures include least-squares with gradient descent (Hendrickson, 1985), (Pannu & Read, 1996; Bricogne & Irwin, 1996; Murshudov et al., 1997), simulated annealing (Brünger, 1988) and knowledge-based conformational sampling (Depristo et al., 2005). However, investigating the structure of large biomolecular complexes has posed a serious challenge to traditional crystallographic techniques. The inherent flexibility of such large systems and the presence of disordered solvent and or ligands often cause the crystals to diffract at low resolutions. Furthermore, in the low-resolution limit the number of atomic coordinates to be determined often exceeds the number of observed diffraction intensities. At moderate to low resolutions, knowledge of the stereochemistry of the system must be incorporated to achieve accurate atomic positions. Lower resolutions, >5 Å, pose a greater challenge to however, even at ∼7 Å resolution there are in principle enough independent Bragg reflections to determine the backbone torsion angles of protein crystal structures (Brunger et al., 2012).
Solving structures from low-resolution diffraction data is a difficult, time-consuming process. Consequently, low-resolution data sets are usually chosen to be discarded (Karmali et al., 2009).
However, new methods are being developed to better handle low-resolution data. For example, DEN et al., 2010) to address the issue. Other notable recent developments applicable to refining low-resolution structures include normal-mode (Delarue, 2008), the Rosetta implementation of physical energy functions (DiMaio et al., 2011) and its combination with reciprocal-space X-ray in PHENIX (DiMaio et al., 2013), torsional optimization protocols (Haddadian et al., 2011) and external structure restraints or jelly-body in REFMAC (Murshudov et al., 2011). Indeed, the number of low-resolution X-ray structures has grown rapidly in recent years (Karmali et al., 2009).
incorporates deformable elastic network models with generic stereochemistry and homology information (SchröderHere, we present a new method, xMDFF (molecular dynamics flexible fitting for X-ray crystallography), for real-space (Diamond, 1971; Chapman, 1995; Chapman & Blanc, 1997) X-ray Our method specifically targets the handling of low-resolution data and the large-scale deformations that often separate the target and reference models. To create this method, we extended a previous hybrid method, flexible fitting (MDFF), developed to solve atomic models from cryo-EM densities. In MDFF, an initial atomic model is subject to a (MD) simulation with a modified function that includes a term derived from the cryo-EM density map (Trabuco et al., 2008, 2009). The accuracy and robustness of MDFF has been widely demonstrated in many applications solving structural models for the ribosome (Villa et al., 2009; Gumbart et al., 2009, 2011; Becker et al., 2009; Seidelt et al., 2009; Trabuco et al., 2010; Frauenfeld et al., 2011; Agirrezabala et al., 2011; Li et al., 2011), photosynthetic proteins (Hsin et al., 2009; Sener et al., 2009) and the first all-atom structure of the HIV capsid (Zhao et al., 2013). In xMDFF, the MDFF protocol is modified to use iteratively updated model-phased maps to fit and subsequently refine densities derived from low-resolution X-ray diffraction data.
xMDFF was tested via the of a model with a known final structure at resolutions of 3.5–5 Å. Next, xMDFF successfully further refined six low-resolution (4–4.5 Å) protein structures of varying sizes that had already been submitted to the PDB (Fig. 1). Finally, xMDFF was applied in parallel with an independent experimental investigation to resolve crystallographic uncertainty in the three-dimensional structure of Ciona intestinalis voltage-sensing protein (Ci-VSP). xMDFF refinements were evaluated by (i) Rfree, (ii) r.m.s.d. to known targets (test case and voltage-sensing protein) and (iii) improvements in structural geometry of the model. In all cases xMDFF successfully refined the structures and demonstrated an ability to work at very low resolution (7 Å) and with starting models that are very divergent from the target (>5 Å).
2. Methods
2.1. Concept
xMDFF is derived from the MDFF method which solves atomic models of biomolecules imaged by cryo-electron microscopy. In MDFF, an initial atomic model is subjected to an MD simulation with a modified et al., 2008, 2009). Through the density-dependent term, atoms experience steering forces, ffit, that locally drive them towards high-density regions, thereby fitting the atoms to the map. For use in low-resolution X-ray crystallography, the MDFF protocol was modified to work with model-phased densities, using the phases φcalc calculated from a tentative model and the amplitudes Fobs from the X-ray diffraction data to produce a 2mFobs − DFcalc density map. The density is biased by the model, but contains sufficient information from the Fobs to determine the experimental structure. Next, the tentative model is flexibly fitted into the electron-density map using MDFF. In addition to the steering forces derived from the density data, structural restraints are applied to preserve the secondary structure of proteins and (Trabuco et al., 2008, 2009), as well as to ensure stereochemical correctness (Schreiner et al., 2011), thus avoiding overfitting the model into the map. The xMDFF-fitted structure provides new φcalc that, together with Fobs, are used to regenerate the electron-density map. The fitted structure is then employed as an updated search model to be driven into the new model-phased density map, and this process continues iteratively. In effect, ffit drives the structure in a direction biased by the Fobs contribution to the density. Consequently, φcalc improvement is indicated by a decrease in R factors with each subsequent iteration. The iterations continue until the Rfree and Rwork values reach a minimum or become lower than a predefined tolerance. The quality of the xMDFF-refined all-atom structures is further analyzed via computing correlation coefficients (CCs) between the electron-density map generated from the refined φcalc with Fobs and a simulated map at the target resolution.
function that includes a term derived from the cryo-EM density map (TrabucoNext, the MDFF method is briefly discussed in addition to the algorithmic and computational extensions required to address the crystallographic aspects of xMDFF. All of the software required to use xMDFF is currently distributed in released versions of NAMD (Phillips et al., 2005), VMD (Humphrey et al., 1996) and PHENIX (Adams et al., 2010). A script for running xMDFF can be found as part of the Supporting Information1 for this article.
2.2. MDFF
xMDFF is a real-space Utotal, during MD to fit a structure into a low-resolution cryo-EM density (Trabuco et al., 2008, 2009). This function has three terms, namely
technique and relies on a previously developed method, flexible fitting (MDFF), which uses a modified function,The first term, UMD, is the conventional MD function. The second term, UEM, is a function derived from the electron density that is used to drive the structure from areas of low density to areas of high density. The third term, USS, is a potential which helps to preserve the secondary structure of proteins and through restraints (Trabuco et al., 2008, 2009) in addition to ensuring stereochemical correctness in and peptide-bond conformations (Schreiner et al., 2011). Symmetry restraints can also be introduced if the system exhibits (Chan et al., 2011).
The VMD (Humphrey et al., 1996) plugin mdff can be employed to generate a function UEM defined on a three-dimensional grid based on the cryo-EM density map,
where
Here Φ(r) is the density at position r; it and its maximum value Φmax are obtained from the cryo-EM data. In equation (3), a threshold Φthr is introduced to clamp the Φ(r) values that are lower than Φthr to the Φthr value, effectively removing the solvent contribution from the map and creating a flat potential for those regions. The global scaling factor ξ uniformly adjusts the strength of the influence of the cryo-EM map on the molecular system. In addition, VEM(r) has a weight wj for each atom j present at position rj. Generally, wj is set to the this weighting avoids strong differences in the acceleration of atoms owing to mass disparities, ensuring stability of the simulation. This choice of weighting factor is also in line with the rough correspondence between the mass of atoms and their density in a cryo-EM map.
The UEM defined by (2) and (3) is incorporated into an MD simulation using the gridForces feature (Wells et al., 2007) of NAMD (Phillips et al., 2005). The gridForces feature allows an arbitrary potential defined on a three-dimensional grid to be added to an MD simulation. The gradient of the potential is calculated by finite-difference methods and forces are applied to each atom depending on its position on the grid using an interpolation scheme. In the case of MDFF, for each atom i in the system, the resulting force is given by
functionThe force fiEM can be tuned via the scaling factor ξ (3), which is the same for all atoms, and the weight wi, which can be defined on a per-atom basis.
2.3. xMDFF
To extend MDFF to low-resolution X-ray crystallography and create xMDFF, the MDFF protocol was adjusted to work with electron densities produced from X-ray diffraction instead of a cryo-EM source (Fig. 2). To this end, xMDFF uses model-phased maps which incorporate the phases φcalc from a tentative model and the amplitudes Fobs from the X-ray diffraction data. Ideally, this approach produces a density, which although biased by the phasing model, is expected to contain a sufficient contribution from the diffraction data such that the density can be used as a target for However, this may not be the case if the experimental data are too low resolution (>∼7 Å as seen in the Ci-VSP case below) or excessively noisy, or if the phasing model greatly differs from the experimental structure. The model-phased density can be used as a potential (2) to steer the structure into the appropriate locations using MDFF forces (4), now termed ffit. Once the structure is fitted into the density, it provides new phases φcalc to be used with the Fobs to generate an updated density. This structure is then fitted into the new map using MDFF, and the process proceeds iteratively until a sufficiently low Rfree is obtained.
To create the densities, xMDFF utilizes tools in the PHENIX software suite (Adams et al., 2010) to generate 2mFobs − DFcalc maps. These maps highlight the areas of the density where the difference between Fobs and Fcalc is greatest, suggesting that these parts of the structure require xMDFF employs additional features of PHENIX to improve the densities, such as bulk-solvent correction and β-factor sharpening, which improves the maps, particularly at low resolutions (DeLaBarre & Brunger, 2003). All density maps generated for use in xMDFF exclude the Rfree reflections. This results in poorer quality maps, but allows proper use of the Rfree metric for an unbiased evaluation, which is especially useful for low-resolution data. To help correct for model bias caused by using a homology model to supply phase information, xMDFF employs kicked maps (Pražnikar et al., 2009), which are produced by randomly perturbing the structure multiple times, calculating densities with the new φcalc and averaging the results. Additionally, xMDFF can make use of inherent sampling owing to the MD-based nature of the method and can increase or lower the temperature to control the thermal fluctuations of the system. Conventional MD simulations generate ensembles of atomic structures under constraints such as constant pressure (P), volume (V), temperature (T) and number of particles (N). Generally, xMDFF is compatible with any such ensemble-generation scheme, e.g. constant NPT or NVT, and microenvironmental conditions, such as vacuum, explicit/implicit solvent or membrane, achievable within typical NAMD simulations. For the examples in the subsequent sections, constant volume and temperature, i.e. NVT, ensembles were chosen in vacuum. Details of MD conditions are provided in the xMDFF scripts in the Supporting Material. Much of the analysis of the xMDFF-refined structures was performed with the phenix.model_vs_data package in PHENIX, including the computation of Rwork, Rfree and MolProbity (Chen et al., 2010) statistics. All analysis was performed on structures with B factors obtained through individual ADP in phenix.refine.
2.3.1. protocol
xMDFF ξ (3) is set to a low value, ∼0.1, which helps to reduce the overall force felt by each of the selected atoms owing to the density. Both of these settings allow the system to remain more flexible and not be heavily constrained to the map, which is likely to be quite noisy at the early stage of the The flexibility is also required for adequate sampling of the density map. As the r.m.s.d. of the system stabilizes with time, the side-chain atoms can be coupled to the density-derived potential and ξ can be increased to ∼0.3–0.5. Once the r.m.s.d. stabilizes again, it can be beneficial to further increase ξ and begin reducing the temperature of the system to 0 K. The protocol is illustrated using the change of the r.m.s.d. of the phasing model relative to a known reference structure as shown in Supplementary Fig. S1 for a simple test system as described in §3.1.
is performed in multiple stages, tweaking the parameters as outlined here, which follows the general protocol employed for the cases presented in this paper. If the initial phasing model is thought to differ from the reference model by large-scale conformational changes (>∼2 Å r.m.s.d.), it is best to first only couple the backbone atoms to the density-derived potential. Furthermore, the global scaling factorAs in other Rwork and Rfree of the structure. Poor geometries such as bad dihedral angles and a large difference between Rwork and Rfree can be indicative of overfitting a structure to the density and should be avoided.
techniques, xMDFF can also perform simulated annealing by increasing and subsequently decreasing the temperature of the system for multiple iterations to help avoid the structure becoming trapped in any local energy minimum. At this stage of the it is important to frequently analyze the geometry,Real-space ). and simulated-annealing protocols further improved the convergence radius of real-space refinements (Brünger et al., 1987). However, MD sampling of the side chains at low resolutions becomes computationally expensive. Historically, such issues have been addressed with dihedral sampling techniques (Rice & Brünger, 1994). As noted here, xMDFF uses simulated annealing to address the sampling of side chains. To increase the speed of fitting side chains and to improve their placement, future versions of xMDFF will incorporate data from rotamer libraries (Subramaniam & Senes, 2012). Improved fitting of side chains can also be achieved by using more realistic simulation environments. All of the refinements discussed here were performed in vacuum; however, it has been shown that MDFF-derived structures can be improved by the inclusion of explicit water molecules during the simulation, or with the use of a generalized Born implicit solvent for better computational performance with similar results (Tanner et al., 2011). Additionally, membrane proteins can be simulated in a membrane, as in the previous case of MDFF studies of the ribosome (Frauenfeld et al., 2011).
methods are expected to have a wide convergence radius, as has been formally shown, provided that the initial phases are of good quality (Diamond, 1971The computational cost of a typical xMDFF NAMD, it is able to scale up from a single CPU core to thousands, potentially allowing the protocol to be applied to large systems such as the multi-million-atom HIV capsid (Zhao et al., 2013). Furthermore, NAMD, and thus xMDFF, can utilize GPU acceleration to improve simulation performance (Stone et al., 2007, 2010).
will vary based on the system size and the number of iterations required, which is very system-dependent. Generally, we find that xMDFF can be computationally more demanding than other software owing to the full nature of the method. However, because xMDFF is an extension of MDFF, which was developed as part of3. Results
3.1. Proof of principle
The performance of xMDFF was evaluated on a test structure, ribose-binding protein, with two known conformations at high resolution. The open conformation (PDB entry 1urp ; 2.3 Å resolution) was used as an initial phasing model, with the closed conformation (PDB entry 2dri ; 1.6 Å resolution) as a target model (Fig. 3). The was performed using the diffraction data for the target model at four resolutions (3.5, 4, 4.5 and 5 Å), created by truncating the original intensities at each resolution limit. The four refinements began with the same initial phasing model and were evaluated against the same target; the final refined structures were evaluated using the overall improvement in Rfree as well as the root-mean-squared deviation (r.m.s.d.) from the target model. xMDFF refinements improved the Rfree value dramatically at every resolution, with an initial value of 0.57 and a final value of 0.23 at a resolution of 3.5 Å (Table 1). The all-heavy-atom r.m.s.d. for the refined structure at every resolution was 3.0 Å from the high-resolution 2dri target, down from the initial 5.46 Å. However, the final r.m.s.d. of the backbone alone at 3.5 Å resolution was 0.53 Å (down from an initial 4.46 Å), demonstrating proper backbone placement relative to the target model. In this case, the all-atom r.m.s.d.s are much higher relative to the backbone r.m.s.d.s because the target model was originally refined against 1.6 Å resolution data, where the side chains are much better resolved. Using the low-resolution synthetic data, the backbone can still be fitted quite well, but it is much harder to refine the side chains to the same extent.
|
To verify that xMDFF was performing as well as it could, a standard MDFF simulation was performed using density created directly from the 2dri target model and Fobs reduced to 3.5 Å resolution, which underwent non-iterative The final structure obtained from this fitting had an all-atom r.m.s.d. of 3.01 Å and a backbone r.m.s.d. of 0.56 Å, very close to those of the xMDFF refinements and demonstrating that xMDFF performs well against a more appropriate benchmark. Although the synthesized 3.5 Å resolution map manifests the best possible density at this resolution, it has much less side-chain information than the original 1.6 Å resolution map. This lack of side-chain density negatively affects side-chain fitting, and thus increased the side-chain r.m.s.d. relative to the known high-resolution target and accounts for the poor side-chain refinements in xMDFF. Since the initial model was originally obtained through against high-resolution data, the overall structural geometry including the percentage of favored Ramachandran angles started very high at 98.5%. However, xMDFF did manage to improve the overall MolProbity (Chen et al., 2010) scores, primarily through a decrease in rotamer outliers as well as in steric clashes. In practice, xMDFF improves the MolProbity statistics of almost all models discussed here owing to the MD-based nature of the method, which provides excellent structural restraints on the system.
We further tested the B factor of 35.00 Å2, in a fashion similar to the technique used in Schröder et al. (2010). Such a smoothing procedure further reduces the signal-to-noise ratio of the 5 Å resolution diffraction data, posing perhaps a more realistic scenario. The resulted in very similar but slightly worse results, with a backbone r.m.s.d. of 0.93 Å, an Rfree of 0.34 and an Rwork of 0.28. To test the robustness of xMDFF refinements to the choice of initial structure, the 5 Å resolution diffraction data were refined using another search model that has the same overall r.m.s.d. from the target but with a shift in the positioning of the helices. The new search model, shown in Supplementary Fig. S1, has a backbone r.m.s.d. of 4.45 Å from 2dri , which is comparable to that of 1urp , but it also has an r.m.s.d. of 1.5 Å relative to 1urp . The results are slightly worse but very similar, with an r.m.s.d. of 0.8 Å, a final Rfree of 0.34 (down from an initial 0.62) and a final Rwork of 0.26 (down from 0.56).
capabilities of xMDFF by employing more realistic low-resolution synthetic data. After truncating the high-resolution data to 5 Å, the data were smoothed using aAdditionally, during refinements of the test model the scaling factor which determines the overall strength of the steering forces being applied was varied in order to determine the optimal parameterization for future work (see §2 for additional information). A lower scaling factor was determined to be useful during early stages of backbone to keep the system flexible, but increasing the scaling factor is considered useful as progresses in order to couple the structure more strongly to the density and improve the fit; however overfitting needs to be avoided at this stage.
3.2. Improving refinements of reported PDB structures
To test whether xMDFF is capable of improving previously refined structures, it was applied to six structures at 4–4.5 Å resolution deposited previously into the PDB (Fig. 1). The structures served as an initial phasing model and xMDFF was able, without any further knowledge of the reference model, to improve the Rfree by at least 0.01 (PDB entry 1aos ) and up to a maximum of 0.08 (PDB entries 1av1 and 1xdv ) in the case of all six structures (Table 1). Furthermore, the Rfree and Rwork values of each system were relatively close, indicating that xMDFF is not overfitting the structures. Additionally, every xMDFF-refined structure exhibits an improved structural geometry, as shown by a higher percentage of Ramachandran favored angles and a lower overall MolProbity score over those of the initial structures (Table 1). In one of the most improved cases (PDB entry 1xdv ), the main cause for the lowered Rfree was improvement in a highly flexible region with relatively large root-mean-square fluctuations (Supplementary Fig. S2).
This region shifted the most during xMDFF ). Flexible regions often diffract poorly and can be difficult to properly place using traditional means of X-ray especially at low resolution. Densities were generated using Fobs and also φcalc from the initial and final models, respectively (Fig. 4). A significant improvement in the quality of the densities can be seen in terms of their respective completeness and how well the structure fits inside the density, with the latter captured by an increase in the local CC from 0.47 (initial, blue) to 0.63 (final, red).
with an r.m.s.d. of 4.3 Å from the initial to the final model (part of which is shown in Fig. 4The application of force-field-based MD simulations for model 1ye1 ) preserves the planar coformation of the heme group. Consequently, distortions in cofactors accompanying multiple conformational states of protein crystals can be accounted for by low-resolution diffraction data (Supplementary Fig. S3a).
inherently provides geometries consistent with some energy minima. Not only does the procedure improve protein conformations, but it also conserves the structures of cofactors, keeping them compatible with the surrounding protein. For example, xMDFF of human hemoglobin (PDB entryxMDFF provides the atomic positions as well as the chemical bonding in the low-resolution X-ray map. In many cases, the assumed connectivity between atoms is reflected in the degree of structural 1yi5 ), the complex has a dense core composed primarily of the AChBPs. The cobratoxins are composed of two antiparallel β-sheets forming slightly concave discs, five of which emerge from the dense AChBP core, with each disc sporting ten cysteine residues. However, given only the positions of atoms it is unclear whether the cysteines are in oxidized or reduced states, i.e. whether disulfide bridges are present or not. Using the 4.2 Å resolution diffraction data, xMDFF refinements were performed assuming either the presence or the absence of the disulfide bridges (Supplementary Fig. S3b). A more pronounced was achieved with the assumption of oxidized cysteines (a final Rwork of 0.26 and an Rfree of 0.29 in contrast to the final Rwork of 0.27 and Rfree of 0.31 for the reduced state), implying the existence of disulfide bridges. The presence of the five xMDFF-predicted disulfide bridges per Cbtx disc has been validated in biochemical studies, whereby four bridges are key to the stability and the concave shape of the disc and the fifth is resposible for optimal Cbtx–AChBP binding (Bourne et al., 2005). Thus, xMDFF provided biomolecular structures with energetically favorable geometries that interpret low-resolution density maps as well as help to clarify structures' relevant biological functions.
which in turn can affect predictions of the associated function. For example, in the case of the pentameric cobratoxin (Cbtx)-bound acetylcholine-binding protein (AChBP) complex (PDB entryThe xMDFF-based structures [PDB entries 1av1 (Borhani et al., 1997), 1xdv (Sondermann et al., 2004), 1yi5 (Bourne et al., 2005), 1aos (Turner et al., 1997), 1jl4 (Wang et al., 2001) and 1ye1 (Kavanaugh et al., 2005)] in Fig. 1 showed improvement with regard to R factor and geometry over structures that had resulted from conventional approaches, e.g. REFMAC or CNS/X-PLOR (PDB entries 1yi5 , 1jl4 , 1ye1 , 1av1 , 1xdv and 1aos ), applied to the same diffraction data, most likely owing to the combination of starting xMDFF with good initial phases from an already refined search model and having incorporated in xMDFF. An identical conclusion on the use of good initial phases was drawn from X-PLOR refinements with MD simulations. However, models 1jl4 and 1av1 , which were published in 2001 and 1997, respectively, resulted from early-generation methods and accordingly had relatively high R-factor values, posing an easy challenge for our present xMDFF treatment. In other cases (1yi5 and 1ye1 ) the original refinements involved initial REFMAC or X-PLOR rigid-body fitting of the search model used and the authors had to resort to manual fitting employing FRODO to account for the needed structural flexibility. The automated, force-field-based flexible fitting algorithm in xMDFF should avoid errors arising in manual fitting. Altogether, for the discussed examples, the xMDFF-refined models are found to be better or at least as good as the published models.
3.3. xMDFF and experimental validation of the Ci-VSP crystal structure
Finally, xMDFF was applied to solve the structure of a voltage-sensing protein, Ci-VSP, using 3.6, 4 and 7 Å resolution diffraction data (Fig. 5). To validate the reliability of the xMDFF the work was carried out in parallel with an independent experimental investigation of the structure and function of Ci-VSP (Li et al., 2014).
Voltage-sensing protein is a common scaffold present in voltage-gated ion channels, voltage-sensitive enzymes and voltage-gated proton channels, which are related to diverse important physiological functions. As illustrated in Fig. 5, the protein under current investigation, Ci-VSP, is arranged as an antiparallel four-transmembrane-helix bundle S1–S4; the overall structure is in agreement with the basic three-dimensional architecture of all known voltage-sensor proteins (Jiang et al., 2003; Long et al., 2007; Payandeh et al., 2011; Zhang et al., 2012). The positively charged S4 helix within Ci-VSP reorients upon stimulus from a transmembrane electric field, leading to downstream responses. Despite a wealth of structural and functional data, the details of this conformational change remain controversial, in particular the movement of the S4 helix.
According to electrophysiological results, Ci-VSP at 0 mV assumes the resting (Down) state in the wild type (WT) but the activated (Up) state in the R217E mutant. Crystal structures of both states have been determined experimentally: R217E at 2.5 Å resolution and WT at 3.6 Å resolution (Li et al., 2014). Unfortunately, there was crystallographic uncertainty in the S4 position in the Ci-VSP WT 3.6 Å resolution electron-density map. Spectroscopic data had limited the S4 position of Ci-VSP WT to three options in reference to the R217E structure: no conformational change, one click down and two clicks down, where a click refers to the offset of a helix by one turn. Obviously, the confirmation of the S4 position in Ci-VSP WT became the key to the puzzle. xMDFF was applied to predict the WT Ci-VSP structure and, thereby, to resolve the uncertainty in the low-resolution data.
Refinement started from a MUFOLD-predicted (Zhang et al., 2010) medium-confidence homology model developed using information from 13 proteins (Supporting Information). During the tentative model underwent a remarkable large-scale deformation with an r.m.s.d. of 5.96 Å. Unlike many traditional techniques, xMDFF is able to handle such large-scale structural deformations between the initial and final structures, producing in the present case final Rfree values of 0.28 and 0.29, starting from initial Rfree values of 0.50 and 0.48, at 3.6 and 4 Å resolution, respectively (Table 1). Positioning of the functionally relevant S4 helices is in excellent agreement with the one-click-down model of WT Ci-VSP, with an r.m.s.d. ranging from 0.4 to 1 Å.
To further confirm the S4 position, potential structural models were generated by gradual shift and rotation of the S4 helix from the Up-conformation model to the two-click-down model in 2000 even steps (in all a ∼10 Å vertical displacement and ∼110° rotation). Two independent parameters for model evaluation were calculated from each of these 2000 structures: (i) the crystallographic Rfree value and (ii) the (CC) between the experimental density map of the S4 region and the calculated electron density from the refined model (Supplementary Fig. S4). Fig. 5(b) shows that the structure corresponding to the Rfree minimum and CC maximum from Supplementary Fig. S4 resides in a region that unambiguously places the position of the S4 helix in the one-click-down position. Assuming that movement of the S4 helix in Ci-VSP follows the classic helix-screw or sliding-helix mode (Li et al., 2014), other that are not involved in any helix-screw type of motion are considered to be redundant. Subsequently, the Rfree minimum identified by xMDFF along the chosen line of helix-screw offsets is likely to be global for the resting state of WT Ci-VSP. xMDFF clearly differentiates the pattern of side chains associated with a specific S4 helix position even though the individual side chains are not fully visible at 3.6 Å resolution. The resulting S4 position is three residues lower than that of the Up structure, positioning Arg residues into the protein interior and away from and is in excellent agreement with the one-click-down model.
The R factors. The R factors from the resulting structure were considerably higher than those obtained from higher resolutions. Unlike the higher resolution maps, the 7 Å resolution data failed to distinguish between the Up and one-click-down models of Ci-VSP. Furthermore, using the 7 Å resolution data the R factors derived from the high-resolution Up or one-click-down structures were comparable to that of the xMDFF output. Thus, we conclude that the 7 Å resolution map is too coarse to resolve any information relevant to S4 placement.
with the 7 Å resolution data was not as pronounced as those with the 3.6 and 4 Å resolution data. However, improvements were still observed in the r.m.s.d. andThe present example reaffirms the capability of xMDFF to address large-scale structural deformations, as has been shown for the test case with synthetic data sets, to produce significant refinements yielding realistic structures, but now from more noisy low-resolution experimental data.
4. Discussion
The introduction of MD and simulated-annealing algorithms have facilitated et al., 1987). xMDFF provides a significant step forward among the MD-based algorithms as a crystallographic tool. A brief comparison of xMDFF predictions with those from other available methods is provided in Supplementary Table S2. Broadly speaking, xMDFF refinements provide the lowest R factors, minimal overfitting and improved structural statistics among the methods compared [DEN in both CNS and PHENIX (Brünger et al., 1998; Schröder et al., 2010) in addition to default PHENIX (Adams et al., 2010) refinement]. However, as further discussed in the Supporting Information, although we try to achieve a fair comparison, the results might depend on the user's knowledge of the system, the application of the software and the usage of manual fitting. For a more controlled comparison, xMDFF was used to refine two structures using the same initial models (PDB entries 3kso and 3k0i ) and reflection data (PDB entry 3k07 ) from a previous comparison of Rosetta–PHENIX (DiMaio et al., 2013), CNS DEN (Brünger et al., 1998; Schröder et al., 2010), PHENIX (Adams et al., 2010) and REFMAC5 (Murshudov et al., 2011). The two structures are sufficiently low resolution, >∼4 Å, and the search models are displaced by >∼5 Å relative to the target. In the case of 3kso , xMDFF achieved a better refined structure than the best one from DiMaio and coworkers, which was obtained using CNS DEN (using Rwork and Rfree as the determinants). This improvement is evident in the lower Rwork (0.2518), Rfree (0.3509) and MolProbity score (4.03) than those obtained for the CNS DEN-refined structure (Rwork 0.319, Rfree 0.387, MolProbity score 4.15). When starting with the 3k0i search model, the xMDFF-refined structure has a higher Rwork (0.3226) and Rfree (0.3944) but a lower MolProbity score (3.00) than those obtained for the best refined structure, again using CNS DEN (Rwork 0.307, Rfree 0.368, MolProbity score 3.94). However, xMDFF still produces better results than the rest of the methods in the study. It should also be noted that while the Rfree and Rwork of Rosetta–PHENIX-refined structures in these two cases are worse than those with CNS DEN, they have much better MolProbity scores of 2.00 for 3ks0 and 1.91 for 3k0i , a trend that is observed in all but two of the test cases.
from low-resolution diffraction data sets (BrüngerAdditional considerations must be made to satisfactorily, if not conclusively, compare the multiple
protocols. Restricting discussions to only the real-space protocols compared here brings forth several differences. For example, elastic networks, as implemented in DEN-based protocols, robustly predict collective global motions but do not work as well for describing local changes. Therefore, additional care should be taken when dealing with side chains through the use of library-based protocols. Furthermore, elastic network or heurestic force-field-based algorithms often require optimization of the interatomic interactions with each new class of applications. Universal force-field-based protocols such as ours are expected to be more physically correct as the atomic interactions are calibrated against very accurate quantum calculations. However, sometimes the application of different force fields leads to different results owing to differences in the calibration protocol used for force-field development. Thus, any further comparison of the various real-space protocols is beyond the scope of this paper.xMDFF guides the dynamics of a search model to a refined structure through use of first principles via universal force fields together with restraints from the X-ray maps. Consequently, manual real-space fitting, as is commonly used with reciprocal-space protocols, is often avoided. This benefit is of relevance to protocols in drug discovery, which often require determination of multiple crystal structures distributed over a broad range of conformations or ligand-bound states. If performed manually, in drug discovery requires extensive repetition of the same task for each structure. xMDFF naturally provides a semi-automated computational platform to systematically perform several model-building and repetitive steps amenable to high-throughput crystallography.
The initial xMDFF implementation presented here has several limitations that we hope to address through future development. Firstly, the quality of the
results might depend on the nature of the force field used, the effects of which require further investigation. Additionally, the present implementation is unable to handle changes in secondary structure during the simulation, and thus the is strongly biased by the folds present in the search model. Any change in the secondary structure can only be invoked at the homology-modelling stage and not during the Finally, the current xMDFF implementation is limited to of biomolecules for which force fields are already available. of nonbiological systems is subject to force-field availibility.In summary, the application of xMDFF to synthetic as well as experimental low-resolution X-ray data demonstrated the capability of the software to refine phasing models 6 Å away from target data even with maps as coarse as 7 Å resolution, a feat thus far achieved, to the best of our knowledge, very rarely in low-resolution X-ray crystallography. MD naturally provides the necessary sampling required to flexibly fit a model into an electron-density map. Through flexible fitting into iteratively updated model-phased maps, xMDFF can bring about a series of large-scale deformations of the initial structure relevant to its MolProbity scores together with a consistently small difference between Rwork and Rfree values imply negligible overfitting in xMDFF refinements. The quality of the overall is confirmed via improvements in cross-correlations with simulated density. Finally, xMDFF output structures require little post-processing to initiate MD simulations for subsequent analysis of their dynamics in a host medium. In summary, xMDFF together with sequence information and homology modelling provides a general approach to determining all-atom structures from low-resolution X-ray data.
Detailed features characterizing macromolecular function, such as the location of helices in voltage-sensor proteins or disulfide-bridge networks in cobratoxin, have been accurately determined. The application of force fields within xMDFF achieves realistic atomic structures with sterically and conformationally acceptable geometries. LowSupporting information
Supplementary Figures and Tables. DOI: 10.1107/S1399004714013856/rr5069sup1.pdf
Information, configuration files, and scripts for running xMDFF software. DOI: 10.1107/S1399004714013856/rr5069sup2.gz
Acknowledgements
This work was supported by grants NIH 9P41GM104601, NIH 5R01GM098243-02 and NIH U54GM087519 from the National Institutes of Health. The authors also acknowledge the Beckman Postdoctoral Fellowship program for supporting A. Singharoy.
References
Adams, P. D. et al. (2010). Acta Cryst. D66, 213–221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Agirrezabala, X., Schreiner, E., Trabuco, L. G., Lei, J., Ortiz-Meoz, R. F., Schulten, K., Green, R. & Frank, J. (2011). EMBO J. 30, 1497–1507. Web of Science CrossRef CAS PubMed Google Scholar
Becker, T., Bhushan, S., Jarasch, A., Armache, J.-P., Funes, S., Jossinet, F., Gumbart, J., Mielke, T., Berninghausen, O., Schulten, K., Westhof, E., Gilmore, R., Mandon, E. C. & Beckmann, R. (2009). Science, 326, 1369–1373. Web of Science CrossRef PubMed CAS Google Scholar
Borhani, D. W., Rogers, D. P., Engler, J. A. & Brouillette, C. G. (1997). Proc. Natl Acad. Sci. USA, 94, 12291–12296. CrossRef CAS PubMed Web of Science Google Scholar
Bourne, Y., Talley, T., Hansen, S., Taylor, P. & Marchot, P. (2005). EMBO J. 24, 1512–1522. Web of Science CrossRef PubMed CAS Google Scholar
Bricogne, G. & Irwin, J. (1996). Proceedings of the CCP4 Study Weekend. Macromolecular Refinement, edited by E. Dodson, M. Moore, A. Ralph & S. Bailey, pp. 85–92. Warrington: Daresbury Laboratory. Google Scholar
Brunger, A. T., Adams, P. D., Fromme, P., Fromme, R., Levitt, M. & Schröder, G. F. (2012). Structure, 20, 957–966. Web of Science CrossRef CAS PubMed Google Scholar
Brünger, A. T., Kuriyan, J. & Karplus, M. (1987). Science, 235, 458–460. PubMed Web of Science Google Scholar
Brünger, A. T. (1988). Crystallographic Computing 4: Techniques and New Technologies, edited by N. W. Isaacs & M. R. Taylor. Oxford: Clarendon Press. Google Scholar
Brünger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Acta Cryst. D54, 905–921. Web of Science CrossRef IUCr Journals Google Scholar
Chan, K.-Y., Gumbart, J., McGreevy, R., Watermeyer, J. M., Sewell, B. T. & Schulten, K. (2011). Structure, 19, 1211–1218. Web of Science CrossRef CAS PubMed Google Scholar
Chapman, M. S. (1995). Acta Cryst. A51, 69–80. CrossRef CAS Web of Science IUCr Journals Google Scholar
Chapman, M. S. & Blanc, E. (1997). Acta Cryst. D53, 203–206. CrossRef CAS Web of Science IUCr Journals Google Scholar
Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21. Web of Science CrossRef CAS IUCr Journals Google Scholar
DeLaBarre, B. & Brunger, A. T. (2003). Nature Struct. Biol. 10, 856–863. Web of Science CrossRef PubMed CAS Google Scholar
Delarue, M. (2008). Acta Cryst. D64, 40–48. Web of Science CrossRef CAS IUCr Journals Google Scholar
Depristo, M. A., de Bakker, P. I. W., Johnson, R. J. K. & Blundell, T. L. (2005). Structure, 13, 1311–1319. Web of Science CrossRef PubMed CAS Google Scholar
Diamond, R. (1971). Acta Cryst. A27, 436–452. CrossRef CAS IUCr Journals Web of Science Google Scholar
DiMaio, F., Echols, N., Headd, J. J., Terwilliger, T. C., Adams, P. D. & Baker, D. (2013). Nature Methods, 10, 1102–1104. Web of Science CrossRef CAS PubMed Google Scholar
DiMaio, F., Terwilliger, T. C., Read, R. J., Wlodawer, A., Oberdorfer, G., Wagner, U., Valkov, E., Alon, A., Fass, D., Axelrod, H. L., Das, D., Vorobiev, S. M., Iwai, H., Pokkuluri, P. R. & Baker, D. (2011). Nature (London), 473, 540–543. Web of Science CrossRef CAS PubMed Google Scholar
Frauenfeld, J., Gumbart, J., van der Sluis, E. O., Funes, S., Gartmann, M., Beatrix, B., Mielke, T., Berninghausen, O., Becker, T., Schulten, K. & Beckmann, R. (2011). Nature Struct. Mol. Biol. 18, 614–621. Web of Science CrossRef CAS Google Scholar
Gumbart, J., Schreiner, E., Trabuco, L. G., Chan, K.-Y. & Schulten, K. (2011). Molecular Machines in Biology, edited by J. Frank, pp. 142–157. Cambridge University Press. Google Scholar
Gumbart, J., Trabuco, L. G., Schreiner, E., Villa, E. & Schulten, K. (2009). Structure, 17, 1453–1464. Web of Science CrossRef PubMed CAS Google Scholar
Haddadian, E. J., Gong, H., Jha, A. K., Yang, X., DeBartolo, J., Hinshaw, J. R., Rice, P. A., Sosnick, T. R. & Freed, K. F. (2011). Biophys. J. 101, 899–909. Web of Science CrossRef CAS PubMed Google Scholar
Hendrickson, W. A. (1985). Methods Enzymol. 115, 252–270. CrossRef CAS PubMed Google Scholar
Hsin, J., Gumbart, J., Trabuco, L. G., Villa, E., Qian, P., Hunter, C. N. & Schulten, K. (2009). Biophys. J. 97, 321–329. Web of Science CrossRef PubMed CAS Google Scholar
Humphrey, W., Dalke, A. & Schulten, K. (1996). J. Mol. Graph. 14, 33–38. Web of Science CrossRef CAS PubMed Google Scholar
Jiang, Y., Lee, A., Chen, J., Cadene, M., Chait, B. T. & MacKinnon, R. (2003). Nature (London), 423, 33–41. Web of Science CrossRef PubMed CAS Google Scholar
Karmali, A. M., Blundell, T. L. & Furnham, N. (2009). Acta Cryst. D65, 121–127. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kavanaugh, J. S., Rogers, P. H. & Arnone, A. (2005). Biochemistry, 44, 6101–6121. Web of Science CrossRef PubMed CAS Google Scholar
Li, Q., Wanderling, S., Paduch, M., Medovoy, D., Singharoy, A., McGreevy, R., Villalba-Galea, C., Hulse, R. E., Roux, B., Schulten, K., Kossiako, A. & Perozo, E. (2014). Nature Struct. Mol. Biol. 21, 244–252. Web of Science CrossRef CAS Google Scholar
Li, W., Trabuco, L. G., Schulten, K. & Frank, J. (2011). Proteins, 79, 1478–1486. Web of Science CrossRef CAS PubMed Google Scholar
Long, S., Tao, X., Campbell, E. & MacKinnon, R. (2007). Nature (London), 15, 376–382. Web of Science CrossRef Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Acta Cryst. D53, 240–255. CrossRef CAS Web of Science IUCr Journals Google Scholar
Pannu, N. S. & Read, R. J. (1996). Acta Cryst. A52, 659–668. CrossRef CAS Web of Science IUCr Journals Google Scholar
Payandeh, J., Scheuer, T., Zheng, N. & Catterall, W. (2011). Nature (London), 475, 353–358. Web of Science CrossRef CAS PubMed Google Scholar
Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R. D., Kale, L. & Schulten, K. (2005). J. Comput. Chem. 26, 1781–1802. Web of Science CrossRef PubMed CAS Google Scholar
Pražnikar, J., Afonine, P. V., Gunčar, G., Adams, P. D. & Turk, D. (2009). Acta Cryst. D65, 921–931. Web of Science CrossRef IUCr Journals Google Scholar
Rice, L. M. & Brünger, A. T. (1994). Proteins, 19, 277–290. CrossRef CAS PubMed Web of Science Google Scholar
Schreiner, E., Trabuco, L. G., Freddolino, P. L. & Schulten, K. (2011). BMC Bioinformatics, 12, 190. Google Scholar
Schröder, G. F., Levitt, M. & Brunger, A. (2010). Nature (London), 464, 1218–1222. Web of Science PubMed Google Scholar
Seidelt, B., Innis, C. A., Wilson, D. N., Gartmann, M., Armache, J.-P., Villa, E., Trabuco, L. G., Becker, T., Mielke, T., Schulten, K., Steitz, T. A. & Beckmann, R. (2009). Science, 326, 1412–1415. Web of Science CrossRef PubMed CAS Google Scholar
Sener, M. K., Hsin, J., Trabuco, L. G., Villa, E., Qian, P., Hunter, C. N. & Schulten, K. (2009). Chem. Phys. 357, 188–197. CAS PubMed Google Scholar
Sondermann, H., Soisson, S. M., Boykevisch, S., Yang, S.-S., Bar-Sagi, D. & Kuriyan, J. (2004). Cell, 119, 393–405. Web of Science CrossRef PubMed CAS Google Scholar
Stone, J. E., Hardy, D. J., Umfitsev, I. S. & Schulten, K. (2010). J. Mol. Graph. Model. 29, 116–125. Web of Science CrossRef CAS PubMed Google Scholar
Stone, J. E., Phillips, J. C., Freddolino, P. L., Hardy, D. J., Trabuco, L. G. & Schulten, K. (2007). J. Comput. Chem. 28, 2618–2640. Web of Science CrossRef PubMed CAS Google Scholar
Subramaniam, S. & Senes, A. (2012). Proteins, 80, 2218–2234. Web of Science CrossRef CAS PubMed Google Scholar
Tanner, D. E., Chan, K.-Y., Phillips, J. & Schulten, K. (2011). J. Chem. Theor. Comput. 7, 3635–3642. Web of Science CrossRef CAS Google Scholar
Trabuco, L. G., Schreiner, E., Eargle, J., Cornish, P., Ha, T., Luthey-Schulten, Z. & Schulten, K. (2010). J. Mol. Biol. 402, 741–760. Web of Science CrossRef CAS PubMed Google Scholar
Trabuco, L. G., Villa, E., Mitra, K., Frank, J. & Schulten, K. (2008). Structure, 16, 673–683. Web of Science CrossRef PubMed CAS Google Scholar
Trabuco, L. G., Villa, E., Schreiner, E., Harrison, C. B. & Schulten, K. (2009). Methods, 49, 174–180. Web of Science CrossRef PubMed CAS Google Scholar
Turner, M. A., Simpson, A., McInnes, R. R. & Howell, P. L. (1997). Proc. Natl Acad. Sci. USA, 94, 9063–9068. CrossRef CAS PubMed Web of Science Google Scholar
Villa, E., Sengupta, J., Trabuco, L. G., LeBarron, J., Baxter, W. T., Shaikh, T. R., Grassucci, R. A., Nissen, P., Ehrenberg, M., Schulten, K. & Frank, J. (2009). Proc. Natl Acad. Sci. USA, 106, 1063–1068. Web of Science CrossRef PubMed CAS Google Scholar
Wang, J., Meijers, R., Xiong, Y., Liu, J., Sakihama, T., Zhang, R., Joachimiak, A. & Reinherz, E. L. (2001). Proc. Natl Acad. Sci. USA, 98, 10799–10804. Web of Science CrossRef PubMed CAS Google Scholar
Wells, D., Abramkina, V. & Aksimentiev, A. (2007). J. Chem. Phys. 127, 125101. Web of Science CrossRef PubMed Google Scholar
Zhang, J., Wang, Q., Barz, B., He, Z., Kosztin, I., Shang, Y. & Xu, D. (2010). Proteins, 78, 1137–1152. Web of Science CrossRef PubMed CAS Google Scholar
Zhang, X., Ren, W., DeCaen, P., Yan, C., Tao, X., Tang, L., Wang, J., Hasegawa, K., Kumasaka, T., He, J., Wang, J., Clapham, D. E. & Yan, N. (2012). Nature (London), 486, 130–134. CrossRef CAS PubMed Google Scholar
Zhao, G., Perilla, J. R., Yufenyuy, E. L., Meng, X., Chen, B., Ning, J., Ahn, J., Gronenborn, A. M., Schulten, K., Aiken, C. & Zhang, P. (2013). Nature (London), 497, 643–646. Web of Science CrossRef CAS PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.