research papers
High-throughput quantum-mechanics/molecular-mechanics (ONIOM) macromolecular crystallographic PHENIX/DivCon: the impact of mixed Hamiltonian methods on ligand and protein structure
withaQuantumBio Inc., 2790 West College Avenue, State College, PA 16801, USA
*Correspondence e-mail: lance@quantumbioinc.com
Conventional macromolecular crystallographic Python-based Hierarchical ENvironment for Integrated Xtallography (PHENIX) crystallographic platform have been developed to augment conventional methods with the in situ use of quantum mechanics (QM) applied to ligand(s) along with the surrounding active site(s) at each step of [Borbulevych et al. (2014), Acta Cryst D70, 1233–1247]. This method (Region-QM) significantly increases the accuracy of the X-ray process, and this approach is now used, coupled with experimental density, to accurately determine protonation states, binding modes, ring-flip states, water positions and so on. In the present work, this approach is expanded to include a more rigorous treatment of the entire structure, including the ligand(s), the associated active site(s) and the entire protein, using a fully automated, mixed quantum-mechanics/molecular-mechanics (QM/MM) Hamiltonian recently implemented in the DivCon package. This approach was validated through the automatic treatment of a population of 80 protein–ligand structures chosen from the Astex Diverse Set. Across the entire population, this method results in an average 3.5-fold reduction in ligand strain and a 4.5-fold improvement in MolProbity clashscore, as well as improvements in Ramachandran and rotamer outlier analyses. Overall, these results demonstrate that the use of a structure-wide QM/MM Hamiltonian exhibits improvements in the local structural chemistry of the ligand similar to Region-QM but with significant improvements in the overall structure beyond the active site.
relies on often dubious stereochemical restraints, the preparation of which often requires human validation for unusual species, and on rudimentary energy functionals that are devoid of nonbonding effects owing to electrostatics, polarization, charge transfer or even hydrogen bonding. While this approach has served the crystallographic community for decades, as structure-based drug design/discovery (SBDD) has grown in prominence it has become clear that these conventional methods are less rigorous than they need to be in order to produce properly predictive protein–ligand models, and that the human intervention that is required to successfully treat ligands and other unusual chemistries found in SBDD often precludes high-throughput, automated Recently, plugins to theKeywords: X-ray crystallography; quantum-mechanics refinement; PM6 semiempirical method; QM/MM; ONIOM macromolecular refinement; molecular mechanics; stereochemical restraints; ligand strain; MolProbity clashscore; high-throughput crystallography.
1. Introduction
X-ray crystallography is a popular technique that is used to determine the three-dimensional atomic structures of biomolecular systems, which serve as three-dimensional templates for structure-based drug discovery (SBDD) and fragment-based drug discovery (FBDD). The quality of the model is crucial for the overall success of high-throughput screening, docking and scoring (for example rank ordering) of potential drug candidates. In recent years, X-ray crystallography has become routine thanks to advances in data collection and processing, structure solution and et al., 2003, 2007), and these errors negatively impact the very ligand-binding affinity estimations (Davis et al., 2003) that are critical to SBDD/FBDD applications. This has led to the development of structure-validation metrics, including Ramachandran, clashscore and MolProbity score, the latter of which is a composite of the clashscore and Ramachandran plot and rotamer outliers (Ramachandran et al., 2011; Read et al., 2011; Chen et al., 2010; MacCallum et al., 2009). In particular, the median clashscore, which is the number of clashes per 1000 atoms, for all X-ray structures deposited in the Protein Data Bank (PDB) and the worldwide PDB (wwPDB) (Berman et al., 2003, 2007) since 1990, and determined at a resolution of 1.5 Å or better, is 8.8 units. Furthermore, this median score deteriorates as the resolution decreases (Read et al., 2011).
automation. However, protein crystal models are still subject to significant uncertainties in atomic coordinates and other structural errors (DavisThe prevalence of problematic geometries observed in deposited PDB structures suggests that conventional ; Pozharski et al., 2013; Smart et al., 2018). Overall, this problem stems from an intrinsic limitation of macromolecular X-ray crystallographic which is its reliance on an insufficient ratio of observed reflections to refined parameters, as typically observed at moderate and low resolutions (Rupp, 2009). In order to overcome this limitation, conventional methods use a priori information about the structure in the form of stereochemical restraints (for example bond lengths, bond angles and bond torsion angles, as well as and group planarity information) for all components included within the protein–ligand complex. For standard amino acids, these fixed stereochemical restraints are based on the ideal Engh and Huber parameters (Engh & Huber, 1991), and these restraints often lead to significant structural deficiencies (Moriarty et al., 2014). In these situations, the backbone geometry can deviate significantly from these ideal values for high-resolution models (Vlassi et al., 1998), and this problem becomes even more pronounced when small molecules and ions (for example, ligands, inhibitors and/or metallic or nonmetallic cofactors) are bound to the protein in question (Kleywegt, 2007). Surveys of the PDB indicate that the percentage of ligands with questionable geometric parameters in deposited macromolecular structures could be as high as 60% (Gore et al., 2011; Liebeschuetz et al., 2012).
methods are not sufficiently rigorous to represent the chemistry within the protein–ligand complex (Kleywegt, 2007These conventional methods rely on a detailed description of the molecular geometry for each species to be refined, and an accurate library or ), incomplete or inaccurate a priori understanding of in situ bound bond lengths and angles, and a lack of intermolecular interactions in conventional functionals (Read et al., 2011). Efforts have been made in recent years to improve the automatic generation of ligand-restraint libraries for ligands in order to address these problems. The eLBOW tool (Moriarty et al., 2009) found within the Python-based Hierarchical Environment for Integrated Xtallography (PHENIX) package (Adams et al., 2010) is capable of creating restraints based on quantum-mechanics optimization, and the AceDRG tool from CCP4 provides similar capabilities (Nicholls, 2017; Long et al., 2017). Alternatively, the publicly available Grade webserver (https://grade.globalphasing.org) along with the commercial Mogul package (Bruno et al., 2004), which are both based on the Cambridge Structural Database (CSD; Groom et al., 2016), use small-molecule X-ray structural information to determine target values. Finally, the AFITT program (Janowski et al., 2016) produced by OpenEye Inc. works to improve the ligand geometry based on the Merck Molecular Mechanics Force Field (MMFF94). Regardless of the accuracy of the however, conventional methods are unable to accurately account for crucial binding influences on both the ligand and the surrounding active site arising from coordination, bond making/breaking, hydrogen bonding, electrostatics and other nonbonding interactions (Borbulevych et al., 2014; Janowski et al., 2016; Read et al., 2011). This problem is further exacerbated when such species are covalently bound to the macromolecule.
(CIF) is important to the ultimate success of the effort. Unfortunately, the creation and validation of accurate CIFs is a nontrivial task which requires significant human intervention and often leads to bound ligand structures of less than desirable quality. These deficiencies are owing to the great variety of ligand chemistries and structures (Kleywegt, 2007Taking a different route, in 2014 our laboratory introduced (Borbulevych et al., 2014) a plugin to the PHENIX package to treat the active site or the entire protein using our DivCon linear-scaling, semiempirical quantum-mechanics (SE-QM) implementation (Dixon & Merz, 1996, 1997) and the PM6 Hamiltonian (Stewart, 2009; Řezáč et al., 2009). The advantage of this approach is that interactions such as hydrogen bonding, dispersion, electrostatics, polarization and charge transfer between the ligand and the protein are taken into account (Diller et al., 2010; Zhang et al., 2010). While the DivCon implementation can be applied to structures with thousands or even tens of thousands of atoms, the plugin was designed to optionally focus the QM method on one or more user-definable regions (for example active sites, ligands, key residues etc.) during the (Region-QM), leaving the rest of the macromolecule dependent on conventional stereochemical restraints. In the present work, we explore a `complete functional' representation for macromolecular which uses a mixed quantum-mechanics/molecular-mechanics (QM/MM) Hamiltonian based on the ONIOM (Our own N-layered Integrated and molecular Mechanics) method (Vreven et al., 2003) as recently implemented in DivCon Discovery Suite build-7.1.1-b4015.17 (QuantumBio, 2017). We use SE-QM for the high-level theory, `region layer' [including ligands(s) and active site(s)], while the remainder of the biomolecule, called the `system layer', is treated using our implementation of the Assisted Model Building with Energy (AMBER) molecular-mechanics force field (Case et al., 2014). In addition to validating the ONIOM method against our previous Region-QM method, the results of conventional as provided by the PHENIX platform are also discussed.
2. Methods
2.1. PHENIX and the QM/MM methodology
Typical biomacromolecular systems, such as those including protein, DNA and/or RNA, are usually quite large and ab initio or density functional theory (DFT) QM methods are too expensive to treat these structures quickly and efficiently on the timescales demanded by industrial practitioners. The DivCon Discovery Suite (QuantumBio, 2017) employs divide-and-conquer (D&C), linear scaling, semiempirical quantum-mechanics (SE-QM) methods described previously (Dixon & Merz, 1996, 1997; Van der Vaart, Gogonea et al., 2000; Van der Vaart, Suarez et al., 2000; Wang et al., 2007) to characterize all-atom structures of tens or even hundreds of thousands of atoms using the traditional AM1 (Dewar et al., 1985) or PM3 (Stewart, 1989) SE-QM Hamiltonians, as well as the more modern PM6 Hamiltonian (Stewart, 2009; Řezáč et al., 2009). Over the last two decades, this approach has been applied to a number of key SBDD applications including QMScore (Diller et al., 2010; Merz & Raha, 2011; Raha & Merz, 2005; Zhang et al., 2010) and NMRScore (Wang et al., 2004, 2007; Williams et al., 2009), QM-based quantitative structure–activity relationship (QSAR) models (Dixon et al., 2005; Peters & Merz, 2006; Zhang et al., 2010) and X-ray (Borbulevych et al., 2014, 2016; Li et al., 2010; Yu et al., 2005).
While the DivCon D&C implementation is faster than conventional semiempirical implementations, density functional theory (DFT) and ab initio QM methods (Dixon & Merz, 1996, 1997), linear-scaling SE-QM methods can still be time-consuming for large biomacromolecular structures (especially within an industrial environment, where a quicker turnaround time is often required). Therefore, the mixed QM/MM Hamiltonian concept provides a reasonable tradeoff for these structures as it allows one to treat the region of interest, such as an active site, at an SE-QM level of theory, while the remaining residues outside this region are treated at a faster, more approximate molecular-mechanics (MM) level of theory. This approach combines these different levels of theory in a way which significantly improves the speed of the calculation versus treating the entire structure at the higher level, but with a greater accuracy than if the entire structure were treated at the lower level (Chung et al., 2015).
There are generally two QM/MM coupling schemes in common use in the computational chemistry field today: additive (Liu et al., 2014) and subtractive (Vreven et al., 2003). Additive QM/MM represents the energy of the system as the sum of three terms,
The first two terms describe the energies of the QM and MM regions, respectively, and the third term explicitly expresses interactions (coupling) between the QM and MM subsystems in the form of an additional, one-electron QM Hamiltonian describing the electrostatic coupling interactions between the two layers (Brooks et al., 1983; Field et al., 1990). This coupling term leads to greater complexity in the Hamiltonian, and calculating this term accurately can be particularly difficult given the inclusion of link atoms and electrostatic perturbations in the QM Hamiltonian (Plotnikov et al., 2011).
Subtractive QM/MM, on the other hand, represents the energy of a system through the following equation (Vreven et al., 2003),
where the EallMM term is the MM energy calculated for the entire system, the EregionMM term is the MM energy for a region and EregionQM is the energy of the region computed using the QM method. As per Vreven et al. (2003), QM/MM ONIOM gradients in the subtractive scheme are computed using (3), which is similar to (2),
and in which the gradients of the QM region(s) include contributions from both the QM and the MM functionals. While standard ONIOM does not include electrostatic perturbations of the QM density matrix by the atoms within the MM region, the lack of a coupling term representing the interactions between these two regions in subtractive QM/MM leads to generally faster and more convergent calculations, along with the ability to treat multiple QM regions (such as those with multiple active sites or sites of interest or those with multiple copies). This makes the method particularly well suited to fast, routine, high-throughput QM/MM-based crystallographic , which utilize both QM and MM terms, we can approximate the interactions between the QM region and the MM region in a way that does not adversely impact on the convergence of the QM calculation.
With the use of the gradients represented in (3)Traditionally, with the explosion of different approaches and implementations, both general QM/MM varieties are often difficult to use depending upon the application and the desired outcomes of the investigator (Sousa et al., 2016; Cao & Ryde, 2018). They can exhibit problems with convergence and performance which make the routine use of the methods expensive (Hu et al., 2011), they are often limited to a single, compact QM region (Case et al., 2018), they require significant atom-type and charge preparation of any unknown species (for example ligands, cofactors, nonstandard amino acids etc.) and protonation (Chung et al., 2015), and/or they rely on the ability of a user to correctly define the QM atoms/residues along with any link atoms needed to complete broken bonds (Sousa et al., 2016). As depicted in Fig. 1, the QM/MM implementation in DivCon addresses these problems through the inclusion of the following key features.
The DivCon Discovery Suite build-7.1.1-b4015.17 was used for all QM/MM (ONIOM) calculations in this project. This package includes implementations of the SE-QM Hamiltonians AM1 (Dewar et al., 1985), PM3 (Stewart, 1989) and PM6 (Stewart, 2009; Řezáč et al., 2009) along with an implementation of the AMBER MM force field (Case et al., 2014). In the present project, we employed a two-layer ONIOM configuration as depicted in Figs. 2(a) and 2(b) where, for each characterized structure, the ligand(s) along with the surrounding active site was (were) treated using the PM6 SE-QM Hamiltonian and the remainder of the protein was treated using the 2014 parameter set of the AMBER MM force field. Both PM6 and AMBERFF14 were chosen as they are the most advanced methods available in the DivCon Discovery Suite at this time and they include a large coverage of atoms and atom types (for example, PM6 includes support for upwards of 70 elements). Furthermore, while newer SE-QM methods are available in the literature in other packages, such as PM7 (Stewart, 2013), recent benchmarks indicate similar performance characteristics between PM6 and PM7, with PM6 often demonstrating superior results (Hostaš et al., 2013). Given these observations, the impact of the choice of SE-QM Hamiltonian on the results observed in the present study would be negligible.
Building on the QM-based plugin that we described in detail in Borbulevych et al. (2014), the ONIOM QM/MM method was integrated with the PHENIX package v.1.11.1-2575 (Adams et al., 2010). The typical protocol in PHENIX involves fitting bulk-solvent parameters and anisotropic scaling, reciprocal-space atomic coordinate atomic displacement parameter (ADP) and occupancy The overall target Etotal in PHENIX is presented as
where Ωxray and Ωgeom are weights assigned to X-ray data and geometry (QM/MM ONIOM) restraints, respectively, and wcxscale is an additional scale factor implemented in PHENIX (Afonine et al., 2012). Ωgeom is typically set to 1, while Ωxray is a variable weight determined using an automatic procedure in PHENIX (Adams et al., 1997). Mimicking the Region-QM framework detailed in Borbulevych et al. (2014), (4) is extended in order to calculate the ONIOM QM/MM gradients on each atom with coordinates x according to
where corresponds to the ONIOM gradients determined using (3), where any ligands and surrounding binding pockets is are defined as part of the QM region and the remainder of the structure is designated as the MM region. Under this regime, unlike in our prior work, all stereochemical restraint gradients are replaced by QM/MM gradients.
2.2. Structure preparation and refinement
Coordinates and structure factors for all 80 structures from the Astex Diverse Set (Hartshorn et al., 2007; Table 1, Supplementary Table S1) were downloaded from the PDB. Ligands(s), solvent molecules, metals and/or anions (e.g. Cl−) were included in each of the refinements. Since QM/MM is an `all-atom' method (requiring protons as well as heavy atoms), H atoms were added to each structure, including all water molecules, using Protonate3D (Labute, 2009) as implemented in MOE2016 from Chemical Computing Group Inc. Likewise, CIFs for any unsupported species were automatically generated using Scientific Vector Language (SVL) extensions to MOE2016 provided in the DivCon Discovery Suite. For each structure in the set, every copy of each ligand specified in Table 1 was chosen as one or more QM region centers. The QM region(s) of each structure was (were) extended 3.0 Å from each center to include all amino-acid residues, ions and crystal waters within each pocket. The balance of residues and crystal waters were defined as part of the MM region and capping link atoms were automatically added to the QM region edges to satisfy covalent bonds that were cut in the process. In order to compare the new QM/MM with older methods, we also refined the structures using both conventional (i.e. non-QM PHENIX) and the Region-QM approach as described in our previous work (Borbulevych et al., 2014). The same input PDB files were used in all three types of and in order to characterize automated only default parameters and automatically determined X-ray weights (Adams et al., 1997, 2010) were used for phenix.refine. The aforementioned files were used in the conventional and they were provided as input to PHENIX in the Region-QM and ONIOM refinements in order to satisfy the internal `error-trapping' mechanism of the phenix.refine executable. Certainly, we could spend a significant amount of time manually manipulating the input parameters, weights and restraints in order to `tune' the conventional for each of the 80 structures of the Astex Diverse Set; however, this approach could arguably no longer be considered high-throughput. Furthermore, from a scientific perspective, with too much `hand manipulation' one would need to ask how much investigator bias could be introduced into the final model. Therefore, the approach utilized in the present study works to minimize investigator bias so that the final models are based solely on the combination of the experimental data and the initial placement of each structure as published, along with the Hamiltonian used for the refinement.
|
2.3. Validation metrics
In order to validate the performance of ONIOM Z score of the difference density (ZDD; Tickle, 2012) assessed using DivCon (Borbulevych et al., 2014; QuantumBio, 2017), and overall structure quality including MolProbity metrics assessed using the MolProbity program (Chen et al., 2010) as distributed within the PHENIX package.
in comparison to other types (conventional and Region-QM), we employed two groups of metrics: ligand quality, consisting of both the strain energy andSince the histograms depicted in the present study (Figs. 3, 4, 6 and 9) show that the results are skewed and deviate from the normal distribution, instead of standard deviations (SDs) to show the spread of the data in the sections below, we employed the median absolute deviation (MAD; Sachs, 1984) calculated as
where Xi represents data point i and X is the array of data.
2.3.1. Local ligand-strain energy calculations
Local ligand-strain energy is the difference in the conformational energy of the isolated ligand conformation and the protein-bound ligand conformation. This metric serves as a quality indicator of protein–ligand structures as it shows how much strain the ligand must take on or `accept' in order to bind to the protein, and lower strain energy is preferred to higher strain energy (Fu et al., 2011; Janowski et al., 2016; Mobley & Dill, 2009; Perola & Charifson, 2004). Previously, we used ligand strain to validate Region-QM and we validated the method against a repertoire of 50 quasi-randomly chosen PDB structures (Borbulevych et al., 2014); we went on to use this metric as a critical component of our XModeScore method (Borbulevych et al., 2016). As detailed in Fu et al. (2011), the ligand-strain energy Estrain is computed as
where Eligandxray is the single point energy computed for the ligand X-ray geometry and Eligandoptimized is the energy of the optimized ligand that corresponds to the local minimum.
When discussing ligand strain, it should be noted that it can be thought of as a combination of a number of different factors, as represented qualitatively by
In this equation, Estraintarget is the `natural' or `target-induced' strain associated with changes in ligand geometry/conformation owing to binding, Estrainplacement is the strain associated with initial ligand placement (e.g. docking) and Estrainmethod is the strain related to the underlying method or force field (restraints/CIF, functional or Hamiltonian) being used in the Ideally, Estrain would equal Estraintarget. The PHENIX/DivCon plugin is primarily designed to address the Estrainmethod term through the replacement of inaccurate stereochemical restraints and approximate molecular-mechanics parameters with more accurate QM gradients which significantly reduce the method-induced ligand strain, as shown in our previous work (Borbulevych et al., 2014). In order to address the Estrainplacement term in (8), additional side-chain sampling and/or ligand re-docking would need to be performed. These steps are beyond the scope of the present work, and the observed results are attributable to localized changes (for example improvements in bond lengths, torsions, rotations and translations) within the radius of convergence of the input conformation.
2.3.2. Difference density as a measure of the accuracy of density around a ligand
The conventional quality metric used to communicate agreement between the model and the X-ray (or neutron) density is the ). However, in 2012 Tickle demonstrated that the correlates with both the accuracy and the precision of the structure model, and described a more sophisticated quality indicator, the real-space Z score of difference density (ZDD), which measures the accuracy of the model alone (Tickle, 2012; Borbulevych et al., 2016). A detailed mathematical description of ZDD can be found in Borbulevych et al. (2016) and Tickle (2012), but briefly the Z score for a point difference density value is expressed by
(RSCC; Brändén & Jones, 1990where σ[Δρ(r)] is the standard deviation of the difference density (mFo − DFc) maps and corresponds to the random error of the model and is pure precision, while the Z score of the difference density is a measure of the residual, nonrandom error and is pure accuracy. In order to limit the impact of outliers or noise on the final value, while at the same time preserving information, we assume that the difference density Z values should approach a normal distribution of random errors with zero mean and unit standard deviation as the quality of the model, as measured by χ2, improves. The subset of values of x2(i) that maximize the probability pmax over k are summed,
where the function P is the lower normalized gamma function representing the cumulative distribution function (CDF) of χk2. The second function, I, is also computed as the complement in practice and is the normalized incomplete beta function (CDF of a normal-order statistic; Gibbons & Chakraborti, 2010) which accounts for the `multiple comparisons' correction (Yuriev & Ramsland, 2013).
ZDD is evaluated as the two-tailed normal Z score corresponding to the maximal value pmax over k of the cumulative probability of χk2 derived from (10),
where the function Φ is the CDF of the normal distribution, 2Φ(|Z|) − 1 is the CDF of the half-normal distribution of the absolute value of a normal variate Z, and Φ−1 is the inverse function or the value of Z corresponding to a given probability. The set of negative density values, owing to incorrectly positioned atoms, yields ZDD−. Likewise, the set of positive density values, owing to missing atoms, yields ZDD+. The final ZDD is the maximum of the absolute values of ZDD− and ZDD+ as defined using
Thus, ZDD as used below is always positive and lower values correspond to a lower amount of residual difference density. Tickle (2012) provided further guidance to interpreting ZDD, such as a magnitude of over 3 indicates significant difference density peaks.
2.3.3. Overall structure-quality metrics: MolProbity score and clashscore
MolProbity, which is included as a module in PHENIX, is a software tool that includes several macromolecular model-validation metrics using multiple quality criteria (Chen et al., 2010). The MolProbity score (MPScore) represents overall structure quality and is a logarithm-based score combining three key component metrics: clashscore, Ramachadran plot outliers (MacCallum et al., 2009) and rotamer outliers (Hintze et al., 2016; Lovell et al., 2000). The lower the value of the MPScore, the better the quality of the model. In particular, an important component of the MPScore is the clashscore, which is the number of clashes per 1000 atoms; it is determined through nonbonded atom contacts derived using a rolling-probe algorithm employed by the program Probe (Word et al., 1999). A clash occurs when the dot surface around one atom overlaps the dot surface around another by greater than 0.4 Å (Davis et al., 2007). Generally, a chemically incorrect model will yield a high number of clashes (Chen et al., 2010). Since the stereochemical restraint function does not explicitly include electrostatics and other nonbonded interactions for attraction and repulsion, while AMBER and PM6 do include these attractive, and in particular repulsive, effects, one would expect that clashscore should be a particularly indicative metric.
3. Results
3.1. R-factor analysis
As shown in Supplementary Table S2, the ONIOM method yields an average Rwork of 0.177 ± 0.02 and an average Rfree of 0.218 ± 0.02. Similarly, conventional PHENIX produces averages of 0.171 ± 0.02 and 0.217 ± 0.02, respectively, and Region-QM yields averages of 0.174 ± 0.02 and 0.218 ± 0.02, respectively. Together, these results show that the ONIOM methodology does not negatively impact the overall agreement between the experimental data and the atomic structure models.
3.2. Overall structure-quality metrics
3.2.1. MolProbity: Ramachandran and rotamer scores
Fig. 3 depicts a histogram of MPScores for all 80 Astex structures involved in the current study, in which the average MPScore of ONIOM-refined structures is 1.23 ± 0.32 units. This average is lower (better) than the corresponding values for Region-QM (1.81 ± 0.32 units) and conventional (1.75 ± 0.34 units) refinements. Furthermore, unlike the Region-QM and conventional refinements, ONIOM shows a in which the first peak is at 0.75 units and covers about 50% of the population, and the second peak is at 1.6 units and coincides with peaks that are also observed for conventional and Region-QM data. This second peak has a long tail for these less sophisticated methods, with ∼25% of conventional and Region-QM structures distributed in the 2.0+ unit bin.
An analysis of the individual Ramachadran and rotamer components that comprise MPScore indicates that ONIOM versus the models yielded by both conventional and Region-QM refinements. For example, when comparing conventional and ONIOM the average percentage of Ramachadran plot outliers decreases from 0.40% to 0.26%, while the residue population in the favorable regions of the Ramachadran plot slightly increases from 96.46% to 96.90%. In over 91% of the cases studied ONIOM leads to models with a Ramachandran plot and rotamer angles that are as good or better when compared with those from conventional demonstrating that use of the ONIOM plugin does not break or otherwise damage the final model.
leads to models which exhibit improved statistics3.2.2. MolProbity: clashscore
While a portion of the observed improvement in the MPScore is attributable to improvements in the Ramachandran and rotamer components, the largest improvement is seen for the clashscore component. As shown in Table 1, for the 80 Astex models studied the average clashscore is 1.10 ± 0.41 units for the ONIOM models, which is 4.5–5.0-fold lower (better) than the average clashscores for the conventional (4.83 ± 1.2 units) and Region-QM (5.54 ± 1.6 units) models. The clashscore histogram (Fig. 4) shows a clear peak around 0.5 units which comprises 90% of the ONIOM models, while a peak representing both conventional and Region-QM model data is located around 3.5 units. Furthermore, around 50% of the data in the conventional and Region-QM histograms are found in the tails of the respective peaks and are distributed in histogram bins of 4.5+ units and above, while no ONIOM data are found in this range. This observation suggests that the ONIOM QM/MM method utilized in this study exhibits greater consistency over the range of structures studied versus the use of stereochemical restraints alone in an automated (high-throughput) regime with default phenix.refine settings. Furthermore, since the Region-QM and conventional refinements yield similar results for the bulk of the protein structure, this would suggest that much of the improvement in clashscore is attributable to the use of the QM/MM Hamiltonian on the entire structure.
The α ligand-binding domain in complex with the antagonist ligand 4-D determined at 1.9 Å resolution (PDB entry 1sj0; Kim et al., 2004) has been chosen as a representative example in order to demonstrate the sort of improvements that we have observed in treatment of the Astex Diverse Set with the QM/MM method. An initial clashscore for the deposited structure was calculated as 18.64 units. While all three refinements led to a noticeable reduction (improvement) in clashscore, ONIOM exhibited the largest improvement, with a clashscore of 2.27 units compared with 9.06 units for conventional and 13.09 units for Region-QM As shown in Supplementary Table S3, the poorer score of the conventional is owing to the 36 bad clashes that remained after (compared with 74 bad clashes in the originally downloaded file). 28 of those 36 clashes were not observed in the ONIOM model, and no additional clashes were introduced with ONIOM. Interestingly, owing to the addition of six clashes at the boundary of the buffer region, Region-QM yielded a higher (worse) clashscore than both ONIOM and conventional Among the bad clashes observed after conventional ONIOM leads to an average improvement of 0.25 ± 0.12 Å, while some significant short contacts were improved by as much as 0.65 Å. A notable example is depicted in Fig. 5, where the intermolecular distance between Asn413 ND and Wat1098 O in conventional yields a clash distance of 2.41 Å, while ONIOM yields a more reasonable 3.23 Å. Further, structural rearrangement in this region after ONIOM is mostly attributable to the movement of the side chain of Asn413 (Fig. 5). This residue in the conventionally refined structure adopts an m-80° rotamer conformation, with the χ1 and χ2 torsion angles both being −84°. On the other hand, ONIOM yields a χ1 angle in Asn413 which is increased by 15°, making this torsion angle (−69°) very close to the ideal value of −71° for the m-80° rotamer (Lovell et al., 1999). Interestingly, this structural shift leads to the removal of both of the above-noted bad clashes and to an improvement in the Asn41 OD1–Wat1098 O bond distance (which approaches a typical hydrogen-bond distance). Specifically, when accompanied by the rotation of the Wat1098 water molecule depicted in Fig. 5, a hydrogen bond is indeed formed between Asn413 OD1 and Wat1098 O, as shown by the interatomic distance of 2.73 Å and the Wat1098 O–Wat1098 H1⋯Asn413 OD1 bond angle of 161° observed after ONIOM refinement.
of human estrogen receptor3.2.3. MolProbity: Cβ deviations and r.m.s. bond and angle deviations
In addition to the aforementioned Ramachandran, clashscore and rotamer components, for the sake of completeness the Cβ deviations are also reported in Supplementary Table S2. Generally, Cβ deviations are defined as abnormalities in bond-angle distributions around the Cβ atom. Deviations larger than 0.25 Å typically indicate incompatibility between main-chain and side-chain conformations (Davis et al., 2007). As indicated in Supplementary Table S2, the number of Cβ deviations is similar in all three types and over 90% of structures are free of this aberration. Furthermore, the average r.m.s.d. in bond length is the same for ONIOM (0.014 ± 0.002 Å), Region-QM (0.014 ± 0.002 Å) and conventional (0.013 ± 0.002 Å) refinements (Supplementary Table S2). However, the average r.m.s.d. in angles is slightly lower for conventional (1.30 ± 0.20°) compared with QM-driven refinements (1.86 ± 0.20° for ONIOM and 1.53 ± 0.20° for Region-QM), suggesting greater variability in the QM and MM methods. This deviation is likely to be caused by different target bond angles in the AMBER functional together with the greater number of atom types in MM and the captured atom–atom interactions in both methods.
3.3. Ligand-quality metrics
3.3.1. Local ligand-strain energy
Ligand strain is a method to explore refined ligand structural models (Fu et al., 2011; Janowski et al., 2016; Mobley & Dill, 2009; Perola & Charifson, 2004), and ligand strain is a key metric which we have used previously to evaluate the quality of the region (Borbulevych et al., 2012, 2014). For the present study, we find that the average strain energies calculated over 141 ligands from 80 Astex structures are similar in ONIOM (9.95 ± 3.77 kcal mol−1) and Region-QM (10.49 ± 4.52 kcal mol−1) refinements. As shown in the ligand-strain histogram (Fig. 6), we also see similar distributions between both ONIOM and Region-QM in that both methods exhibit peaks around 3.0 kcal mol−1 which account for approximately three quarters of the models in the set. This is compared with conventional using automatically generated CIFs, which yields a population of structures which are more evenly distributed in a broad range from 10 to 40 kcal mol−1 and ∼30% of the data are in the last bin of >50 kcal mol−1. This finding is consistent with our previous work, in which we demonstrated that QM across a diverse population of structures yields a tighter strain energy range versus conventional methods (Borbulevych et al., 2014). In addition to exhibiting a wider strain range, the average ligand-strain energy after conventional of the Astex set is 35.64 ± 9.35 kcal mol−1 or about 3.5-fold higher than in the QM-driven refinements. This average improvement in strain energy is consistent with the 3.4-fold average improvement observed in Region-QM refinements in our previous study (Borbulevych et al., 2014). While beyond the scope of the present work, which is focused on automated, high-throughput methods, arguably one could potentially manipulate these CIFs `by hand' in order to yield ligand structures with lower strain energy or even which mimic the capture of atom–atom interactions (for example slightly elongated/shortened bond lengths, rotations etc.) automatically observed in QM/MM However, with over 80 species considered, these manipulations would come at a significant cost in investigator time with more opportunities for inclusion of investigator bias. Further, the success or failure of each structure would be much more greatly dependent on investigator proficiency.
1s19; Tocchini-Valentini et al., 2004) is chosen as an illustrative example. Conventional of PDB entry 1s19 leads to a strain energy of 28.62 kcal mol−1 for the ligand MC9 (Table 1). However, QM-driven yields a ligand structural model in which ligand strains are 3.8–3.5-fold lower or 7.52 kcal mol−1 for ONIOM and 8.28 kcal mol−1 for Region-QM. Closer examination of the geometry of this ligand after the conventional and QM-driven refinements reveals that the key difference is related to the orientation of the hydroxylpropene fragment at the junction with the cyclopropyl ring described by the torsion angle C22—C23—C24—C25, which is −35° for conventional −119° for ONIOM and −128° for Region-QM (Fig. 7). Further, the conventional model exhibits positive and negative density peaks (Fig. 8c) which are not observed in the two QM-based refinements (Figs. 8a and 8b). These peaks generally indicate that the ligand conformation adopted is likely to be incorrectly placed within the density after conventional refinement.
of the of the vitamin D receptor (VDR) ligand-binding domain bound to calcipotriol (ligand ID MC9) determined at 2.1 Å resolution (PDB entry3.3.2. Ligand ZDD
The histogram for ZDD (Fig. 9) exhibits similar distributions for all three types, with a rather broad peak at 1.4 units. However, the proportion of ONIOM- and Region-QM-refined models in the first three bins, which cover the range of values from 0 to 1.2 ZDD units, is higher than the number of conventional models in the same range. Thus, the average ZDD for the ligands in ONIOM-refined structures (2.3 ± 0.8 units) is slightly lower (better) than that after conventional (2.9 ± 1.1 units). Region-QM yields a set of models which are in the middle (2.6 ± 0.9 units) (Table 2). Overall, the ZDD distribution differs significantly from that observed in the ligand strain, and the square of the Pearson (R2) between ZDD and ligand strain is zero for all three refinements, demonstrating that these two metrics are uncorrelated.
|
4. Discussion
Protein crystallography continues to play a central role in drug discovery as SBDD remains a critical technique for ligand design and optimization, high-throughput screening and often FDA approval (Blundell, 2017). However, the overall lackluster quality of ligands within deposited protein–ligand complexes raises serious concerns. Unfortunately, these errors in the ligand geometry, placement and protonation states often lead to the misperception of protein–ligand interactions and to problems in binding-mode determination, thus diminishing the relevance of such models for SBDD (Borbulevych et al., 2012, 2014, 2016; Cooper et al., 2011; Malde & Mark, 2011; Reynolds, 2014). These issues have been acknowledged (Debreczeni & Emsley, 2017), and the community has made significant methodological improvements in the generation of higher quality restraint (CIF) ligand dictionaries (Nicholls, 2017; Long et al., 2017; Janowski et al., 2016). However, these improvements still lead to a static dictionary file which is created for an isolated ligand without explicit consideration of the in situ impact of the protein and the ligand on one another. In our previous work (Borbulevych et al., 2014), we introduced an approach for macromolecular within the PHENIX package for Region-QM In this approach, the quality of the is immaterial, and the entire user-defined region including both the ligand(s) and the active site(s) are treated as one QM system, thus capturing intermolecular interactions (for example electrostatics, charge transfer, polarization, dispersion and hydrogen bonding) at each step. This work has led to significant improvements in ligand strain and ligand ZDD upon QM In cases where significant strain is still observed, manual building of the model may still be necessary to fix large model errors or rotamer outliers since any gradient-driven cannot make changes beyond its radius of convergence.
The present study takes this improvement to macromolecular DivCon Region-QM versus conventional as measured by ligand-strain energy and ligand ZDD (Table 1, Figs. 6 and 9), while at the same time showing marked improvements in overall structure quality as measured by MPScore (Table 1, Fig. 3). In particular, we observed an improvement in the clashscore component of MPScore by an average factor of 4.5–5.0 upon ONIOM compared with both Region-QM and conventional PHENIX refinements (Table 1), demonstrating that ONIOM is able to correct bad clashes. The cause of these improvements can be explained when one considers how MM works. Specifically, since any residues outside the QM region are described at the MM level (in this case using the AMBER forcefield as implemented in DivCon), any reduction of unfavorable short clashes arises from the 6-12 Lennard–Jones potential for van der Waals interactions. Furthermore, the electrostatic interactions captured by the qiqj/rij term of the AMBER functional also play an essential role, as shown by the example shown in Fig. 5. In this case, the bad clash between Asn413 ND2 and Wat1098 O that was found in the original structure, and that was not corrected by conventional and Region-QM was not only corrected by ONIOM but the interaction was also converted to an electrostatically favorable hydrogen bond.
further through the development and integration of a high-throughput and fully automated two-layer mixed QM/MM ONIOM module applied to the entire structure. With this fully automated approach, any user-chosen ligands, metal ions and cofactors, together with the surrounding residues, comprise a QM layer, while the rest of the atoms in the structure comprise the MM layer and interactions between the two layers are addressed. ONIOM exhibits all benefits of the previously developedWhen considering the distribution of ligand ZDD values (Table 1, Fig. 9), it is worth noting that ONIOM leads to a smaller (better) average ZDD (2.29) when compared with the corresponding average for conventional (2.94). However, this improvement is smaller in magnitude than that observed for ligand-strain energy. ZDD values generally correspond to the amount of difference density around the ligands (Borbulevych et al., 2016; Tickle, 2012), and previously we have shown that ZDD is very sensitive to protomeric/tautomeric states (Borbulevych et al., 2016) or ligand poses (Borbulevych & Westerhoff, 2018). However, in the present study the input PDB files including ligand states/positions were the same for all three types of and therefore we would expect that the ZDD distributions would likewise be similar for those refinements (Fig. 9).
5. Conclusions
Recently, numerous new programs and approaches to create high-quality ligand restraints have been published (Steiner & Tucker, 2017). These methods generally suffer from a critical, fundamental flaw in that they do not explicitly capture the in situ interactions between the protein and the ligand during In the present work, we demonstrate an entirely new methodology to perform X-ray using the two-layer ONIOM method as implemented in the QuantumBio DivCon package. Using this concept, ligands and corresponding active-site residues are treated at the QM level, while the rest of molecule is represented using the MM functional. Both functionals are then combined to derive the ONIOM energy, and associated gradients, of the system. In the present work, the ONIOM approach for the X-ray has been validated against 80 protein–ligand structures from the Astex Diverse Set using both MolProbity metrics and ligand-quality metrics. We established that ONIOM excels in both sets of metrics, resulting in a superior overall quality of the protein–ligand model compared with conventional Combined with a fully automatic structure-preparation protocol and fast, convergent QM/MM calculations, we believe that the ONIOM devised in this paper sets a new paradigm for fast, accurate and user-friendly macromolecular X-ray refinement.
Supporting information
Supplementary Tables. DOI: https://doi.org/10.1107/S2059798318012913/rr5160sup1.pdf
Final ONIOM files: PDB, MTZ, https://doi.org/10.1107/S2059798318012913/rr5160sup2.gz
DOI:Final Region-QM files: PDB, MTZ, https://doi.org/10.1107/S2059798318012913/rr5160sup3.gz
DOI:Final PHENIX files: PDB, MTZ, https://doi.org/10.1107/S2059798318012913/rr5160sup4.gz
DOI:Acknowledgements
The authors wish to acknowledge the continued support of the PHENIX Consortium, in particular Drs Nigel Moriarty, Pavel Afonine and Paul Adams, for maintaining the application programming interface (API) hooks to our software within the PHENIX distribution and for helpful discussion and feedback. The authors would also like to thank Gregory Warren for helpful discussion along with the editor and reviewers for their suggestions and feedback which have lead to a much improved paper. The DivCon plugin to PHENIX is provided by QuantumBio Inc. and it is available at no cost to academic users at https://www.quantumbioinc.com/products/software_licensing.
Funding information
The research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Nos. R44GM112406 and R44GM121162. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
References
Adams, P. D., Afonine, P. V., Bunkóczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L.-W., Kapral, G. J., Grosse-Kunstleve, R. W., McCoy, A. J., Moriarty, N. W., Oeffner, R., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C. & Zwart, P. H. (2010). Acta Cryst. D66, 213–221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Adams, P. D., Pannu, N. S., Read, R. J. & Brünger, A. T. (1997). Proc. Natl Acad. Sci. USA, 94, 5018–5023. CrossRef CAS PubMed Web of Science Google Scholar
Afonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Berman, H., Henrick, K. & Nakamura, H. (2003). Nature Struct. Biol. 10, 980. Web of Science CrossRef PubMed Google Scholar
Berman, H., Henrick, K., Nakamura, H. & Markley, J. L. (2007). Nucleic Acids Res. 35, D301–D303. Web of Science CrossRef PubMed CAS Google Scholar
Blundell, T. L. (2017). IUCrJ, 4, 308–321. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Borbulevych, O., Martin, R. I., Tickle, I. J. & Westerhoff, L. M. (2016). Acta Cryst. D72, 586–598. Web of Science CrossRef IUCr Journals Google Scholar
Borbulevych, O. Y., Plumley, J. A., Martin, R. I., Merz, K. M. & Westerhoff, L. M. (2014). Acta Cryst. D70, 1233–1247. Web of Science CrossRef IUCr Journals Google Scholar
Borbulevych, O. Y., Plumley, J. A. & Westerhoff, L. M. (2012). Abstr. Pap. Am. Chem. Soc. 244, 478. Google Scholar
Borbulevych, O. Y. & Westerhoff, L. M. (2018). In preparation. Google Scholar
Brändén, C.-I. & Jones, T. A. (1990). Nature (London), 343, 687. Google Scholar
Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S. & Karplus, M. (1983). J. Comput. Chem. 4, 187–217. CrossRef CAS Web of Science Google Scholar
Bruno, I. J., Cole, J. C., Kessler, M., Luo, J., Motherwell, W. D. S., Purkis, L. H., Smith, B. R., Taylor, R., Cooper, R. I., Harris, S. E. & Orpen, A. G. (2004). J. Chem. Inf. Comput. Sci. 44, 2133–2144. Web of Science CSD CrossRef PubMed CAS Google Scholar
Cao, L. & Ryde, U. (2018). Front. Chem. 6, 89. Web of Science CrossRef PubMed Google Scholar
Case, D. A., Babin, V., Berryman, J. T., Betz, R. M., Cai, Q., Cerutti, D. S., Cheatham, I. T. E., Darden, T. A., Duke, R. E., Gohlke, H., Goetz, A. W., Gusarov, S., Homeyer, N., Janowski, P., Kaus, J., Kolossvary, I., Kovalenko, A., Lee, T. S., LeGrand, S., Luchko, T., Luo, R., Madej, B., Merz, K. M., Paesani, F., Roe, D. R., Roitberg, A., Sagui, C., Salomon-Ferrer, R., Seabra, G., Simmerling, C. L., Smith, W., Swails, J., Walker, R. C., Wang, J., Wolf, R. M., Wu, X. & Kollman, P. A. (2014). AMBER 14. University of California, San Francisco, USA. Google Scholar
Case, D. A., Ben-Shalom, I. Y., Brozell, S. R., Cerutti, D. S., Cheatham, I. T. E., Cruzeiro, V. W. D., Darden, T. A., Duke, R. E., Ghoreishi, D., Gilson, M. K., Gohlke, H., Goetz, A. W., Greene, D., Harris, R., Homeyer, N., Izadi, S., Kovalenko, A., Kurtzman, T., Lee, T. S., LeGrand, S., Li, P., Lin, C., Liu, J., Luchko, T., Luo, R., Mermelstein, D. J., Merz, K. M., Miao, Y., Monard, G., Nguyen, C., Nguyen, H., Omelyan, I., Onufriev, A., Pan, F., Qi, R., Roe, D. R., Roitberg, A., Sagui, C., Schott-Verdugo, S., Shen, J., Simmerling, C. L., Smith, J., Salomon-Ferrer, R., Swails, J., Walker, R. C., Wang, J., Wei, H., Wolf, R. M., Wu, X., Xiao, L., York, D. M. & Kollman, P. A. (2018). AMBER 2018. University of California, San Francisco, USA. Google Scholar
Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21. Web of Science CrossRef CAS IUCr Journals Google Scholar
Chung, L. W., Sameera, W. M. C., Ramozzi, R., Page, A. J., Hatanaka, M., Petrova, G. P., Harris, T. V., Li, X., Ke, Z., Liu, F., Li, H.-B., Ding, L. & Morokuma, K. (2015). Chem. Rev. 115, 5678–5796. Web of Science CrossRef PubMed Google Scholar
Cooper, D. R., Porebski, P. J., Chruszcz, M. & Minor, W. (2011). Exp. Opin. Drug. Discov. 6, 771–782. Web of Science CrossRef CAS Google Scholar
Davis, A. M., Teague, S. J. & Kleywegt, G. J. (2003). Angew. Chem. Int. Ed. 42, 2718–2736. Web of Science CrossRef CAS Google Scholar
Davis, I. W., Leaver-Fay, A., Chen, V. B., Block, J. N., Kapral, G. J., Wang, X., Murray, L. W., Arendall, W. B., Snoeyink, J., Richardson, J. S. & Richardson, D. C. (2007). Nucleic Acids Res. 35, W375–W383. Web of Science CrossRef PubMed Google Scholar
Debreczeni, J. É. & Emsley, P. (2017). Acta Cryst. D73, 77–78. CrossRef IUCr Journals Google Scholar
Dewar, M. J. S., Zoebisch, E. G., Healy, E. F. & Stewart, J. J. P. (1985). J. Am. Chem. Soc. 107, 3902–3909. CrossRef CAS Web of Science Google Scholar
Diller, D. J., Humblet, C., Zhang, X. & Westerhoff, L. M. (2010). Proteins, 78, 2329–2337. Web of Science CrossRef CAS PubMed Google Scholar
Dixon, S. L. & Merz, K. M. (1996). J. Chem. Phys. 104, 6643–6649. CrossRef CAS Web of Science Google Scholar
Dixon, S. L. & Merz, K. M. (1997). J. Chem. Phys. 107, 879–893. CrossRef CAS Web of Science Google Scholar
Dixon, S., Merz, K. M., Lauri, G. & Ianni, J. C. (2005). J. Comput. Chem. 26, 23–34. Web of Science CrossRef PubMed Google Scholar
Engh, R. A. & Huber, R. (1991). Acta Cryst. A47, 392–400. CrossRef CAS Web of Science IUCr Journals Google Scholar
Field, M. J., Bash, P. A. & Karplus, M. (1990). J. Comput. Chem. 11, 700–733. CrossRef Web of Science Google Scholar
Fu, Z., Li, X. & Merz, K. M. (2011). J. Comput. Chem. 32, 2587–2597. Web of Science CrossRef CAS PubMed Google Scholar
Gibbons, J. D. & Chakraborti, S. (2010). Nonparametric Statistical Inference, 5th ed. Boca Raton: CRC Press. Google Scholar
Gore, S., Olsson, T. S. G. & Zhuravleva, M. (2011). Acta Cryst. A67, C104. CrossRef IUCr Journals Google Scholar
Groom, C. R., Bruno, I. J., Lightfoot, M. P. & Ward, S. C. (2016). Acta Cryst. B72, 171–179. Web of Science CSD CrossRef IUCr Journals Google Scholar
Hartshorn, M. J., Verdonk, M. L., Chessari, G., Brewerton, S. C., Mooij, W. T. M., Mortenson, P. N. & Murray, C. W. (2007). J. Med. Chem. 50, 726–741. Web of Science CrossRef PubMed Google Scholar
Hintze, B. J., Lewis, S. M., Richardson, J. S. & Richardson, D. C. (2016). Proteins, 84, 1177–1189. Web of Science CrossRef CAS PubMed Google Scholar
Hostaš, J., Řezáč, J. & Hobza, P. (2013). Chem. Phys. Lett. 568–569, 161–166. Google Scholar
Hu, L., Söderhjelm, P. & Ryde, U. (2011). J. Chem. Theory Comput. 7, 761–777. Web of Science CrossRef PubMed Google Scholar
Janowski, P. A., Moriarty, N. W., Kelley, B. P., Case, D. A., York, D. M., Adams, P. D. & Warren, G. L. (2016). Acta Cryst. D72, 1062–1072. Web of Science CrossRef IUCr Journals Google Scholar
Kim, S., Wu, J. Y., Birzin, E. T., Frisch, K., Chan, W., Pai, L.-Y., Yang, Y. T., Mosley, R. T., Fitzgerald, P. M. D., Sharma, N., Dahllund, J., Thorsell, A.-G., DiNinno, F., Rohrer, S. P., Schaeffer, J. M. & Hammond, M. L. (2004). J. Med. Chem. 47, 2171–2175. Web of Science CrossRef PubMed Google Scholar
Kleywegt, G. J. (2007). Acta Cryst. D63, 94–100. Web of Science CrossRef CAS IUCr Journals Google Scholar
Labute, P. (2009). Proteins, 75, 187–205. Web of Science CrossRef PubMed CAS Google Scholar
Li, X., Hayik, S. A. & Merz, K. M. (2010). J. Inorg. Biochem. 104, 512–522. Web of Science CrossRef PubMed Google Scholar
Liebeschuetz, J., Hennemann, J., Olsson, T. S. G. & Groom, C. R. (2012). J. Comput. Aided Mol. Des. 26, 169–183. Web of Science CrossRef PubMed Google Scholar
Liu, M., Wang, Y., Chen, Y., Field, M. J. & Gao, J. (2014). Isr. J. Chem. 54, 1250–1263. Web of Science CrossRef PubMed Google Scholar
Long, F., Nicholls, R. A., Emsley, P., Gražulis, S., Merkys, A., Vaitkus, A. & Murshudov, G. N. (2017). Acta Cryst. D73, 112–122. Web of Science CrossRef IUCr Journals Google Scholar
Lovell, S. C., Word, J. M., Richardson, J. S. & Richardson, D. C. (1999). Proc. Natl Acad. Sci. USA, 96, 400–405. Web of Science CrossRef CAS PubMed Google Scholar
Lovell, S. C., Word, J. M., Richardson, J. S. & Richardson, D. C. (2000). Proteins, 40, 389–408. CrossRef PubMed CAS Google Scholar
MacCallum, J. L., Hua, L., Schnieders, M. J., Pande, V. S., Jacobson, M. P. & Dill, K. A. (2009). Proteins, 77, 66–80. Web of Science CrossRef PubMed Google Scholar
Malde, A. K. & Mark, A. E. (2011). J. Comput. Aided Mol. Des. 25, 1–12. Web of Science CrossRef CAS PubMed Google Scholar
Merz, K. M. & Raha, K. (2011). US Patent 7904283. Google Scholar
Mobley, D. L. & Dill, K. A. (2009). Structure, 17, 489–498. Web of Science CrossRef PubMed Google Scholar
Moriarty, N. W., Grosse-Kunstleve, R. W. & Adams, P. D. (2009). Acta Cryst. D65, 1074–1080. Web of Science CrossRef CAS IUCr Journals Google Scholar
Moriarty, N. W., Tronrud, D. E., Adams, P. D. & Karplus, P. A. (2014). FEBS J. 281, 4061–4071. Web of Science CrossRef CAS PubMed Google Scholar
Nicholls, R. A. (2017). Acta Cryst. D73, 158–170. Web of Science CrossRef IUCr Journals Google Scholar
Perola, E. & Charifson, P. S. (2004). J. Med. Chem. 47, 2499–2510. Web of Science CrossRef PubMed CAS Google Scholar
Peters, M. B. & Merz, K. M. (2006). J. Chem. Theory Comput. 2, 383–399. Web of Science CrossRef PubMed Google Scholar
Plotnikov, N. V., Kamerlin, S. C. L. & Warshel, A. (2011). J. Phys. Chem. B, 115, 7950–7962. Web of Science CrossRef PubMed Google Scholar
Pozharski, E., Weichenberger, C. X. & Rupp, B. (2013). Acta Cryst. D69, 150–167. Web of Science CrossRef CAS IUCr Journals Google Scholar
QuantumBio (2017). LibQB v.7.0. https://www.quantumbioinc.com. Google Scholar
Raha, K. & Merz, K. M. (2005). J. Med. Chem. 48, 4558–4575. Web of Science CrossRef PubMed Google Scholar
Ramachandran, S., Kota, P., Ding, F. & Dokholyan, N. V. (2011). Proteins, 79, 261–270. Web of Science CrossRef CAS PubMed Google Scholar
Read, R. J., Adams, P. D., Arendall, W. B. III, Brunger, A. T., Emsley, P., Joosten, R. P., Kleywegt, G. J., Krissinel, E. B., Lütteke, T., Otwinowski, Z., Perrakis, A., Richardson, J. S., Sheffler, W. H., Smith, J. L., Tickle, I. J., Vriend, G. & Zwart, P. H. (2011). Structure, 19, 1395–1412. Web of Science CrossRef PubMed Google Scholar
Reynolds, C. H. (2014). ACS Med. Chem. Lett. 5, 727–729. Web of Science CrossRef CAS PubMed Google Scholar
Řezáč, J., Fanfrlík, J., Salahub, D. & Hobza, P. (2009). J. Chem. Theory Comput. 5, 1749–1760. Web of Science PubMed Google Scholar
Rupp, B. (2009). Biomolecular Crystallography: Principles, Practice, and Application to Structural Biology. New York: Garland Science. Google Scholar
Sachs, L. (1984). Applied Statistics: A Handbook of Techniques. New York: Springer-Verlag. Google Scholar
Smart, O. S., Horský, V., Gore, S., Svobodová Vařeková, R., Bendová, V., Kleywegt, G. J. & Velankar, S. (2018). Acta Cryst. D74, 228–236. Web of Science CrossRef IUCr Journals Google Scholar
Sousa, S. F., Ribeiro, A. J. M., Neves, R. P. P., Brás, N. F., Cerqueira, N. M. F. S. A., Fernandes, P. A. & Ramos, M. J. (2016). Wiley Interdiscip. Rev. Comput. Mol. Sci. 7, e1281. Web of Science CrossRef Google Scholar
Steiner, R. A. & Tucker, J. A. (2017). Acta Cryst. D73, 93–102. Web of Science CrossRef IUCr Journals Google Scholar
Stewart, J. J. P. (1989). J. Comput. Chem. 10, 209–220. CrossRef CAS Web of Science Google Scholar
Stewart, J. J. P. (2009). J. Mol. Model. 15, 765–805. Web of Science CrossRef PubMed CAS Google Scholar
Stewart, J. J. P. (2013). J. Mol. Model. 19, 1–32. Web of Science CrossRef CAS PubMed Google Scholar
Tickle, I. J. (2012). Acta Cryst. D68, 454–467. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tocchini-Valentini, G., Rochel, N., Wurtz, J.-M. & Moras, D. (2004). J. Med. Chem. 47, 1956–1961. PubMed Google Scholar
Vaart, A. van der, Gogonea, V., Dixon, S. L. & Merz, K. M. (2000). J. Comput. Chem. 21, 1494–1504. Google Scholar
Vaart, A. van der, Suárez, D. & Merz, K. M. (2000). J. Chem. Phys. 113, 10512–10523. Google Scholar
Vlassi, M., Dauter, Z., Wilson, K. S. & Kokkinidis, M. (1998). Acta Cryst. D54, 1245–1260. Web of Science CrossRef CAS IUCr Journals Google Scholar
Vreven, T., Morokuma, K., Farkas, Ö., Schlegel, H. B. & Frisch, M. J. (2003). J. Comput. Chem. 24, 760–769. Web of Science CrossRef PubMed CAS Google Scholar
Wang, B., Raha, K. & Merz, K. M. (2004). J. Am. Chem. Soc. 126, 11430–11431. Web of Science CrossRef PubMed Google Scholar
Wang, B., Westerhoff, L. M. & Merz, K. M. (2007). J. Med. Chem. 50, 5128–5134. Web of Science CrossRef PubMed Google Scholar
Williams, D. E., Peters, M. B., Wang, B., Roitberg, A. E. & Merz, K. M. (2009). J. Phys. Chem. A, 113, 11550–11559. Web of Science CrossRef PubMed Google Scholar
Word, J. M., Lovell, S. C., LaBean, T. H., Taylor, H. C., Zalis, M. E., Presley, B. K., Richardson, J. S. & Richardson, D. C. (1999). J. Mol. Biol. 285, 1711–1733. Web of Science CrossRef CAS PubMed Google Scholar
Yu, N., Yennawar, H. P. & Merz, K. M. (2005). Acta Cryst. D61, 322–332. Web of Science CrossRef CAS IUCr Journals Google Scholar
Yuriev, E. & Ramsland, P. A. (2013). J. Mol. Recognit. 26, 215–239. Web of Science CrossRef CAS PubMed Google Scholar
Zhang, X., Gibbs, A. C., Reynolds, C. H., Peters, M. B. & Westerhoff, L. M. (2010). J. Chem. Inf. Model. 50, 651–661. Web of Science CrossRef CAS PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.