research papers
Accurate macromolecular crystallographic DivCon into the PHENIX package
incorporation of the linear scaling, semiempirical quantum-mechanics programaQuantumBio Inc., 2790 West College Avenue, State College, PA 16801, USA, and bQuantum Theory Project, University of Florida, Gainesville, Florida USA
*Correspondence e-mail: lance@quantumbioinc.com
Macromolecular crystallographic a priori understanding of the ligand geometry within the active site, and creation of the is often an error-prone process owing to the great variety of potential ligand chemistry and structure. Stereochemical restraints have been replaced with more robust functionals through the integration of the linear-scaling, semiempirical quantum-mechanics (SE-QM) program DivCon with the PHENIX X-ray engine. The PHENIX/DivCon package has been thoroughly validated on a population of 50 protein–ligand Protein Data Bank (PDB) structures with a range of resolutions and chemistry. The PDB structures used for the validation were originally refined utilizing various packages and were published within the past five years. PHENIX/DivCon does not utilize CIF(s), link restraints and other parameters for and hence it does not make as many a priori assumptions about the model. Across the entire population, the method results in reasonable ligand geometries and low ligand strains, even when the original exhibited difficulties, indicating that PHENIX/DivCon is applicable to both single-structure and high-throughput crystallography.
relies on sometimes dubious stereochemical restraints and rudimentary energy functionals to ensure the correct geometry of the model of the macromolecule and any covalently bound ligand(s). The ligand stereochemical restraint file (CIF) requiresKeywords: X-ray crystallography; quantum-mechanics refinement; regional refinement; stereochemical restraints; ligand strain; high-throughput crystallography.
1. Introduction
Structure-based drug discovery (SBDD) utilizes the three-dimensional structure of a drug target, such as an enzyme, complexed with one or more ligands or lead compounds in order to inform the drug-discovery process. The structure serves as the three-dimensional template for screening, docking and scoring (e.g. rank ordering) potential drug candidates, and hence the quality of the model is crucial to overall success. X-ray crystallography is the primary technique used to determine the three-dimensional structure of molecular biochemical systems, and owing to recent advances in data collection, processing, structure solution and automation, X-ray crystallography has become fairly routine. The quality and correctness of the final model is dependent both on the quality of the experimental data (e.g. resolution, mosaicity etc.), as well as human factors such as the expertise and abilities of the investigator and the robustness of the methodology, including the force field and the initial ligand placement (Pozharski et al., 2013). During the process, the parameters of the macromolecular model, including atomic coordinates and temperature factors, are optimized to achieve the best possible agreement between the observed experimental structure factors and the chosen target function (e.g. or least squares; Murshudov et al., 1997, 2011; Tronrud, 2004). Ideally, the protocol should provide satisfactory structure geometry regardless of data quality.
A significant challenge in obtaining a reasonable fit during macromolecular a priori information about the structure such as its stereochemistry is usually required. Stereochemical restraints represent a harmonic oscillator function (1) that implies a penalty (Evans, 2007) QP for a deviation of the geometric parameter P from its ideal value. In typical X-ray geometry parameters include bond lengths, bond angles and torsion angles as well as and group planarity requirements.
is that the model parameters are typically underdetermined in the macromolecular diffraction experiment owing to resolution limitations. At moderate and low resolutions, the observation-to-parameter ratio is low and ranges from 4 to 1, and henceConventional ), and these restraints often promote fairly satisfactory macromolecular geometry. However, the situation becomes less certain when a small molecule (a ligand, an inhibitor and/or a metallic or non-metallic cofactor) is bound to a protein (Kleywegt, 2007). The conventional approach requires a detailed description of the molecular geometry for each small molecule to be refined, and thus creating a good library or Crystallographic Information File (CIF) is extremely important to the ultimate success of the effort. Unfortunately, the creation of such a dictionary is not a trivial task because of the great variety of potential ligand chemistry and structure. When such a species is involved in the model, a correct structure requires an a priori understanding of the bond lengths and bond angles of the species within the active site, and these methods are unable to account for influences including coordination, bond making/breaking and protein-induced or environment-induced structural perturbations. Furthermore, compared with modern molecular-mechanics and quantum-mechanics methods, the functionals used in conventional are rudimentary in nature and do not account for interactions such as hydrogen bonds, dispersion, electrostatics, polarization and charge transfer. In order to address these deficiencies, and since it is impossible to understand all of these influences prior to conventional often requires a repeated, highly interactive, investigator-driven cycle leading to greater expense, especially in industrial laboratories. According to a recent PDB survey, the percentage of ligands having questionable geometric parameters in deposited macromolecular structures could be as high as 60% (Liebeschuetz et al., 2012; Gore et al., 2011). Overall, this survey has suggested that conventional macromolecular protocols based on fixed, pre-refinement stereochemical restraints have often failed to provide a reasonable structure of small-molecule ligands. Enhancements in PDB validation tools would certainly help in detecting problematic ligands, but such tools only serve to point out problems and they do not directly improve structure. The PDB_REDO project (Joosten et al., 2012) uses conventional methods to clean up these structures, but this project is still limited to the use of CIFs, other empirical parameters and a rudimentary energy functional.
methods use fixed stereochemical restraints for any components included within the protein–ligand complex. For standard amino acids, these fixed stereochemical restraints are based on the ideal Engh and Huber parameters (Engh & Huber, 1991To overcome the limitations associated with conventional et al., 2005) that a quantum-mechanical (QM) method can be substituted for the stereochemical restraint function. This proof-of-concept study clearly demonstrated that QM macromolecular using QM gradients generated in real time can be successfully carried out using a combination of the QM and molecular-mechanics (MM) functionals from AMBER (Case et al., 2010) with the Crystallography & NMR System (CNS; Brünger et al., 1998) platform. While this work was groundbreaking, a number of key limitations impact the accessibility of the method. In particular, in the years since this implementation was put into place, CNS has been largely replaced in the industrial field. Further, in this previous approach, solvent molecules were excluded from the model, and the bulk-solvent correction and temperature factors of protein atoms were not refined. The method also did not allow multiple QM regions, alternative atom positions and so on. In the present work, we present a new, streamlined QM X-ray implementation based on the popular, modern crystallographic package PHENIX (Adams et al., 2010; Afonine et al., 2012). QuantumBio's linear scaling, semiempirical QM C++/Python toolkit libQB, upon which DivCon (Dixon & Merz, 1996, 1997) is built, has been integrated with PHENIX, resulting in a user-friendly tool referred to as PHENIX/DivCon. Unlike conventional this method does not require generation using a priori knowledge of the expected outcome, and since the QM functional is by definition more rigorous then the conventional functional, PHENIX/DivCon can be applied to complex protein–ligand structures, including metal-containing systems and structures with covalently bound ligands, without specifying coordination-sphere parameters, link restraints and the like.
it has been proposed (Yu2. Methods
2.1. DivCon discovery suite
DivCon employs divide-and-conquer (D&C), linear scaling, semiempirical quantum-mechanics (SE-QM) methods as described previously (van der Vaart, Gogonea et al., 2000; van der Vaart, Suarez et al., 2000; Wang et al., 2007; Dixon & Merz, 1996, 1997). The D&C methodology makes routine QM characterization of protein–ligand systems with thousands or even tens of thousands of atoms feasible, and it has been applied to a number of key SBDD applications including a patented QMScore (Merz & Raha, 2011; Diller et al., 2010; Raha & Merz, 2005; Zhang et al., 2010) along with an NMRScore (Williams et al., 2009; Wang et al., 2004, 2007), QM-based QSAR (Peters & Merz, 2006; Dixon et al., 2005; Zhang et al., 2010) and X-ray (Li et al., 2010; Yu et al., 2005). While DivCon includes support for modern Hamiltonians such as PM6 (Řezáč et al., 2009; Stewart, 2009) and soon PM7 (Stewart, 2013; Hostaš et al., 2013), in order to better compare with the state of the art in generation such as the program eLBOW (Moriarty et al., 2009), for the present project the traditional AM1 (Dewar et al., 1985) Hamiltonian was chosen for all SE-QM calculations (except when otherwise indicated).
2.2. QM X-ray refinement
The typical PHENIX involves fitting bulk-solvent parameters and anisotropic scaling, atomic coordinate temperature-factor and occupancy These stages are executed in sequential order as depicted in the flow chart in Fig. 1, and are repeated for each macro-cycle. The overall target Etotal in PHENIX is usually presented as follows:
protocol inwhere Ωxray and Ωgeom are weights assigned to X-ray data and geometry (stereochemical) restraints, respectively, and wcxscale is the additional scale factor implemented in PHENIX (Afonine et al., 2012).
The macro-cycle is broken into a number of minimization-driven micro-cycles, each of which, in PHENIX, is built on top of a minimization engine from the quasi-Newton family of minimizers called limited memory–Broyden–Fletcher–Goldfarb–Shanno (L-BFGS; Liu & Nocedal, 1989). This engine takes the function to be minimized (e.g. the coordinates) and the corresponding gradients (e.g. the atomic forces) as input and determines the approximate inverse Hessian matrix. The Hessian matrix is a matrix of second derivatives of energy with respect to nuclear motion, and it determines the direction of the minimum search. The matrix is iteratively updated at each step based upon updated gradients. This reliance on gradients underscores the importance of high-quality gradients in the minimization. During the coordinate-refinement stage in PHENIX, gradients on each atom in the structure are evaluated both from X-ray data (e.g. X-ray gradients) and from the stereochemical restraint function (e.g. geometry gradients). X-ray and geometry gradients are then added together, taking into account the corresponding weights, and the sum of the gradients is provided to L-BFGS. For each macro-cycle there are generally 15–30 L-BFGS iterations, and there are three (default) macro-cycles in the entire refinement.
In SBDD, where an accurate representation of ligand(s), cofactor(s) and surrounding active-site residues is of paramount importance, the stereochemical restraint function is often not rigorous enough to represent the chemistry within the complex. Therefore, in order to improve the description of the atomic forces experienced by each atom within the complex, the SE-QM-based function available in DivCon is employed in the present method at each step and the updated coordinates and gradients are generated in `real time' during the entire course of the This DivCon (libQB) toolkit is a C++-based library on which input/output processing functions are layered in order to facilitate communication between itself and the Python-based PHENIX package. During the coordinate-refinement stage of each macro-cycle, the atomic coordinates of the structure are passed to DivCon through a lightweight Python class each time geometry gradients are required to complete an L-BFGS minimization step. The QM engine within DivCon then generates the single-point SE-QM energy and atomic gradients, and the same lightweight Python class processes the results and replaces the native PHENIX stereochemistry gradients with the QM gradients. The sequential organization of the flow in PHENIX allows us to fully employ the other steps of the procedure such as bulk-solvent correction and B-factor without any modifications or corrections owing to inclusion of the SE-QM data without an external run script and with all communication performed underneath the phenix.refine executable.
2.3. Region-specific QM X-ray refinement
DivCon can be used to treat the entire protein–ligand complex; however, owing to the repeated iterations required for each PHENIX macro-cycle, this treatment becomes computationally expensive for large protein structures even when the linear scaling methodology is employed. For example, L-BFGS in PHENIX requires a new set of computed gradients 20–30 times (micro-cycles) for each macro-cycle. Instead of treating the entire protein, one can reduce the size of the problem by only treating the region of interest (for example, a ligand/cofactor along with its surrounding active site). In the previous CNS-based implementation (Li et al., 2010), the third-party package AMBER (Case et al., 2010) and its QM/MM implementation was used. The active site was treated at an ab initio or semiempirical level of theory, and the rest of the protein utilized the AMBER force field for the MM portion of the computations.
In this work, in order to limit dependence on third-party software and to improve the general usability of the software, we implemented an automatic region-specific methodology to carry out QM-based ). As depicted in Fig. 2, firstly we define all residues within a certain distance of any atom of the ligand as a part of the main or core region. This core region is the region for which the atomic SE-QM gradients are determined and returned for use in the Instead of layering an MM region on top of this core region, we employ a second SE-QM region referred to as a buffer region, which includes any residues surrounding the core region. By using a buffer region to chemically insulate the core region, we gain a significant speed-up versus a full QM treatment and, at the same time, we limit any errors that may occur in the gradients owing to capping or to some artificial chemical environment surrounding the core region. Granted, we lose longer-range interactions of a true QM/MM Hamiltonian; however, we gain greater usability, improved QM convergence characteristics and software independence by not being dependent upon the AMBER package in our implementation. Finally, while not explored in the present paper, this regional QM method can also be applied concurrently to more than one region within the complex. Since the atoms within the buffer region are only provided in order to chemically separate the QM core from the rest of the biomolecular structure, the QM gradients generated from the atoms within the buffer region are not used in the and are thrown out, and the standard stereochemical restraints are used instead.
on the area(s) of interest within the protein–ligand complex (Fig. 2To explore the performance of regional QM PHENIX/DivCon are extended in a residue-inclusive manner, meaning that if a single residue atom falls within a particular distance of the ligand/core then the entire residue is included in the calculation. The complete core + buffer region is treated with SE-QM, and where the buffer region is separated or cut from the protein, any `dangling bonds' are addressed through the addition of protons. For the remainder of the protein, standard PHENIX stereochemistry restraints are used, as schematically shown in Fig. 2. While the stereochemistry restraints are not as sophisticated as an MM force field, they are still well parameterized for standard residues such as amino acids and hence produce reasonable results for the protein as assessed by several studies (Kleywegt & Jones, 1996; Jaskolski et al., 2007; Dodson et al., 1996). During QM regional QM and stereochemical gradients on each atom with coordinates x are combined according to
we set the residue-selection radius of the core region to 5 Å from the ligand and the residue-selection radius for the buffer region to 3 Å from the core region. Both regions withinwhere the weight ϖ is set to 1 for the core QM region(s) and 0 for the rest of the atoms, including the buffer region. In order to gauge the overall performance of this regional QM full QM was also performed, in which the weight ϖ is 1 for all atoms in the whole system.
2.4. Local ligand strain energy calculations
In addition to crystallographic metrics to measure the agreement between model and experiment, local ligand strain has been used and provided throughout this project. The local ligand strain energy, defined as the difference between the energy of the isolated ligand conformation and the protein-bound ligand conformation, serves as an important quality indicator of protein–ligand structures as it shows how much strain the ligand must take on in order to bind to the protein. Some strain is expected as it captures ligand deformation upon binding; however, large ligand strains can imply problems in the ligand. As suggested by Fu et al. (2011), ligand strain energy Estrain is computed as
where Eligandxray is the single-point energy computed for the ligand X-ray geometry and Eligandoptimized is the energy of the optimized ligand that corresponds to the local minimum.
Upon the completion of each PHENIX/DivCon (e.g. AM1 in this case) is provided as output, and these strains are reported. In order to compare these AM1 strains with a higher level of theory, the strains calculated using the ab initio HF/6-311+G** method as implemented within GAMESS v. October 1, 2010 (Schmidt et al., 1993) are also reported. Protons were added as per the ReadySet! workflow. Since the added protons are not optimized using the AM1 Hamiltonian in ReadySet!, prior to calculating the strain energy for deposited structures the positions of any added H atoms were optimized. AM1 as implemented in GAMESS v. October 1, 2010 was chosen for this step because it provides a mechanism to refine protons separately from heavy atoms. Therefore, the error associated with proton addition has been canceled when reported. Solvation (implicit or explicit) was not included in any strain calculations owing to methodological and implementation limitations. Finally, both the raw strain and the normalized strain values are reported, where the normalized strain is the raw strain divided by the number of atoms.
the starting and ending ligand strain using the Hamiltonian chosen in2.5. Structure preparation
Validation of the QM-augmented 1lri (Lascombe et al., 2002; Table 1) with a well resolved ligand. When this structure was originally deposited in the PDB, it was determined at a resolution of 1.45 Å with an Rfree below 20% and with sound Ramachandran plot statistics (99% of residues in the most favorable region). It therefore serves as a positive test to reveal flaws (if any) in our QM protocol.
method was conducted in two stages. The first stage involved a detailed description of the QM-based of the high-quality protein structure PDB entry
|
The second stage of the validation involved high-throughput PHENIX/DivCon of 50 protein structures containing various ligands (Table 2). These structures were chosen through an internal strain energy survey for all ligand poses found in the PDB. Using these ligand strains, the structures were divided into three bins: 0–50 kcal mol−1 strain, 50–100 kcal mol−1 strain and >100 kcal mol−1 strain. The bins were further divided into those structures deposited in the years prior to 2007 and those deposited in the years since. A total of 30 structures for high-throughput validation were then randomly selected from the latter set of bins at a rate of ten examples from each bin. The remaining 20 structures were made up of two additional equal sized bins: ten metal-containing structures and ten models containing covalently bound ligands. This selection led to a total population of 50 structures with a large variety of chemistry and ligand strain that were deposited in approximately the last five years.
‡Ligand strain calculated at HF/6-311+G**. The first number is the raw strain, while the second, italicized number is the strain normalized by the number of atoms. §Structural (three-dimensional) r.m.s.d. calculated between initial and final ligand structures for all atoms. ¶Normalized strain energy values (strain energy/Natoms). |
Coordinates and structure factors for all 50 structures were downloaded from the Protein Data Bank (PDB). Ligands(s), solvent molecules, metals or anions (e.g. Cl−) were included in our and H atoms were added to protein residues, ligands and water molecules using the PHENIX utility ReadySet! as implemented in the PHENIX package v.1.7.3-928. No optimization of hydrogen positions was carried out prior to the re-refinement. The correctness of the ligand protonation was verified using the molecular-perception algorithm (Labute, 2005) implemented in DivCon. While not required for QM the is required for PHENIX `error traps'. Therefore, the geometry restraints library or for each ligand in the structures under investigation was generated using eLBOW (Moriarty et al., 2009) from PHENIX utilizing the AM1 optimization (the -opt) option. To limit investigator bias, we used the files from eLBOW and ReadySet! as provided, and we did not hand-edit the files to correct, change or otherwise `improve' the ligand CIFs to address any missed atom type or other problems associated with automated methods.
2.6. The protocol
For the first validation, the structure 1lri was chosen for detailed validation in this study and was refined in two ways: (i) full (non-regional) SE-QM PHENIX/DivCon and (ii) regional SE-QM PHENIX/DivCon refinement.
In each case, the et al., 1985) as implemented in DivCon. For the regional the core of the QM region was defined as the ligand/cofactor along with residues that had any atoms within 5.0 Å of any of the ligand/cofactor atoms at the start of the The buffer region was configured to include any residues with atoms 2.0 Å from any core QM atoms. In both refinements, in order to further limit investigator bias and to demonstrate the high-throughput nature of the workflow, the default phenix.refine v.1.7.3-928 (Afonine et al., 2012) options were chosen. Finally, the quality of the refined structures was analyzed using PROCHECK (Laskowski et al., 1993) and MolProbity (Chen, Arendall et al., 2010; Davis et al., 2007).
began with the identical PDB-deposited coordinates and structure factors, and the QM energy and gradient calculations were performed using the AM1 Hamiltonian (DewarThe high-throughput validation focuses on the treatment of 50 additional recently deposited PDB structures. For this validation, only the 5.0 Å core, 3.0 Å buffer regional SE-QM PHENIX/DivCon described above was performed. In each the center of the region was defined as the ligand listed in Table 2. Also as with the 1lri validation, the AM1 Hamiltonian was chosen, and all default PHENIX/DivCon settings were used in order to validate the automated high-throughput nature of the new package. The real-space correlation coefficients (CCs) for each refined ligand are reported (Supplementary Table S11) as calculated using the program phenix.get_cc_mtz_pdb from the PHENIX package (Adams et al., 2010).
3. Results
3.1. Detailed QM method validation: of 1lri
The structure of β-cryptogein, a sterol includes 98 residues in the complex with cholesterol (Lascombe et al., 2002), and has been determined to the relatively high resolution of 1.45 Å. Originally, this structure was refined using the anisotropic approximation of atomic displacements for non-H atoms (Lascombe et al., 2002). However, a recent comprehensive study (Merritt, 2012) indicates that the anisotropic at this resolution may produce questionable temperature factors. Therefore, we adopted the isotropic protocol for this structure in this study. Conventional PHENIX produced good statistics overall, with Rfree and Rwork values of 17.8 and 15.8%, respectively (Table 1). Non-regional SE-QM PHENIX/DivCon yielded a slightly better Rfree of 17.4%, and with an Rwork of 16.5% the structure resulting from SE-QM was noticeably less overfitted relative to conventional PHENIX As noted in Table 3, the average protein backbone C—C and C—O bond lengths resulting from the SE-QM are within 1σ of the ideal values accepted in macromolecular crystallography literature. The protein backbone C—N bonds do deviate by 1.5–2σ from the target length of 1.33 (2) Å reported by Jaskolski et al. (2007). This deviation is associated with the use of the AM1 Hamiltonian, which is known to slightly overestimate the C—N bond length, as noted previously (Yu et al., 2005).
‡Statistics derived from ultrahigh-resolution protein structures (Jaskolski et al., 2007). §The first value corresponds to Ile, Thr and Val residues; the second value corresponds to the remaining amino acids excluding Ala. |
Regional SE-QM PHENIX/DivCon was centered on the cholesterol ligand. The resulting 5.0 Å core QM region comprises 564 atoms and the 2.0 Å buffer region adds another 828 atoms to complete the (core + buffer) QM region. This protocol yields overall (Rwork of 16.7% and Rfree of 17.8%) and model stereochemistry similar to those produced by the non-regional QM method (Table 1). As expected, since the protein outside the core QM region was treated with the same stereochemical restraints as the conventional PHENIX the overall quality of the protein, including the Ramachadran plot statistics, is very similar in both QM and PHENIX Importantly, MolProbity and PROCHECK analyses revealed no geometry distortion at the border between the QM core and the part of the molecule treated with PHENIX stereochemical restraints.
Since e.g. the heat of formation) of the system will decrease over the course of QM As depicted in Fig. 3, there is indeed a dramatic drop in the energy that occurs during the first L-BFGS macro-cycle, followed by much smaller decreases in subsequent cycles. This observation suggests that the most important macro-cycle is the first one. However, we cannot exclude the possibility that protein structures with bad starting geometry might experience more significant energy changes during subsequent macro-cycles than we have seen here.
involves the minimization of the geometry of the protein structure, it is expected that the QM energy (Detailed analysis of the ligand (cholesterol) molecule structure in 1lri allows us to evaluate the overall robustness of our QM procedure. Regardless of the quality of the molecular description provided in the (Borbulevych et al., 2011; Borbulevych & Westerhoff, 2011), upon the completion of each QM the geometry of the cholesterol is found to be chemically correct, with all bond lengths and bond angles close to their standard values (Allen et al., 1987; Table 4). This CIF-independent result is consistent with the fact that the PHENIX/DivCon calculations do not rely on the ligand library for the ligand structure, and this observation is of pivotal importance to the application of this method in high-throughput X-ray Furthermore, the ligand also exhibits an excellent fit to the electron-density map (Fig. 4). Finally, when comparing the regional with the non-regional or `full' QM we have shown that the ligand geometry is virtually identical between the two methods (Table 4, Fig. 4), which demonstrates that the lower computational cost regional method will suffice.
|
3.2. QM validation on a set of 50 PDB structures
Once the method had been successfully validated against a single example, regional QM . To measure the impact that the QM method brings to the structure, we adopted strain energy (Fu et al., 2012) as a metric since traditional crystallographic metrics such as R-factor values or the stereochemical quality of the polypeptide chains (e.g. MolProbity) are typically not effective in assessing the quality of the ligand (Cooper et al., 2011). For example, we found that the average Rwork and Rfree factor values over 50 structures remains virtually unchanged after regional QM while the ligand strain energy dramatically improved. The insensitivity of the R-factor metrics can be explained by the fact that we apply the QM functional only to a relatively small region around the chosen ligand, while the majority of the structure is still treated using the conventional stereochemistry restraints within PHENIX. Furthermore, the method presented is focused on the actual of the deposited structure, and does not involve steps that are beyond the scope of the present work such as novel ligand placement, docking or other tools for model building and ensemble generation.
for a set of 50 PDB structures was also performed. In each case, the center of the selection was the ligand found in the original PDB structure as specified in Table 2The average ligand strain energy for the set of 50 structures calculated from the deposited coordinates is found to be 83.50 ± 9.03 kcal mol−1, where the minimum and maximum values are 6.88 and 283.35 kcal mol−1, respectively, or a range of 276.47 kcal mol−1. After regional PHENIX/DivCon is performed, there is a dramatic improvement of the strain throughout the set (Table 2). The average strain energy of the refined set of structures is 24.60 ± 3.67 kcal mol−1, or 3.4 times smaller than that of the originally deposited structures. The normalized strain energy, which is the strain energy divided by the number of ligand atoms, shows a similar improvement of 3.3-fold, where the average of 50 structures decreased from 1.80 ± 0.20 kcal mol−1 (initial set) to 0.55 ± 0.09 kcal mol−1 (refined set).
Moreover, the energy distribution range also becomes significantly tighter (110.52 kcal mol−1) after the QM In recent work (Borbulevych et al., 2012), we studied the strain energy distributions over all structures deposited in the PDB using the semiempirical method PM6. In particular, we demonstrated that ∼55% of all ligand poses fall into the 0–40 kcal mol−1 strain bin, while ∼25% of all ligand poses exhibit a strain energy of over 100 kcal mol−1. Generally speaking, higher values of the strain increase the likelihood of an incorrect geometry (Perola & Charifson, 2004). Hence, the overall improvement of the ligand strain energy from an average of 83.5 kcal mol−1 to an average of 24.6 kcal mol−1 brings the average value into the most populated bin, likely owing to the elimination of gross errors in the ligand geometry.
Furthermore, the QM ), the overall decrease in the strain energy after the regional QM was over twofold and was accompanied by significant corrections in the geometry of the linkage bonds or the metal coordination sphere, as discussed below in more detail.
protocol exhibits superior performance in cases that usually require tedious work to create library files in conventional such as ligands covalently bound to protein residues or ligands coordinated to metals. In both categories of structures studied in this work (Table 2We found that the average ligand strain improvement is similar in all three initial strain bins (e.g. low, 0–50 kcal mol−1; medium, 50–100 kcal mol−1; high, >100 kcal mol−1), ranging from 4.9-fold to 5.3-fold. Despite this similarity, the results for particular ligands vary significantly. For example, five ligands from the high-strain bin exhibit a significant decrease in strain of between 9.5-fold and 15.5-fold. The other five structures in this bin show a more modest improvement measured by twofold to 3.7-fold decreases. This observation underscores the fact that the strain energy consists of several components (both artificial and natural). Artificial strain is itself made up of both initial positional or docking-induced strain (e.g. DIS), which is strain associated with poor or questionable ligand starting geometry, and method-induced strain (MIS), which is from errors in the underlying force-field method, parameters or file(s) used in the Natural ligand strain or target-induced strain (TIS), on the other hand, includes legitimate changes in the ligand conformation during the process of binding to the macromolecular receptor. In the present project, the initial binding conformation of the ligand has not been manipulated prior to QM therefore, it can be concluded that the sharp decrease in strain after QM represents cases in which the main cause of strain in the deposited structure can be attributed to MIS (e.g. PDB entry 2wue, as detailed below). Interestingly, even a modest improvement in the strain energy corresponds to cases in which a ligand undergoes noticeable structure/conformation changes upon binding (e.g. PDB entry 3fe9). In cases where high strain is reported after QM it is probable that this strain is owing to poor initial conditions or DIS. Subsequent research will be performed in order to explore this hypothesis.
Table 5 shows the average strain energy for structures sorted by the original software used. While this table should not be interpreted as a robust, statistical comparison of the various conventional tools with one another, these results do illustrate that QM gives rise to a significant improvement in ligand strain (e.g. threefold to fourfold) regardless of the software used by the authors of the published PDB entries. This observation underscores the fact that problematic ligand geometry is an inherent weakness of conventional For the set of 50 structures studied, REFMAC seemed to have performed marginally better than other programs. However, even in the case of REFMAC, PHENIX/DivCon is still significantly superior.
|
In order to validate these SE-QM energies, they have been compared with those calculated at the HF/6-311+G** ab initio level of theory, and these values are also provided in Table 2. Ab initio calculations can be quite expensive depending upon the size of the ligand and the level of theory. As one would expect, since the ab initio functional was not used in the the change in strain is not as pronounced with the ab initio method as with the SE-QM method. However, in all cases, especially in those with higher initial ligand strain, the SE-QM yielded significantly improved ab initio-calculated ligand strains, suggesting that the SE-QM energies are quite robust. As PM6 replaces the much older AM1 Hamiltonian as the standard SE-QM method for we expect that the improvement in ligand strain will be even more robust.
A complete treatment of all individual examples would be beyond the space allowed for this paper. Instead, several specific examples from Table 2 are discussed below. For the individual analysis, we have chosen examples from different bins that also exhibit significant changes in strain energy and geometry of the ligand molecules after the QM All 50 structures are available for download as Supporting Information.
3.2.1. of 2wue
The et al., 2010). The strain energy of this ligand computed using the deposited coordinates at the AM1 level was 176.56 kcal mol−1. Closer examination of the ligand molecule revealed that key geometry parameters are severely distorted (Fig. 5). In particular, all of the bond lengths in the phenyl ring were 1.53–1.54 Å, compared with the typical CAr—CAr bond length of 1.38 Å (Allen et al., 1987). Furthermore, the C—O bonds of the COO− group (1.41 and 1.18 Å) deviate significantly from the length of the delocalized double C=O bond of 1.254 Å (Allen et al., 1987) in carboxylate anions. After QM region-specific using PHENIX/DivCon, it is found that the geometry parameters of the abovementioned moieties of the ligand approach the standard values for these bond lengths. This improvement in bond lengths is accompanied by a dramatic tenfold decrease of the ligand strain energy to 18.63 kcal mol−1.
of the hydrolase inhibited with the synthetic analog of the cholesterol cleavage product (the ligand KEK) originally refined at 1.8 Å resolution was published in 2010 (Lack3.2.2. of 2zl9
The structure 2zl9 was originally refined using CNS 1.1 against 1.9 Å resolution data (Shimizu et al., 2008). The vitamin D analog (ligand VDA) in the structure is found to have a strain energy of 283.35 kcal mol−1 that is likely to indicate significant errors in the ligand. Upon further inspection, almost all of the bond lengths in the ligand VDA significantly deviate from the corresponding standard values as listed in Table 6, and this deviation contributes to the high strain energy observed. For example, two C—S bonds (1.52 and 1.68 Å) are considerably shorter than the typical Csp3—S bond length of 1.819 Å (Allen et al., 1987). QM within PHENIX gives rise to a significant improvement of the bond lengths throughout the ligand structure (Table 6, Fig. 6), and overall the strain energy after the QM decreases by almost 12-fold to a value of 23.76 kcal mol−1.
|
3.2.3. of 2x7t
The structure 2x7t determined at 2.8 Å resolution (Cozier et al., 2010) features the enzyme carbonic anhydrase that has a tetrahedral zinc in the active site coordinated by three histidine residues (Fig. 7). The inhibitor (ligand ID WZB) is bound to zinc via an amino group to complete the coordination sphere of the metal. In the deposited structure, all Zn—N and Zn—O distances are in the range 2.14–2.25 Å, which are longer than the average length of 2.00 (2) Å for this bond type (Harding, 1999). This discrepancy leads to a distortion of the coordination sphere of the metal (Fig. 7, Table 7). Concerning the ligand geometry, the phenyl ring is distorted. In particular, one of the bond angles in the ring is 147°, and two bonds are shorter than the normal Car—Car bond length of 1.398 Å (Allen et al., 1987), while the other bonds are longer than the standard value. It is likely that these abnormities contribute to the high strain energy of 96.71 kcal mol−1 computed for this ligand using the deposited coordinates.
|
We carried out the QM PHENIX/DivCon the bond lengths in the coordination sphere of zinc are in the range 1.98–2.02 Å (Table 7), which are all remarkably close to the average values reported in the literature (Harding, 1999). Furthermore, the decrease in the ligand strain energy to 14.5 kcal mol−1 indicates significant improvements in the geometry of the bound molecule WZB, as is also seen from Table 7.
with no assumptions about the geometry of the coordination sphere expressed in the form of restraints. Upon completion of the3.2.4. of 3nck
Suicide inhibitors represent a traditional challenge for conventional ). Such ligands are covalently bound to certain amino acids (e.g. Ser or Lys), thus becoming part of the polypeptide chain. This bond makes it difficult to choose the correct sets of restraints for the chemically modified amino acid and the ligand. The structure 3nck determined at 2.8 Å resolution revealed the signal protein BlaC with a to nafcilin, which belongs to the family of β-lactam antibiotics. The residue Ser84 of BlaC interacts with the four-membered β-lactam ring of the antibiotic and results in ring opening and in the formation of a covalent ester bond between the OG atom of the amino acid and the carbonyl C atom of the ligand. The deposited BlaC–nafcilin (Nff) structure exhibits severe geometry distortions in the region of the ester bond between the ligand and Ser84. In particular, the bond angle ONff1—CNff1—OGSer84 is 102°, CANff1—CNff1—OGSer84 is 123° and CBSer84—OGSer84—CNff1 is 144° (Table 8, Fig. 8), two of which differ significantly from the ideal sp2 configuration arrangement of 120°.
(Kleywegt, 2007
|
As in all cases considered, no geometry restraints for the ligand and the surrounding active site, including the linkage bond CNff1—OGSer84, were utilized. Despite this, QM leads to correct geometry of the ligand and the ester bond and the structure is completely fixed relative to the original PDB model. Notably, the sum of the bond angles around the CNff1 atom (360°) indicates the planarity of the C(O)—C moiety, with all bonds being close to standard values (Allen et al., 1987). In particular, the length of the linkage bond CNff1—OGSer84 (1.39 Å) corresponds to the ideal single C—O bond in of 1.40 Å (Allen et al., 1987). Overall, the strain energy of nafcilin after the SE-QM is significantly improved in comparison to the PDB structure (Table 2). It should be noted that while in several of the cases presented the PDB-REDO project was able to partially recover or correct some of the erroneous structural details, in this case automatic re-refinement of the 3nck structure within the PDB-REDO project (Table 8) failed in the region of the linkage bond. PDB-REDO reports the bond angle CANff1—CNff1—OGSer84 as 127° and the bond angle CBSer84—OGSer84—CNff1 as 156°, which are worse than in the originally deposited structure.
3.2.5. of 3lxs
The structure 3lxs with the cysteine protease inhibitor 4MC was reported at 1.5 Å resolution (Chen, Brinen et al., 2010). In the deposited structure, one of the amide bonds in the inhibitor 4MC (Fig. 9) is notably distorted such that the length C17—O23 is 1.46 Å and the bond angles around the sp2 C atom C17 (Table 9) are significantly different from the ideal bond angle of 120°. Furthermore, in the difference electron-density map drawn using deposited structure factors and phases (Fig. 9) we found both positive and negative electron-density difference peaks in this region associated with the fragment in question, underscoring its poor geometry. Superimposition of the original and of the QM re-refined structures indicates that the improvement in the geometry is also associated with the movement of the amide moiety towards the positive peak of the difference density and away from the negative peak. As a result, no difference density peaks at or above the 3σ level around this amide bond are observed in the re-refined structure. Overall, the strain energy for this ligand 4MC is improved threefold after the QM (Table 2).
|
4. Discussion
X-ray macromolecular structures deposited in the PDB have almost exclusively been refined using conventional geometry restraints-based protocols and very rudimentary energy functionals. However, the high percentage of ligands in the PDB with unusual or questionable geometry (Gore et al., 2011) raises concerns about the extent that this database can be considered to be a reliable source of information on the structures of small-molecule compounds (Cooper et al., 2011) and therefore the protein–ligand complexes so important to structure-based drug discovery. It follows that protein using pre-calculated small-molecule restraint libraries is often not sufficient to determine the final ligand structure, and this observation is at least partially attributed to the fact that ligand dictionaries are very error-prone and frequently require manual editing. Theoretically, manual inspection should reduce a number of errors in the description; however, making advantageous corrections in the is time-consuming and requires significant a priori knowledge of the chemistry and of the ligand(s) of interest in their bound state. This task might be especially challenging for unseasoned crystallographers, who are responsible for about 50% of structure depositions in the PDB (Cooper et al., 2011), some biochemically oriented crystallographers who may be more comfortable with biochemical structures than with small-molecule organic structures, and for industrial crystallographers pressed for time. The bottom line is that an incorrect ligand description inevitably gives rise to problematic geometry and the principle `garbage in/garbage out' (Kleywegt, 2007) is well illustrated in these sorts of studies.
Efforts to improve the ligand representation for X-ray ab initio QM methods (eLBOW; Moriarty et al., 2009) to create the or deriving geometric values from small-molecule crystallographic data (Smart et al., 2011). The target ligand geometry is only part of the equation, and the shape of the harmonic function described by the weight Wk in (1) is also important. Those parameters determine how permissive restraints are and how well they allow a ligand to accommodate the bound geometry in the protein molecule. Unfortunately, there is not an easy recipe to tailor those parameters to a specific ligand–protein complex in situ. In essence, geometry restraints are biased towards isolated ligand geometry, while the X-ray diffraction method reports on the bound receptor–ligand structure. This contradiction might be a reason for the high ligand strain energy observed in conventional (Fu et al., 2011) and some of the strain reported in the present project. Since the macromolecule to which the ligand is bound also influences the structure of the ligand, it is hypothesized that in cases where bond lengths or angles diverge from expected values that these divergences may in fact be owing to the influences of the active site on the ligand as captured by the improved functional. A future study will focus on this very important question.
have focused either on using semiempirical orSE-QM et al., 2011), and in the present study we re-refined 50 protein structures as denoted in Table 2 and found that PHENIX/DivCon consistently produces ligands of significantly lower strain energy values compared with those derived from the original PDB structures. Therefore, even though large differences in pure crystallographic metrics (e.g. Rfree, Rwork etc.) are rarely observed between QM and conventional refinements, differences in ligand strain are often quite large.
demonstrated exceptional robustness and produced high-quality ligand stereochemistry independent of the ligand library. This result is owing to the fact that the SE-QM procedure is rooted in solving the Schrödinger equation for the system and hence does not require the ligand dictionary. This robustness also makes the QM X-ray methodology ideally suited for high-throughput crystallography, especially in the industrial setting. Specifically, the geometric quality of the ligand structure can be quantitatively evaluated using the strain energy metrics (FuThe intrinsic weaknesses of the geometry restraints can be observed in the cases of covalently bound ligand and/or metal-containing examples. From the QM perspective, since the concept of bound versus unbound is immaterial, the ligand and the active site are treated as a single unit and the QM Hamiltonian captures effects owing to electrostatics, polarization, dispersion, charge transfer and so on that are omitted in conventional leading to greater methodological robustness. In order to approximate the qualities that QM provides, the investigator would need a complete set of very accurate in situ restraints, and as observed with the PDB-REDO results, more accurate restraints can certainly aid in the process. However, as demonstrated by decades of molecular-mechanics research, the goal of developing a robust, truly generalizable force field is exceedingly difficult to reach. For instance, inclusion of the PDB-REDO results also shows that a few missing or incorrect restraints, such as those involving the protein–ligand can lead to very deleterious results when the conventional functional is unable to compensate.
Overall, QM ; Řezáč et al., 2009), PM6-DH2 (Korth et al., 2010) and PM7 (Hostaš et al., 2013; Stewart, 2013). These advances in the development of semiempirical methods make the QM methodology even more attractive and promising. In addition to the underlying Hamiltonian, there are a number of key determinants of success that are beyond the scope of the present work, including the protonation state(s) and chemical connectivity chosen by the investigator as well as the starting position(s) of the ligand(s), which is a key contributor to DIS.
addresses method-induced strain or MIS, and therefore the quality of the final model is limited by the parameterization of the semiempirical QM Hamiltonian. In recent years there has been significant Hamiltonian development including PM6 (Stewart, 20095. Conclusions
In this work, we present a new methodology to perform X-ray PHENIX. The PHENIX/DivCon protocol has been validated against over 50 protein–ligand structures and the method clearly results in chemically reasonable ligand geometry even where conventional exhibited difficulties. The problematic ligand geometries found in X-ray structures often originate with the use of small-molecule libraries that require a priori knowledge of the refined protein–ligand complex structure. The QM method derives the system characteristics such as energy and gradients by solving the Schrödinger equation in each cycle, and therefore does not depend upon the ligand dictionary file, fixed atom types, link restraints, coordination-sphere parameters or other `user-supplied' characteristics. It is therefore more robust then conventional methods when addressing ligand strain and artifacts induced by the rudimentary energy function (e.g. MIS) found in conventional tools. A case in point is the comparison of several QM re-refined structures with those treated within the PDB-REDO project (Joosten et al., 2012; Tables 6, 7, 8 and 9). While the geometry has improved for some ligands in the PDB-REDO structures (e.g. 2zl9), it also failed to correct more challenging cases such as covalently bound ligands (e.g. 3nck) or a zinc-bound ligand (e.g. 2x7t). These results suggest that PDB-REDO structures also suffer from the same problems associated with reliance on explicit ligand descriptions and likely the rudimentary functional used.
using the semiempirical quantum-mechanics functional implemented as a plug-in to the crystallographic packageWhen using the method as presented, the successful investigator must still take responsibility for providing a reasonable initial atom placement along with a correct ligand chemistry including connectivity and protonation. For example, while some bond breaking/making can occur with QM methods, a functional alone will not automatically or reliably transmute benzene (i.e. C6H6) into cyclohexane (i.e. C6H12) during Likewise, since the underlying method is gradient-based, it has the same limited radius of convergence as conventional methods, which often cannot overcome large barriers (Brünger & Rice, 1997). Therefore, simulated annealing followed by the QM could be the right protocol when grossly wrong conformations are given as the input.
With this in mind, a logical next stage of the project is to pair PHENIX/DivCon with tools such as LigandFit (Terwilliger et al., 2006), AFIT (Wlodek et al., 2006) and/or simulated annealing in order to account for the major components of artificial strain (e.g. MIS and DIS) and to gain a much better understanding of the strain induced naturally through protein–ligand binding (e.g. TIS). With such a study, one could more easily decompose and measure the influence of the surrounding active site on structure and strain. Further, since strain is really only a measure of the chemical structure of the ligand versus the in vacuo structure, an additional metric to be considered in subsequent work is the change in predicted binding affinity and interaction energies upon QM One could imagine a final workflow in which the investigator uses a density-aware docking function to generate various starting geometries (perhaps with different connectivity or protonation), refines each structure using PHENIX/DivCon, and uses ligand strain, standard crystallographic metrics and QM interaction descriptors (Diller et al., 2010; Zhang et al., 2010; Raha & Merz, 2004, 2005) to inform his or her decision about which model is most correct.
Supporting information
50 QM refined PDB structures. DOI: https://doi.org/10.1107/S1399004714002260/rr5046sup1.zip
Supporting Information. DOI: https://doi.org/10.1107/S1399004714002260/rr5046sup2.pdf
Acknowledgements
We thank the entire PHENIX development team, especially Drs Paul Adams, Nigel Moriarty, Ralf Grosse-Kunstleve and Nat Echols, for technical support and for instructional discussions early in the integration effort; Drs Nestor Concha and Patricia Elkins (GlaxoSmithKline) for discussion and perspective during the validation process; Dr Greg Warren (OpenEye Inc.) for suggested structures on which to focus our validation efforts; and NIH SBIR 1R42GM079899 for funding the research and development effort.
References
Adams, P. D. et al. (2010). Acta Cryst. D66, 213–221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Afonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Allen, F. H., Kennard, O., Watson, D. G., Brammer, L., Orpen, A. G. & Taylor, R. (1987). J. Chem. Soc. Perkin Trans. 2, pp. S1–S19. CSD CrossRef Web of Science Google Scholar
Borbulevych, O., Merz Jr, K. M. & Westerhoff, L. M. (2011). Acta Cryst. A67, C593. CrossRef IUCr Journals Google Scholar
Borbulevych, O. Y., Plumley, J. A. & Westerhoff, L. M. (2012). Abstr. Pap. Am. Chem. Soc., abstract 478. Google Scholar
Borbulevych, O. & Westerhoff, L. M. (2011). Abstr. Pap. Am. Chem. Soc., abstract 242. Google Scholar
Brünger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Acta Cryst. D54, 905–921. Web of Science CrossRef IUCr Journals Google Scholar
Brünger, A. T. & Rice, L. M. (1997). Methods Enzymol. 277, 243–269. PubMed CAS Web of Science Google Scholar
Case, D. A. et al. (2010). AMBER 11. University of California, San Francisco, USA. Google Scholar
Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21. Web of Science CrossRef CAS IUCr Journals Google Scholar
Chen, Y. T., Brinen, L. S., Kerr, I. D., Hansell, E., Doyle, P. S., McKerrow, J. H. & Roush, W. R. (2010). PLoS Negl. Trop. Dis. 4, e825. Web of Science CrossRef PubMed Google Scholar
Cooper, D. R., Porebski, P. J., Chruszcz, M. & Minor, W. (2011). Expert Opin. Drug Discov. 6, 771–782. Web of Science CrossRef CAS PubMed Google Scholar
Cozier, G. E., Leese, M. P., Lloyd, M. D., Baker, M. D., Thiyagarajan, N., Acharya, K. R. & Potter, B. V. L. (2010). Biochemistry, 49, 3464–3476. Web of Science CrossRef CAS PubMed Google Scholar
Davis, I. W., Leaver-Fay, A., Chen, V. B., Block, J. N., Kapral, G. J., Wang, X., Murray, L. W., Arendall, W. B., Snoeyink, J., Richardson, J. S. & Richardson, D. C. (2007). Nucleic Acids Res. 35, W375–W383. Web of Science CrossRef PubMed Google Scholar
Dewar, M. J. S., Zoebisch, E. G., Healy, E. F. & Stewart, J. J. P. (1985). J. Am. Chem. Soc. 107, 3902–3909. CrossRef CAS Web of Science Google Scholar
Diller, D. J., Humblet, C., Zhang, X. & Westerhoff, L. M. (2010). Proteins, 78, 2329–2337. Web of Science CrossRef CAS PubMed Google Scholar
Dixon, S. L. & Merz, K. M. Jr (1996). J. Chem. Phys. 104, 6643. CrossRef Web of Science Google Scholar
Dixon, S. L. & Merz, K. M. Jr (1997). J. Chem. Phys. 107, 879. CrossRef Web of Science Google Scholar
Dixon, S., Merz, K. M. Jr, Lauri, G. & Ianni, J. C. (2005). J. Comput. Chem. 26, 23–34. Web of Science CrossRef PubMed CAS Google Scholar
Dodson, E., Kleywegt, G. J. & Wilson, K. (1996). Acta Cryst. D52, 228–234. CrossRef CAS Web of Science IUCr Journals Google Scholar
Engh, R. A. & Huber, R. (1991). Acta Cryst. A47, 392–400. CrossRef CAS Web of Science IUCr Journals Google Scholar
Evans, P. R. (2007). Acta Cryst. D63, 58–61. Web of Science CrossRef CAS IUCr Journals Google Scholar
Fu, Z., Li, X. & Merz, K. M. Jr (2011). J. Comput. Chem. 32, 2587–2597. Web of Science CrossRef CAS PubMed Google Scholar
Fu, Z., Li, X. & Merz, K. M. Jr (2012). J. Chem. Theory Comput. 8, 1436–1448. Web of Science CrossRef CAS PubMed Google Scholar
Gore, S., Olsson, T. S. G. & Zhuravleva, M. (2011). Acta Cryst. A67, C104. CrossRef IUCr Journals Google Scholar
Harding, M. M. (1999). Acta Cryst. D55, 1432–1443. Web of Science CrossRef CAS IUCr Journals Google Scholar
Hostaš, J., Řezáč, J. & Hobza, P. (2013). Chem. Phys. Lett. 568–569, 161–166. Google Scholar
Jaskolski, M., Gilski, M., Dauter, Z. & Wlodawer, A. (2007). Acta Cryst. D63, 611–620. Web of Science CrossRef CAS IUCr Journals Google Scholar
Joosten, R. P., Joosten, K., Murshudov, G. N. & Perrakis, A. (2012). Acta Cryst. D68, 484–496. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kleywegt, G. J. (2007). Acta Cryst. D63, 94–100. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kleywegt, G. J. & Jones, T. A. (1996). Structure, 4, 1395–1400. CrossRef CAS PubMed Web of Science Google Scholar
Korth, M., Pitoňák, M., Řezáč, J. & Hobza, P. (2010). J. Chem. Theory Comput. 6, 344–352. Web of Science CrossRef CAS Google Scholar
Labute, P. (2005). J. Chem. Inf. Model. 45, 215–221. Web of Science CrossRef PubMed CAS Google Scholar
Lack, N. A., Yam, K. C., Lowe, E. D., Horsman, G. P., Owen, R. L., Sim, E. & Eltis, L. D. (2010). J. Biol. Chem. 285, 434–443. Web of Science CrossRef PubMed CAS Google Scholar
Lascombe, M.-B., Ponchet, M., Venard, P., Milat, M.-L., Blein, J.-P. & Prangé, T. (2002). Acta Cryst. D58, 1442–1447. Web of Science CrossRef CAS IUCr Journals Google Scholar
Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). J. Appl. Cryst. 26, 283–291. CrossRef CAS Web of Science IUCr Journals Google Scholar
Li, X., Hayik, S. A. & Merz, K. M. Jr (2010). J. Inorg. Biochem. 104, 512–522. Web of Science CrossRef CAS PubMed Google Scholar
Liebeschuetz, J., Hennemann, J., Olsson, T. & Groom, C. R. (2012). J. Comput. Aided Mol. Des. 26, 169–183. Web of Science CSD CrossRef CAS PubMed Google Scholar
Liu, D. C. & Nocedal, J. (1989). Math. Program. 45, 503–528. CrossRef Web of Science Google Scholar
Merritt, E. A. (2012). Acta Cryst. D68, 468–477. Web of Science CrossRef CAS IUCr Journals Google Scholar
Merz, K. M. Jr & Raha, K. (2011). US Patent 7904283. Google Scholar
Moriarty, N. W., Grosse-Kunstleve, R. W. & Adams, P. D. (2009). Acta Cryst. D65, 1074–1080. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Acta Cryst. D53, 240–255. CrossRef CAS Web of Science IUCr Journals Google Scholar
Perola, E. & Charifson, P. S. (2004). J. Med. Chem. 47, 2499–2510. Web of Science CrossRef PubMed CAS Google Scholar
Peters, M. B. & Merz, K. M. Jr (2006). J. Chem. Theory Comput. 2, 383–399. Web of Science CrossRef CAS Google Scholar
Pozharski, E., Weichenberger, C. X. & Rupp, B. (2013). Acta Cryst. D69, 150–167. Web of Science CrossRef CAS IUCr Journals Google Scholar
Raha, K. & Merz, K. M. Jr (2004). J. Am. Chem. Soc. 126, 1020–1021. Web of Science CrossRef PubMed CAS Google Scholar
Raha, K. & Merz, K. M. Jr (2005). J. Med. Chem. 48, 4558–4575. Web of Science CrossRef PubMed CAS Google Scholar
Řezáč, J., Fanfrlík, J., Salahub, D. & Hobza, P. (2009). J. Chem. Theory Comput. 5, 1749–1760. PubMed Google Scholar
Schmidt, M. W., Baldridge, K. K., Boatz, J. A., Elbert, S. T., Gordon, M. S., Jensen, J. H., Koseki, S., Matsunaga, N., Nguyen, K. A., Su, S., Windus, T. L., Dupuis, M. & Montgomery, J. A. (1993). J. Comput. Chem. 14, 1347–1363. CrossRef CAS Web of Science Google Scholar
Shimizu, M., Miyamoto, Y., Takaku, H., Matsuo, M., Nakabayashi, M., Masuno, H., Udagawa, N., DeLuca, H. F., Ikura, T. & Ito, N. (2008). Bioorg. Med. Chem. 16, 6949–6964. Web of Science CrossRef PubMed CAS Google Scholar
Smart, O. S., Womack, T. O., Flensburg, C., Keller, P., Paciorek, W., Sharff, A., Vonrhein, C. & Bricogne, G. (2011). Acta Cryst. A67, C134. CrossRef IUCr Journals Google Scholar
Stewart, J. J. P. (2009). J. Mol. Model. 15, 765–805. Web of Science CrossRef PubMed CAS Google Scholar
Stewart, J. J. P. (2013). J. Mol. Model. 19, 1–32. Web of Science CrossRef CAS PubMed Google Scholar
Terwilliger, T. C., Klei, H., Adams, P. D., Moriarty, N. W. & Cohn, J. D. (2006). Acta Cryst. D62, 915–922. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tronrud, D. E. (2004). Acta Cryst. D60, 2156–2168. Web of Science CrossRef CAS IUCr Journals Google Scholar
Vaart, A. van der, Gogonea, V., Dixon, S. L. & Merz, K. M. Jr (2000). J. Comput. Chem. 21, 1494–1504. Google Scholar
Vaart, A. van der., Suarez, D. & Merz, K. M. Jr (2000). J. Chem. Phys. 113, 10512–10523. Google Scholar
Wang, B., Raha, K. & Merz, K. M. Jr (2004). J. Am. Chem. Soc. 126, 11430–11431. Web of Science CrossRef PubMed CAS Google Scholar
Wang, B., Westerhoff, L. M. & Merz, K. M. Jr (2007). J. Med. Chem. 50, 5128–5134. Web of Science CrossRef PubMed CAS Google Scholar
Williams, D. E., Peters, M. B., Wang, B., Roitberg, A. E. & Merz, K. M. Jr (2009). J. Phys. Chem. A, 113, 11550–11559. Web of Science CrossRef PubMed CAS Google Scholar
Wlodek, S., Skillman, A. G. & Nicholls, A. (2006). Acta Cryst. D62, 741–749. Web of Science CrossRef CAS IUCr Journals Google Scholar
Yu, N., Yennawar, H. P. & Merz, K. M. Jr (2005). Acta Cryst. D61, 322–332. Web of Science CrossRef CAS IUCr Journals Google Scholar
Zhang, X., Gibbs, A. C., Reynolds, C. H., Peters, M. B. & Westerhoff, L. M. (2010). J. Chem. Inf. Model. 50, 651–661. Web of Science CrossRef CAS PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.