computer programs
MatchMaps: non-isomorphous difference maps for X-ray crystallography
aDepartment of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA, and bSchool of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA
*Correspondence e-mail: dennisbrookner@fas.harvard.edu, doeke_hekstra@harvard.edu
Conformational change mediates the biological functions of macromolecules. Crystallographic measurements can map these changes with extraordinary sensitivity as a function of mutations, ligands and time. A popular method for detecting structural differences between crystallographic data sets is the isomorphous difference map. These maps combine the phases of a chosen reference state with the observed changes in MatchMaps, that can be run in any software environment supporting PHENIX [Liebschner et al. (2019). Acta Cryst. D75, 861–877] and CCP4 [Agirre et al. (2023). Acta Cryst. D79, 449–461]. Worked examples show that MatchMaps `rescues' observed difference electron-density maps for poorly corrects artifacts in nominally isomorphous difference maps, and extends to detecting differences across copies within the or across altogether different crystal forms.
amplitudes to yield a map of changes in electron density. Such maps are much more sensitive to conformational change than structure is, and are unbiased in the sense that observed differences do not depend on of the perturbed state. However, even modest changes in unit-cell properties can render isomorphous difference maps useless. This is unnecessary. Described here is a generalized procedure for calculating observed difference maps that retains the high sensitivity to conformational change and avoids structure of the perturbed state. This procedure is implemented in an open-source Python package,Keywords: X-ray crystallography; open-source software; difference maps.
1. Introduction
X-ray crystallography provides a powerful method for characterizing the changes in protein structure caused by a perturbation (Hekstra et al., 2016; Keedy et al., 2018; Bhabha et al., 2015; Brändén & Neutze, 2021). For significant structural changes, it is usually sufficient to refine separate structural models for each data set and draw comparisons between the refined structures. However, for many conformational changes, coordinate-based comparisons are inaccurate and insensitive.
In crystallography, electron density is not observed directly. Rather, one observes a diffraction pattern consisting of reflections with intensities proportional to the squared amplitudes of the structure factors – the Fourier components of the electron density. Unfortunately, the phases of these structure factors are not observable. These phases correspond in real space to shifts of the sinusoidal waves that add up to an electron-density pattern. Accordingly, phases are usually calculated from a refined model. Since phases have a strong effect on the map appearance (Read, 1986), naïve electron-density maps calculated using observed amplitudes and model-based phases will tend to resemble the model, a phenomenon known as model bias.
Conformational changes in crystallography, and especially in room-temperature or time-resolved crystallography, are often detected via an isomorphous difference map (Rould & Carter, 2003). Such a map is computed by combining differences in observed amplitudes with a single set of phases. The phases are usually derived from a model for one of the two states, chosen as a reference. Thus, the difference density Δρ(x) is approximated as
where and are sets of observed h is shorthand for the triplet of (h, k, l) and x is shorthand for the real-space fractional coordinates (x, y, z). Crucially, therefore, isomorphous difference maps do not include any information derived from modeling of the ON structure. Any difference electron density relating to the ON data relative to the OFF data (e.g. positive difference density for a bound ligand) is thus guaranteed not to be biased by previous modeling of the ON state. Unfortunately, interesting conformational changes often slightly alter the packing of molecules in the which can manifest as changes in unit-cell dimensions. Unit-cell constants are also sensitive to temperature (Fraser et al., 2011), radiation damage (Ravelli & McSweeney, 2000), pressure (Barstow et al., 2008) and humidity (Farley et al., 2014), meaning that even data collected on the same crystal may not be quite isomorphous.
amplitudes from the ON (perturbed) and OFF (reference) data sets, respectively, is a set of calculated phases derived from a structural model of the OFF data,In this contribution, we will illustrate the consequences of deviations from perfect isomorphism, introduce an approach to the calculation of difference maps without perfect isomorphism, and describe examples of the application of the software implementing this approach (MatchMaps) to a number of typical use cases. We find the MatchMaps approach to be also applicable to molecules related by and to molecules crystallized in altogether different crystal forms.
1.1. Implications of isomorphism
We begin by demonstrating the consequences of small deviations from perfect isomorphism. Our example makes use of three data sets (Sawaya & Kraut, 1997), all of Escherichia coli dihydrofolate reductase (DHFR) crystallized in P212121. These data sets vary by which ligands are bound to DHFR; we will discuss these ligands further below. Data sets 1rx2 and 1rx1 have unit-cell dimensions identical to within 0.4%, whereas data sets 1rx2 and 1rx4 differ by 2% along the c axis. Reflections in diffraction experiments report on different 3D frequency components of the electron density of molecules in the As such, the shape of the molecular arrangement may look essentially the same (that is, isomorphous) at low spatial resolution yet entirely different at high resolution [recall that the contributions of different atoms j to structure factors add up by terms for scattering vector s and atomic position rj]. Measuring this quantitatively, we see much higher correlations between the amplitudes for our highly isomorphous pair of data sets (1rx2 and 1rx1) than for our `poorly isomorphous' pair (1rx2 and 1rx4) [solid lines in Fig. 1(a)]. We find a similarly stark difference in correlations for the phases of refined models, whether measured by a figure of merit, , or by a (liable to small phase-wrapping artifacts). The loss in similarity of phases is visually striking [Figs. 1(b) and 1(c)].
We expect the consequences of such a loss of isomorphism to be severe: the computation of an isomorphous difference map requires that (i) amplitude differences are large only when phase differences are small, and conversely that (ii) phase differences are large only when amplitude differences are small. These requirements follow from equation (1) above and are depicted visually in the work of Rould & Carter (2003). The isomorphous data meet these requirements [Fig. 1(d)]. In contrast, the poorly isomorphous data sets display consistently large amplitude differences, regardless of the corresponding phase difference [Fig. 1(e)].
1.2. Rethinking isomorphous difference maps via the linearity of the Fourier transform
An isomorphous difference map is typically computed by first subtracting the i.e. subtracting in reciprocal space) and then applying the Fourier transform to convert the differences into a real-space difference map. However, because the Fourier transform and subtraction are both linear operations, their order can be switched without changing the result; one might just as well calculate two electron-density maps first and then subtract those maps voxel by voxel to yield an isomorphous difference map.
amplitudes (This reordering suggests how difference-map computation can be generalized beyond the isomorphous case. In particular, we see that the step in the algorithm most specific to the assumption of isomorphism is the construction of `hybrid' structure factors, which combine the observed
amplitudes for the ON data () with the calculated phases for the OFF data (). The resulting structure factors thus have the formCritically, if the ON and OFF data differ in unit-cell volume and/or molecular orientation, these OFF phases may be incompatible with the ON amplitudes.
The method presented below improves these hybrid structure factors by computing phases that account for the (generally uninteresting) shifts in molecular position and orientation without removing any signal associated with `interesting' changes.
2. The MatchMaps algorithm
The goal of MatchMaps is to achieve the best possible real-space difference density map without utilizing a prior model of any structural changes of interest. To compute a real-space difference density map, one first needs to approximate phases for each data set. As discussed above, the isomorphous difference map makes the simplifying assumption that the same set of phases can be used for both structures.
The key to MatchMaps is to improve phases for the ON data via rigid-body of the OFF starting model against the ON amplitudes. This rigid-body step improves phases by optimally placing the protein model in space. Critically, the restriction of this to only whole-model rigid-body motion protects these new phases from bias towards modeled structural changes. The result is two sets of complex structure factors which make use of the information encoded in the amplitudes without relying on a second input model.
Next, each set of complex structure factors is Fourier-transformed into a real-space electron-density map. These two real-space maps will not necessarily overlay in space. However, the rotation and translation necessary to overlay the maps can be obtained from the results of the rigid-body
Following real-space alignment, the maps can be subtracted voxel-wise to compute a difference map.In the idealized case – similar structures, oriented identically in space, with identical unit cells – MatchMaps will perform essentially identically to an isomorphous difference map. However, as we show in the examples below, MatchMaps is more capable than a traditional isomorphous difference map of handling data sets that diverge from this ideal. Furthermore, even in seemingly simple cases where isomorphous difference maps perform well, the real-space MatchMaps approach can show distinct improvements.
2.1. Details of algorithmic implementation
The full MatchMaps algorithm is as follows. As inputs, the algorithm requires two sets of amplitudes (referred to as ON and OFF data sets, for simplicity) and a single starting model (corresponding to the OFF data).
(i) If necessary, place both sets of SCALEIT (Henderson & Moffat, 1971) utility in CCP4 (Agirre et al., 2023).
amplitudes on a common scale using the(ii) Truncate both data sets to the same resolution range. This prevents the final difference map from preferentially displaying high-resolution features from the higher-resolution data set.
(iii) Generate phases for each data set via the phenix.refine program (Liebschner et al., 2019). For each data set, the OFF starting model is used and only rigid-body is permitted, to prevent the introduction of model bias. Bulk-solvent scaling may be either included (by default) or omitted from Including bulk-solvent scaling leads to better and higher map quality overall. However, bulk-solvent scaling may `flatten' desired signal in the solvent region, e.g. for a large bound ligand. This trade-off is left to the user.
(iv) Create complex structure factors by combining observed reciprocalspaceship (Greisman et al., 2021) and gemmi (Wojdyr, 2022).
amplitudes with computed phases obtained from Fourier transform each set of complex structure factors into a real-space electron-density map; this is performed using the Python packages(v) Compute the translation and rotation necessary to overlay the two rigid-body refined models. Apply this translation–rotation to the ON real-space map such that it overlays with the OFF map. These computations are carried out using gemmi. Note that the two rigid-body refined models are identical aside from translation and rotation, rendering trivial the atom selection for alignment.
(vi) Subtract real-space maps voxel-wise.
(vii) Apply a solvent mask to the final difference map.
We note that MatchMaps is structured such that Step (ii) can be generalized to not only rigid-body but of any `uninteresting features' if the user provides a custom PHENIX (Liebschner et al., 2019) parameter file as specified in the online documentation. For example, if the starting model contains multiple protein chains, each chain can be rigid-body-refined separately.
2.2. Installation
MatchMaps can be installed using the pip Python package manager (pip install matchmaps). The various pure-Python dependencies of MatchMaps are handled by pip. Additionally, MatchMaps requires installation of the popular CCP4 and PHENIX software suites for crystallography. Once installed, the above protocol can be run in a single step from the command line.
In addition to the base MatchMaps command-line utility, the utilities matchmaps.ncs and matchmaps.mr provide additional functionalities explored in the examples below and the online documentation. MatchMaps is fully open source and readily extensible for novel use cases.
For more information, read the MatchMaps documentation at https://rs-station.github.io/matchmaps.
3. MatchMaps in the context of and alternative approaches
MatchMaps is not a replacement for automatic and manual structural of crystallographic data. Rather, we argue that MatchMaps provides a valuable supplement to structural when the crystallographer seeks to characterize a structural change. MatchMaps can be implemented near the beginning of the analysis process to visualize the ON–OFF signal before an ON model has been refined. MatchMaps can also be used during or following to validate or justify structural differences modeled during of the ON and OFF models.
Below, we discuss two alternative methods which supplement structure MatchMaps.
and which contrast interestingly with3.1. Fo–Fc difference maps across data sets
A common element of structure Fo–Fc map (or, more precisely, mFo–DFc), which is used to describe how the modeled structure differs from the data. Details of the construction of such a map can be found elsewhere (Lamb et al., 2015). In practice, Fo–Fc maps are often the output of a procedure including of atomic coordinates. In principle, however, an Fo–Fc map can derive from a rigid-body-only of a known structure to a new data set. In this latter scenario, the Fo–Fc map is similar to a MatchMaps difference map (or, in an isomorphous case, to an isomorphous difference map).
is theThe difference between an Fo–Fc map and a MatchMaps difference map is that, whereas MatchMaps only ever uses observed amplitudes, the Fo–Fc map describes the OFF/reference data set using calculated amplitudes. In the limiting case where the OFF model describes the OFF data perfectly, the Fo–Fc map should look like a MatchMaps difference map. In fact, an Fo–Fc map may look better, because the map coefficients include only one set of measurement errors. Unfortunately, however, any modeling errors of the OFF/reference state will be included in the final Fo–Fc map. Accordingly, in an Fo–Fc map, it is impossible to distinguish `real signal' (differences between the ON and OFF data) from modeling errors. We illustrate this undesired behavior below [Figs. 2(j)–2(k) and 3(i)–3(j)].
Note that the map coefficients for an Fo–Fc map are created and saved by MatchMaps (if the --keep-temp-files flag is used), facilitating easy comparison between these two map types if desired.
3.2. PanDDA
A popular recent method for extracting subtle ligand-binding signal from crystallographic data is the pan-data-set density analysis (PanDDA) approach (Pearce et al., 2017). A key practical difference between PanDDA and MatchMaps is that, while PanDDA expects several (typically of the order of dozens of) data sets, MatchMaps supports only two data sets at once. Additionally, whereas MatchMaps never changes the internal atomic coordinates of the input model, PanDDA aligns all input structures and maps via a local warping procedure. Thus, PanDDA reduces its ability to describe protein conformational changes in order to maximize its ability to detect weak ligand-binding events.
4. Examples
The following examples explore the benefits and functionalities offered by MatchMaps. All examples make use of published crystallographic data available from the Protein Data Bank (https://www.rcsb.org/). Scripts and data files for reproducing the figures can be found on Zenodo (Brookner & Hekstra, 2024).
4.1. MatchMaps for poorly isomorphous DHFR data sets
The enzyme dihydrofolate reductase is a central model system for understanding the role of conformational change in productive catalytic turnover (Sawaya & Kraut, 1997; Boehr et al., 2006; Bhabha et al., 2011). Specifically, the active-site Met20 loop of E. coli DHFR can adopt several different conformations, each stabilized by particular bound ligands and crystal contacts (Sawaya & Kraut, 1997). DHFR bound to NADP+ and substrate analog folate adopts a `closed' Met20 loop (PDB ID 1rx2), whereas DHFR bound to NADP+ and product analog (dideazatetrahydrofolate) adopts an `occluded' Met20 loop (PDB ID 1rx4). These structures are highly similar, other than the relevant changes at the active site [Fig. 2(a), structural changes shown in boxes; r.m.s.d. 0.37 Å for protein Cα atoms excluding the Met20 loop].
Importantly, the presence of the occluded-loop conformation leads to altered crystal packing wherein the crystallographic b axis increases by 2%, from 98.91 to 100.88 Å [Fig. 2(c)]. Thus, 1rx2 and 1rx4 are `poorly isomorphous', meaning that these structures, though extremely similar, cannot be effectively compared by an isomorphous difference map [Figs. 2(d) and 2(g)]. We illustrate the striking change in phase between these structures in Fig. 1. MatchMaps is able to account for this poor isomorphism and recover the expected difference signal.
First, we focus on ligand rearrangement in the active site. In the occluded-loop structure, the cofactor [Figs. 2(d)–2(f), left] leaves the active site while the substrate [Figs. 2(d)–2(f), right] slides laterally within the active site. MatchMaps shows this expected signal, with negative (red) difference density for the cofactor and paired positive (blue) and negative (red) difference density for the substrate [Figs. 2(e)–2(f)]. There is even faint positive signal for the `swung-out' cofactor [Figs. 2(e)–2(f), far left]. By contrast, an isomorphous difference map [Fig. 2(d)] is unable to recover this signal. A model of the occluded-loop structure is shown for clarity in Fig. 2(f) as blue sticks and clearly matches the positive difference density. Importantly, this ON model is never used in the computation of the MatchMaps map.
We find a similar result around residues 21–25 of the Met20 loop [Figs. 2(g)–2(i)]. Again, MatchMaps shows readily interpretable difference signal for the change in loop conformation between the closed-loop (red) and occluded-loop (blue) structures [Figs. 2(h)–2(i)]. The isomorphous difference map, on the other hand, contains no interpretable signal in this region of strong structural change [Fig. 2(g)]. The occluded-loop model is shown for visual comparison in Fig. 2(i) but was not used for computation of the MatchMaps map.
4.1.1. MatchMaps is not susceptible to modeling errors
As discussed above, Fo–Fc maps can often display similar information to MatchMaps difference maps. However, Fo–Fc maps will also contain signal that is not a difference between ON and OFF data sets, but rather results from modeling errors of the OFF model to the OFF data. We demonstrate this behavior by introducing a spurious conformer of phenylalanine 103 to the OFF starting model used above. Phe103 lies in a region distal to the ligands and active site [Fig. 2(b)]. An Fo–Fc map, which inherently includes modeling errors, shows strong positive and negative difference density suggesting the correct Phe103 conformer [Fig. 2(j)]. From the Fo–Fc map alone, it would be impossible to determine if this signal represented a difference between the ON and OFF data or a modeling error. In contrast, the MatchMaps difference map shows no difference density for this side chain [Fig. 2(k)]. This is the desired and expected result; neither data set's Fo contains any information about this spurious conformer.
4.2. MatchMaps for poorly isomorphous HEWL data sets with a translation artifact
Hen egg-white lysozyme (HEWL) is among the best characterized model enzymes and has been the subject of many crystallographic analyses. One such analysis is high-pressure protein crystallography (HPPX), wherein crystal structures are collected at pressures ranging from ambient to hundreds of megapascals. Notably, HPPX is frequently associated with unit-cell changes. Here, we use MatchMaps to compare an ambient-pressure apo structure of HEWL (PDB ID 4wld) with a (GlcNAc)4-bound structure collected at 920 MPa (PDB ID 4xen) (Yamada et al., 2015). The a and b axes of the shrink from 79.197 to 76.152 Å as a result of pressure, a change of nearly 4%.
First, we examine the positive difference density (blue mesh) for the bound (GlcNAc)4 (gray sticks) in both MatchMaps and an isomorphous difference map. While signal for the ligand is present in both maps, the density from MatchMaps is more clearly contoured to the high-resolution features of the ligand [Fig. 3(d)], whereas the isomorphous signal is weaker and less precisely located [Fig. 3(c)]. When viewing the density in the surrounding region at the same contour level (±2.5σ, positive as blue mesh, negative as red mesh), it is clear that the isomorphous map [Fig. 3(e)] is noisier than MatchMaps [Fig. 3(f)].
Additionally, these data illustrate how poor isomorphism can manifest as a strong translation artifact [Fig. 3(a)]. In this case, the main `interesting' difference between the high- and low-pressure structures is a slight overall constriction of the protein. This change can be visualized by examining the structural models following alignment [Fig. 3(b)]. Relative to the low-pressure model (red cartoon), the high-pressure model (blue cartoon) moves downward in the upper half of the protein and upward in the lower half of the protein. This total constriction is 0.77 Å, measured as the change in distance between the Cα of residues 25 and 69. However, this subtle change is obscured when viewing the original unaligned coordinates from each structure [Fig. 3(a)]. The high-pressure model (gray cartoon) differs from the low-pressure model (red cartoon), not only by a slight constriction but also by a larger (1.48 Å) lateral translation.
By construction, isomorphous difference maps are susceptible to the translation artifact described here, whereas MatchMaps is not. This effect is visible throughout the isomorphous difference map, which is dominated by this artifact. As an example, we show the difference densities around the disulfide bond between Cys64 and Cys80. The positive (blue) and negative (red) signal in the isomorphous difference map [Fig. 3(g)] corresponds to the original unaligned coordinates from the low-pressure (red) and high-pressure (gray) models. In contrast, the positive and negative signal from MatchMaps [Fig. 3(h)] corresponds to the slight shift between the low-pressure model (red) and the high-pressure model (blue) following alignment to the low-pressure model.
4.2.1. MatchMaps is not susceptible to modeling errors
The high-pressure data set again illustrates how modeling errors (differences between the OFF model and OFF data) will appear in an Fo–Fc map derived from rigid-body of the OFF model against the ON data. To illustrate this, we erroneously omitted a bound sodium ion. As expected, the Fo–Fc map [Fig. 3(i)] shows strong positive (blue mesh) signal around the omitted sodium ion (purple sphere). Importantly, although this signal corresponds to a modeling error, it is indistinguishable from `real' signal, i.e. a situation wherein the ion were present in the high-pressure structure but not the low-pressure structure. MatchMaps [Fig. 3(j)] does not display any signal for this ion, which is the desired behavior. Omitting the sodium ion from the OFF model has no significant effect on the MatchMaps signals described above.
4.3. MatchMaps for isomorphous PTP1B data with a rotation artifact
The enzyme protein tyrosine phosphatase 1B (PTP1B) plays a key role in insulin signaling (Elchebly et al., 1999), making it a long-standing target for the treatment of diabetes using ortho- and allosteric drugs (Wiesmann et al., 2004; Keedy et al., 2018; Choy et al., 2017). For illustration, we compare recent high-quality room-temperature structures of the apo protein (PDB ID 7rin) with the protein bound to the competitive inhibitor TCS401 (PDB ID 7mm1) (Greisman et al., 2022). In addition to the presence/absence of signal for the ligand itself, the apo structure exhibits an equilibrium between `open' and `closed' active-site loops (Whittier et al., 2013), whereas the bound structure shows only the closed loop.
The data sets 7rin and 7mm1 are sufficiently isomorphous that an isomorphous difference map reveals the main structural changes. MatchMaps performs similarly. Strong positive difference density (blue mesh) is seen for the TCS401 ligand (gray sticks) in both the isomorphous difference map [Fig. 4(c)] and the MatchMaps difference map [Fig. 4(d)]. Around residues 180–182 of the active-site loop (known as the WPD loop), both the isomorphous difference map [Fig. 4(e)] and the MatchMaps difference map [Fig. 4(f)] show strong signal for a decrease in occupancy (red mesh) of the open-loop conformation (red sticks) and an increase in occupancy (blue mesh) of the closed-loop conformation (blue sticks).
However, even in this seemingly straightforward case, we find that the isomorphous difference map is susceptible to an artifact resulting from a slight (1.37°) rotation of the protein. The displacement between the original refined structural coordinates of each structure is especially strong around residues 22–25 [Fig. 4(a), boxed region; Fig. 4(g), apo model in gray, bound model in blue]. In this region, an isomorphous difference map picks up on this artifactual difference between the data sets and displays strong difference signal (blue and red mesh). Remarkably, this signal is similar in magnitude to the `true' signal seen in Figs. 3(c) and 3(e). In contrast, MatchMaps internally aligns the data before subtraction of electron density. Fig. 4(b) (boxed region) and Fig. 4(h) (apo model in red, bound model in blue) show residues 22–25 following whole-molecule alignment of the protein models. Following global alignment of the refined models, it is clear that this region does not contain significant `interesting' signal. Sure enough, the MatchMaps difference map contains no strong signal in this region. In fact, the faint signal that persists in the MatchMaps map for this region seems to suggest a slight remaining coordinate displacement in this region following whole-molecule alignment.
4.4. matchmaps.mr for DHFR data from different space groups
For many protein systems, careful analysis of electron-density change is stymied for pairs of similar structures which crystallize in different crystal forms. The MatchMaps algorithm can be further generalized to allow comparison of data sets in entirely different crystal packings or space groups. Specifically, the OFF model can serve as a search model for for the ON data. Following this extra step, the algorithm proceeds identically. We implement this modified algorithm in the command-line utility matchmaps.mr.
One such example is the enzyme DHFR, which has been crystallized in many space groups (Sawaya & Kraut, 1997). Here, we examine two structures of the enzyme bound to NADP+, in space groups P212121 (PDB ID 1rx1) and C2 (PDB ID 1ra1), visualized in Fig. 5(a). These structures are similar overall but differ in the active site [Figs. 5(b)–5(d)]. Here, we visualize these structural changes directly in electron density without introducing model bias.
Specifically, in the P212121 structure, the active-site Met20 loop adopts a closed conformation. In the C2 structure, the Met20 loop adopts an `open' conformation, which is stabilized by a crystal contact in this crystal form (Sawaya & Kraut, 1997). The difference between the open and closed loops is exemplified by residues 17–24 [Fig. 5(c)]. The open loop is stabilized by the formation of a key hydrogen bond between the Asn23 backbone and the Ser148 side chain. In the closed conformation, Asn23 is too far from Ser148 to form a hydrogen bond [Fig. 5(d)].
Remarkably, the positive difference density (blue) for the open loop is strong and readily interpretable in Figs. 5(c)–5(d). The MatchMaps map was computed using only the P212121 (red) closed-loop model. This means that the signal for the open-loop conformation is derived only from the observed amplitudes for the open-loop state in an unrelated crystal form.
4.5. matchmaps.ncs for NCS-related molecules of PDZ
The real-space portion of the MatchMaps algorithm can be repurposed to create `internal' difference maps across (NCS) operations. We implement this modified algorithm in the command-line utility matchmaps.ncs. As an example, we examined the of the fifth PDZ domain (PDZ5) from the Drosophila protein Inactivation, no after-potential D (INAD). This domain plays an essential role in terminating the response of photoreceptors to absorbed photons by modulation of its ability to bind ligands (Mishra et al., 2007). In particular, the binding cleft of PDZ5 can be locked by formation of a disulfide bond between residues Cys606 and Cys645. PDZ5 was found to crystallize in a form with three molecules in the [Fig. 6(a)] where each molecule adopts a different state. Specifically, chain C contains a disulfide bond between residues Cys606 and Cys645, whereas chain B does not. Chains B and C overlay well other than the disulfide bond region [Fig. 6(b)]. Chain A adopts a bound state by binding the C terminus of chain C (not shown). MatchMaps enables calculation of an internal difference map, yielding a clearly interpretable difference map for the formation of the disulfide bond [Fig. 6(c)].
5. Discussion
The isomorphous difference map has been a popular method for detecting conformational change for many years (Henderson & Moffat, 1971; Rould & Carter, 2003). However, we have shown above that the same inputs – one structural model and two sets of amplitudes – can be combined to compute a difference map that shares the strengths of an isomorphous difference map while ameliorating a key weakness. Specifically, phases are highly sensitive not only to structural changes (`interesting' signal) but also to changes in unit-cell dimensions and model pose (`uninteresting' signal). The introduction of rigid-body minimizes the contribution of this uninteresting signal to the final difference map. In Fig. 2, we illustrate a case where a loss of isomorphism significantly degrades the signal of an isomorphous difference map. In this case, MatchMaps is still able to recover the expected difference signal.
Changes in unit-cell volume frequently involve a disproportionate contribution from changes in solvent volume (Atakisi et al., 2018; Yamada et al., 2015), whereas the protein volume changes less. In such a situation, the protein location relative to the must change in some systematic way. This systematic change is an inherent part of the signal detected by an isomorphous difference map. We demonstrate in Figs. 3 and 4 that isomorphous difference maps are highly susceptible to translation and rotation artifacts, whereas MatchMaps, by virtue of construction, does not contain these artifacts. We emphasize that this problem with isomorphous difference maps is inherent and thus likely to be widespread.
In our experience, crystallographic perturbation experiments are often shelved due to changes in unit-cell constants. MatchMaps removes, in principle, the requirement for isomorphism and allows for the analysis of more crystallographic differences.
The computation of an isomorphous difference map is entirely incompatible with data from different crystal forms. The matchmaps.mr extension of MatchMaps allows for model-bias-free comparisons of electron densities regardless of crystal form, opening up a new world of structural comparisons. For instance, an isomorphous difference map cannot characterize the impacts of crystal packing. As shown above, MatchMaps can create such a map and thus allows enhanced understanding of the often subtle role of crystal packing on protein structure.
MatchMaps depends only on the common CCP4 and PHENIX crystallographic suites, along with various automatically installed pure-Python dependencies. MatchMaps runs in minutes on a modern laptop computer. The only required input files are a PDB or mmCIF file containing the protein model, two MTZ files containing amplitudes and uncertainties, and any ligand restraint files necessary for These are the same inputs as required for many common purposes (such as running phenix.refine) and would probably already be on hand. As outputs, MatchMaps produces real-space maps in the common MAP/CCP4/MRC format which can be readily opened in molecular visualization software such as PyMOL (https://pymol.org/) or Coot (Emsley et al., 2010). For these reasons, MatchMaps should slot naturally into the crystallographer's workflow for analysis of related data sets. Additionally, MatchMaps is open source and can be easily modified for a new use case by an interested developer. The authors welcome issues and pull requests on GitHub for the continued improvement of the software.
Supporting information
Link https://doi.org/10.5281/zenodo.10452581
data used; scripts for analysis and creating figures
Acknowledgements
We thank Harrison Wang (Harvard University) for testing of the MatchMaps code. We thank Professor James Fraser (UCSF) and his laboratory for testing MatchMaps, suggesting improvements and commenting on the pre-print of this manuscript. We thank Marcin Wojdyr (Global Phasing Ltd) for assistance with the gemmi library.
Funding information
This work was supported by the NIH Director's New Innovator Award (award No. DP2-GM141000 to Doeke R. Hekstra).
References
Agirre, J., Atanasova, M., Bagdonas, H., Ballard, C. B., Baslé, A., Beilsten-Edmands, J., Borges, R. J., Brown, D. G., Burgos-Mármol, J. J., Berrisford, J. M., Bond, P. S., Caballero, I., Catapano, L., Chojnowski, G., Cook, A. G., Cowtan, K. D., Croll, T. I., Debreczeni, J. É., Devenish, N. E., Dodson, E. J., Drevon, T. R., Emsley, P., Evans, G., Evans, P. R., Fando, M., Foadi, J., Fuentes-Montero, L., Garman, E. F., Gerstel, M., Gildea, R. J., Hatti, K., Hekkelman, M. L., Heuser, P., Hoh, S. W., Hough, M. A., Jenkins, H. T., Jiménez, E., Joosten, R. P., Keegan, R. M., Keep, N., Krissinel, E. B., Kolenko, P., Kovalevskiy, O., Lamzin, V. S., Lawson, D. M., Lebedev, A. A., Leslie, A. G. W., Lohkamp, B., Long, F., Malý, M., McCoy, A. J., McNicholas, S. J., Medina, A., Millán, C., Murray, J. W., Murshudov, G. N., Nicholls, R. A., Noble, M. E. M., Oeffner, R., Pannu, N. S., Parkhurst, J. M., Pearce, N., Pereira, J., Perrakis, A., Powell, H. R., Read, R. J., Rigden, D. J., Rochira, W., Sammito, M., Sánchez Rodríguez, F., Sheldrick, G. M., Shelley, K. L., Simkovic, F., Simpkin, A. J., Skubak, P., Sobolev, E., Steiner, R. A., Stevenson, K., Tews, I., Thomas, J. M. H., Thorn, A., Valls, J. T., Uski, V., Usón, I., Vagin, A., Velankar, S., Vollmar, M., Walden, H., Waterman, D., Wilson, K. S., Winn, M. D., Winter, G., Wojdyr, M. & Yamashita, K. (2023). Acta Cryst. D79, 449–461. Web of Science CrossRef IUCr Journals Google Scholar
Atakisi, H., Moreau, D. W. & Thorne, R. E. (2018). Acta Cryst. D74, 264–278. Web of Science CrossRef IUCr Journals Google Scholar
Barstow, B., Ando, N., Kim, C. U. & Gruner, S. M. (2008). Proc. Natl Acad. Sci. USA, 105, 13362–13366. Web of Science CrossRef PubMed CAS Google Scholar
Bhabha, G., Biel, J. T. & Fraser, J. S. (2015). Acc. Chem. Res. 48, 423–430. Web of Science CrossRef CAS PubMed Google Scholar
Bhabha, G., Lee, J., Ekiert, D. C., Gam, J., Wilson, I. A., Dyson, H. J., Benkovic, S. J. & Wright, P. E. (2011). Science, 332, 234–238. CrossRef CAS PubMed Google Scholar
Boehr, D. D., McElheny, D., Dyson, H. J. & Wright, P. E. (2006). Science, 313, 1638–1642. Web of Science CrossRef PubMed CAS Google Scholar
Brändén, G. & Neutze, R. (2021). Science, 373, eaba0954. Web of Science PubMed Google Scholar
Brookner, D. E. & Hekstra, D. R. (2024). MatchMaps: Non-isomorphous Difference Maps for X-ray Crystallography (Supporting Data), https://zenodo.org/records/10452581. Google Scholar
Choy, M. S., Li, Y., Machado, L. E., Kunze, M. B., Connors, C. R., Wei, X., Lindorff-Larsen, K., Page, R. & Peti, W. (2017). Mol. Cell, 65, 644–658.e5. CrossRef CAS PubMed Google Scholar
Elchebly, M., Payette, P., Michaliszyn, E., Cromlish, W., Collins, S., Loy, A. L., Normandin, D., Cheng, A., Himms-Hagen, J., Chan, C. C., Ramachandran, C., Gresser, M. J., Tremblay, M. L. & Kennedy, B. P. (1999). Science, 283, 1544–1548. Web of Science CrossRef PubMed CAS Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. Web of Science CrossRef IUCr Journals Google Scholar
Farley, C., Burks, G., Siegert, T. & Juers, D. H. (2014). Acta Cryst. D70, 2111–2124. Web of Science CrossRef IUCr Journals Google Scholar
Fraser, J. S., van den Bedem, H., Samelson, A. J., Lang, P. T., Holton, J. M., Echols, N. & Alber, T. (2011). Proc. Natl Acad. Sci. USA, 108, 16247–16252. Web of Science CrossRef CAS PubMed Google Scholar
Greisman, J. B., Dalton, K. M. & Hekstra, D. R. (2021). J. Appl. Cryst. 54, 1521–1529. Web of Science CrossRef CAS IUCr Journals Google Scholar
Greisman, J. B., Dalton, K. M., Sheehan, C. J., Klureza, M. A., Kurinov, I. & Hekstra, D. R. (2022). Acta Cryst. D78, 986–996. Web of Science CrossRef IUCr Journals Google Scholar
Hekstra, D. R., White, K. I., Socolich, M. A., Henning, R. W., Šrajer, V. & Ranganathan, R. (2016). Nature, 540, 400–405. Web of Science CrossRef CAS PubMed Google Scholar
Henderson, R. & Moffat, J. K. (1971). Acta Cryst. B27, 1414–1420. CrossRef CAS IUCr Journals Web of Science Google Scholar
Keedy, D. A., Hill, Z. B., Biel, J. T., Kang, E., Rettenmaier, T. J., Brandão-Neto, J., Pearce, N. M., von Delft, F., Wells, J. A. & Fraser, J. S. (2018). eLife, 7, e36307. Web of Science CrossRef PubMed Google Scholar
Lamb, A. L., Kappock, T. J. & Silvaggi, N. R. (2015). Biochim. Biophys. Acta, 1854, 258–268. Web of Science CrossRef CAS PubMed Google Scholar
Liebschner, D., Afonine, P. V., Baker, M. L., Bunkóczi, G., Chen, V. B., Croll, T. I., Hintze, B., Hung, L.-W., Jain, S., McCoy, A. J., Moriarty, N. W., Oeffner, R. D., Poon, B. K., Prisant, M. G., Read, R. J., Richardson, J. S., Richardson, D. C., Sammito, M. D., Sobolev, O. V., Stockwell, D. H., Terwilliger, T. C., Urzhumtsev, A. G., Videau, L. L., Williams, C. J. & Adams, P. D. (2019). Acta Cryst. D75, 861–877. Web of Science CrossRef IUCr Journals Google Scholar
Mishra, P., Socolich, M., Wall, M. A., Graves, J., Wang, Z. F. & Ranganathan, R. (2007). Cell, 131, 80–92. CrossRef PubMed CAS Google Scholar
Pearce, N. M., Krojer, T., Bradley, A. R., Collins, P., Nowak, R. P., Talon, R., Marsden, B. D., Kelm, S., Shi, J., Deane, C. M. & von Delft, F. (2017). Nat. Commun. 8, 15123. Web of Science CrossRef PubMed Google Scholar
Ravelli, R. B. & McSweeney, S. M. (2000). Structure, 8, 315–328. Web of Science CrossRef PubMed CAS Google Scholar
Read, R. J. (1986). Acta Cryst. A42, 140–149. CrossRef CAS Web of Science IUCr Journals Google Scholar
Rould, M. A. & Carter, C. W. (2003). Methods Enzymol. 374, 145–163. CrossRef PubMed CAS Google Scholar
Sawaya, M. R. & Kraut, J. (1997). Biochemistry, 36, 586–603. CrossRef CAS PubMed Web of Science Google Scholar
Whittier, S. K., Hengge, A. C. & Loria, J. P. (2013). Science, 341, 899–903. Web of Science CrossRef CAS PubMed Google Scholar
Wiesmann, C., Barr, K. J., Kung, J., Zhu, J., Erlanson, D. A., Shen, W., Fahr, B. J., Zhong, M., Taylor, L., Randal, M., McDowell, R. S. & Hansen, S. K. (2004). Nat. Struct. Mol. Biol. 11, 730–737. Web of Science CrossRef PubMed CAS Google Scholar
Wojdyr, M. (2022). J. Open Source Softw. 7, 4200. CrossRef Google Scholar
Yamada, H., Nagae, T. & Watanabe, N. (2015). Acta Cryst. D71, 742–753. Web of Science CrossRef IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.