Protein Energy Landscapes Determined by 5-Dimensional Crystallography

Free energy landscapes decisively determine the progress of enzymatically catalyzed reactions[1]. Time-resolved macromolecular crystallography unifies transient-state kinetics with structure determination [2-4] because both can be determined from the same set of X-ray data. We demonstrate here how barriers of activation can be determined solely from five-dimensional crystallography [5]. Directly linking molecular structures with barriers of activation between them allows for gaining insight into the structural nature of the barrier. We analyze comprehensive time series of crystal-lographic data at 14 different temperature settings and determine entropy and enthalpy contributions to the barriers of activation. 100 years after the discovery of X-ray scattering, we advance X-ray structure determination to a new frontier, the determination of energy landscapes.


Supplementary Material
The supplemental material reports details that the authors consider not vital to be included in the main text but worth reporting. The supplementary material is not meant to be a replacement for the main text but rather be an extension that is only meaningful in combination with the main text.

Crystals and Data Collection:
Typical crystal sizes used were 100 x 100 x 700 µm 3 . The laser light was focused at the crystal to a round focal spot of 200 µm, with typical pulse energy of about 4 mJ/mm 2 . The reaction was followed by a series of Laue diffraction snapshots at various time delays between the ~5 ns pump (laser) and 100 ps probe (X-ray) pulses (see Tab. S1). Depending on the crystal size, three to seven pump-probe pulse sequences were accumulated prior to detector readout to obtain highquality diffraction patterns. The waiting time between the pulse sequences, necessary for the dark state recovery, varied between 1 s at higher temperatures and around 20 s at lowest temperatures.
Single 100ps X-ray pulses were isolated as described (Graber et al., 2011). The X-ray beam was Table S1. Statistics for selected data sets at all respective temperatures. Time-delays shown are delays between the peak of the laser pulse to the rising edge of the X-ray pulse. Completeness of Laue data is calculated including single and deconvoluted harmonic reflections, R merge is calculated from singlet intensities using multiple measurements and symmetry equivalents; both completeness and R merge are given exemplary for the dark dataset, last shell is from 1.9 -1.8 Å; R scale is calculated from amplitudes (F) after scaling the time-resolved structure factor amplitudes F ∆t to calculated dark F D amplitudes (on the absolute scale); ∆ρ min /σ ∆ρ and ∆ρ max /σ ∆ρ are the most negative and most positive difference electron density features in units of the sigma level found in the difference map at a selected time point ∆t. The largest features can be found at and near the sulfur atom of Cys69.
focused to a size of 60 µm vertically (v) and 90 µm horizontally (h) and each 100 ps pulse contained about 4 × 10 10 photons in the hybrid mode of the APS storage ring or 10 10 photons in the 24-bunch mode. At a particular crystal orientation we only probed by X-rays the crystal surface layer that was facing the laser (Graber et al., 2011). To precisely position this layer in the X-ray beam, crystal edge scan was done where a series of weak diffraction images were collected while the crystal edge was translated through the X-ray beam. With this, the overlap of the laser-excited volume with the X-ray probed volume is optimized.

Preparation of Data Matrix A
The time-series of the difference maps were analyzed at each temperature by Singular Value Decomposition (SVD) as described in the main text. A more extensive description can be found in the literature (Schmidt et al., 2003, Schmidt, 2008, Tripathi et al., 2012. To prepare data matrix A, a volume that covers an extended region of the chromophore pocket (21 amino acid residues) was masked out. The mask included Cys69, Tyr42, Ala44-Asp53, Asp65-Asp71, Phe96, Met100 and three water molecules close to the entrance of the chromophore pocket. The mask was further modified by using for the SVD analysis only grid points above or below plus or minus 3 σ, respectively, in at least one of the difference maps of each time series. The difference electron density values within this mask were arranged in temporal order and subjected to SVD. This procedure ensures that only those regions where strong signal is present are subjected to SVD. Other regions that contain low signal contribute mainly noise to the analysis and were excluded this way. After the SVD analysis (Eq. 2 in the main text), the spatial components, which are difference electron densities, are separated into the significant left singular vectors (lSVs). The kinetics can be found in the corresponding significant right singular vectors (rSVs).

Details for Fitting a Chemical Kinetic Mechanism
Kinetic modeling is required to explain the time-dependent variations of the difference electron density values in terms of concentrations of the intermediates. All structures of the PYP photocycle intermediates in the time-range analyzed are known (Borgstahl et al., 1995, Schmidt et al., 2004, Jung et al., 2013. Each structure represents a transient (=short kinetic mechanism (Steinfeld et al., 1985, Cornish-Bowden, 2012. These relaxation times are observed globally as kinetic processes in the rSVs. However, if relaxation times are similar, they appear as only one process. This is why only 4 processes are observed in our rSVs, but 5 intermediates contribute. The amplitudes are equivalent to (fractional) concentrations, and the relaxation times determine the variations of the concentrations with time. On the absolute scale, difference electron densities are directly proportional to concentrations (Eq. 4, main text). In certain instances (Schmidt, Nienhaus, et al., 2005) one can account for the electrons in a particular density feature and infer from this the occupancy, the fractional concentration, or even the concentration proper, of a molecule in the unit cell. In other instances the entire difference electron density of the whole unit cell can be fit by calculated difference electron density maps from chemically plausible structural models. This is in contrast to, for example, absorption spectroscopy where there are linear factors between concentration and absorption, namely the absorption coefficients that are all a-priori unknown for the intermediate state (van Stokkum et al., 2004, Yeremenko et al., 2006, Khoroshyy et al., 2013. Initially determined absorption coefficients will yield self-consistent results, and may persist in the literature. This problem is considerably smaller in crystallography, because crystallographic data can be always represented on the absolute scale. A chemical kinetic mechanism compatible with the X-ray data must generate time-dependent concentrations that are directly commensurable with the observed difference electron density values. The mechanism becomes testable and, in certain instances excluded by posterior analysis (Schmidt et al., 2003, Schmidt, 2008. Still, a number of mechanisms can fit the data reasonably well, and consequently may be indistinguishable or degenerate (Schmidt et al., 2010). Here, however, we are not concerned with lifting this degeneracy but with extracting meaningful thermodynamic parameters within the constraints of a plausible candidate mechanism used previously with TRX data to extract intermediates (Jung et al., 2013.
Posterior analysis allows the refinement (Eq. 5, main text) of the microscopic rate coefficients of a mechanism (Schmidt et al., 2004, Jung et al., 2013. This approach has the potential to exclude reaction pathways. This is the case when the refinement yields rate coefficients that are so small that they may be ignored. An example is shown in Fig. 3 (main text), where the rate coefficient of the dashed pathway is less than 0.1% of k 3 throughout and, therefore, less than 0.1% of the molecules react through this pathway. Within the scope of chemical kinetics, one can estimate the number of observables present in timeresolved X-ray data and compare these to the number of fit parameters in a mechanism such as the one used here. The time-dependent concentrations of each individual intermediate follow sums of exponentials (Steinfeld et al., 1985), which is reflected by the time-dependent difference The minimum number of 2N observables is faced with a number of free fit parameters in Eq. 5 (main text). The fit parameters are the microscopic rate coefficients of the mechanism whose magnitudes determine both the relaxation times and the amplitudes of the kinetic phases. Up to -10 o C rate coefficients k 1 and k 2 were free fit parameter. From 0 o C k 1 was fixed to 2 × 10 9 1/s, and changed to 3 × 10 9 1/s from 25 o C (Tab. S2). k 2 was varied freely, because for a given k 1 , the magnitude of k 2 accounts for the concentration of pR 1 relative to I CT . Since the concentration of pR 1 is directly observable in the difference maps, k 2 can be determined. At lower temperatures pB 2 is not observed and 8 microscopic rate-coefficients (k 1 ... k 8 ) are fit parameters in the mechanism that connects 5 intermediate states and the dark state. One additional fit parameter is the extent of reaction initiation, which is the scale factor sf in Eq. 5 (main text). Hence, there are 9 free parameters. Since the time-course contains 5 intermediate states, the (lowest) number of observables is 10 (see above argumentation). A least squares fit of the mechanism to the data is possible at all temperatures ≤ 40 o C. A ninth rate coefficients (a 10 th fit parameter) spanning from I CT directly to pG can be included and its magnitude determined. This rate coefficient is smaller than 0.1% of k 3 so that the pathway is irrelevant within the constraints of the mechanism. Also k 6 is much smaller than k 4 throughout (see Tab. S2). As a consequence, the fitted values of k 6 are fluctuating largely and contribute little to the mechanism. This pathway can also be ignored.
Above 40 o C a truncated mechanism is used that starts from the pR states, since the earlier processes become successively inaccessible at these temperatures. This mechanism includes in addition pB 2 because a second pB phase is identified in the rSVs. 8 observables are faced by 9 free parameters, because this time two scale factors that determine the initial extent of pR 1 and pR 2 have to be included in addition to the 7 rate coefficients. In its full extent, this mechanism is underdetermined and cannot be fit without using additional constraints. Accordingly, we constrained the initial concentration of pR 1 to 30% of pR 1 . In addition we fixed k 6 to 1% of k 4 .
Both conditions are roughly observed at lower temperatures. With this the fit becomes stable.
The temperature dependences of the individual rate coefficients were then fit up to 50 o C by the transition state equation (TST, Eq. 1, main text) to determine entropy and enthalpy difference to the transition state. Above 50 o C, PYP starts to deviate from simple thermal activation by occupying more states and rate coefficients extracted at these elevated temperatures were not used for the fit of the TST.

Microspectrophotometry
Small PYP crystals were crushed between two cover slides which were subsequently sealed with epoxy. The crystalline slurry was probed by a micro-spectrophotometer with a time-resolution of 20 µs. The design of the micro-spectrophotometer will be reported elsewhere (Purwar et al.,   The temperature was controlled by an Oxford Cryojet HTII (Agilent Technologies) gas stream and determined by a calibrated diode. Difference spectra were generated by subtracting the absorption spectrum collected in the dark from those collected at the time-delays. The time-series of difference spectra (Fig. 6, main text) were analyzed by singular value decomposition in the wavelength range from 410 nm to 530 nm. Absorption values at longer wavelengths only contribute noise and nothing to the kinetics. Although the blue shifted part of the spectra is not included, kinetic phases corresponding to pR and pB can be faithfully distinguished since absorption at each wavelength contains information about the kinetics which is extracted by the SVD. Relaxation times were determined by fitting exponential functions to the right singular vectors in a similar way as it was done for the time-resolved crystallographic experiments.

Laser pulses
The absorption coefficient of crystals is anisotropic (Ng et al., 1995). With unpolarized laser light used in our experiments we can assume that the absorption coefficient of the absorption maximum is equal to that in solution (45,500 cm 2 mmol -1 ), however slightly shifted (3 nm) in the crystal (Fig. S2). The wavelength of the laser used to initiate the reaction is 485 nm. At that wavelength the absorption coefficient is a factor of 10 smaller (see also Fig. S2). The PYP concentration is 96 mmol/L in the crystal. These crystals are exposed to laser light in a geometry shown in Fig. 4A (main text). The penetration depth can be defined as the thickness when 1 a.u. µm x crystal diameter which is about 120 µm. The fraction of the laser energy that strikes the crystal is therefore 0.2 mm × 0.12 mm × 4.0 mJ/mm 2 = 0.096 mJ. 90 % of this distributes to a volume of 0.2 × 0.12 × 0.023 mm 3 =5.5 × 10 -10 L, which contains 5.3 × 10 -8 mmol PYP. With a molecular weight of 14700 g/mol this amount of PYP has a mass of 8 x 10 -10 kg. If we assume that half of the laser light is dissipated as heat into the vibrational modes of motion and half of it is stored in an energy rich chromophore configuration (Martin et al., 1983, van Brederode et al., 1995, we can estimate the adiabatic temperature rise from the fraction of the heat dissipated into the vibrational modes. With a heat capacitance of a typical protein of 5 kJ kg -1 K -1 (Miyazaki et al., 2000), we would expect an increase of , which is about the typical temperature step of ~10 K we used. The energy stored in the chromophore configuration is gradually released. The total heat is diffusing out of the illuminated and excited volume. Since the penetration depth of the laser is much smaller than the crystal diameter, and the face of the crystal is in contact with the temperature controlled capillary surface we have a two dimensional heat diffusion problem with the heat escaping to the left and right into the crystal volume and into the capillary wall and into the bulk of the crystal volume below the illuminated volume. The characteristic time for the heat to diffuse out of that volume is (Carslaw & Jaeger, 1959) with κ the thermal diffusivity and a and b the size of the two-dimensional box shown in Fig. 2 (main text). With a=0.2 mm, b=0.023 mm, and assuming that κ of the protein crystal equals to that of water being 0.143 mm 2 /s, we estimate a characteristic time of roughly 0.5 ms for the heat to diffuse out of the laser illuminated volume. Moffat et al. (Moffat et al., 1992) estimate heat diffusion times that are much longer based on calculations with larger illuminated volumes. k 5 k 7 Figure S1. Temperature dependence of rate coefficients k 5 and k 7 . Decay of pR 1 to pB 1 (k 5 ) and pG (k 7 ). Black spheres: rate coefficients, red line: fit by the transition state equation. Entropy and enthalpy of the barriers are listed. Inserts: corresponding Arrhenius plots.