Improving sampling of crystallographic disorder in ensemble refinement

Improvements to the ensemble refinement method are described and demonstrated. These improvements lead to more physically meaningful and interpretable macromolecular ensembles.


S1.1. ECHT Model Parameterisation
ECHT model characterisations were performed on the re-refined structures using the default parameters. ECHT outputs are included as part of the supplementary data (see availability).

S1.2.1. 1UOY
The model for ensemble refinement was prepared by refining the model from PDB_REDO with phenix.refine for 8 cycles using standard refinement options, with all atoms except water using anisotropic B-factors.

S1.2.2. 1YTT
The model for ensemble refinement was prepared by refining the model from PDB_REDO with phenix.refine for 8 cycles using the standard refinement options, with all atoms except water using anisotropic B-factors. Stable refinement of the Yb atoms required restraints to prevent disassociation of the Yb atoms from the protein. Initial simulations were performed with the initial model in which Yb atoms had full occupancy (1.0) and different restraints were tested. The best results were obtained using weight = 0.007 and slack = 0.6. The initial model was also re-refined where the occupancy of Yb atoms was additionally refined and the values found were updated in the input file for ER. Refined occupancies for the Yb atoms were 1.00 (A1), 0.81 (A2), 0.9 (B1) and 0.8 (B2).

S1.2.3. 3K0N
The model for ensemble refinement was prepared by refining the model from PDB_REDO with phenix.refine for 8 cycles using the standard refinement options, with all atoms except water using anisotropic B-factors.
For ECHT parameterisation, secondary structure using the DSSP algorithm failed to run on this structure, so secondary structure definitions were taken from the annotated PDB structure, and supplied manually to ECHT.
After initial ensemble refinement simulations, high positive peaks were identified in the difference map in the neighbourhoods of some S atoms for Met and Cys residues. These arose due to the insertion by the ER algorithm of water atoms into the space temporarily-vacated by the protein residues. In these cases, these residues are embedded in the core of the structure and the waters therefore cannot be ejected, leading to the identified difference density. To prevent the erroneous Acta Cryst. (2021). D77, https://doi.org/10.1107/S2059798321010044 Supporting information, sup-2 water placement, the S atoms in the identified cysteine and methionine residues were restrained to their initial positions using weight = 0.005 slack = 1.0.

S1.2.4. 7K3T
The model for ensemble refinement was prepared by refining the model from PDB_REDO with phenix.refine using group occupancies for the bound DMSO molecules. This model was then resubmitted to the PDB_REDO server and the resulting model was used as input to ER.

S1.3. Ensemble Refinement Parameterisation
When DEN restraints are used, they are applied with a weight of 30 as per the recommended PHENIX defaults. In the case where an ECHT disorder model is not supplied as input to the program, a pTLS value is provided instead. Parameter sweeps for Ensemble Refinement parameters as shown in Table   S1. For each ECHT disorder model, a grid search was performed over the set of TX and WX parameters; for pTLS models, a grid search was performed over the set of pTLS, TX and WX parameters. In both cases, statistics and parameters presented for each data set are for the parameter combination with the lowest R-free. Exact details of parameter files and commands uses can be found in the supplementary data (see availability).

Table S1
Ensemble Refinement Sweep Parameters. A grid search was performed using all combinations of the below parameters (including pTLS for pTLS runs, otherwise using the stated ECHT model as input). Parameters marked with an asterisk (*) were not run for 7k3t.