research papers
Pi sampling: a methodical and flexible approach to initial macromolecular crystallization screening
^{a}MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, England
^{*}Correspondence email: fgorrec@mrclmb.cam.ac.uk
The Pi sampling method is derived from the incomplete factorial approach to macromolecular crystallization screen design. The resulting `Pi screens' have a modular distribution of a given set of up to 36 stock solutions. Maximally diverse conditions can be produced by taking into account the properties of the chemicals used in the formulation and the concentrations of the corresponding solutions. The Pi sampling method has been implemented in a webbased application that generates screen formulations and recipes. It is particularly adapted to screens consisting of 96 different conditions. The flexibility and efficiency of Pi sampling is demonstrated by the crystallization of soluble proteins and of an integral membraneprotein sample.
Keywords: macromolecular crystallization; initial screen formulation; incomplete factorial approach; modular distribution; membraneprotein crystallization; GPCR.
1. Introduction
A crucial aspect of macromolecular crystallographic studies is finding suitable conditions for the crystallization of a sample. This can be difficult because many factors alter the crystallization behaviour of macromolecules, including the type and the concentration of the chemicals employed to formulate the conditions (McPherson, 1990). A condition includes at least a precipitant and most conditions also include a buffer and an additive. During the initial crystallization experiments, the structure of the macromolecule is not known and hence the most efficient formulation cannot be predicted. As a consequence, one should be cautious when making initial assumptions and limiting choices in subsequent optimizations (Rupp, 2003). Nonetheless, the number of initial crystallization conditions cannot be unreasonably large since purified protein is often difficult and expensive to produce in large quantities.
There are essentially two approaches to restrict an initial screen to a limited number of crystallization conditions. Firstly, a sparsematrix formulation can be used, which consists of an empirically derived combination of components based on known or published crystallization conditions (Jancarik & Kim, 1991). Secondly, an incomplete factorial formulation can be generated in which selected components are combined to prepare new conditions in accordance with principles of randomization and balance (Carter & Carter, 1979). Numerous commercial screens based on these two main approaches are available. Automated systems have been implemented at the Medical Research Council (MRC) Laboratory of Molecular Biology (LMB) to test these as routine initial screens using the 96well crystallization plate format (Stock et al., 2005). However, for various reasons, many laboratories opt for a minimal screen (Kimber et al., 2003) and still perform at least some aspects of the work manually (Bergfors, 2007).
Here, we present a development based on the incomplete factorial formulation: the Pi sampling method. The name of the method was inspired by the story of Archimedes, who used the `method of exhaustion' (i.e. an empirical approach) with a 96sided polygon in order to reach the first good numerical approximation of π (Smith, 1958). Pi sampling uses modular arithmetic to form combinations of three stock solutions across a 96condition grid. Maximally diverse conditions can be produced by taking into account the properties of the chemicals used in the formulation and the concentrations of the corresponding stock solutions. We have implemented this approach in a webbased application called Pi Sampler: user input consists of the details of up to 36 stock solutions, from which the application generates the formulations for a 96condition screen. The Pi sampling method is intended to help laboratories to test new crystallizationscreen formulations on a daytoday basis based on the properties of the macromolecules investigated, as has been performed previously with RNA (Doudna et al., 1993).
Firstly, we tested Pi sampling with ten commercially available soluble proteins. For this, the `Pi minimal screen' was employed including a wide variety of well known chemicals frequently used for macromolecular crystallization.
We then investigated the impact of Pi sampling on the crystallization of a Gproteincoupled receptor (GPCR) that had been difficult to crystallize: the adenosine A_{2A} receptor (construct A_{2A}RGL31). We formulated another Pi screen, the `PiPEG screen', taking into consideration general observations made about crystallization of integral membraneprotein samples. Previous crystallization experiments on another GPCR (the β_{1}adrenergic receptor) had indicated that the use of simple proprietary screens formulated with poly(ethylene glycol) (PEG) and buffers gave a greater yield of crystals than all commercially available screens, including those geared towards membrane proteins (Warne et al., 2009), and the 2.7 Å resolution structure was solved using conditions optimized from a proprietary screen essentially based on PEGs (Warne et al., 2008). This has been observed previously with other membraneprotein targets (Lemieux et al., 2003). In addition, mixtures of polyethylene have been used successfully to develop a minimal screen (Brzozowski & Walton, 2001) and to study crystal structures of the Kir potassium channel (Clarke et al., 2010). Such mixtures were incorporated into the PiPEG screen.
2. Methods
2.1. Pi sampling
Pi sampling begins with up to 36 stock solutions, divided into three sets of 12. The first set of solutions is used in the screen at constant concentration. The second and third sets are added according to a gradient between specified minimum and maximum concentrations. Typically, the first set is composed of buffers and the second and third sets are precipitants/additives.
The combinations of three stock solutions (one from each set) are generated according to Fig. 1, where 1–12 refer to the IDs for solutions of the first set, A–M to those of the second set and N–X to those of the third set. The number in each cell shows which solution of the first set will be combined with the corresponding solutions of the second and third sets. Blank spaces show when no such combinations are generated.
Fig. 2 summarizes the distribution of the stock solutions in a standard 96condition plate layout (i.e. 12 columns and eight rows).
2.2. Pi Sampler
Pi Sampler can be accessed via the internet at https://pisampler.mrclmb.cam.ac.uk/ . Users can enter the details of up to 36 stock solutions, including stock concentrations, desired screen concentration ranges and Δ values. The application then generates a 96condition screen formulation following the Pi sampling method described above. Formulations, recipes and total required volumes of stock solutions are presented and may conveniently be downloaded in commaseparated variable format (CSV), allowing the user to import them into other software for automated screen making (Cox & Weber, 1987), formulation analysis (Hedderich et al., 2011) and data mining (Kantardjieff & Rupp, 2004). The parameters used to generate the screen can also be saved and uploaded in the same format. Further details and instructions can be found on the website.
2.3. Pi minimal screen preparation and crystallization assays with commercially available soluble proteins
The final formulation of the Pi minimal screen can be found in Table 1. There are 36 starting stock solutions overall. Each solution composing the first set (ID 1–12) is a mixture of an acid with its corresponding base (e.g. HEPES pH 7.5: 1 M HEPES solution mixed with 1 M HEPES sodium salt in order to reach pH 7.5), except for buffer 11 (AMPD mixed with Tris base). Note that this is also true for the precipitant phosphate (phosphate system: sodium dihydrogen phosphate/dipotassium hydrogen phosphate). Values of pH (4.0–9.5) were chosen as the variable Δ for the first set, whilst arbitrary values were chosen for additives of various natures composing the second set (ID A–L). Eventually, a few conditions were made without additive/buffer because of chemical incompatibilities (Table 1).

Highest purity grade chemicals (Molecular Biology grade when available) were purchased from Sigma–Aldrich to prepare 36 stock solutions. The solutions were mixed in 96 Falcon tubes. The screen was dispensed into `MRC original plates' (96well, twodrop, Swissci; Stock et al., 2005).
Commercial proteins that had been crystallized before were chosen to prepare test samples. Protein concentrations were chosen randomly between 7 and 150 mg ml^{−1} (Table 2). Vapourdiffusion experiments were set up at 295 K, mixing two different sample: condition ratios (1:3 and 3:1) to give a final volume of 400 nl. The plates were then stored at 291 K. A condition was considered to be a hit when at least one of the two corresponding drops contained crystals with well known morphology after one week. Table 3 shows the `hits per condition' observed and the corresponding results expected for the binomial distribution (see §4.2).


2.4. PiPEG screen preparation and crystallization assays with a GPCR
The final formulation of the PiPEG screen can be found in Table 4. The formulation can also be generated using Pi Sampler by loading the PiPEG example data. The pH values (4.8–8.8) were chosen as the variable Δ for the buffers composing set 1 (ID 1–12), whilst molecular weight was chosen for set 2 (PEGs A–L, final concentration range 0–22.5%). The same 12 PEGs were used for set 3 (PEGs M–X, final concentration range 0–45%). General details of the preparation are similar to §2.3, but there are 24 stock solutions at the start (instead of 36). Vapourdiffusion experiments were set up at 277 K, mixing sample and condition in a 1:1 ratio to give a final volume of 200 nl. The preparation of A_{2A}RGL31 will be published elsewhere (Lebon et al., submitted work). Crystal Xray screening was performed at the Diamond synchrotron light source (microfocus beamline I24 equipped with a Pilatus 6M detector).

3. Results
There were 116 crystallization hits overall for the experiments with the Pi minimal screen (Table 2). Some conditions produced hits for several samples (Table 3).
The PiPEG screen yielded crystals that diffracted to 3.0 Å resolution for A_{2A}RGL31 with bound agonist. Fig. 3 shows the crystals of A_{2A}RGL31 obtained in well E9 [50 mM Tris–HCl pH 7.6, 9.6%(v/v) PEG 200, 22.9%(v/v) PEG 300] and an example of the corresponding diffraction pattern (no cryoprotectant was required).
4. Discussion
4.1. Pi sampling
In order to understand the rationale behind the modular arithmetic employed for the Pi sampling, it may help to imagine, on a 12 h clock, a series of events occurring every 5 h. The first event is at noon, the second at 5 pm, then 10 pm, then 3 am etc. Eventually, there is a succession of 12 events occurring at different hours, with as much time as possible in between each event. If we now look at combinations of three components, there are originally 12^{3} or 1728 possibilities. Pi Sampler generates 96 of these combinations that correspond to conditions that are distant in properties. The variety between conditions is then accentuated using a number of different concentrations of solutions (Fig. 2). If the first and second sets of solutions are ordered according to physicochemical properties, the generated screen will be an incomplete factorial sampling of interactions between chemicals with these properties. If the chemicals selected have completely different natures, they can be arranged randomly (see §2.3). The ordering of the third set of solutions can be used to avoid obvious chemical incompatibilities (e.g. mixing phosphate and magnesium salts). It is also possible to design simpler screens with only two sets of stock solutions.
4.2. The Pi minimal screen
In order to check the ). This can be approximated by a binomial distribution. The probability of success for the binomial distribution is the observed probability for ten attempts: 116/(10 × 96) = 0.12083. The χ^{2} statistic for the data is 3.48. This can be compared with the quantiles of a χ^{2} distribution with two which gives a p value of 0.18 (calculations not shown). This χ^{2} test indicates that no conditions are obvious outliers with regard to success or failure. There are, however, a multitude of possible biases implied when proceeding with crystallization experiments (which would be even more accentuated with the use of novel samples); hence, any statistical analysis should be taken with precaution. Nonetheless, it is interesting to see that the analysis of the distribution is in accordance with the original approach based on balanced randomization (Carter & Carter, 1979; Rupp, 2003).
of the hits across the screen with the ten samples, we compared the results obtained with what would be expected if each condition had the same probability of hits overall (Table 3In addition, the conditions of the Pi minimal screen show no identities to the extensive list of conditions (7230) from commercial screens stored in the `PICKScreens' database (Hedderich et al., 2011).
4.3. The PiPEG screen
The extent of effects on crystallization for precipitants such as PEGs is correlated with their concentrations (McPherson, 1976) and molecular weights (Forsythe et al., 2002). The PiPEG screen covers a wide range of parameters (kinetics of equilibrium, protein stabilization etc.). In addition, the concentrations of the two different PEGs in a condition can be adjusted for condition optimization (Stock et al., 2005) and for crystal cryoprotection (Berejnov et al., 2006). Furthermore, the PICKScreens database shows that the PiPEG screen is unique (as for the Pi minimal screen; see §4.2).
Samples of A_{2A}RGL31 purified in a number of different detergents rarely crystallized in commercially available screens used at the LMB (Stock et al., 2005) and when they did the crystal quality was not sufficient for The first quality crystals were recently obtained using the PiPEG screen.
5. Conclusions
We have demonstrated that the Pi sampling is a methodical and flexible approach to initial screening for macromolecular crystallization. Two unique screens produced de novo have resulted from this strategy. The Pi minimal screen potentially has an ideal formulation for crystallization of novel soluble protein samples. The PiPEG screen is a tailormade screen for GPCRs and potentially other membrane proteins generated by biasing the formulation towards components known to be essential.
Further screens can be formulated with the Pi Sampler on a daytoday basis in order to test chemicals and techniques, with the aim of increasing the yield of quality crystals. Also, new crystallization techniques are constantly emerging for macromolecular targets such as membrane proteins and hence formulations with special considerations are required: one may want to formulate screens compatible with the lipidic cubic phase (LCP) concept (Landau & Rosenbusch, 1996) or make extensive use of detergents (KoszelakRosenblum et al., 2009).
In order for laboratories to be able to handle many Pi screen formulations and the flow of resulting data, we are working on the integration of Pi Sampler into the `xtalPiMS' Laboratory Information Management System (LIMS; Morris et al., 2011; see https://www.pimslims.org ).
Acknowledgements
Thanks to Simon Byrne (Cambridge University Statistics Clinic; https://www.statslab.cam.ac.uk/clinic/ ) for discussions. Thanks to the LMB members Jan Löwe, John KendrickJones, Christopher Aylett, Chris Tate, Jake Grimmett and Graham Lingley for various contributions. Finally, thanks to Karen Law (MRC Technology), Chris Morris (STFC, funded by CCP4) and Tanja Hedderich (Max Planck Institute). Conflicting commercial interest: we hereby state that we have a conflicting commercial interest in that MRC Technology (https://www.mrctechnology.org/ ) will commercialize Pi screens under an exclusive licence to Jena Bioscience (https://www.jenabioscience.com/ ).
References
Berejnov, V., Husseini, N. S., Alsaied, O. A. & Thorne, R. E. (2006). J. Appl. Cryst. 39, 244–251. Web of Science CrossRef CAS IUCr Journals Google Scholar
Bergfors, T. (2007). Methods Mol. Biol. 363, 131–151. CrossRef PubMed CAS Google Scholar
Brzozowski, A. M. & Walton, J. (2001). J. Appl. Cryst. 34, 97–101. Web of Science CrossRef CAS IUCr Journals Google Scholar
Carter, C. W. & Carter, C. W. (1979). J. Biol. Chem. 254, 12219–12223. CAS PubMed Web of Science Google Scholar
Clarke, O. B., Caputo, A. T., Hill, A. P., Vandenberg, J. I., Smith, B. J. & Gulbis, J. M. (2010). Cell, 141, 1018–1029. Web of Science CrossRef CAS PubMed Google Scholar
Cox, M. J. & Weber, P. C. (1987). J. Appl. Cryst. 20, 366–373. CrossRef CAS Web of Science IUCr Journals Google Scholar
Doudna, J. A., Grosshans, C., Gooding, A. & Kundrot, C. E. (1993). Proc. Natl Acad. Sci. USA, 90, 7829–7833. CrossRef CAS PubMed Web of Science Google Scholar
Forsythe, E. L., Maxwell, D. L. & Pusey, M. (2002). Acta Cryst. D58, 1601–1605. Web of Science CrossRef CAS IUCr Journals Google Scholar
Hedderich, T., Marcia, M., Köpke, J. & Michel, H. (2011). Cryst. Growth Des. 11, 488–491. CrossRef CAS Google Scholar
Jancarik, J. & Kim, S.H. (1991). J. Appl. Cryst. 24, 409–411. CrossRef CAS Web of Science IUCr Journals Google Scholar
Kantardjieff, K. A. & Rupp, B. (2004). Bioinformatics, 20, 2162–2168. Web of Science CrossRef PubMed CAS Google Scholar
Kimber, M. S., Vallee, F., Houston, S., Necakov, A., Skarina, T., Evdokimova, E., Beasley, S., Christendat, D., Savchenko, A., Arrowsmith, C. H., Vedadi, M., Gerstein, M. & Edwards, A. M. (2003). Proteins, 51, 562–568. Web of Science CrossRef PubMed CAS Google Scholar
KoszelakRosenblum, M., Krol, A., Mozumdar, N., Wunsch, K., Ferin, A., Cook, E., Veatch, C. K., Nagel, R., Luft, J. R., Detitta, G. T. & Malkowski, M. G. (2009). Protein Sci. 18, 1828–1839. Web of Science PubMed CAS Google Scholar
Landau, E. M. & Rosenbusch, J. P. (1996). Proc. Natl Acad. Sci. USA, 93, 14532–14535. CrossRef CAS PubMed Web of Science Google Scholar
Lemieux, M. J., Song, J., Kim, M. J., Huang, Y., Villa, A., Auer, M., Li, X.D. & Wang, D.N. (2003). Protein Sci. 12, 2748–2756. Web of Science CrossRef PubMed CAS Google Scholar
McPherson, A. (1976). J. Biol. Chem. 251, 6300–6303. CAS PubMed Web of Science Google Scholar
McPherson, A. (1990). Eur. J. Biochem. 189, 1–23. CrossRef CAS PubMed Web of Science Google Scholar
Morris, C. et al. (2011). Acta Cryst. D67, 249–260. Web of Science CrossRef CAS IUCr Journals Google Scholar
Rupp, B. (2003). J. Struct. Biol. 142, 162–169. Web of Science CrossRef PubMed CAS Google Scholar
Smith, D. E. (1958). History of Mathematics. New York: Dover Publications. Google Scholar
Stock, D., Perisic, O. & Löwe, J. (2005). Prog. Biophys. Mol. Biol. 88, 311–327. Web of Science CrossRef PubMed CAS Google Scholar
Warne, T., SerranoVega, M. J., Baker, J. G., Moukhametzianov, R., Edwards, P. C., Henderson, R., Leslie, A. G., Tate, C. G. & Schertler, G. F. (2008). Nature (London), 454, 486–491. Web of Science CrossRef PubMed CAS Google Scholar
Warne, T., SerranoVega, M. J., Tate, C. G. & Schertler, G. F. (2009). Protein Expr. Purif. 65, 204–213. Web of Science CrossRef PubMed CAS Google Scholar
This is an openaccess article distributed under the terms of the Creative Commons Attribution (CCBY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.