Structural Biology and Crystallization Communications Structure of Hyperthermophilic B-glucosidase from Pyrococcus Furiosus

Three categories of cellulases, endoglucanases, cellobiohydrolases and-gluco-sidases, are commonly used in the process of cellulose saccharification. In particular, the activity and characteristics of hyperthermophilic-glucosidase make it promising in industrial applications of biomass. In this paper, the crystal structure of the hyperthermophilic-glucosidase from Pyrococcus furiosus (BGLPf) was determined at 2.35 A ˚ resolution in a new crystal form. The structure showed that there is one tetramer in the asymmetric unit and that the dimeric molecule exhibits a structure that is stable towards sodium dodecyl sulfate (SDS). The dimeric molecule migrated in reducing SDS polyacrylamide gel electrophoresis (SDS–PAGE) buffer even after boiling at 368 K. Energy calculations demonstrated that one of the two dimer interfaces acquired the largest solvation free energy. Structural comparison and sequence alignment with mesophilic-glucosidase A from Clostridium cellulovorans (BGLACc) revealed that the elongation at the C-terminal end forms a hydrophobic patch at the dimer interface that might contribute to hyperthermostability.


Introduction
Cellulosic materials constitute most of the biomass on Earth and are capable of being converted into bioethanol, a next-generation biofuel (Bayer & Lamed, 1992;Farrell et al., 2006;Joshi & Mansfield, 2007;Ragauskas et al., 2006). The process of bioethanol production from biomass requires the saccharification of cellulose in order to obtain fermentable sugars. In nature, cellulolytic microbes typically produce three categories of cellulases which convert cellulose into glucose: endoglucanases (EGs), cellobiohydrolases (CBHs) and -glucosidases (BGLs) (Baldrian & Valá sková , 2008;Stricker et al., 2008;Tomme et al., 1995). Cellulase systems using these three types of enzymes show potential for complete industrial-scale enzymatic saccharification of cellulose. In this setting, Trichoderma reesei has been considered to be a strongly cellulolytic and xylanolytic candidate microorganism. However, complete saccharification of cellulose is not accomplished by the cellulases isolated from T. reesei because its BGL exhibits low activity. To overcome this problem, BGL from Aspergillus aculeatus (BGLAa) has been used to increase the cellulase activity of T. reesei (Kawaguchi et al., 1996).
The hyperthermophilic -glucosidase from Pyrococcus furiosus (BGLPf) belongs to the glycoside hydrolase 1 (GH1) family. The enzymes of this family form (/) 8 barrels and hydrolyze their substrate while retaining configuration at the anomeric C atom. Two glutamate residues serve as a general acid/base or nucleophile in the reaction. A site-directed mutagenesis approach revealed that the catalytic dyad of BGLPf, composed of Glu207 (acid/base) and Glu372 (nucleophile), hydrolyzes the -1,4 bonds of its substrates (Voorhorst et al., 1995).
To date, a thermophilic cellulase system for industrial conversion of biomass has not been developed. Nevertheless, enzymatic degradation of biomass at high temperature would provide obvious advantages, such as limiting bacterial contamination and increasing substrate solubility. Recently, an endocellulase (EGPh, family 5) from the hyperthermophilic archaeon P. horikoshii was identified and recombinant EGPh was successfully expressed using Escherichia coli (Ando et al., 2002;Kashima et al., 2005;Kim et al., 2007Kim et al., , 2008. EGPh Acta Cryst. (2011). F67, 1473-1479 exhibits progressive hydrolytic activity, releasing cellobiose after an initial endo-type attack on cellulose. Hyperthermophilic archaeal BGLs have also been isolated from P. horikoshii and P. furiosus (Lebbink et al., 2001;Matsui et al., 2000). BGL from P. horikoshii (BGLPh) exhibits specific activity towards cellobiose, but not towards other cellooligosaccharides (Matsui et al., 2000). Furthermore, the activity of BGLPh was only observed in the presence of detergents (Matsui et al., 2000). In contrast, BGL from P. furiosus (BGLPf) exhibits specific activity towards a wide range of substrates, but its highest hydrolytic activity is towards cellooligosaccharides at high temperature (Kaper et al., 2000;Bauer et al., 1996). The activity and substrate specificity of BGLPf (Kim & Ishikawa, 2010) make it a candidate enzyme for the saccharification of biomass. 31 structures in the GH1 family have been reported to date. The crystal structure of BGLPf has also been determined to a resolution of 3.3 Å (Kaper et al., 2000). However, a structural model has not been built and detailed information about the structure of this enzyme is not available from the low-resolution data set. Moreover, structural data regarding BGLPf have not been deposited in the Protein Data Bank (PDB). Here, the structure of a new crystal form of BGLPf was determined to a resolution of 2.35 Å . The crystal structure was examined to reveal information on the hyperthermostability and the substrate-recognition mechanism of BGLPf.

Protein preparation
BGLPf (Gene ID PF0073; Bauer et al., 1996) was purified as follows. The recombinant protein was expressed in Escherichia coli BL21 (DE3) cells (Novagen) under control of the T7 promoter in pET11a (Novagen). Cell cultures were grown at 310 K in Luria broth (3.2 l) containing 100 mg ml À1 ampicillin until the optical density at 600 nm (OD 600 ) reached 0.8. Isopropyl -d-1-thiogalactopyranoside was added to a final concentration of 1.0 mM for protein induction. The harvested cells were lysed by sonication in 50 mM Tris-HCl pH 8.0 at 277 K. The cell lysate was heat-treated at 358 K for 30 min and then centrifuged at 15 000g for 20 min at 277 K. Streptomycin (2 g) was added to the supernatant (100 ml) at 277 K with stirring and the mixture was centrifuged at 15 000g for 30 min. The supernatant was fractionated with ammonium sulfate up to 80% saturation. After centrifugation, the pellet was resuspended in 50 mM Tris-HCl pH 8.0 and then dialyzed against Tris-HCl pH 8.0. The lysate was loaded onto a HiTrap Q anion-exchange column (GE Healthcare Biosciences) equilibrated with 50 mM Tris-HCl pH 8.0 and eluted with a linear gradient of 0-0.5 M NaCl. The composition of the buffer solution containing the target sample was adjusted to 50 mM Tris-HCl pH 8.0 containing 20%(v/v) ammonium sulfate. This solution was loaded onto a hydrophobic HiTrap Phenyl column (GE Healthcare Biosciences) equilibrated with 20 mM Tris-HCl buffer pH 8.0 containing 20%(v/v) ammonium sulfate and was eluted with a linear gradient of 20-0% ammonium sulfate. The purity and the size of the protein were analyzed by reducing SDS-PAGE. The size of the enzyme oligomer was examined by gel filtration using Hi-Load 26/60 Superdex 200 pg (GE Healthcare Biosciences). The concentration of BGLPf was determined from the UV absorbance at 280 nm using a molar extinction coefficient of 128 160 M À1 cm À1 as calculated from its protein sequence using a standard method (Gill & von Hippel, 1989).

Crystallization
Some crystals were obtained using the conditions described by Kaper et al. (2000), but their quality was too poor to allow X-ray analysis. Thus, initial screening for optimal crystallization conditions was performed using Crystal Screen, Crystal Screen 2 (Hampton Research) and Wizard 1 and 2 (Emerald BioSystems) with the hanging-drop vapour-diffusion method at 293 K. Typically, drops consisting of 1 ml protein solution (10 mg ml À1 in 20 mM Tris-HCl pH 8.0) and 1 ml reservoir solution (0.1 M Na HEPES pH 7.5 containing 0.8 M sodium phosphate monobasic monohydrate and 0.8 M potassium phosphate monobasic) were equilibrated against 0.4 ml reservoir solution. A crystal was obtained within one week at 293 K.

Data collection and processing
The selected crystal was immersed in a cryoprotectant consisting of 25%(v/v) glycerol solution, picked up in a loop and then flash-cooled in a stream of nitrogen gas at 100 K. X-ray diffraction data were collected using a Rayonix MX225HE detector at a wavelength of 0.9 Å on the BL41XU beamline at SPring-8 (Hyogo, Japan). The crystal-to-detector distance was 300 mm. The crystal was rotated through 180 with an oscillation angle of 0.5 per frame. The data collected from diffraction measurements were indexed, integrated and scaled with programs from the HKL-2000 software package (Otwinowski & Minor, 1997). Diffraction data were collected to a resolution of 2.35 Å . Data-collection and processing parameters are presented in Table 1.

Structure solution and refinement
The structure was solved by molecular replacement with MOLREP (Vagin & Teplyakov, 2010) using the structural data for the BGL monomer from Thermosphaera aggregans (BGLTa; 61% sequence identity to BGLPf; PDB entry 1qvb; Chi et al., 1999) as the search model. Further iterations of refinement and model building were performed with REFMAC5 (Murshudov et al., 2011), CNS (Brü nger et al., 1998 and Coot (Emsley & Cowtan, 2004). Noncrystallographic symmetry (NCS) restraints were not applied during the refinement. The presence of four enzyme molecules per asymmetric unit gave a crystal volume per protein mass (V M ) of 3.96 Å 3 Da À1 and a solvent content of 69%(v/v) (Matthews, 1968). The quality of the refined structure was checked with MolProbity (Chen et al., 2010). A Ramachandran plot showed that 96.7% of residues were in the favoured regions and 99.6% were in allowed regions. The structural data have been deposited in the PDB under accession code 3apg.

Results and discussion
3.1. Crystal structure of the tetrameric form of BGLPf A tetrameric structure was identified in the crystallographic asymmetric unit of BGLPf and was determined to a resolution of 2.35 Å ( Table 1). The tetramer shows 222 point-group symmetry and each monomer contacts all three symmetry-related partners. The monomer model contains 471 amino-acid residues and a (/) 8 barrel fold (Figs. 1a and 1b). The active site is located at the centre of the monomer and is reached from the outside by a tunnel with a length of 20 Å . A molecule of glycerol, which was used as cryoprotectant, was observed in the active site of each of the four monomers. Based on a comparison between the structures, a root-mean-square deviation (r.m.s.d.) value of 0.57 Å for 417 C atoms was calculated between BGLPf and BGLTa.
The individual monomers in the tetramer structure were named A, B, C and D (Fig. 1c). The structure of monomer A in BGLPf was compared with those of B, C and D, with r.m.s.d. values ranging from 0.15 to 0.20 Å over 469-470 C atoms. A similar structure consisting of homotetramers has previously been reported in another crystal form determined at 3.3 Å resolution (Kaper et al., 2000).
Gel filtration of BGLPf gave a single peak from which the molecular weight of the protein was estimated to be 238.8 kDa, which is similar to that of the BGLPf tetramer (220 kDa), suggesting that BGLPf predominantly forms tetramers ( Supplementary Fig. 1 1 ). Oligomeric structures appear to be a common characteristic of BGLs, with the exception of that from P. horikoshii. The BGL from the hyperthermophilic bacterium Thermotoga maritima forms a dimer (Zechel et al., 2003), while the BGLs from the hyperthermophiles Sulfolobus solfataricus (Aguilar et al., 1997) and Thermosphaera aggregans (Chi et al., 1999) and BGL A from the mesophile Clostridium cellulovorans (Jeng et al., 2011) form tetramers. BGLPf seems to form a tetrameric structure under physiological conditions in P. furiosus cells.

Analysis of the dimer interface of BGLPf
The results of reducing SDS-PAGE experiments with BGLPf are presented in Fig. 2 (hkl)i is the average intensity of reflection hkl with summation over all data. ‡ R factor = P hkl jF obs j À jF calc j = P hkl jF obs j, where F obs and F calc are the observed and calculated structure factors, respectively. § R free is equivalent to the R factor but is calculated for 5% of the reflections chosen at random and omitted from the refinement process.
in SDS-PAGE loading buffer (consisting of 2% SDS and 710 mM -mercaptoethanol) migrated with an apparent relative molecular mass of about 110 kDa, which corresponds to the molecular size of a dimer. Interestingly, BGLPf that had previously been heated to 368 K in reducing SDS-PAGE loading buffer also migrated with apparent relative molecular masses of about 110 kDa (major band) and 55 kDa (minor band). These results indicate that most of the dimers in the BGLPf tetramer are hyperthermophilic and stable towards SDS. Similar results have previously been obtained with BGLSs tetramers (Gentile et al., 2002).
To analyze the dimer interface, the interactions between each monomer of BGLPf, BGLSs and BGLACc were examined using the Protein Interfaces, Surfaces and Assemblies (PISA) web server (Krissinel & Henrick, 2007; Table 2).
In BGLPf, the interfacial contacts within the tetramers were mainly hydrophobic, with some specific polar interactions. The number of salt bridges and hydrogen bonds found in the A-C interface was lower than that in the A-B interface. However, the solvent-inaccessible area of the A-C interface was found to be larger than that of the A-B interface ( Table 2). The averaged solvation free energy (Á i G) for the A-C interface was also more negative than that for the A-B interface. Remarkably, the hyperthermophilic A-C dimer of BGLSs is stable even at 358 K (Moracci et al., 1995). The A-C dimer of BGLSs presents similar solvent-inaccessible area and averaged solvation free-energy values to those of BGLPf and these values are larger than those of the A-B dimer of BGLSs, which is in good agreement with our results (Gentile et al., 2002). When comparing BGLPf and BGLACc, both solvent-inaccessible area and averaged solvent free energy in BGLPf were much larger than those in BGLACc. In conclusion, thermostable BGLPf has a comparatively large solvent-inaccessible area at the A-C interface; the averaged solvation free energy is reduced, thereby providing the A-C dimer with hyperthermostability. These results suggest that in BGLs the A-C dimer is more stable than the A-B dimer. Furthermore, the hyperthermostability of the tetramer structure of BGLPf seems to mainly be controlled by entropy-driven interactions. Compared with the hyperthermophilic BGLPf and the mesophilic BGLACc, three major differences were observed. Firstly, the insertion from Thr90 to Leu118 (blue line in Fig. 3 and blue circles in Figs. 4a and 4b) exists in BGLPf, BGLTa and BGLSs, but not in BGLACc. This insertion was also not found in BGLPh, the specific hyperthermophilic BGL, which exists as a monomer. Conversely, the insertion in BGLACc from Gly297 to Lys298 (green line in Fig. 3 and green circles in Figs. 4a and 4b) was not found in BGLPf. The structures of these insertions are located outside of the A-C dimer. SDS-PAGE of BGLPf. Lane St, relative molecular-weight standards; lane 1, BGLPf prepared in reducing SDS-PAGE loading buffer (2% SDS and 710 mM -mercaptoethanol) without heating; lanes 2, 3 and 4, BGLPf prepared in reducing SDS-PAGE loading buffer (2% SDS and 710 mM -mercaptoethanol) and heated at 368 K for 5, 10 and 20 min, respectively. The molecular weight of monomeric BGLPf is 54.7 kDa. A-C 34 (7.2) 34 (6.9) 30 (6.8) A-B 24 (5.1) 24 (4.9) 24 (5.3) Averaged solvation free-energy (ÁG) gain on formation of the interface (kcal mol À1 )
Thirdly, in BGLACc the hydrophobic interaction found in BGLPf was not constructed owing to the insertion from Asn368 to Lys377 (black line in Fig. 3). In other words, the insertion of the C-terminal end (red line in Fig. 3) in archaeal hyperthermophilic BGLs generated the hydrophobic patches stabilizing the A-C dimer. In particular, the hydrophobic interaction may contribute to entropy in the Gibbs free energy, reducing Á i G at high temperature. Introduction of the C-terminal part of BGLPf into any of the other BGLs may explain the hyperthermostability of their tetramer structure.