Structural Biology and Crystallization Communications Open and Closed Conformations of Two Spoiiaa- like Proteins (yp_749275.1 and Yp_001095227.1) Provide Insights into Membrane Association and Ligand Binding

The crystal structures of the proteins encoded by the YP_749275.1 and YP_001095227.1 genes from Shewanella frigidimarina and S. loihica, respectively , have been determined at 1.8 and 2.25 A ˚ resolution, respectively. These proteins are members of a novel family of bacterial proteins that adopt the / SpoIIAA-like fold found in STAS and CRAL-TRIO domains. Despite sharing 54% sequence identity, these two proteins adopt distinct conformations arising from different dispositions of their 2 and 3 helices. In the 'open' conformation (YP_001095227.1), these helices are 15 A ˚ apart, leading to the creation of a deep nonpolar cavity. In the 'closed' structure (YP_749275.1), the helices partially unfold and rearrange, occluding the cavity and decreasing the solvent-exposed hydrophobic surface. These two complementary structures are reminiscent of the conformational switch in CRAL-TRIO carriers of hydrophobic compounds. It is suggested that both proteins may associate with the lipid bilayer in their 'open' monomeric state by inserting their amphiphilic helices, 2 and 3, into the lipid bilayer. These bacterial proteins may function as carriers of nonpolar substances or as interfacially activated enzymes.


Introduction
The YP_749275.1 gene from Shewanella frigidimarina encodes a protein of unknown function with a molecular weight of 14 502 Da (residues 1-126) and a calculated isoelectric point of 4.9. An ortholog with 54% sequence identity from S. loihica (YP_001095227.1) is also of unknown function, with a molecular weight of 14 105 Da (residues 1-125) and a calculated isoelectric point of 4.9.
Both sequences have been assigned to a family of 119 bacterial and archaeal proteins (PB000640) in the automatically generated Pfam-B entries (Finn et al., 2008). The proteins in the PB000640 family are composed of a single domain, with one exception which is fused to a universal stress protein (UspA) domain. Profile-profile sequencecomparison methods (Jaroszewski et al., 2005) detected distant homology to proteins which adopt the SpoIIAA-like fold (Kovacs et al., 1998). These NTP-binding proteins are involved in regulating the sporulation sigma factor F in Bacillus subtilis. However, since this relationship is relatively distant (<15% sequence identity), it does not allow a direct functional inference.
Here, we report the structures of these two orthologs determined using the semi-automated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG; Lesley et al., 2002) as part of the Protein Structure Initiative of the National Institute of General Medical Sciences (http://www.nigms.nih.gov/Initiatives/PSI/). Both proteins have now been classified by SCOP (Hubbard et al., 1999) as being members of a novel Sfri0576-like family which belongs to the SpoIIAA superfamily. However, despite sharing the same fold, their structures differ significantly in the relative disposition of two surface -helices and in their mode of dimerization. The arrangement of the -helices suggests that the proteins may associate with the membrane and possibly function as carriers of nonpolar compounds similar to CRAL-TRIO domains.

Protein production and crystallization
The clones for YP_749275.1 and YP_001095227.1 were generated using the Polymerase Incomplete Primer Extension (PIPE) cloning method (Klock et al., 2008). The gene encoding YP_749275.1 (Gen-Bank YP_749275; gi:114561762; Swiss-Prot Q087X8) was amplified from S. frigidimarina NCIMB 400 genomic DNA using PfuTurbo DNA polymerase (Stratagene) and I-PIPE (Insert) primers (forward primer, 5 0 -ctgtacttccagggcATGGATATGAAGAAACATGGTTTA-TCG-3 0 ; reverse primer, 5 0 -aattaagtcgcgttaATATCGAAGCCATTT-CAAGGCGTCATC-3 0 ; target sequence in upper case) that included sequences for the predicted 5 0 and 3 0 ends. The expression vector pSpeedET, which encodes an amino-terminal tobacco etch virus (TEV) protease-cleavable expression and purification tag (MGSD-KIHHHHHHENLYFQ/G), was PCR-amplified with V-PIPE (Vector) primers (forward primer, 5 0 -taacgcgacttaattaactcgtttaaacggtctccagc-3 0 ; reverse primer, 5 0 -gccctggaagtacaggttttcgtgatgatgatgatgatg-3 0 ). V-PIPE and I-PIPE PCR products were mixed to anneal the amplified DNA fragments together. Escherichia coli GeneHogs (Invitrogen) competent cells were transformed with the V-PIPE/I-PIPE mixture and dispensed onto selective LB-agar plates. The cloning junctions were confirmed by DNA sequencing. Expression was performed in selenomethionine-containing medium at 310 K with suppression of normal methionine synthesis (Van Duyne et al., 1993). At the end of fermentation, lysozyme was added to the culture to a final concentration of 250 mg ml À1 and the cells were harvested and frozen. After one freeze-thaw cycle, the cells were homogenized in lysis buffer [50 mM HEPES pH 8.0, 50 mM NaCl, 10 mM imidazole, 1 mM tris(2-carboxyethyl)phosphine-HCl (TCEP)] and passed through a Microfluidizer (Microfluidics). The lysate was clarified by centrifugation at 32 500g for 30 min and loaded onto nickel-chelating resin (GE Healthcare) pre-equilibrated with lysis buffer; the resin was washed with wash buffer [50 mM HEPES pH 8.0, 300 mM NaCl, 40 mM imidazole, 10%(v/v) glycerol, 1 mM TCEP] and the protein was eluted with elution buffer [20 mM HEPES pH 8.0, 300 mM imidazole, 10%(v/v) glycerol, 1 mM TCEP]. The eluate was bufferexchanged with TEV buffer (20 mM HEPES pH 8.0, 200 mM NaCl, 40 mM imidazole, 1 mM TCEP) using a PD-10 column (GE Healthcare) and incubated with 1 mg TEV protease per 15 mg of eluted protein. The protease-treated eluate was run over nickel-chelating resin (GE Healthcare) pre-equilibrated with HEPES crystallization buffer (20 mM HEPES pH 8.0, 200 mM NaCl, 40 mM imidazole, 1 mM TCEP) and the resin was washed with the same buffer. The flowthrough and wash fractions were combined and concentrated to 17.7 mg ml À1 by centrifugal ultrafiltration (Millipore) for crystallization trials and crystallized by mixing 200 nl protein with 200 nl crystallization solution in sitting drops above a 50 ml reservoir volume using the nanodroplet vapor-diffusion method (Santarsiero et al., 2002) with standard JCSG crystallization protocols (Lesley et al., 2002). The crystallization reagent for YP_749275.1 consisted of 0.2 M calcium acetate and 20.0% PEG 3350. A diamond-shaped crystal of approximate dimensions 50 Â 50 Â 50 mm was harvested after 10 d at 277 K. Ethylene glycol was added to the crystal as a cryoprotectant to a final concentration of 8%(v/v). Initial screening for diffraction was carried out using the Stanford Automated Mounting system (SAM; Cohen et al., 2002) at the Stanford Synchrotron Radiation Lightsource (SSRL, Menlo Park, California, USA). The diffraction data were indexed in monoclinic space group C2. The oligomeric state of YP_749275.1 in solution was determined using a 1 Â 30 cm Superdex 200 column (GE Healthcare) coupled with miniDAWN static lightscattering and Optilab differential refractive-index detectors (SEC/ SLS; Wyatt Technology). The mobile phase consisted of 20 mM Tris pH 8.0, 150 mM sodium chloride and 0.02%(w/v) sodium azide. The molecular weight was calculated using the ASTRA v.5.1.5 software (Wyatt Technology).

Data collection, structure solution and refinement
For YP_749275.1 and YP_001095227.1, selenium multiwavelength anomalous diffraction (MAD) data were collected on beamline 11-1 at SSRL at wavelengths corresponding to the inflection, high-energy remote and peak. The data sets were collected at 100 K using a MAR 325 CCD detector and the BLU-ICE data-collection environment (McPhillips et al., 2002). The MAD data were integrated using MOSFLM (Leslie, 1992) and scaled with the program SCALA from the CCP4 suite (Collaborative Computational Project, Number 4, 1994). Phasing was performed with SHELXD (Schneider & Sheldrick, 2002) and autoSHARP (Bricogne et al., 2003), which resulted in a mean figure of merit of 0.35 to 1.8 Å resolution for YP_749275.1 with five selenium sites and of 0.43 to 2.25 Å resolution for YP_001095227.1 with six selenium sites. Automatic model building was performed with ARP/wARP (Cohen et al., 2004). Model completion and refinement were performed with Coot (Emsley & Cowtan, 2004) and REFMAC5.2 (Winn et al., 2003) using the inflection-wavelength data. The refinement of YP_749275.1 included experimental phase restraints in the form of Hendrickson-Lattman coefficients from SHARP and TLS refinement with one TLS group per chain. The refinement of YP_001095227.1 included experimental phase restraints, NCS restraints (positional weight 0.5 and thermal weight 2.0) and TLS refinement with one TLS group per chain. Datacollection and refinement statistics are summarized in Table 1.

Validation and deposition
The quality of the crystal structure was analyzed using the JCSG quality-control server. This server verifies the stereochemical quality of the model using AutoDepInputTool (Yang et al., 2004), MolProbity (Davis et al., 2007), WHATIF (Vriend, 1990) and RESOLVE structural communications (Terwilliger, 2003), as well as several in-house scripts, and summarizes the outputs. Protein quaternary-structure analysis was carried out using the PISA server (Krissinel & Henrick, 2007). Fig. 1(c) was adapted from ESPript (Gouet et al., 1999) and all other figures were prepared with PyMOL (DeLano Scientific). The atomic coordinates and experimental structure factors for YP_749275.1 and YP_001095227.1 have been deposited in the PDB under codes 2ook and 2q3l, respectively.

Orientation of proteins in membranes
The spatial orientations of membrane-associated proteins with respect to the lipid bilayer, including the maximal penetration depths (D) of protein residues in the hydrocarbon core and the free energy of transfer of the protein from water to the membrane (ÁG transf ), were calculated using the PPM program, as previously described by Lomize et al. (2006). Two major contributions are considered in the current version of PPM: (i) the water-lipid transfer energy calculated with atomic solvation parameters (favorable for nonpolar C and S atoms and unfavorable for polar N and O atoms) and (ii) the deionization penalty for charged residues. The smoothing function with a decay parameter of 1 Å was used to describe the gradual polarity changes at the lipid head group-hydrocarbon core boundary.

Overall structure
The crystal structures of YP_749275.1 and YP_001095227.1 (Fig. 1) were determined independently by the MAD method to 1.80 and 2.25 Å resolution, respectively. Data-collection and refinement statistics are summarized in Table 1. The final model of YP_749275.1 consists of a protein dimer (residues 2-126 for chain A and residues 3-79 and 82-126 for chain B), six ethylene glycols and 224 water molecules in the asymmetric unit. Similarly, the structure of YP_001095227.1 consists of a protein dimer (residues 0-125 for chain A and residues 0-48 and 53-125 for chain B), two sodium ions, two chloride ions, nine 2-methyl-2,4-pentanediol molecules and 95 water molecules in the asymmetric unit.
YP_749275.1 and YP_001095227.1 both adopt the SpoIIAA-like fold according to SCOP and CATH (Cuff et al., 2009). This fold consists of four turns of / superhelix with an additional N-terminal -strand. The five strands and four helices are arranged in the order 1-2-1-3-2-4-3-5-4. The two proteins share the same topology ( Fig. 1), although some noticeable differences are apparent in the lengths of the -strands and -helices.

Comparison of the YP_749275.1 and YP_001095227.1 structures
Despite sharing high sequence identity (54%), the two structures align with an overall r.m.s.d. of 4.2 Å (123 aligned C atoms). However, most of the deviations occur around the 2 and 3 helices (Fig. 2a). The r.m.s.d. decreases to 1.6 Å over 92 C atoms when these two helices are excluded from the alignment.   (Diederichs & Karplus, 1997). § R cryst = P hkl jF obs j À jF calc j = P hkl jF obs j, where F calc and F obs are the calculated and observed structure-factor amplitudes, respectively. } R free is the same as R cryst , but for 5.1% (2ook) and 5.2% (2q3l) of the total number of reflections that were chosen at random and omitted from refinement. † † Estimated overall coordinate error (Cruickshank, 1999;Collaborative Computational Project, Number 4, 1994).
YP_001095227.1 displays an 'open' conformation in which the 2 and 3 helices are 15 Å apart and form either side of a large channel that runs across one face of the protein (Figs. 2a and 2b). An analysis using the CastP server (Binkowski et al., 2003) reveals a deep cavity (1743 Å 3 ) lined by over 20 residues that are all hydrophobic, except for Asp73, which is hydrogen bonded to Tyr34. The floor of the cavity is formed by the -sheet and helix 1. This large hydrophobic cavity represents a potential ligand-binding pocket that is occupied in the crystal structure by three 2-methyl-2,4-pentanediol (MPD) molecules, which are likely to stabilize the 'open' conformation by partially filling the cavity.

Dimerization mode
Size-exclusion chromatography supports the assignment of a dimer as the main oligomeric state in solution for both YP_749275.1 and YP_001095227.1. In the 'closed' structure of YP_749275.1, the two monomers are arranged side by side and create an extended intermolecular -sheet (Fig. 3a). This interface includes additional con-tacts between the adjacent 1 helices and buries a surface area of 1439 Å 2 with a free energy of dissociation (ÁG diss ) of 67.4 kJ mol À1 as calculated by the PISA server (Krissinel & Henrick, 2007).
The dimerization mode of the 'open' structure, YP_001095227.1, is different and comparatively weaker. The PISA server predicts two dimerization modes with similar buried surface areas. In one mode (Fig. 3b), the two protein monomers associate through their N-terminal -strands, 2, 1 and the loop between 3 and 2, burying a surface area of only 787 Å 2 (ÁG diss = 3.8 kJ mol À1 ). In the other mode (Fig. 3c), dimer association would be mediated through 2 and the loop between 2 and 4, burying a surface area of 743 Å 2 (ÁG diss = 0.8 kJ mol À1 ). The low values of ÁG diss and buried surface area suggest that these dimers may not be stable (Krissinel & Henrick, 2007) and may represent a crystal-packing artifact.

Distribution of conserved residues
YP_749275.1 and YP_001095227.1 are assigned to family PB000640 in Pfam-B. This family includes 119 proteins from bacteria and archaea consisting of a single domain and an additional bacterial protein that includes a fusion to a universal stress protein (UspA) domain at its C-terminus. A set of conserved residues in the family was identified by aligning 20 of the most closely related bacterial proteins (>25% sequence identity in pairwise comparisons over the full length of the proteins; Fig. 4a).
These conserved residues form two clusters in the protein structure. The larger cluster (Figs. 4b and 4c) is located in the 'switch Crystal structures of (a) YP_749275.1 (PDB code 2ook) and (b) YP_001095227.1 (PDB code 2q3l) shown as ribbon diagrams of protein monomers color-coded from the N-terminus (blue) to the C-terminus (red). Helices 1-4 and strands 1-5 are indicated for YP_749275.1, while every tenth residue is numbered for YP_001095227.1. (c) Diagram showing the secondary-structure elements of YP_749275.1 and YP_001095227.1 superimposed on their primary sequences. Strands and helices are indicated by arrows and coils, respectively, and labeled sequentially as 1, 2 etc. and 1, 2 etc. Identical residues in these proteins are shown in white on a red background, while similar residues are shown as the reverse (red on a white background). region' where the two orthologs adopt distinct conformations. This cluster includes residues from 1 (His6, Gly7), 1 (Gly27, Leu29, Thr30, His31 and Tyr34) and 2 (Ala69, Ala70, Trp71, Asp72, Asp73 and Gly77). It is noteworthy that some conserved residues appear to stabilize the 'closed' conformation ( Fig. 4c), whereas others stabilize the 'open' conformation. For example, in the 'closed' state, Trp71 is buried from the solvent and participates in multiple van der Waals interactions with surrounding aromatic and aliphatic residues (Leu8, Leu29, Tyr34, Leu74 and Trp65), while His6 and His31 face the solvent and are not involved in any stabilizing hydrogen bonds. Conversely, in the 'open' state, Trp71 is solvent-exposed and may participate in protein-membrane interactions, whereas His6 and His31 form stabilizing hydrogen bonds with Asp33 and Asp72, respectively. In the open conformation, two other conserved residues, Tyr34 and Asp73, are located inside the hydrophobic cavity and are hydrogen bonded to each other (Fig. 4b). The conservation of these residues indicates their functional importance and suggests that they may be involved in hydrogen bonding and/or ionic interactions with a bound ligand.
This 'switch-region' cluster is supplemented by two residues from the adjacent subunit in the dimer (Ile12 and Arg14 in the 1 strand). Arg14 of one subunit hydrogen bonds to Asp33 of the other subunit, which is likely to provide some stability and specificity to the dimer formation (YP_749275.1). Ile12 engages in hydrophobic interactions with Met40 (Val40 in YP_001095227.1). However, Arg14 does not form any contacts in the 'open' monomeric structure (YP_ 001095227.1). Thus, the conserved Ile12 and Arg14 may contribute to stabilization of the dimeric state of the protein in solution.

Comparison with other structures
Based on the SCOP classification, the SpoIIAA-like fold consists of two structural superfamilies: (i) the bacterial sporulation antisigma factor antagonist SpoIIAA superfamily that contains the STAS domain (PF01740 in Pfam) and (ii) the CRAL-TRIO superfamily of eukaryotic carriers of nonpolar substances (PF03765 and PF00650 in Pfam).
YP_749275.1 and YP_001095227.1 have both been assigned to a novel Sfri0576-like family in the SpoIIAA-like superfamily in SCOP. However, their structures can be aligned, without significant insertions or deletions, with structures from both superfamilies, although a higher DALI Z score was obtained for SpoIIAA proteins (Holm & Sander, 1995 Meier et al., 2003) yielded an r.m.s.d. of 3.0 Å over 104 C atoms, while superposition of the 'closed' structures (YP_749275.1 and PDB entry 1r5l; Min et al., 2003) resulted in an r.m.s.d. of 2.9 Å over 95 C atoms. The close superposition of these entire domains indicates a possible common evolutionary origin for all these proteins.
Despite their structural similarity, the sequence identity between either YP_749275.1 or YP_001095227.1 and proteins in the SpoIIAAlike fold families is <15%. Comparison of the sequences of different STAS domains (PDB codes 1h4z, 1til, 1tid, 1sbo, 1auz, 2vy9 and 3f43;Seavers et al., 2001;Masuda et al., 2004;Etezady-Esfarjani et al., 2006;Kovacs et al., 1998;Marles-Wright et al., 2008) identified a GxLxH motif in some of these proteins (Seavers et al., 2001). A similar motif is conserved in YP_749275.1 and YP_001095227.1 ( 27 GKLTH). Although the Gly and Leu residues play a structural role in providing the tight turn between the 1-strand and 2-helix, the conservation of His in the STAS-domain proteins cannot easily be explained. On the other hand, the phosphorylatable serine that is conserved in all SpoIIAA (Ser58 in 1auz, Ser57 in 1h4x; Seavers et al., 2001) is substituted by negatively charged Glu/Asp residues in the majority of other members of the PB000640 Pfam-B family, including YP_ 749275.1 (Asp66), although Ser is present in YP_001095227.1 (Ser66). This key serine is located at the start of helix 2 and participates in the interaction of SpoIIAA with SpoIIAB and a nucleotide ligand. In the presence of ADP, the SpoIIAA-SpoIIAB complex is stable, while in the presence of ATP, SpoIIAA becomes phosphorylated and then dissociates (Aravind & Koonin, 2000;Najafi et al., 1996). The lack of conservation of this serine in YP_749275.1 indicates a possible loss of functional similarity to SpoIIAA.
The YP_749275.1 and YP_001095227.1 structures suggest that these proteins can adopt open and closed conformations (Fig. 2). This situation differs from bacterial STAS proteins, the structures of which are essentially 'closed' with 2 and 3 tightly packed, occluding any possible cavity formation; an open state has not yet been observed in STAS-domain proteins.
The presence of a deep cavity in YP_001095227.1 is similar to the eukaryotic Sec14-like proteins, which also adopt the same SpoIIAAlike fold. Sec14-like proteins have a lipid-binding CRAL-TRIO domain that participates in the transport of hydrophobic substances such as lipids or -tocopherol. Like YP_749275.1 and YP_ 001095227.1, yeast and human Sec14 proteins (PDB codes 1aua, 1oiz, 1r5l, 1o6u and 3b7n; Sha et al., 1998;Meier et al., 2003;Min et al., 2003;Stocker et al., 2002;Schaaf et al., 2008) have been crystallized in two alternative conformations which differ in the relative disposition of two helices located at the entrance to a hydrophobic ligand-binding Conserved residues in YP_749275.1 and YP_001095227.1. (a) A sequence alignment with other members of the PFAM PB000640 family (not shown) reveals several conserved residues (marked in grey boxes). Residues from the binding cavity are colored blue and residues that are predicted to penetrate to the lipid bilayer are colored red. These residues are indicated on the structure in (b) for YP_001095227.1 and (c) for YP_749275.1. The main cluster of conserved residues from strand 1 and helices 1 and 2 is shown in purple. The protein backbone is shown in a cartoon representation. The calculated membrane boundary is shown by grey dots. A few additional nonconserved residues involved in hydrophobic interactions (Leu8, Leu74 and Trp65) in YP_749275.1 are shown in orange. cavity. In particular, human TTP (PDB code 1oiz) was obtained in a detergent-bound 'open' structure (Meier et al., 2003) with the 'lid' helices 9 and 11 moved apart (Fig. 5c) as well as in a ligand-bound 'closed' structure (PDB code 1r5l; Min et al., 2003) with 11 shifted towards 9 and blocking the entrance to the cavity (Fig. 5d).

Predicted protein-membrane association
Calculations using the PPM method (Lomize et al., 2006) show that the 'open' conformation of YP_001095227.1 can associate with the lipid bilayer by immersing its exposed nonpolar residues from 2 and 3 into the lipid acyl-chain region. The predicted depth of residue penetration into the hydrophobic core of the membrane is 6.5 AE 0.4 Å and the calculated water-membrane transfer energy (ÁG transf ) is À47.3 kJ mol À1 . However, the corresponding protein membrane binding energy is expected to be smaller than the transfer energy, since part of the transfer energy must be spent on protein conformational change. The lipid-interaction residues include Leu67, Trp71, Leu74 and Leu78 from 2, and Leu95, Trp98, Val102, Trp105 and Phe106 from 3 (Fig. 5a).
In contrast, the 'closed' conformation of YP_749275.1 was predicted to form a stable dimer by the PISA server, as shown in Fig. 5(b). However, this dimer is visibly asymmetric as 3 is longer by one helical turn in subunit 2 compared with subunit 1. Furthermore, two hydrophobic residues (Leu67 in 2 and Trp95 in 3) are solventexposed in one subunit (indicated in purple) but buried from solvent in another subunit. These two solvent-exposed nonpolar residues may anchor the protein at the hydrophobic boundary of the lipid bilayer (Fig. 5b). However, the calculated depth of penetration is only 1.2 Å and ÁG transf is only À12.6 kJ mol À1 , indicating a weak association.

Predicted protein-protein interactions
A genomic neighborhood search performed using STRING (http:// string.embl.de) relates both YP_749275.1 and YP_001095227.1 to a TonB-dependent receptor precursor (a co-occurrence in the same species) and a universal stress protein (UspA) domain (localization in the close genetic neighborhood). One of the homologous proteins (UniProt ID Q083D4_SHEFN) from PfamB family PB000640 is fused to a UspA domain. A PSI-BLAST (Altschul et al., 1997) search returns about three dozen bacterial proteins (mostly proteobacterial) using a sequence-identity cutoff of 25%. Most of these proteins have no functional annotation. A few contain a UspA domain, including Q083D4_SHEFN, albeit with low e-value scores. Although the neighborhood-matching and the BLAST search scores are not significant enough to confer a definitive link between these proteins, they may suggest a role of these proteins in the stress-response pathway.

Discussion
YP_749275.1 and YP_001095227.1 can now be assigned to a new bacterial protein family which adopts the SpoIIAA-like structural fold. The structures suggest that these proteins are metamorphic, adopting two distinct conformations (open and closed) which are stabilized under different environmental conditions (Murzin, 2008). We suggest that the predicted anchoring of YP_749275.   Meier et al., 2003). The N-CRAL-TRIO domain is colored yellow and the lipid-binding CRAL-TRIO domain is colored green. Molecules of detergents or bound ligands are colored in dark green here and in (d). (d) -TTP in the 'closed' conformation (PDB code 1r5l; Min et al., 2003). In all figures, residues that penetrate or are proposed to penetrate the lipid bilayer are colored purple. Calculated boundaries between lipid head groups and the acyl-chain region are shown by grey dots.
YP_001095227.1 to the lipid bilayer via helices 2 and 3 would induce a switch from the 'closed' to the 'open' conformation. Both proteins presumably exist as stable water-soluble dimers in their 'closed' conformation which can weakly associate with the lipid bilayer via nonpolar residues from the 'lid' helices (Fig. 5b). Membrane binding would promote protein activation owing to the rearrangement of the 'lid' helices and subsequent dimer dissociation. The membrane-associated protein in the 'open' conformation ( Fig. 5a) may then bind amphiphilic ligands which have accumulated at the membrane interface.
The cellular localization of YP_749275.1 and YP_001095227.1 is currently unknown. We suggest that these proteins are cytoplasmic based on the following observations. Firstly, proteins from the same PfamB family are found in Gram-positive bacterial and archaeal species that lack the outer membrane and periplasmic space. Secondly, YP_001095227.1 has a single Cys125 that is likely to remain reduced in the bacterial cytoplasm (Ritz & Beckwith, 2001). Finally, one protein from the family is fused with the UspA domain, which is a cytoplasmic protein involved in the stress-response pathway.
We further suggest that YP_749275.1 and YP_001095227.1 may function either as water-soluble carriers of hydrophobic compounds, similar to the related CRAL-TRIO domains, or as interfacially activated enzymes. If these proteins are ligand carriers, then they can dissociate from the membrane in their ligand-loaded state. If these proteins are interfacially activated enzymes, they would remain membrane-associated while performing their chemical reactions. The presence of conserved Tyr34 and Asp73 residues at the entrance to the hydrophobic cavity may indicate possible enzymatic activity rather than simply the formation of a hydrogen bond to a ligand. For example, similar pairs of hydrogen-bonded residues (usually a Tyr-Glu pair) are found in a glycoside hydrolase (PDB code 2fhr; Watts et al., 2006) and in glycosyltransferases (PDB code 1s2g; Anand et al., 2004). In these enzymes, the Tyr and Glu residues form a nucleophile that interacts with the OH group of the substrate.
The natural ligands for these bacterial proteins remain unknown. The shape and the hydrophobic character of the cavity indicate a binding site for relatively large and poorly soluble compounds such as flavins, naphthoquinones or other substituted heterocycles. It is noteworthy that riboflavin and menaquinone are important for the growth of Shewanella cells on poorly soluble minerals, as they participate in the electron transfer to low-potential electron acceptors (Newman & Kolter, 2000;Marsili et al., 2008). Shewanella produces significant amounts of flavins (riboflavin and riboflavin-5 0 -phosphate) that mediate extracellular electron transfer, leading to reduction, chelation and uptake of ferric iron by the cells (Marsili et al., 2008). The uptake of iron complexes (with riboflavin or hydroxamate) can be facilitated by the TonB-dependent transport system (Schauer et al., 2008). Therefore, the genomic neighborhood link between YP_749275.1/YP_001095227.1 and the TonB receptor (which also has a high co-occurrence with the nicotinamide mononucleotide transporter PnuC) may be of functional significance. In addition, the connection with UspA domains suggests a possible role of these proteins in the stress-response pathway.