Crystal structure of the putative cell-wall lipoglycan biosynthesis protein LmcA from Mycobacterium smegmatis

The first crystal structure of the putative cell-wall biosynthesis protein LmcA from Mycobacterium smegmatis is reported at 1.8 Å resolution. The structure revealed an elongated β-barrel fold enclosing two distinct cavities, indicating a possible lipid-binding function in lipomannan/lipoarabinomannan biosynthesis.


Introduction
Bacteria of the suborder Corynebacterineae include important human pathogens such as Mycobacterium tuberculosis, M. leprae and Corynebacterium diphtheriae, and nonpathogenic species such as M. smegmatis and C. glutamicum, which serve as useful experimental models. M. tuberculosis infects around one-quarter of the entire human population and causes approximately 1.4 million deaths annually, making it one of the top ten causes of death worldwide (World Health Organization, 2020). A key virulence factor and validated drug target is the unusually hydrophobic, multilayered cell wall of these bacteria, which comprises a diverse variety of lipids with structural roles as well as important functions in interactions with the human host (Brennan & Nikaido, 1995;Jankute et al., 2015).
One group of abundant glycolipids synthesized by all mycobacteria and corynebacteria are the phosphatidylmyo-inositol mannosides (PIMs). The PIMs also serve as membrane anchors for hyperglycosylated species: lipomannan (LM) and lipoarabinomannan (LAM). These complex surface lipoglycans are essential for the viability and in vivo survival of pathogenic mycobacterial species due to their capacity to modulate host immune responses during infection (Chatterjee & Khoo, 1998;Maeda et al., 2003;Mishra, Driessen et al., 2011;Nigou et al., 2002;Schlesinger et al., 1994;Strohmeier & Fenton, 1999;Vercellone et al., 1998). While many enzymatic steps of the PIM/LM/LAM biosynthetic pathway have been defined, generally through studies using C. glutamicum or M. smegmatis as a model, the mechanisms by which the pathway is regulated, how the various proteins cooperate to synthesize these lipoglycans and how the intermediates are transported through the cell-wall layers remain poorly understood.
Previously, we identified a new membrane protein, conserved in Corynebacterineae, that is required for synthesis of full-length LM and LAM (Cashmore et al., 2017). Deletion of the NCgl2760 gene in C. glutamicum, a useful model organism for the study of cell-wall synthesis in Corynebacterineae, resulted in a complete loss of mature LM/LAM and the appearance of a novel truncated LM (t-LM). Lipid structural studies indicated that the ÁNCgl2760 t-LM comprised a series of short LM species containing a truncated (1-6)-linked mannose backbone with greatly reduced (1-2) mannose side chains. These t-LM species were structurally similar to those of a C. glutamicum mutant lacking the MptA mannosyltransferase that extends the (1-6) mannan backbone of LM intermediates , indicating that both proteins may act at a similar point in the pathway for LM (Cashmore et al., 2017;Fig. 1a). C. glutamicum NCgl2760 has putative orthologs in M. smegmatis (MSMEG_0317) and M. tuberculosis (Rv0227c), both of which are encoded by essential genes (Cashmore et al., 2017;Griffin et al., 2011;Sassetti et al., 2003). Rv0227c has been localized to the bacterial surface and implicated in host cell entry by M. tuberculosis (Rodríguez et al., 2012), but is otherwise unstudied.
NCgl2760, MSMEG_0317 and Rv0227c, which we collectively term LmcA, lack significant amino-acid sequence similarity to other proteins, making their function difficult to predict. To gain structural insight into the LmcA family, here we report the first crystal structure of the major domain of M. smegmatis LmcA at 1.8 Å resolution. Our crystal structure reveals an elongated -barrel fold enclosing two distinct cavities. Xenon derivatization of the crystal structure further identified structural elements within the -barrel that undergo conformational flexibility that allows cavity access. The AlphaFold2derived M. tuberculosis Rv0227c model revealed an identical elongated -barrel fold, consistent with our experimentally derived crystal structures, highlighting the accuracy of AlphaFold2-based predictions. While the AlphaFold2modelled structure of the C. glutamicum ortholog NCgl2760 predicts a much smaller -barrel fold, the most striking feature common to all three LmcA proteins is an enclosed central cavity, suggesting a common mechanism of ligand binding.
The cells were resuspended in lysis buffer consisting of 20 mM Tris pH 7.5, 500 mM NaCl, 10%(v/v) glycerol, 5 mM imidazole, 0.1% Thesit supplemented with cOmplete EDTAfree Protease Inhibitor Cocktail (Roche) and lysed by sonication. The supernatant was clarified by centrifugation at 45 000g and 4 C for 30 min, filtered and loaded onto 1 ml Ni-NTA resin (Roche). After extensive washes with wash buffer [20 mM Tris pH 7.5, 500 M NaCl, 10%(v/v) glycerol, 5 mM imidazole], the protein was eluted in wash buffer supplemented with 150 mM imidazole. MSMEG_0317Á-containing fractions were subjected to size-exclusion chromatography (SEC; Superdex 75 16/600, Cytiva) in SEC buffer [20 mM Tris pH 7.5, 200 mM NaCl, 5%(v/v) glycerol]. Fractions containing MSMEG_0317Á protein were pooled and further purified by anion-exchange chromatography. MSMEG_0317Á was diluted with buffer A [20 mM Tris pH 7.5, 5%(v/v) glycerol] and loaded onto a MonoQ 1/10 GL column (Cytiva) pre-equilibrated in   (Guerin et al., 2009;Lea-Smith et al., 2008) and PatA (Kordulá ková et al., 2003) to produce AcPIM2 from phosphatidylinositol (PI). Further mannosylation yields AcPIM4, which is transported to the periplasm and can be processed by the mannosyltransferase PimE  to form AcPIM6, an end product, or channelled into a parallel pathway for LM and LAM synthesis by the lipoprotein LpqW Kovacevic et al., 2006;Marland et al., 2006). LM/LAM synthesis is catalysed by the PPM-dependent mannosyltransferases MptB, MptA and MptC (Kaur et al., 2006(Kaur et al., , 2007Mishra et al., 2007Mishra et al., , 2008Mishra, Krumbach et al., 2011). A phospholipid-binding protein, LmeA (Rahlwes et al., 2017), is involved in maintaining MptA under stress conditions (Rahlwes et al., 2020). The focus of the current study, LmcA (underlined), also functions at the MptA step in C. glutamicum (Cashmore et al., 2017). (b) The MSMEG_0317 genetic locus. The MSMEG_0317 gene is encoded within a locus that is highly conserved in Corynebacterineae. Likely orthologous genes in the three species are shown using the same colour. Previously studied genes are tmaT (Yamaryo-Botte et al., 2015) and mtrP (Rainczuk et al., 2020), both with roles in cell-wall mycolic acid transport, and the LM/LAM biosynthesis gene NCgl2760 (Cashmore et al., 2017), while the remaining genes are uncharacterized. The focus of the current study is boxed. (c) Predicted membrane topology of MSMEG_0317. Following cleavage of the putative signal peptide (red), the mature protein is proposed to comprise a large periplasmic N-terminal domain, a single transmembrane domain and a small cytoplasmic tail. (d) The elution profile of MSMEG_0317Á on a HiLoad 16/60 Superdex 75 gel-filtration column suggesting a monomeric protein (top) and SDS-PAGE analysis of the eluted MSMEG_0317Á ($34 kDa) (bottom). The molecular-weight markers used for calibration are bovine -globulin (158 kDa), chicken ovalbumin (44 kDa) and equine myoglobin (17 kDa). See also Supplementary Fig. S1. buffer A. MSMEG_0317Á was eluted with a gradient of buffer B [20 mM Tris pH 7.5, 1 M NaCl, 5%(v/v) glycerol] over 15 column volumes. MSMEG_0317Á-containing fractions were pooled, concentrated to 5 mg ml À1 and flash-frozen for storage at À80 C.

Crystallization, data collection and structural determination
Poor-quality crystals of MSMEG_0317Á were initially obtained through a random screen conducted at the C3 CSIRO facility in 2.7 M ammonium sulfate, 50 mM Tris pH 8.5. The crystals were then optimized by repeated rounds of microseeding and buffer optimization using the vapourdiffusion method. The best crystals were obtained by mixing 0.5 ml protein solution at 10 mg ml À1 with 0.5 ml reservoir solution consisting of 2.2 M ammonium sulfate, 50 mM Tris pH 7.0 in the presence of microseeds. Crystals were flashcooled in liquid nitrogen in reservoir solution supplemented with 10%(v/v) glycerol. For iodide phasing, crystals were soaked in 0.25-0.5 M potassium iodide solution prior to cooling.
X-ray diffraction data for native MSMEG_0317Á and iodide-derived MSMEG_0317Á crystals were collected on the MX2 and MX1 beamlines at the Australian Synchrotron (Aragã o et al., 2018;Cowieson et al., 2015), respectively. The native data were collected to 2.0 Å resolution at a wavelength of 0.9537 Å (referred to as MSMEG_0317Á old native data in Table 1). SAD data were collected at a wavelength of 1.4586 Å from iodide-derived MSMEG_0317Á crystals (referred to as MSMEG_0317Á-KI in Table 1). Two data sets were collected from the iodide-soaked crystal at the same position of the crystal, but with an offset of 0.25 in oscillation range for the second data set. All diffraction data were processed using XDS (Kabsch, 2010) in space group P1 and the iodide data sets were merged in AIMLESS within the CCP4 suite (Evans & Murshudov, 2013;Winn et al., 2011). Automated experimental phasing was carried out using the single-wavelength anomalous diffraction (SAD) phasing protocol of Auto-Rickshaw (Panjikar et al., 2005(Panjikar et al., , 2009). The input diffraction data were prepared and converted for use in Auto-Rickshaw using programs from the CCP4 suite . 35 iodine sites with partial occupancy were bound to the protein.
Further native data were run through the MR protocol of Auto-Rickshaw and were refined with REFMAC . The resultant model contained 95% of the total residues. Subsequently, we collected a new native data at 1.8 Å resolution which replaced the original 2.0 Å resolution data set (referred to as MSMEG_0317Á new native data in Table 1). The R free set was copied from the original 2.0 Å resolution data and the model was further improved using manual model building in Coot (Emsley et al., 2010) and refinement in BUSTER (Bricogne et al., 2017). The final refined model has 98% of residues in the favoured region and 2% in allowed regions. A xenon pressure cell (Hampton Research) available at the Australian Synchrotron was used to pressurize MSMEG_0317Á crystals with xenon before cryocooling following previously described protocols (Panjikar & Tucker, 2002). An MSMEG_0317Á crystal in the loop was lowered into the xenon chamber and kept moist by placing $500 ml of the crystallization well solution (2.2 M ammonium sulfate, 50 mM Tris pH 7.0) at the bottom of the chamber. The chamber was held with 20 bar of xenon gas for 1 min and the gas was then released slowly. Soon afterwards, the looped crystal was plunge-cooled in liquid nitrogen. Data from the xenon-pressurized MSMEG_0317Á crystal were collected at a wavelength of 0.9537 Å on the MX2 beamline of the Australian Synchrotron (referred to as MSMEG_0317Á-Xe in Table 1). Two data sets were collected from different positions of the same crystal and processed using XDS in space group P1, followed by merging and scaling using AIMLESS. The crystal diffracted to 1.8 Å resolution, allowing xenon binding sites to be located unambiguously. The MSMEG_0317Á-Xe structure was solved by the molecular-replacement method using Phaser (McCoy et al., 2007) in the CCP4 suite, using the wild-type structure as a model, followed by model building and refinement in Coot and BUSTER, respectively. All structures were validated using MolProbity (Williams et al., 2018). All molecular-graphics representations were created using PyMOL (version 2.3.4; Schrö dinger). The topology diagram was generated using Pro-origami (Stivala et al., 2011).
CASTp was used for cavity analysis (Tian et al., 2018). X-ray diffraction data-collection and refinement statistics are reported in Table 1. Coordinates and structure factors for native MSMEG_0317Á and MSMEG_0317Á-Xe have been deposited in the Protein Data Bank (PDB) with accession codes 7n3v and 7shw, respectively.

Results and discussion
3.1. LmcA is a putative membrane protein that is well conserved among mycobacteria and corynebacteria Our previous studies on C. glutamicum LmcA (NCgl2760) provided strong evidence for a role in the formation of fulllength LM and LAM (Cashmore et al., 2017). NCgl2760 is encoded by a genetic locus that is well conserved in the Corynebacterineae suborder (Fig. 1b) and is dedicated to cellwall synthesis (Rainczuk et al., 2020;Yamaryo-Botte et al., 2015). In sequence-similarity searches, MSMEG_0317 was the best match for NCgl2760 in the M. smegmatis genome, with the proteins sharing 24% amino-acid sequence identity ( Supplementary Fig. S1). This, combined with the genome synteny across multiple species, suggests that the proteins are orthologs. MSMEG_0317 and M. tuberculosis Rv0227c display higher identity (65%; Supplementary Fig. S1), as expected for proteins from species belonging to the same genus.

Expression, purification and structural determination
MSMEG_0317 is predicted to contain a signal peptide, a large periplasmic domain and a single transmembrane domain (residues 324-349) located towards the C-terminal end (Fig. 1c). To gain insight into the structure of MSMEG_0317, we focused on the periplasmic domain and produced a truncated form that lacks the predicted signal peptide and transmembrane domain, referred to as MSMEG_0317Á (residues 30-323), using a M. smegmatis expression system (Triccas et al., 1998). Deletion of both the signal peptide and the transmembrane domain yielded a stable soluble form (Fig. 1d). MSMEG_0317Á eluted as a monomer on size-exclusion chromatography and yielded diffraction-quality crystals after several rounds of seeding (MSMEG_0317Á old and new native data; Table 1). As structural homologs of MSMEG_ 0317 have not previously been characterized, we next soaked the native crystals with varying concentrations of halide ions, such as bromide and iodide. Crystals were able to tolerate 0.25 M potassium iodide without losing their crystalline order, allowing the collection of a SAD data set (MSMEG_0317Á-KI; Table 1). The crystal structure of MSMEG_0317Á was solved to 2.4 Å resolution and refined against the native data set using Auto-Rickshaw followed by refinement in REFMAC5, resulting in an almost complete model in space group P1 with two molecules in the asymmetric unit (Table 1, Supplementary Fig. S2).

MSMEG_0317D adopts an elongated b-barrel fold
MSMEG_0317Á adopts an elongated -barrel core composed of 11 antiparallel -strands with two -turns and one -helix extending away from the core (Fig. 2, Supplementary Fig. S3). The N-terminal region folds back and research papers interacts with the C-terminal -helix located just before the transmembrane domain to form a closed structure that resembles the shape of a 'cone with a flake', with the cone being the -barrel core and the extended -helix being the flake. All the loops connecting the -strands and -turns are ordered except for residues 129-154 within loop 6, which connects 5 and 6 (Fig. 2b). The wall of the -barrel core is formed by two sets of antiparallel flat or twisted -strands (Fig. 2b). The first set of antiparallel -strands is comprised of 1, 3, 4 and 5, which form one side of the -barrel wall, and the second is comprised of 6, 7, 8, 11, 12 and 13, which form the opposite wall. Of these, -strands 1, 3, 4, 6, 7 and 12 adopt twisted conformations to various degrees due to the presence of a glycine or a proline (Fig. 2b). Each -strand is connected to the subsequent -strand through hydrogen-bond interactions, except for 8 and 9, which do not interact with each other directly but instead interact with 11 (Fig. 2, Supplementary Figs. S3   strand 1 interacts with the last strand 13, the MSMEG_ 0317Á fold resembles a closed toroidal -barrel. The narrow base of the -barrel core is occupied by 2, 10, 9 and the end of 11. As expected, the overall electrostatic potential of the MSMEG_0317Á surface reveals a net positive charge near to the transmembrane domain attributed to the presence of Lys38 of loop 1, Arg122 of loop 6 and Arg317 of 14 ( Supplementary Fig. S4b). Interestingly, the surface electrostatics of residues on the surface of 14, 1, 3, 4 and 5 show an overall negative charge compared with surface residues in 8, 11, 12 and 13, suggesting that these may represent distinct surface interactions to accommodate the binding of interacting partners within the LM/LAM pathway.

The structure of MSMEG_0317D reveals two enclosed cavities
A structure-comparison search of the Protein Data Bank using the DALI server (Holm, 2020) suggested structural similarity (Z-score 11-12, an indicator of structural similarity) to members of the CD36 superfamily of scavenger receptor proteins, including the human lysosomal integral membrane protein 2 (LIMP-2) and CD36, a fatty-acid transporter. The overall shape of MSMEG_0317Á has similarity to LIMP-2 and CD36, which also adopt an asymmetric -barrel core (Fig. 3a). Interestingly, the three-helix bundle atop the extended -strands in LIMP-2 and CD36 is absent in MSMEG_0317Á; instead, a single -helix (14) protrudes out from the -barrel core. Like LIMP2 and CD36, MSMEG_0317Á encloses central cavities that span the entire length of the molecule (Fig. 3b). However, unlike CD36 (PDB entry 5lgd), no additional electron density within the cavity that corresponds to a hydrocarbon chain was detected in MSMEG_0317, despite it being expressed in its native host M. smegmatis.
The central cavity in MSMEG_0317Á (cavity 1, volume 340 Å 3 ) adopts an uneven shape and is lined by several hydrophobic as well as charged residues (Table 2, Fig. 3b, Supplementary Fig. S5). Cavity 1 has two openings: entrance 1 and entrance 2 (Figs. 3a and 3b). Entrance 1, which is predicted to be located close to the membrane in the native protein, has an opening of $8 Å (distance measured between the side chains of Glu314 and Thr45 and between Ala310 and Ile43) and is lined by Gln307, Ala310 and Glu314 of 14, Arg163 of loop 6, Ile43 of loop 1 and Thr45 of 1 (Fig. 3c). Interestingly, Glu314 of 14 forms a salt-bridge interaction with Arg163 of loop 6 and this interaction is likely to contribute to the narrow opening of this cavity and holds the 14 helix in its conformation protruding out of the -barrel core. Entrance 2 of cavity 1 has a wider opening of $10 Å (distance measured between the side chains of Gln181 and Leu114 and between Leu155 and Asp227) and is surrounded by Leu114 and Asp116 of 5, His157 and Leu155 of loop 6, Asp226 and Tyr224 of loop 9 and Gln181 of 7 (Fig. 3d). This entrance is in the vicinity of the disordered region of loop 6 (129-154), which is likely to affect the opening and closing of this entrance. In addition to the central cavity, there is an additional smaller cavity (cavity 2, volume 41 Å 3 ) at the base of the barrel surrounded by the two -turns 2 and 10, strands 9 and 11 and the tip of 1, 3, 12 and 13 (Fig. 3e, Table 2). Together, these two cavities span the entire length of the MSMEG_ 0317Á molecule.

Xenon derivatization of the MSMEG_0317D crystal reveals conformational flexibility
To gain further insight into the potential roles of the enclosed cavities in MSMEG_0317Á and investigate their hydrophobicity and potential to binds lipids, we pressurized MSMEG_0317Á crystals in a xenon pressure cell (Australian Synchrotron) before cryocooling, following established protocols (Panjikar & Tucker, 2002). Xenon is known to rapidly diffuse into hydrophobic pockets of proteins with high occupancy, which permits structure determination and the identification of hydrophobic channels (Schiltz et al., 2003). We solved the xenon-pressurized MSMEG_0317Á crystal structure to 1.8 Å resolution in space group P1 (MSMEG_ 0317Á-Xe; Table 1). Overall, the conformation of MSMEG_ 0317Á-Xe is very similar to the original crystal structure, with two molecules in the asymmetric unit (monomer 1, root-meansquare deviation of 0.399 Å over 199 C atoms; monomer 2, root-mean-square deviation of 0.215 Å over 204 C atoms). However, changes in 3, 4 and 5 were noted: strand 3 was shorter and more flexible in the MSMEG_0317Á-Xe structures, while the 4 and 5 strands were longer (Fig. 4a)   sites [anomalous peaks Xe 1 (6.3), Xe 2 (9.6), Xe 3 (10.9), Xe 4 (8.4) and Xe 5 (8.4)] were identified within the two monomers of the asymmetric unit and were refined (Fig. 4a, Supplementary Fig. S6a). Of these, Xe 1 and Xe 3 occupied an identical position in the central cavity within the two mono-mers. However, while the three remaining xenon sites identified were all located within cavity 2 at the base of the -barrel core, their exact positions within the cavity differ (Xe 2, Xe 4 and Xe 5; Supplementary Fig. S6b). Importantly, the binding of Xe atoms to cavity 2 (Xe 2, Xe 4 and Xe 5)  notable conformational change in loop 9 (residues 222-229). Consequently, a distinctly charged motif within loop 9 (E 225 DDAD 229 ) is disordered in both monomers (Fig. 4b). In our MSMEG_0317Á crystal structure, the electron density of loop 9 is well resolved except for the side chains of Asp226 and Asp227. Loop 9 in this conformation is stabilized by a number of van der Waals interactions, including those of Tyr224 in loop 9 with Gln181 in 7 and of Ala228 in loop 9 with Tyr292 in 14, and a hydrogen-bond interaction between the main chain of Asp229 in loop 9 and the hydroxyl group of Tyr281 in 12. Xenon pressurization led to the opening of cavity 2 and destabilization of these interactions, resulting in a disordered loop 9 (Fig. 4b). Additionally, the different positions of the Xe atoms in this region between the two monomers result in slightly different conformations of loop 3, loop 11 and the 2 turn, especially residues Phe56, Leu61 and Val62 (Fig. 4b).
Overall, xenon binding revealed plasticity of loop 9 and its surrounding region and indicated that loop 9 may adopt alternate conformations depending on ligand binding. In contrast to cavity 2, the binding of Xe atoms within cavity 1 did not result in changes in the side-chain conformation of the residues surrounding the xenon, with the exception of Leu114, and did not significantly increase the volume of cavity 1 (Supplementary Fig. S6c). A longer incubation time in the pressure chamber did not result in additional xenon sites, suggesting that most of the conformational flexibility due to xenon binding occurs near the base of the -barrel core and especially in the conformation of loop 9.
3.6. AlphaFold2-predicted structures of MSMEG_0317 and M. tuberculosis Rv0227c support conformational flexibility While this manuscript was in preparation, the AlphaFold Protein Structure Database became available , enabling the prediction of three-dimensional protein structures from the human proteome and 20 other organisms, including M. tuberculosis. We therefore used AlphaFold2 to predict the three-dimensional structure of M. tuberculosis Rv0227c (UniProt P96409), the closest MSMEG_0317 homolog (Supplementary Fig. S1). AlphaFold2 predicted Rv0227c to be a 'probable conserved membrane protein', with most of the structure having a very high (>90) per-residue confidence score (pLDDT; Fig. 5a; Supplementary Fig. S7a). The predicted AlphaFold2 structure of Rv0227c (referred to as AF Rv0227c) is very similar to the MSMEG_0317Á crystal structure (root-mean-square deviation of 0.618 Å over 210 C atoms), suggesting a high 3D structural similarity (Fig. 5a,  Supplementary Fig. S7), despite the two proteins displaying 65% sequence identity (Supplementary Fig. S1). Importantly, the predicted AlphaFold2 structure of MSMEG_0317 (referred to as AF MSMEG_0317) is very similar to our MSMEG_0317Á crystal structure (root-mean-square deviation of 0.546 Å over 210 C atoms), further validating our experimental structural data (Fig. 5b, Supplementary Fig.  S7b).
Both the AF Rv0227c and AF MSMEG_0317 models adopt an elongated -barrel core (Figs. 5a and 5b; Supplementary   Fig. S7). However, in addition to the 11 -strands seen in the MSMEG_0317Á crystal structure, the AlphaFold2-predicted models have two additional short -strands within loop 6 (Figs. 5c and 5d); loop 6 is fully resolved in the models, but with a lower pLDDT score (70-90) in this region. In our MSMEG_0317Á structure, loop 6 (residues 129-154) is disordered and would clash with the symmetry-related molecule if it were to adopt the conformation seen in the Alpha-Fold2 models (Figs. 2, 5c and 5d). This suggests that loop 6 is likely to adopt alternate conformations, as suggested by the lower pLDDT score. Moreover, the AlphaFold2 models reveal a third -turn in loop 5, in addition to the two -turns seen in our experimental MSMEG_0317Á structure, however with a lower pLDDT score (70-90), again indicative of flexibility (Figs. 5c and 5d). Among the other loops connecting the secondary structures, the conformations of loops 2, 3, 4, 7, 8, 10, 12, 13 and 14 in the AlphaFold2 models are almost identical to those in the MSMEG_0317Á crystal structure, while the conformations of loops 5, 9 and 11 vary (Figs. 5c and 5d). Of these loops, the conformation of loop 9, which is located at the base of the -barrel core (cavity 2), deviates the most from our experimental crystal structure and adopts a more 'open' or 'out' conformation compared with the 'closed' or 'in' conformation seen in the MSMEG_0317Á crystal structure (Figs. 5c and 5d). In the AF MSMEG_0317 model, the conformation of loop 9 in the 'open' conformation is stabilized by van der Waals interactions between Val138 and Pro141 in loop 6 and Leu223 and Tyr 224 in loop 9 (also conserved in AF Rv0227c) and a salt-bridge interaction of Lys140 in loop 6 with Glu225 in loop 9 (not conserved in AF Rv0227c, where lysine is replaced by an alanine) ( Supplementary Fig. S7d). While loop 6 is disordered in the MSMEG_0317Á and MSMEG_0317Á-Xe crystal structures, direct comparison of the loop 9 conformation in MSMEG_0317Á (closed conformation), MSMEG_ 0317Á-Xe (disordered) and the AlphaFold2-predicted models (open conformation) suggests that loops 9 and 6 may have an interdependent role in opening or closing of the cavity. It is thus likely that when loop 6 adopts the conformation seen in the AlphaFold2-predicted models, loop 9 is in an 'open' conformation. Together, our experimentally derived data and the AlphaFold2 models support the hypothesis that the solved crystal structure of MSMEG_0317Á represents a 'closed' conformation.

Loop conformational flexibility and cavity size
We next analysed the impact of the conformations of loops 6 and 9 on the size of the enclosed cavity. A CASTp analysis of our crystal structure highlighted two separate cavities: cavity 1 (340 Å 3 ) and cavity 2 (41 Å 3 ) (Fig. 3b). Interestingly, in the AF MSMEG_0317 model, the ordered conformation of loop 6 combined with the 'out' conformation of loop 9 result in an increase in the size of these two cavities (cavity 1, 538 Å 3 ; cavity 2, 165 Å 3 ; Fig. 5e). Interestingly, CASTp analysis of the Rv0227c AlphaFold model predicted a single, large cavity occupying the entire length of the molecule (720 Å 3 ; Fig. 5e). Despite the relatively high conservation of the residues   surrounding the cavities between MSMEG_0317Á and Rv0227c (Table 2, Supplementary Fig. S1), the sequence differences between Rv0227c and MSMEG_0317, and the differences in the conformations of the loops, especially loops 6 and 9, and the conformation of the 14 helix are likely to influence the shape and the volume of these cavities (Fig. 5). While xenon derivatization of crystals did not clearly identify a hydrophobic channel and further work will be required to identify the native ligand that directly binds to MSMEG_ 0317Á, xenon derivatization and analysis of the AlphaFold2 models has enabled the identification of elements that may allow 'open' or 'closed' conformations in MSMEG_0317Á.
3.8. The b-barrel fold is predicted to be conserved in C. glutamicum NCgl2760 C. glutamicum NCgl2760 is the best match for MSMEG_ 0317 in the C. glutamicum proteome, with the proteins sharing 24% amino-acid sequence identity ( Supplementary Fig. S1). Despite the modest sequence identity, both proteins are encoded by the same well conserved cell-wall biosynthesis locus (Fig. 1b), providing further evidence that they are orthologs. To understand the structural basis of this conservation, we next used AlphaFold2 to generate a model of NCgl2760 (Fig. 6). The root-mean-square deviation of NCgl2760 with MSMEG_0317 is 1.401 Å over 98 C atoms. The NCgl2760 model adopts a much smaller -barrel core, with 12 -strands, compared with the extended -barrel core seen in MSMEG_0317 and Rv0227c. Despite this, the positions of strands 1, 3,4,5,7,8,11,12 and 13 in MSMEG_0317Á align with -strands in NCgl2760, with the exception of the 9 strand ( Supplementary Fig. S8). The 9 strand, which is connected to the 8 strand through loop 9, is located at the base of the -barrel core in MSMEG_0317Á and Rv0227c. In contrast, in NCgl2760 loop 9 is much shorter and the 9 strand is part of the main -barrel core. Interestingly, the two additional -strands seen in loop 6 in the AlphaFold2 models of Rv0227c and MSMEG_0317 are also present in NCgl2760 (Fig. 6). Like MSMEG_0317Á and Rv0227c, NCgl2760 encloses a central cavity, albeit with a different shape and volume (119 Å  AlphaFold2-derived prediction of NCgl2760 (AF NCgl2760). AF NCgl2760 adopts a smaller -barrel core compared with MSMEG_0317 and Rv0227c; however, the central cavity is still a conserved feature. Note that the N-terminal helix may represent a signal peptide. See also Supplementary Figs. S8 and S9.

LmcA structures suggest potential functions in cell-wall lipoglycan synthesis in Corynebacterineae
A role for LmcA in cell-wall synthesis was initially proposed based on the phenotypic characterization of an NCgl2760 null mutant of C. glutamicum. This strain lacks all full-length LM and LAM lipoglycans and accumulates a truncated LM species, a phenotype that is mirrored by an mptA mutant lacking a key mannosyltransferase responsible for synthesizing the mannan backbone (Cashmore et al., 2017). The putative orthologs of NCgl2760 in mycobacteria (Rv0227c and MSMEG_0317) are essential for bacterial growth (Cashmore et al., 2017;Sassetti et al., 2003), hampering their characterization. While several theoretical functions of LmcA could explain the NCgl2760 mutant phenotype, our structural characterization of the LmcA family points to a possible lipidbinding function for these proteins. Specifically, the structural similarity between MSMEG_0317Á and CD36 with palmitate bound in a central cavity raises the possibility that the LmcA family may also bind palmitate or a lipid of similar chain length. Despite significant heterogeneity (Klatt et al., 2018), the lipid core of all PIM/LM/LAM species contains at least one palmitate, and the most abundant species contain two palmitate chains (for example AcPIM2). Unlike mycobacteria, C. glutamicum synthesizes a second class of lipoglycans termed Cg-LM-B, which are structurally related to PIM/LM/ LAM but are instead based on an -d-glucopyranosyluronic acid-(1-3)-glycerol anchor (Lea-Smith et al., 2008;Tatituri, Illarionov et al., 2007;Tatituri, Alderwick et al., 2007). These anchors comprise two palmitate chains, and synthesis of Cg-LM-B lipoglycans is also compromised in an NCgl2760 null mutant (Cashmore et al., 2017). A requirement to accommodate two structurally different lipid anchors could explain the structural differences between NCgl2760 and the more closely related MSMEG_0317/Rv0227c proteins. To test whether palmitate can bind to MSMEG_0317, we attempted to crystallize MSMEG_0317Á in the presence of lipids such as palmitic acid (C16 carbon chain composition), phosphatidylglycerol (C8 carbon chain composition) and phosphatidylinositol (C8 carbon chain composition). While these crystals diffracted to high resolution, no additional electron density corresponding to the lipids was observed, consistent with the notion that the structure obtained may represent a 'closed' conformation and structural change may be required to allow lipid binding. An alternative hypothesis is that the LmcA family binds the mannose donor for LM/LAM biosynthesis, polyprenylphosphomannose; however, its lipid component is structurally distinct from palmitate. Overall, we speculate that lipoglycan-bound LmcA may interact with the MptA mannosyltransferase to catalyse the synthesis of the mannan backbone of LM, but further experiments are required to identify the true ligand of LmcA and investigate its interactions with other proteins of the LM/LAM pathway.

Concluding remarks
Here, we report the first crystal structure of the M. smegmatis ortholog of LmcA, MSMEG_0317, at 1.8 Å resolution. The crystal structure of the periplasmic domain of MSMEG_0317 revealed an elongated -barrel fold which encloses two distinct cavities. The availability of AlphaFold2 has allowed us to directly compare our experimental MSMEG_0317Á crystal structure with AlphaFold2-derived models of putative LmcA orthologs from M. tuberculosis (Rv0227c) and C. glutamicum (NCgl2760). Our study revealed three key structural features. Firstly, we identified that all three LmcA proteins adopt a -barrel fold. In MSMEG_0317 and Rv0227c, which share 65% sequence identity, the -barrel core adopted an elongated fold, while in NCgl2760, which shares 24% sequence identity with MSMEG_0317, the -barrel core was significantly smaller. Secondly, by comparing our crystal structure with AlphaFold2-derived models of Rv0227c and NCgl2760 we have shown that the central cavity enclosed by the -barrel fold is a common feature of the LmcA family. Thirdly, through xenon derivatization of the MSMEG_0317 crystal structure we have identified structural elements within the -barrel that show conformational flexibility, allowing 'open' or 'closed' conformations that may drive access to the enclosed cavities. Further work will be required to identify the authentic ligand that binds to the LmcA family; however, the observed structural features suggest a lipid-binding function for LmcA and provide clues to the flexible regions where conformational changes may occur upon ligand binding.