Crystal structure of the TLDc domain of human NCOA7-AS

The crystal structure of the TLDc domain of the human NCOA7-AS protein, which acts as an interferon-induced antiviral inhibitor, is reported.


Introduction
Nuclear receptor coactivator 7 (NCOA7) belongs to the TLDc [Tre2/Bub2/Cdc16 (TBC), lysin motif (LysM), domain catalytic] domain-containing family of proteins. In human and mouse, seven TLDc domain-containing proteins have been reported, among which are oxidation resistance (OXR) proteins 1 and 2 and NCOA7 short and long [alternative start (AS) and full-length (FL), respectively] isoforms (Volkert et al., 2000;Durand et al., 2007;Finelli & Oliver, 2017). NCOA7-FL associates with the estrogen receptor and has been reported to translocate to the nucleus upon estradiol treatment, where it was suggested to act as a transcriptional coregulator (Shao et al., 2002). In addition, TLDc domaincontaining proteins have been shown to play a protective role against oxidative stress, notably in the brain, through an unknown mechanism Finelli & Oliver, 2017). The short isoform of NCOA7, NCOA7-AS, does not seem to share this property and is uniquely regulated by type 1 interferon (IFN) via an internal promoter (Yu et al., 2015).
Additional functions have recently been attributed to NCOA7-AS, which plays a significant role in the IFN-induced control of influenza A virus (IAV). NCOA7-AS notably impairs IAV replication in pulmonary cells through the regulation of V-ATPase activity (Doyle et al., 2018). As previously shown for NCOA7-FL (Merkulova et al., 2015), NCOA7-AS interacts with several subunits of the vacuolar V-ATPase, the proton pump responsible for endosomal acidification (Doyle et al., 2018). NCOA7-AS and NCOA7-FL mostly share the TLDc domain, pinpointing a probable role for this domain in interaction with the V-ATPase. Through an as yet unelucidated mechanism, this interaction leads to greater acidification of the endolysosomal system, which increases antigen degradation and is detrimental to IAV, possibly by irreversibly affecting the ability of haemagglutinin to allow fusion with the host cells (Doyle et al., 2018). In line with this finding, the NCOA7 locus has been shown to be important for regulation of V-ATPase function. Interestingly, NCOA7-FL has been shown to interact with the vacuolar V-ATPase in the brain, which enables the correct assembly and transport activity of the proton pump (Castroflorio et al., 2021). Similar to the NCOA7 isoforms, the OXR1 TLDc protein has also been identified as a V-ATPase partner, suggesting that regulation of the V-ATPase could be a common feature of TLDc family members (Merkulova et al., 2015). Despite these findings, the function of the TLDc domain is not yet clearly established and it is not known whether this domain is solely responsible for direct interaction with the V-ATPase.
Mutations in genes encoding TLDc domain-containing proteins have been reported in several human diseases (Finelli & Oliver, 2017). Missense mutations in the TLDc domain found at the C-terminus of the TBC1D24 protein trigger multiple phenotypes, but seem to be particularly linked to epilepsy (Falace et al., 2010;Lü thy et al., 2019). Via its TBC domain in the N-terminus, TBC1D24 regulates synaptic vesicle trafficking, which is exerted through the capacity of this domain to interact with small Rab GTPases (Frasa et al., 2012). However, the function of the TLDc domain of TBC1D24 remains elusive. Little structural information has been gathered on proteins harboring a TLDc domain. The crystal structure of the Skywalker/TBC domain of TBC1D24 has been solved (Fischer et al., 2016) and two studies have reported crystal structures of the OXR2 and TBC1D24 TLDc domains from Danio rerio (Blaise et al., 2012) and Drosophila melanogaster (Lü thy et al., 2019). The determination of the three-dimensional structure of the TLDc domain from a fly orthologue of human TBC1D24 enabled the pathological mutations linked to human epilepsy to be mapped, and it was proposed that some of these mutations could impair the stability of the protein (Lü thy et al., 2019), illustrating that structural data on the TLDc domain could be of help in deciphering the impact of genetic mutations. However, to date, no structural information is available for any human TLDc domain.
In this context, and because of the important roles and functions of TLDc domain-containing proteins in various cellular processes and because pathological mutations are found in this domain in human diseases, we engaged in structural studies of the human TLDc domain. Here, we report the crystal structure of the TLDc domain of the human NCOA7-AS protein at high resolution.

Protein expression and purification
The recombinant plasmid pET-30 Ek/LIC::TLDc was transformed into the Escherichia coli BL21 (DE3) strain resistant to phage T1 (New England Biolabs, Evry, France) carrying pRARE2. One colony was used to inoculate an overnight culture of 500 ml LB medium supplemented with kanamycin (50 mg ml À1 ) and chloramphenicol (34 mg ml À1 ). This culture was diluted in 10 l LB medium supplemented with the two antibiotics. The cells were grown at 289 K to an optical density at 600 nm of 0.8, and protein expression was then induced with 1 mM isopropyl -d-thiogalactopyranoside (IPTG). The culture was grown overnight at 289 K. The cells were harvested by centrifugation at 8200g for 20 min and were resuspended in 100 ml buffer A (50 mM Tris-HCl pH 8, 400 mM NaCl, 5 mM -mercaptoethanol, 40 mM imidazole, 1 mM benzamidine). The cells were disrupted by sonication and cell debris was removed by centrifugation at 28 000g for 60 min. The supernatant was loaded at 277 K onto Ni-NTA agarose beads previously equilibrated with buffer A. The beads were washed twice with buffer B (50 mM Tris-HCl pH 8, 1 M NaCl, 5 mM -mercaptoethanol, 40 mM imidazole, 1 mM benzamidine) and elution was performed with buffer E (50 mM Tris-HCl pH 8, 400 mM NaCl, 5 mM -mercaptoethanol, 500 mM imidazole). The eluted protein was incubated with His-tagged TEV protease purified in our laboratory in a 1:100(w:w) ratio; the cleavage reaction was performed during dialysis (dialysis-bag cutoff 12-15 kDa) against 1 l dialysis buffer D (50 mM Tris-HCl pH 8, 100 mM NaCl, 5 mM -mercaptoethanol) overnight at 277 K. After dialysis, the proteins were centrifuged for 20 min at 28 000g and the supernatant was again loaded at 277 K onto Ni-NTA agarose beads equilibrated with buffer D. The TLDc domain without the tag was collected in the flowthrough, concentrated to 5 mg ml À1 using a Vivaspin column (10 kDa cutoff), loaded onto a size-exclusion chromatography column (Superdex 75 10/300 GL, GE Healthcare) and eluted with buffer D. Following this protocol, 1 mg highly pure protein as judged by a Coomassie Blue-stained denaturing gel was obtained from 1 l culture. Macromolecule-production information is summarized in Table 1.

Crystallization
Initial crystallization screening was performed at two protein concentrations: 4.2 and 10.9 mg ml À1 . The vapordiffusion method was performed in sitting drops by mixing 0.6 ml protein solution with 0.6 ml reservoir solution using 96-well Swissci MRC plates (Molecular Dimensions, Suffolk, UK) at 291 K. The commercial Structure 1 + 2 (Molecular Dimensions), Index, SaltRx and PEGRx screens (Hampton Research) were assessed. Several hits were obtained and crystal optimization was performed in Swissci 48-Well MRC Maxi Optimization Plates. The best diffraction crystals derived from these optimizations were obtained by mixing 1.5 ml protein solution at 10.9 mg ml À1 with 1.5 ml reservoir solution consisting of 0.1 M sodium acetate pH 4.5, 32% PEG 300 ( Table 2). The crystals were cryocooled in liquid nitrogen without any cryoprotection prior to data collection.

Data collection and processing
X-ray data were collected on the ID30A-1/MASSIF-1 beamline at the European Synchrotron Radiation Facility (ESRF), Grenoble, France. The data set was recorded on a PILATUS3 2M detector (Dectris) at a wavelength of 0.965 Å (12.842 keV) and a crystal-to-detector distance of 190.5 mm. A total of 934 images were collected with an exposure time of 0.097 s, a rotation range of 0.15 and full beam transmission. Data were processed, scaled and merged with XDS (Kabsch, 2010) and the data-collection statistics are given in Table 3.  (Emsley et al., 2010) was used for manual rebuilding, while structure refinement and validation were performed with the Phenix package. Because of the high resolution of the data, NCS restraints were not applied during refinement. The statistics for structure refinement are displayed in Table 4. Figures were prepared with PyMOL (http://www.pymol.org).

Results and discussion
As NCOA7-AS plays an important function, notably as an interferon-induced antiviral inhibitor, we aimed to solve its crystal structure. The 219-amino-acid-long NCOA7-AS can be divided into two parts: an N-terminal domain (residues 1-53) followed by the TLDc domain (residues 54-219). We have performed bioinformatic analyses that predicted the first 53 amino acids to be mainly unfolded, with only the presence of two -strands formed by residues 10-15 and 24-28. Despite numerous efforts, we have so far been unable to purify fulllength NCOA7-AS expressed in E. coli to homogeneity. NCOA7-AS has a tendency to aggregate and to form large  Table 1 Macromolecule-production information.   oligomers as judged by size-exclusion chromatography (SEC). We therefore alternatively expressed and purified a truncated form of NCOA7-AS corresponding to the TLDc domain, hereafter referred to as TLDc Hs , using E. coli as an expression host (Table 2). TLDc Hs could be purified using a three-step chromatography procedure as a very pure and homogeneous material as attested by SEC (Figs. 1a and 1b). We could crystallize the domain under several conditions using commercial screens. Crystallization-condition optimization led to rod-shaped crystals with a length of about 100-150 mm (Fig. 1c), which were obtained in 0.1 M sodium acetate pH 4.5, 32% PEG 300 (Table 2). A full X-ray data set could be collected and processed to a resolution of 1.8 Å . The crystals belonged to the orthorhombic space group P2 1 2 1 2 1 , with unitcell parameters as indicated in Table 3. The Matthews coefficient (V M ) of 2.2 Å 3 Da À1 assumes the presence of 44.5% solvent and six molecules of TLDc in the asymmetric unit (Fig. 1d).
The TLDc Hs structure was solved by molecular replacement using the TLDc domain (PDB entry 4acj) from D. rerio (TLDc Dr ) as a search model, which shares 62% sequence identity with its human homologue. Six molecules of TLDc Hs were found in the asymmetric unit, as expected from the V M . The structure was manually rebuilt and refined to R work and R free values of 0.176 and 0.221, respectively, with rather good geometry, as indicated in Table 4. Most residues could be rebuilt for the six monomers, except for the first two amino acids of chain A and the first three residues at the N-terminus for chains B, C, D, F and E. Gly175 was not modeled in chain A as well as the glycine stretch ranging from 174 to 176 in chain F. Residues 146-148 were also disordered in chain F but were well ordered in the other chains. Analysis of the crystal packing with the PISA server (Krissinel & Henrick, 2007) revealed the potential existence of a stable complex made of three TLDc Hs monomers within the asymmetric unit (Fig. 1d). Nonetheless, our SEC analysis demonstrated that the protein    behaves only as a monomer in solution (Fig. 1a), as also described for the TLDc domain from zebrafish (Blaise et al., 2012). The overall structure of TLDc Hs is globular and consists of two -helices in the N-terminus and ten -strands forming two antiparallel -sheets. The two sheets, organized as a central pseudo-orthogonal -sandwich, are made by strands 1, 2, 3, 4, 5 and 10 and strands 6, 7, 8 and 9, respectively. The N-terminal part of 10 interacts with 9 and to this extent contributes to forming the second -sandwich. The fold is similar to those of the two previously solved TLDc structures from D. rerio OXR2 (TLDc Dr ) and D. melanogaster TBC1D24 (TLDc Dm ). TLDc Hs shares 62% sequence identity with TLDc Dr (Fig. 2a) and the two structures display an r.m.s.d. of 0.7 Å when superposed over 164 C atoms. Few structural differences could be observed. TLDc Hs does not possess an -helix after strand 1 as seen in TLDc Dr (Fig. 2b). TLDc Hs and TLDc Dm are more distant as they present only 38% sequence identity (Fig. 2a), and superimposition of the two structures leads to an r.m.s.d. of 1.1 Å over 144 residues. The N-terminal sequences are not well conserved in the three proteins compared with the rest of the sequence, which is reflected by a few differences at the three-dimensional level. TLDc Dm possesses one extra helix (2) in the N-terminus that is not seen in TLDc Hs or TLDc Dr . Finally, noticeable differences are found as two extended loops between 3 and 4 and 5 and 6 in TLDc Dm .
Several missense mutations have been reported in the TLDc domain from TBC1D24 ( Table 5) that are linked to human diseases (Falace et al., 2010;Balestrini et al., 2016;Wang et al., 2019;Lü thy et al., 2019;Muona et al., 2015;Uzunhan & Uyanik, 2020;Atli et al., 2018). The threedimensional structure of TLDc Hs therefore offers the possibility to map these mutations onto the domain. We performed mapping of these mutations onto TLDc Hs (Figs. 2a and 3) and assessed the conservation of the residues with missense mutations mapped onto TLDc Dm from fly TBC1D24 as described previously by Lü thy et al. Structural mapping of the pathological mutations found in the human TLDc domain from TBC1D24. The mutations are reported in the crystal structures of TLDc Hs from NCOA7-AS (a) and TLDc Dm from fly TBC1D24 (b). The human pathological missense mutations reported in TBC1D24 are indicated in black, while the corresponding residues are in blue for the TLDc Hs structure and in red for the TLDc Dm structure. Residues corresponding to the mutations are shown as yellow spheres. Table 5 Comparison of the pathological mutations and associated syndromes found in the TLDc domain from human TBC1D24 with the TLDc domain from D. melanogaster TBC1D24 (red) and the TLDc domain from human NCOA7-AS (blue).
The residues that differ between the TLDc domains are highlighted in bold.

Pathological mutations in the
reported Gly428Arg mutation corresponds to Gly138 in human TLDc and is situated just before strand 4. The Ala500 residue mutated to Val in human TBC1D24 is not conserved in TLDc Hs from NCOA7-AS, where Asn163 is instead found. Despite this lack of residue conservation, both amino acids are part of strand 6. Gly501 in TBC1D24 and the equivalent Gly164 in TLDc Hs are also part of strand 6. Mutation(s) in a structured region such as strand 6 will probably affect the folding of this strand. The Gly511 residue mutated to Arg matches Gly174 in TLDc Hs and is situated in a loop formed by a five-glycine stretch between strands 7 and 8. Finally, Ala515, which has been reported to be mutated to Val in several studies, is not conserved in TLDc Hs as Gly179 is instead found. Nonetheless, both residues are at the beginning of strand 8, and these mutations could destabilize the integrity of this strand. The missense mutations found in the TLDc domain of human TBC1D24 which are linked to neurological disorders are all found at structurally conserved positions. It is indeed clear that the Gly511Arg and Gly501Arg mutations as well as the Arg360His/Leu, Ala500Val and Ala515Val mutations could disturb the folding of the TLDc domain and potentially destabilize the protein because of the nature of the amino-acid substitution and/or its position in the structurally conserved regions. This analysis is therefore in strong agreement with the previously proposed effect on protein stability of the TLDc domain induced by pathological mutations found in TBC1D24 (Lü thy et al., 2019). To conclude, this structural work strongly attests to the high protein similarity between TLDc domains from vertebrates as well as, albeit slightly more distantly, that from an arthropod. Although this was expected from the high sequence identity that is shared between these proteins, reporting the first crystal structure of a human TLDc domain is notably of interest because of the versatile roles of TLDc proteins and also because pathological mutations affect this domain.