Transport and golgi organization 2 homolog (TANGO2) also known as chromosome 22 open reading frame 25 (C22orf25) is a protein that in humans is encoded by the TANGO2 gene.
The function of C22orf25 is not currently known. It is characterized by the NRDE superfamily domain (DUF883), which is strictly known for the conserved amino acid sequence of (N)-Asparagine (R)-Arginine (D)-Aspartic Acid (E)-Glutamic Acid. This domain is found among distantly related species from the six kingdoms: [4] Eubacteria, Archaebacteria, Protista, Fungi, Plantae, and Animalia and is known to be involved in Golgi organization and protein secretion. [5] It is likely that it localizes in the cytoplasm but is anchored in the cell membrane by the second amino acid. [6] [7] C22orf25 is also xenologous to T10 like proteins in the Fowlpox Virus and Canarypox Virus. The gene coding for C22orf25 is located on chromosome 22 and the location q11.21, so it is often associated with 22q11.2 deletion syndrome. [8]
Gene Size | Protein Size | # of exons | Promoter Sequence | Signal Peptide | Molecular Weight | Domain Length |
---|---|---|---|---|---|---|
2271 bp | 276 aa | 9 [9] | 687 bp | No [10] | 30.9 kDa [11] | 270 aa |
The C22orf25 gene is located on the long arm (q) of chromosome 22 in region 1, band 1, and sub-band 2 (22q11.21) starting at 20,008,631 base pairs and ending at 20,053,447 base pairs. [8] There is a 1.5-3.0 Mb deletion containing around 30-40 genes, spanning this region that causes the most survivable genetic deletion disorder known as 22q11.2 deletion syndrome, which is most commonly known as DiGeorge syndrome or Velocaridofacial syndrome. [12] [13] 22q11.2 deletion syndrome has a vast array of phenotypes and is not attributed to the loss of a single gene. The vast phenotypes arise from deletions of not only DiGeorge Syndrome Critical Region (DGCR) genes and disease genes but other unidentified genes as well. [14]
C22orf25 is in close proximity to DGCR8 as well as other genes known to play a part in DiGeorge Syndrome such as armadillo repeat gene deleted in Velocardiofacial syndrome (ARVCF), Cathechol-O-methyltransferase (COMT) and T-box 1 (TBX1). [15] [16]
The promoter for the C22orf25 gene spans 687 base pairs from 20,008,092 to 20,008,878 with a predicted transcriptional start site that is 104 base pairs and spans from 20,008,591 to 20,008,694. [17] The promoter region and beginning of the C22orf25 gene (20,008,263 to 20,009,250) is not conserved past primates. This region was used to determine transcription factor interactions.
Some of the main transcription factors that bind to the promoter are listed below. [18]
Reference | Detailed Family Information | Start (amino acid) | End (amino acid) | Strand |
---|---|---|---|---|
XBBF | X-box binding factors | 227 | 245 | - |
GCMF | Chorion-specific transcription factors (with a GCM DNA binding domain) | 151 | 165 | - |
YBXF | Y-box binding transcription factors | 158 | 170 | - |
RUSH | SWI/SNF related nucleophosphoproteins (with a RING finger binding motif) | 222 | 232 | - |
NEUR | NeuroD, Beta2, HLH domain | 214 | 226 | - |
PCBE | PREB core-binding element | 148 | 162 | - |
NR2F | Nuclear receptor subfamily 2 factors | 169 | 193 | - |
AP1R | MAF and AP1 related factors | 201 | 221 | - |
ZF02 | C2H2 zinc finger transcription factors 2 | 108 | 130 | - |
TALE | TALE homeodomain class recognizing TG motifs | 216 | 232 | - |
WHNF | Winged helix transcription factors | 271 | 281 | - |
FKHD | Forkhead domain factors | 119 | 135 | + |
MYOD | Myoblast determining factors | 218 | 234 | + |
AP1F | AP1, activating protein 1 | 118 | 130 | + |
BCL6 | POZ domain zinc finger expressed in B cells | 190 | 206 | + |
CARE | Calcium response elements | 196 | 206 | + |
EVI1 | EVI1 nuclear transcription factor | 90 | 106 | + |
ETSF | ETS transcription factor | 162 | 182 | + |
TEAF | TEA/ATTS DNA binding domain factors | 176 | 188 | + |
Expression data from Expressed Sequence Tag mapping, microarray and in situ hybridization show high expression for Homo sapiens in the blood, bone marrow and nerves. [19] [20] [21] Expression is not restricted to these areas and low expression is seen elsewhere in the body. In Caenorhabditis elegans , the snt-1 gene (C22orf25 homologue) was expressed in the nerve ring, ventral and dorsal cord processes, sites of neuromuscular junctions, and in neurons. [22]
The NRDE (DUF883) domain, is a domain of unknown function spanning majority of the C22orf25 gene and is found among distantly related species, including viruses.
Genus and Species | Common Name | Accession Number | Seq. Length | Seq. Identity | Seq. Similarity | Kingdom | Time of Divergence |
---|---|---|---|---|---|---|---|
Homo sapiens | humans | NP_690870.3 | 276aa | - | - | Animalia | - |
Pan troglodytes | common chimpanzee | BAK62258.1 | 276aa | 99% | 100% | Animalia | 6.4 mya |
Ailuropoda melanoleuca | giant panda | XP_002920626 | 276aa | 91% | 94% | Animalia | 94.4 mya |
Mus musculus | house mouse | NP_613049.2 | 276aa | 88% | 95% | Animalia | 92.4 mya |
Meleagris gallopavo | turkey | XP_003210928 | 276aa | 74% | 88% | Animalia | 301.7 mya |
Gallus gallus | Red Junglefowl | NP_001007837 | 276aa | 73% | 88% | Animalia | 301.7 mya |
Xenopus laevis | African clawed frog | NP_001083694 | 275aa | 69% | 86% | Animalia | 371.2 mya |
Xenopus (Silurana) tropicalis | Western clawed frog | NP_001004885.1 | 276aa | 68% | 85% | Animalia | 371.2 mya |
Salmo salar | Atlantic salmon | NP_001167100 | 274aa | 66% | 79% | Animalia | 400.1 mya |
Danio rerio | zebrafish | NP_001003781 | 273aa | 64% | 78% | Animalia | 400.1 mya |
Canarypox | virus | NP_955117 | 275aa | 50% | 69% | - | - |
Fowlpox | virus | NP_039033 | 273aa | 44% | 63% | - | - |
Cupriavidus | proteobacteria | YP_002005507.1 | 275aa | 38% | 52% | Eubacteria | 2313.2 mya |
Burkholderia | proteobacteria | YP_004977059 | 273aa | 37% | 53% | Eubacteria | 2313.2 mya |
Physcomitrella patens | moss | XP_001781807 | 275aa | 37% | 54% | Plantae | 1369 mya |
Zea mays | maize/corn | ACG35095 | 266aa | 33% | 53% | Plantae | 1369 mya |
Trichophyton rubrum | fungus | XP_003236126 | 306aa | 32% | 47% | Fungi | 1215.8 mya |
Sporisorium reilianum | Plant pathogen | CBQ69093 | 321aa | 32% | 43% | Fungi | 1215.8 mya |
Perkinsus marinus | pathogen of oysters | XP_002787624 | 219aa | 31% | 48% | Protista | 1381.2 mya |
Tetrahymena thermophilia | Ciliate protozoa | XP_001010229 | 277aa | 26% | 44% | Protista | 1381.2 mya |
Natrialba magadii | extremophile | YP_003481665 | 300aa | 25% | 39% | Archaebacteria | 3556.3 mya |
Halopiger xanaduensis | halophilic archaeon | YP_004597780.1 | 264aa | 24% | 39% | Archaebacteria | 3556.3 mya |
Post translational modifications of the C22orf25 gene that are evolutionarily conserved in the Animalia and Plantae kingdoms as well as the Canarypox Virus include glycosylation (C-mannosylation), [23] glycation, [24] phosphorylation (kinase specific), [25] and palmitoylation. [26]
C22orf25 localizes to the cytoplasm and is anchored to the cell membrane by the second amino acid. As mentioned previously, the second amino acid is modified by palmitoylation. Palmitoylation is known to contribute to membrane association [27] because it contributes to enhanced hydrophobicity. [6] Palmitoylation is known to play a role in the modulation of proteins' trafficking, [28] stability [29] and sorting. [30] Palmitoylation is also involved in cellular signaling [31] and neuronal transmission. [32]
C22orf25 has been shown to interact with NFKB1, [33] RELA, [33] RELB, [33] BTRC, [33] RPS27A, [33] BCL3, [33] MAP3K8, [33] NFKBIA, [33] SIN3A, [33] SUMO1, [33] Tat. [34]
Mutations in the TANGO2 gene may cause defects in mitochondrial β-oxidation [35] and increased endoplasmic reticulum stress and a reduction in Golgi volume density. [36] These mutations results in early onset hypoglycemia, hyperammonemia, rhabdomyolysis, cardiac arrhythmias, and encephalopathy that later develops into cognitive impairment. [35] [36] Abnormal autophagy and mitophagy have been associated with TANGO2-related disease and may explain the varying presentation in muscle biopsies, including secondary abnormal fatty acid and mitochondrial metabolism. [37]
Mitochondrial myopathies are types of myopathies associated with mitochondrial disease. Adenosine triphosphate (ATP), the chemical used to provide energy for the cell, cannot be produced sufficiently by oxidative phosphorylation when the mitochondrion is either damaged or missing necessary enzymes or transport proteins. With ATP production deficient in mitochondria, there is an over-reliance on anaerobic glycolysis which leads to lactic acidosis either at rest or exercise-induced.
T-box transcription factor TBX1 also known as T-box protein 1 and testis-specific T-box protein is a protein that in humans is encoded by the TBX1 gene. Genes in the T-box family are transcription factors that play important roles in the formation of tissues and organs during embryonic development. To carry out these roles, proteins made by this gene family bind to specific areas of DNA called T-box binding element (TBE) to control the expression of target genes.
Intermembrane lipid transfer protein VPS13B, also known as vacuolar protein sorting-associated 13B, and Cohen syndrome protein 1 is a protein that in humans is encoded by the VPS13B gene. It is a giant protein associated with the Golgi apparatus that is believed to be involved in post-Golgi apparatus sorting and trafficking. Mutations in the human VPS13B gene cause Cohen syndrome.
Protein YIF1A is a Yip1 domain family proteins that in humans is encoded by the YIF1A gene.
Uncharacterized protein KIAA1109 is a protein that in humans is encoded by the KIAA1109 gene.
Leucine-zipper-like transcriptional regulator 1 is a protein that in humans is encoded by the LZTR1 gene.
In molecular biology the DHHC domain is a protein domain that acts as an enzyme, which adds a palmitoyl chemical group to proteins in order to anchor them to cell membranes. The DHHC domain was discovered in 1999 and named after a conserved sequence motif found in its protein sequence. Roth and colleagues showed that the yeast Akr1p protein could palmitoylate Yck2p in vitro and inferred that the DHHC domain defined a large family of palmitoyltransferases. In mammals twenty three members of this family have been identified and their substrate specificities investigated. Some members of the family such as ZDHHC3 and ZDHHC7 enhance palmitoylation of proteins such as PSD-95, SNAP-25, GAP43, Gαs. Others such as ZDHHC9 showed specificity only toward the H-Ras protein. However, a recent study questions the involvement of classical enzyme-substrate recognition and specificity in the palmitoylation reaction. Several members of the family have been implicated in human diseases.
DEPDC5 is a human protein of poorly understood function but has been associated with cancer in several studies. It is encoded by a gene of the same name, located on chromosome 22.
Coiled coil domain containing protein 120 (CCDC120), also known as JM11 protein, is a protein that, in humans, is encoded by the CCDC120 gene. The function of CCDC120 has not been formally identified but structural components, conservation, and interactions can be identified computationally.
Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.
Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains., which is a determinant of the intramitochondrial sorting of inner membrane proteins.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Zinc finger CCHC-type containing 18 (ZCCHC18) is a protein that in humans is encoded by ZCCHC18 gene. It is also known as Smad-interacting zinc finger protein 2 (SIZN2), para-neoplastic Ma antigen family member 7b (PNMA7B), and LOC644353. Other names such as zinc finger, CCHC domain containing 12 pseudogene 1, P0CG32, ZCC18_HUMAN had been used to describe this protein.
Transmembrane protein 171 (TMEM171) is a protein that in humans is encoded by the TMEM171 gene.
C22orf23 is a protein which in humans is encoded by the C22orf23 gene. Its predicted secondary structure consists of alpha helices and disordered/coil regions. It is expressed in many tissues and highest in the testes and it is conserved across many orthologs.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B. TMEM39B is a multi-pass membrane protein with eight transmembrane domains. The protein localizes to the plasma membrane and vesicles. The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus. Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C protein. TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.
CCDC188 or coiled-coil domain containing protein is a protein that in humans is encoded by the CCDC188 gene.
DiGeorge syndrome, also known as 22q11.2 deletion syndrome, is a syndrome caused by a microdeletion on the long arm of chromosome 22. While the symptoms can vary, they often include congenital heart problems, specific facial features, frequent infections, developmental disability, intellectual disability and cleft palate. Associated conditions include kidney problems, schizophrenia, hearing loss and autoimmune disorders such as rheumatoid arthritis or Graves' disease.