C2orf80 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C2orf80 , GONDA1, chromosome 2 open reading frame 80 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 615536 MGI: 2138198 HomoloGene: 129674 GeneCards: C2orf80 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
C2orf80 (chromosome 2 open reading frame 80) is a protein that, in humans, is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1 (gonad development associated 1). In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.
The c2orf80 gene is located on the negative sense strand of chromosome 2 at locus 2q33.3, and has nine exons. [5] It spans from 208165347-2081902581. [5] There are six isoforms of c2orf80. [5]
Isoform | Accession Number | Base Pairs | Amino Acids |
---|---|---|---|
X1 | NM_001099334.3 | 1195 | 193 |
X2 | XM_017004075.1 | 1098 | 200 |
X3 | XM_017004076.1 | 1183 | 193 |
X4 | XM_017004077.1 | 1043 | 183 |
X5 | XM_024452880.1 | 1118 | 176 |
X6 | XM_017004078.1 | 578 | 131 |
The loci near c2orf80 are protein encoding, and include genes IDH1 (histone deacetylase inhibitor 1), and CRYGB (crystallin gamma B). [5]
The IDH1 protein is the NADP(+)-dependent isocitrate dehydrogenase found in the cytoplasm and peroxisomes. It is involved in catalyzing the oxidative decarboxylation of isocitrate to 2-oxoglutarate. [6] Mutations in IDH have been associated strongly with gliomas, and it is a strong candidate for therapeutic targeting. [7]
The gene CRYGB, or crystallin gamma B, encodes a protein found in the lens and cornea of the eye, and have been involved in cataract formation. [6]
Within the cell, it is possible that c2orf80 is expressed in the nucleus, or the mitochondria. [8] An analysis via PSORT II concluded that there is a 47.8% probability that c2orf80 is located in the nucleus, and a 39.1% probability it is located in the mitochondria. [8] The c2orf80 protein has two nuclear localization signals that may play a role in directing the protein to the nucleus of the cell. [8]
C2orf80 is tissue specific, and expressed only in the brain. [5] A tumor specific RNA assessment via The Cancer Genome Atlas showed significant expression of c2orf80 in gliomas. [9] This reveals that c2orf80 is likely not functioning in neurons because gliomas are not present in neurons. Gene Paint explored expression of c2orf80 in infant mice, revealing concentrated expression in the frontal lobe. [10]
There are many transcription factors involved in the regulation of expression of c2orf80. Due to the tissue specific expression, only transcription factors with brain tissue specificity increase c2orf80 transcription. The promoter of c2orf80 X1 is GXP_9792289. [11]
Transcription Factor [11] | Function [11] |
---|---|
NKX Homeodomain transcription factor | Plays a role in organ development |
CLOX and CLOX homology (CDP) factor | Plays a role in nervous system growth |
Paralog hox genes 1-8 from the four hox clusters A, B, C, D | Play a role in central nervous system development |
SOX/SRY-sex/testis determining and related HMG box factor | Plays a role in sex determination |
The protein product of c2orf80 is 193 amino acids in length with a predicted molecular weight of 22kDa and an isoelectric point of 9.5. [12] [13]
Several different post-translation factors regulate the expression of the c2orf80 gene. There is a large disordered region spanning from amino acid 155-193 that is a possible site for phosphorylation from many kinases. [12] The phosphorylations with the highest confidence scores are all modified by the same kinases, AGC, PKC, PKCh, and PRKCE which are involved in cell growth and death, and protein function modification. [14] Other post-transcriptional modifications posited by EML are shown below. [15]
Affinity chromatography verified interaction between Protein Kinase D2 (PRKD2) and c2orf80. [16] The protein PRKD2 can bind to diacylglycerol (DAG) in the trans-Golgi network (TGN), and can be activated by phorbol esters as well as by gastrin via the cholecystokinin B receptor (CCKBR) in gastric cancer cells. [17]
The protein ZNF804A (zinc finger protein 804A) has been shown by the first genome-wide significant association with the broad phenotype of psychosis to be associated with schizophrenia and bipolar disorder. When ZNF804A was knocked out, c2orf80 expression was downregulated, indicating a regulatory relationship between the two. [18]
Motif [15] | Function [15] |
---|---|
MAPK | MAPK cascades are central signaling pathways that regulate a wide variety of stimulated cellular processes. Deregulation can lead to cancer. |
SUMO-1 | Causes dramatic rearrangements of subcellular location. Three possible functions of sumoylation: (i) protein targeting, (ii) enhanced protein stability, (iii) transcriptional control |
PP2B | A Ca2+- and calmodulin-regulated serine/threonine protein that regulates Ca2+-dependent migration of neutrophils, synaptic plasticity, and apoptosis |
Canonical Arg-Containing Phosphorylation | 14-3-3 proteins are involved in cellular processes such as signal transduction, cell-cycle control, apoptosis, stress response and malignant transformation. 14-3-3 proteins possess no catalytic activity so they perform functional tasks only binding and modulating the activity of their partner proteins |
The secondary structure is predicted to consist mostly of alpha helices. [16] The predicted 3D structure via Alphafold is shown. [19] Various alpha helices and a disordered region can be seen.
The function of the c2orf80 protein is unknown. There are indications that it could be involved in sex determination, or cell division regulation. [20] [21]
C2orf80 is found in all vertebrates and was first seen 465 million years ago in sharks. [22] There are no observed paralogs for c2orf80 within the human genome. [23]
The similarity to the human sequence generally decreases as the divergence date from humans increases, but there are some fluctuations in this trend possibly due to a functionally unimportant, poorly conserved region. [22]
Genus, Species | Common Name | Accession Number [23] | Sequence Length (amino acids) [23] | Sequence Identity to Human c2orf80 (%) [22] | Median Divergence Date (Million Years Ago) [24] |
---|---|---|---|---|---|
Homo sapiens | Human | NP_001092804.2 | 193 | 100 | 0 |
Ictidomys tridecemlineatus | Thirteen lined ground squirrel | XP_040150742.1 | 210 | 71 | 89 |
Mus musculus | Mouse | NP_001272811.1 | 196 | 62.9 | 89 |
Orycteropus afer afer | Aardvark | XP_007934115.1 | 207 | 75.4 | 102 |
Anolis carolinesis | Anole lizard | XP_003220055.1 | 178 | 53 | 318 |
Chelonia mydas | Green se turtle | XP_037769043.1 | 218 | 52.7 | 318 |
Egretta garzetta | little egret | XP_009635806.1 | 198 | 51.2 | 318 |
Alligator sinensis | alligator | XP_006029205.1 | 233 | 49.8 | 318 |
Nothoprocta perdicaria | Chilean tinamou | XP_025904208.1 | 215 | 47.9 | 318 |
Aptenodytes forsteri | Emperor penguin | XP_009287939.1 | 142 | 40.8 | 318 |
Bufo bufo | Common toad | XP_040297412.1 | 188 | 45.5 | 351.7 |
Xenopus tropicalis | Western clawed frog | XP_017953126.2 | 210 | 43.5 | 351.7 |
Rhinatrema bivittatum | Two-lined caecilian | XP_029462232.1 | 235 | 43.6 | 351.7 |
Carcharodon carcharias | Great white shark | XP_041056463.1 | 204 | 42.7 | 465 |
Oreochromis aureus | Blue tilapia | XP_039455604.1 | 210 | 38.3 | 433 |
Rhincodon typus | Whale shark | XP_020371397.1 | 213 | 37.1 | 465 |
Astyanax mexicanus | Mexican tetra | XP_022536111.1 | 201 | 36.4 | 433 |
Published research studies that mention c2orf80 indicate it could be involved in a number of maladies. Variants in c2orf80 are associated with IDH-mutant gliomas, possibly due to the proximity of c2orf80 to IDH1 on chromosome 2. [20]
Siblings with 46X,Y gonadal dysgenesis, a disorder of sex development, had a deletion of 8 of the 9 c2orf80 exons, and an interstitial duplication of the SUPT3H gene. [21] Both mutations were inherited from the healthy mother. [21] There is no known interaction between these two genes as of now. [21]
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.
RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.
C3orf56 is a protein encoding gene found on chromosome 3. Although, the structure and function of the protein is not well understood, it is known that the C3orf56 protein is exclusively expressed in metaphase II of oocytes and degrades as the oocyte develops towards the blastocyst stage. Degradation of the C3orf56 protein suggests that this gene plays a role in the progression from maternal to embryonic genome and in embryonic genome activation.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.
Chromosome 3 open reading frame 38 (C3orf38) is a protein which in humans is encoded by the C3orf38 gene.
Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.
Transmembrane protein 248, also known as C7orf42, is a gene that in humans encodes the TMEM248 protein. This gene contains multiple transmembrane domains and is composed of seven exons.TMEM248 is predicted to be a component of the plasma membrane and be involved in vesicular trafficking. It has low tissue specificity, meaning it is ubiquitously expressed in tissues throughout the human body. Orthology analyses determined that TMEM248 is highly conserved, having homology with vertebrates and invertebrates. TMEM248 may play a role in cancer development. It was shown to be more highly expressed in cases of colon, breast, lung, ovarian, brain, and renal cancers.
FAM131A is a protein that is encoded by the FAM131A gene in humans. Aliases for FAM131A include C3orf40, FLAT715, and PRO1378.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.
Zinc Finger Protein 62, also known as "ZNF62," "ZNF755," or "ZET," is a protein that in humans is encoded by the ZFP62 gene. ZFP62 is part of the C2H2 Zinc Finger family of genes.