C2orf81

Last updated

C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.

Contents

Gene

Location of C2orf81 on chromosome 2 via Ensembl C2orf81 location on c2.png
Location of C2orf81 on chromosome 2 via Ensembl

C2orf81's aliases are LOC388963 and hCG40743. [2] The gene spans from bases 74,414,176 to 74,421,591 on the minus (-) strand of chromosome 2, and contains 4 exons. [1] The coding region is 2086 base pairs, and the protein sequence contains 615 amino acids. [3]

Expression

The protein encoded by c2orf81 is expressed highly in testis, kidneys, and about 18 other tissues in humans. [4] Disease states in which it is expressed include in gliomas, neoplasm, and lymphoma. [5]

Tissues in which c2orf81 is expressed Unigene.png
Tissues in which c2orf81 is expressed

Transcription Variants

Only a few mutations have been documented to occur in c2orf81. Three common missense mutations occur in the 3’ UTR and in the coding sequence which change serine to leucine in the protein. Nonsense mutations have been documented as well, occurring exclusively in the codon for proline.

mRNA

The mRNA sequence contains and 2086 base pairs and 4 isoforms.

Protein

Properties and Composition

C2orf81 has a molecular weight of 66.6 kDa and its isoelectric point is 5.32. [7] It contains a high amount of prolines in the human protein and most mammalian homologs, but a higher amount of glutamic acid residues in non-mammalian vertebrate homologs. [8] C2orf81 has 4 isoforms and its most common isoform contains 615 amino acids. Isoforms 2 through 4 have 566, 520 and 588 amino acids respectively. [3] C2orf81 is the only member of superfamily cl25621. [9]

Domains

Domain of unknown function (DUF) 4639 is unique to the c2orf81 protein and is conserved in eukaryotes. [10] DUF 4639 spans from amino acid 17 to the end of the protein in human c2orf81.

Subcellular Localization

C2orf81 is primarily predicted to be nuclear, but potentially also cytoplasmic and mitochondrial. [11]

Interacting proteins

C2orf81 protein is predicted to interact highly with enoyl-CoA hydratase and hydroxyacyl-CoA dehydrogenase, based on textmining and database searches. [12] Other predicted interacting proteins are acetyl-CoA carboxylases A and B, glycine dehydrogenase, 3-oxoacid CoA transferase 2.

Structure

The c2orf81 is composed mainly of alpha helices. It contains fewer beta pleated sheets, turns, and coils. [13]

A prediction of secondary structure of a portion of c2orf81 generated by Phyre2. PHYRE2MODEL.png
A prediction of secondary structure of a portion of c2orf81 generated by Phyre2.

Function

Despite consisting almost entirely of domain of unknown function, the c2orf81 gene has been analyzed in a study of sites prone to DNA methylation. [4] Another study found the gene c2orf81 to overlap with other genes. [15] Genes from its loci have been related to Alstrom syndrome, cleft palate, neurodevelopmental delays, macrocephaly, and Perry syndrome.

Post-translational modifications

In human c2orf81, phosphorylation is expected to be undergone only in serines, but not in any threonines or tyrosines. [16] O-linked glycosylation is predicted to occur at 3 sites toward the C-terminus. [17] These sites are well-conserved in all homologs. C2orf81 contains one potential SUMOylation site towards the end of the protein with the sequence GKAE. [18]

A diagram depicting predicted post-translational modifications of c2orf81. PTMs.png
A diagram depicting predicted post-translational modifications of c2orf81.

Homology

Paralogs

C2orf81 was found to have one paralog, Homo sapiens BAC clone RP11-523H20. [19]

Homologs

The c2orf81 protein is conserved highly in primates and other mammals, but less so in non-mammalian vertebrates. Its most distant homolog is in the Asian swamp eel. [20] Below is a table showing homologs of c2orf81 and their date of divergence and percent identity to the c2orf81 protein sequence.

SpeciesDate of divergence (mya)Protein identity
Bonobo 6.499%
Gorilla 8.6194%
Orangutan 15.295%
Macaque 28.192%
Lemur 8272%
Mouse 8852%
Minke whale 9469%
Cow 9466%
Pig 9464%
Chinese softshell turtle 32069%
Ostrich 32062%
American golden eagle 32042%
Asian swamp eel 43235%

Evolution

C2orf81 has evolved quickly over time. [21] The N-terminus of the protein has evolved less quickly than the rest of the protein.

Related Research Articles

C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).

WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

<span class="mw-page-title-main">C21orf62</span> Protein-coding gene in the species Homo sapiens

C21orf62 is a protein that, in humans, is encoded by the C21orf62 gene. C21orf62 is found on human chromosome 21, and it is thought to be expressed in tissues of the brain and reproductive organs. Additionally, C21orf62 is highly expressed in ovarian surface epithelial cells during normal regulation, but is not expressed in cancerous ovarian surface epithelial cells.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

<span class="mw-page-title-main">C6orf62</span> Protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

<span class="mw-page-title-main">C17orf53</span>

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">Chromosome 9 open reading frame 43</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

<span class="mw-page-title-main">C1orf185</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">TEKTIP1</span> Gene

TEKTIP1, also known as tektin-bundle interacting protein 1, is a protein that in humans is encoded by the TEKTIP1 gene.

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

References

  1. 1 2 "Chromosome 2: 74,414,176-74,421,591 - Region in detail - Homo sapiens - Ensembl genome browser 92". useast.ensembl.org. Retrieved 2018-05-06.
  2. "Gene Cards".
  3. 1 2 "NCBI Protein c2orf81".
  4. 1 2 Seow, W. J., Kile, M. L., Baccarelli, A. A., Pan, W.-C., Byun, H.-M., Mostofa, G., Quamruzzaman, Q., Rahman, M., Lin, X. and Christiani, D. C. (2014), Epigenome-wide DNA methylation changes with development of arsenic-induced skin lesions in Bangladesh: A case–control follow-up study. Environ. Mol. Mutagen., 55: 449 –456. doi:10.1002/em.21860
  5. Group, Schuler. "EST Profile - Hs.445377". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
  6. "C2orf81 chromosome 2 open reading frame 81 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
  7. Kozlowski, Lukasz P. "CALCULATION OF PROTEIN ISOELECTRIC POINT". isoelectric.org. Retrieved 2018-05-06.
  8. "Composition/Molecular Weight Calculation [PIR - Protein Information Resource]". pir.georgetown.edu. Retrieved 2018-05-06.
  9. group, NIH/NLM/NCBI/IEB/CDD. "NCBI CDD Conserved Protein Domain DUF4639". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
  10. "DUF4639". pfam.xfam.org. Retrieved 2018-05-06.
  11. "PSORTII".
  12. "STRING".
  13. Kumar, Prof. T. Ashok. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2018-05-06.
  14. "Phyre 2 Results for Undefined". www.sbg.bio.ic.ac.uk. Archived from the original on 2018-05-07. Retrieved 2018-05-06.
  15. "Figure 5: Genic alleles in the DAnc(YRI, Europe, UI) tail and overlapping genes". www.nature.com. Retrieved 2018-05-11.
  16. "DISPHOS 1.3". www.dabi.temple.edu. Retrieved 2018-05-06.
  17. "DictyOGlyc 1.1". www.cbs.dtu.dk. Retrieved 2018-05-06.
  18. "SUMOplot™ Analysis Program | Abgent". www.abgent.com. Retrieved 2018-05-06.
  19. Database, GeneCards Human Gene. "C2orf81 Gene - GeneCards | CB081 Protein | CB081 Antibody". www.genecards.org. Retrieved 2018-05-06.
  20. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
  21. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2018-05-06.