C8orf82

Last updated

Chromosome 8 open reading frame 82 is a protein encoded in humans by the C8orf82 gene.

Contents

Gene

C8orf82 is a gene located on the minus strand of chromosome 8 in the species Homo sapiens at locus 8q24. The gene is 3,400 bases long and contains 3 exons. [1]

mRNA

The mRNA produced by the C8orf82 gene is 1,979 base pairs long with coding sequence that starts at base 159 and ends at base 809. [2] According to NCBI Gene there are no alternative transcripts or isoforms. [1] AceView lists several alternative transcripts and isoforms, the existence of which is uncertain. [3]

Homology

C8orf82 is conserved among invertebrates, fish, reptiles, mammals, and protists. C8orf82 was not found to be conserved among bacteria, archaea, plants, fungi, or trichoplax. The domain of unknown function DUF4505 seems to be well conserved among most orthologs. [1]

Orthologs

Below is a table of orthologs obtained using Blast.

Common nameGenus & speciesDate of divergence from humans (MYA)Accession numberSequence length (AA)Sequence identity to humansSequence similarity to humans
Chimpanzee Pan troglodytes 6.6NP_00123340721698%98%
Little brown bat Myotis lucifugus 97.5XP_00610762615791%92%
Wild boar Sus scrofa 97.5XP_01385183826953%56%
Peregrine falcon Falco peregrinus 320.5XP_01314989120570%76%
Burmese python Python bivittatus 320.5XP_00743643420164%75%
Schlegel's Japanese gecko Gekko japonicus 320.5XP_01527377821465%74%
Crested ibis Nipponia nippon 320.5XP_00946691916866%72%
Rock dove Columba livia 320.5XP_01322658915959%67%
Kiwi bird Apteryx australis mantelli 320.5XP_01380058531855%64%
Green turtle Chelonia mydas 320.5XP_00706339218852%64%
Blind cave fish Astyanax mexicanus 429.6XP_00725570021763%75%
Black cod Notothenia coriiceps 429.6XP_01078821923761%71%
Sheepshead minnow Cyprinodon variegatus 429.6XP_01522742416657%65%
Sea urchin Strongylocentrotus purpuratus 747.8XP_78424426554%72%
Swallowtail butterfly Papilio polytes 847XP_0131337451665272
Honey bee Apis mellifera 847XP_393829.222050%64%
Mite Metaseiulus occidentalis 847XP_00374431821142%60%
Sponge Amphimedon queenslandica 936XP_00338461121547%62%
Protozoa Tetrahymena thermophila 1724.7XP_001015777.220830%46%
Protist Thecamonas trahens  ???XP_01376073823732%44%

Evolutionary history

C8orf82 does not belong to a specific gene family and has no other known alternative splice isoforms. This holds true for its most distant ancestor as well. [1] [4]

Comparison of C8orf82's % of amino acid changes over time to that of cytochrome c and fibrinogen. Generated using data from ortholog table. C8orf82 %25 Amino Acid Changes Over Time.png
Comparison of C8orf82's % of amino acid changes over time to that of cytochrome c and fibrinogen. Generated using data from ortholog table.

The above graph show's how the behavior of C8orf82's percent of amino acid changes over time is similar to that of fibrinogen and vastly different from cytochrome c indicating a very slow rate of change over time.

Below is a graph that depicts the percent identity change of the orthologs listed in the ortholog table based on the orthologs dates of divergence.

C8orf82's orthologs percent identity change over time based on date of divergence. Ortholog Identity Over Time.png
C8orf82's orthologs percent identity change over time based on date of divergence.

Protein

The C8orf82 gene encodes for the protein termed UPF0598 protein C8orf82 or just C8orf82. [4]

General properties

The protein is 216 amino acids long and contains a domain of unknown function DUF4505. This domain of unknown function is also known as pfam14956. [5] The protein has a molecular weight of 23.8 kDa and an isoelectric point of 9.36. [6]

Compositional features

Isoleucine (0.5%) and lysine (1.4%) within C8orf82 were found to occur in lower frequencies compared to other normal human proteins, while proline(9.7%) and arginine (10.6%) had slightly higher occurrence frequencies. [6]

Post-translational modifications

Predicted post-translational modifications based on ExPASy tests include phosphorylation sites at several serines, threonines, and tyrosines. In addition tyrosine sulfation, sumoylation, O-linked glycosylation, O-ß-GlcNAc attachment sites and an NES motif was also detected. [7]

A conceptual translation of UPF0598 protein C8orf82 showing post translation modifications and secondary structure. Conceptual Protein C8orf82.png
A conceptual translation of UPF0598 protein C8orf82 showing post translation modifications and secondary structure.

Subcellular localization

C8orf82 is predicted to be localized within the mitochondria using PSORT II analysis. The results of a k-NN prediction attributed 60.9% to mitochondrial localization and 26.1% to nuclear localization. [8]

Interacting proteins

ETFA has been shown to interact with C8orf82 using high-throughput affinity-purification mass spectrometry. [9]

Expression

Using antibody staining, most normal cells showed moderate cytoplasmic and nuclear immunoreactivity. Strong staining was observed in placental trophoblasts. Lymphoid tissues, glial cells and myocytes were weakly stained or negative. A staining analysis was also done on cancer cells and the results were as follows; a majority of malignant cells displayed weak to moderate positivity. Many cases of malignant carcinoids and lymphomas were negative. In all these tests, only a single antibody was used (CAB034331). [10]

Microarray data has shown C8orf82 to be ubiquitously expressed.

Microarray data for C8orf82 throughout normal human tissues. Microarray data for C8orf82..png
Microarray data for C8orf82 throughout normal human tissues.

An EST profile for C8orf82 also shows ubiquitous expression.

EST profile for C8orf82 taken from Unigene. EST profile for C8orf82.png
EST profile for C8orf82 taken from Unigene.

Related Research Articles

<span class="mw-page-title-main">OSER1</span>

Chromosome 20 open reading frame 111, or C20orf111, is the hypothetical protein that in humans is encoded by the C20orf111 gene. C20orf111 is also known as Perit1, HSPC207, and dJ1183I21.1. It was originally located using genomic sequencing of chromosome 20. The National Center for Biotechnology Information, or NCBI, shows that it is located at q13.11 on chromosome 20, however the genome browser at the University of California-Santa Cruz (UCSC) website shows that it is at location q13.12, and within a million base pairs of the adenosine deaminase locus. It was also found to have an increase in expression in cells undergoing hydrogen peroxide(H
2
O
2
)-induced apoptosis. After analyzing the amino acid content of C20orf111, it was found to be rich in serine residues.

<span class="mw-page-title-main">FAM214A</span>

Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.

<span class="mw-page-title-main">CCDC94</span> Protein-coding gene in the species Homo sapiens

Coiled-coil domain containing 94 (CCDC94), is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.

C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

<span class="mw-page-title-main">C14orf93</span> Protein-coding gene in the species Homo sapiens

C14orf93 is a protein that is encoded in humans by the C14orf93 gene. It is a globular protein with a conserved C-terminus that is localized to the nucleus. While expressed relatively highly in all tissues except nervous tissue, it is expressed particularly highly in T cells and other immune tissues.

<span class="mw-page-title-main">C17orf98</span> Protein-coding gene in the species Homo sapiens

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">CCDC121</span> Protein-coding gene in the species Homo sapiens

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

<span class="mw-page-title-main">C22orf31</span>

C22orf31 is a protein which in humans is encoded by the C22orf31 gene. The C22orf31 mRNA transcript has an upstream in-frame stop codon, while the protein has a domain of unknown function (DUF4662) spanning the majority of the protein-coding region. The protein has orthologs with high percent similarity in mammals. The most distant orthologs are found in species of bony fish, but C22orf31 is not found in any species of birds or amphibians.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">C6orf136</span>

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene. The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

References

  1. 1 2 3 4 NCBI (National Center for Biotechnology Information) gene entry on C8orf82
  2. NCBI Nucleotide entry on C8orf82
  3. AceView entry on C8orf82
  4. 1 2 GeneCards entry on C8orf82
  5. NCBI Protein entry on C8orf82
  6. 1 2 SDSC Biology WorkBench 3.2 - Statistical Analysis of Primary Structure tool
  7. ExPASy Proteomics Tools
  8. "PSORT II Prediction Analysis". Archived from the original on 2021-07-09. Retrieved 2016-05-09.
  9. Huttlin EL, Ting L, Bruckner RJ, Gebreab F, Gygi MP, Szpyt J, Tam S, Zarraga G, Colby G, Baltier K, Dong R, Guarani V, Vaites LP, Ordureau A, Rad R, Erickson BK, Wühr M, Chick J, Zhai B, Kolippakkam D, Mintseris J, Obar RA, Harris T, Artavanis-Tsakonas S, Sowa ME, De Camilli P, Paulo JA, Harper JW, Gygi SP (2015). "The BioPlex Network: A Systematic Exploration of the Human Interactome". Cell. 162 (2): 425–40. doi:10.1016/j.cell.2015.06.043. PMC   4617211 . PMID   26186194.
  10. The Human Protein Atlas entry on C8orf82
  11. NCBI GEO Profile entry on C8orf82 GDS3113
  12. Unigene entry for C8orf82