C8orf34 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C8orf34 , VEST-1, VEST1, chromosome 8 open reading frame 34 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2444149; HomoloGene: 14194; GeneCards: C8orf34; OMA:C8orf34 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
C8orf34 is a protein that, in Homo sapiens , is encoded by the C8orf34 gene. [4] Aliases for C8orf34 include vestibule-1 or VEST-1. Within the cell, C8orf34 is localized to the nucleus and nucleoli where it may play a role in the regulation of gene expression as well as the cell cycle.
The C8orf34 gene is located on the positive-sense strand of chromosome 8 at locus 8q13.2. On the NCBI genome assembly GRCh38.p12, it spans from 68330373 to 68819023. [5] It is 635 kbp in length and contains 14 exons. Among the seven possible transcripts for C8orf34, the longest is 2452 base pairs, encoding for 538 amino acids. [6]
Several gene loci lie near the C8orf34 gene along chromosome 8. While many of these are non-functional pseudogenes, a few of these gene neighbors are functional and protein-coding. The nearest protein-encoding gene to C8orf34 is PREX2, a guanine-nucleotide exchange factor for the Rac family of G proteins. [7] This protein is involved in insulin signalling pathways. Mutations in and overexpression of the PREX2 gene have been observed in some cancers. [8]
Gene | Location | Function | NCBI Gene ID |
---|---|---|---|
PREX2 | 67951918...68237033 | facilitates the exchange of GDP for GTP on Rac1 (a GTPase) | 80243 [7] |
LOC105375888 | 68082051...68095535 | uncharacterized | 105375888 [9] |
LOC107986951 | 68849606...68858076 | uncharacterized | 107986951 [10] |
LOC108004543 | 68973432...68976574 | non-coding, known to undergo non-allelic homologous recombination (NAHR) with another region | 108004543 [11] |
Within the cell, C8orf34 is expressed primarily in the nucleus. C8orf34 protein lacks a signal peptide to allow it to sort outside of the nuclear membrane or to other organelles. An analysis via PSORT II concluded that C8orf34 is localized to the nucleus 94.1% reliability. [12] This nuclear localization suggests that C8orf34 protein may have a function related to the expression and regulation of genes in the nucleus. Alternatively, it may be involved in the maintenance and protection of the cell's genetic material.
C8orf34 is expressed in a wide array of tissues, including the kidney, stomach, thymus, pituitary gland, ear, and brain. [6] [14] In the brain, C8orf34 is expressed in the dentate gyrus, epithalamus, and medulla. [15] In the mouse brain, an orthologous C8orf34 is expressed highly in the granule layer of the dentate gyrus, the somatosensory areas of the cerebral cortex and in the amygdala. [16]
Several different transcription factors regulate the expression of the C8orf34 gene. Many of these transcription factors are related to regulation of the cell's progression through the cell cycle and longevity, suggesting that C8orf34 performs a function related to these processes. [17]
Transcription factor | Function |
---|---|
OCT1 | Involved in the cell cycle regulation of histone H2B gene transcription and in the transcription of other cellular housekeeping genes. [18] |
STAT3 | Involved in the expression of genes that progress cell cycle from G1 to S phase. Acts as a regulator of inflammatory response by regulating differentiation of naive CD4+ T-cells into T-helper Th17 or regulatory T-cells (Treg). [19] |
HSF1 | Rapidly induced after temperature stress and binds heat shock promoter elements (HSE). This protein plays a role in the regulation of lifespan. [20] |
MZF1 | Expressed in hematopoietic progenitor cells that are committed to myeloid lineage differentiation. It contains 13 C2H2 zinc fingers arranged in two domains that are separated by a short glycine- and proline-rich sequence. [21] |
The protein product of the C8orf34 gene is 538 amino acids in length, with a predicted molecular weight of 59kDa and an isoelectric point of 5.9. [22] At the cellular level, several pieces of evidence support the conclusion that C8orf34 plays a role in gene expression regulation and regulation of the cell cycle.
C8orf34 has a domain entitled "Dimerization-anchoring domain of cAMP-dependent protein kinase regulatory subunit" that spans residues 94 to 133. [23] Proteins with this domain are subunits of a multimer protein kinase. [24] The negatively-charged region within the middle of the protein may indicate the site of a coordination with a metal ion, a common structure in proteins that interact with DNA, including zinc-finger proteins. [25]
C8orf34 protein undergoes few modifications following translation. C8orf34 protein is not cleaved after translation. There are eight sites along the protein that are likely candidates for glycosylation and 27 probable sites for phosphorylation. There are four predicted SUMOylation sites in C8orf34. [26] Each of these post-translational modifications is expected to have some effect on the protein. O-glycosylation may influence the sorting of a protein and the protein's conformation. [27] In some cases, glycosylation may play a role in adhesion and immunological processes. [28] Phosphorylation of amino acid residues may serve to activate or deactivate the functional domain of C8orf34. [29] SUMOylation sites are residues that SUMO (small ubiquitin-like modifier) proteins can bind to modify the protein's function. [30] SUMO proteins may modify proteins to perform many functions, including nuclear-cytosolic transport, transcriptional regulation, progressing through the cell cycle, and even apoptosis. [31]
The secondary structure of C8orf34 is predicted to consist mostly of free random coils with alpha helices being the dominant organized structure. [33] Alpha helices are a common motif in proteins that regulate gene expression and may support this function in C8orf34. [34] The structure prediction and analysis application Phyre2 reported that a portion of C8orf34 has close structural similarity with the yeast methyltransferase H3K4, an enzyme that influences gene expression by catalyzing methylation of DNA. [35] [36]
Software-based predictions and experimental results yield several possibilities as to the function of C8orf34. The high frequency of alpha helices may indicate a few things about C8orf34's function. Alpha helices are commonly found in DNA-binding motifs of proteins, including helix-turn-helix motifs and zinc finger motifs. As C8orf34 is localized to the nucleus, the presence of alpha helices further supports the possibility that it is involved in gene regulation and expression. [37] The protein kinase dimerization domain within C8orf34 in combination with its presence in the nucleus may indicate that it is a type of histone kinase. [38]
C8orf34 has been carried across evolutionary events and is observed being expressed as an orthologous protein in several animal clades. There are no observed paralogs for C8orf34 within the human genome as the result of a gene duplication event. [39]
Orthologs of C8orf34 exist in many species. C8orf34 seems to have appeared first in cnidarians, with sea anemones holding its most distant ortholog. An ortholog most similar in structure and function to human C8orf34 likely arose in aquatic chordates, as there appears to be a higher level of identity beginning with sharks. There is no similar homolog of C8orf34 present in arthropods. [39] This clade may have evolved to no longer need C8orf34 for whatever function it served. Alternatively, arthropod species may have a substitute for C8orf34 that performs a similar function.
Organism | Scientific Name | NCBI Accession [39] | Identity % | Seq Length | Est Time of Divergence (MYA) [40] |
---|---|---|---|---|---|
Human | Homo sapiens | NP_443190.2 | 100.00% | 538 | 0.00 |
Gorilla | Gorilla gorilla gorilla | XP_004047177.2 | 99.44% | 538 | 9.06 |
Chimpanzee | Pan troglodytes | NP_001186058.1 | 99.26% | 538 | 6.65 |
Dog | Canis lupus familiaris | NP_001182595.1 | 91.59% | 451 | 96.00 |
Mouse | Mus musculus | NP_001153841.1 | 90.71% | 462 | 90.00 |
Chinchilla | Chinchilla lanigera | XP_013373625.1 | 90.48% | 456 | 90.00 |
Cat | Felis catus | XP_019678323.2 | 88.13% | 537 | 96.00 |
Horse | Equus caballus | XP_023504264.1 | 86.43% | 534 | 96.00 |
Thirteen-lined ground squirrel | Ictidomys tridecemlineatus | XP_021580557.1 | 85.53% | 538 | 90.00 |
Chicken | Gallus gallus | XP_025003758.1 | 83.73% | 620 | 312.00 |
American Alligator | Alligator mississippiensis | XP_019354134.1 | 82.20% | 678 | 312.00 |
White-throat sparrow | Zonotrichia albicollis | XP_026647522.1 | 79.78% | 657 | 312.00 |
Western clawed frog | Xenopus tropicalis | XP_002935369.2 | 77.23% | 621 | 352.00 |
Common box turtle | Terrapene mexicana triunguis | XP_026503128.1 | 77.21% | 414 | 312.00 |
Australian ghostshark | Callorhinchus milii | XP_007885522.1 | 70.80% | 709 | 473.00 |
Zebrafish | Danio rerio | XP_005162763.1 | 70.65% | 626 | 435.00 |
Lamp Shell | Lingula anatina | XP_013381780.1 | 30.73% | 517 | 797.00 |
C. teleta | Capitella teleta | ELU06153.1 | 29.00% | 516 | 797.00 |
Eastern Oytster | Crassostrea virginica | XP_022341487.1 | 26.91% | 500 | 797.00 |
Exaiptasia (sea anemone) | Exaiptasia pallida | XP_020895362.1 | 26.65% | 548 | 824.00 |
Yeast two hybrid experimentation has revealed that C8orf34 interacts with a number of proteins insular to the nucleus. [41] The protein has been shown to interact with ubiquitin C, a precursor protein to polyubiquitin, which functions to lead various effects in the cell cycle depending on the residues it conjugates to. C8orf34 has also demonstrated interactions with MTUS2 (microtubule associated tumor suppressor candidate 2). There is not much information available about this protein candidate, but it is likely to be involved in tumor-suppression functions and cell cycle regulation. [42] C8orf34 also interacts with MCM7 (mini chromosome maintenance complex component 7), part of a protein complex that functions in the Initiation of eukaryotic genome replication during the cell cycle. [43] C8orf34's interactions with these proteins support the conclusion that it is involved in transcription regulation and cell cycle progression.
Studies have determined that C8orf34 has associations with several diseases. Mutations within C8orf34 are associated with risk for diarrhea and neutropenia in patients receiving chemotherapy. [44] A translocation causing a fusion of the C8orf34 gene with the MET protooncogene has been found in tissue sample of patients with papillary renal carcinoma. [45] A Japanese patent application currently cites a procedure claimed to be able to scan for mutations in C8orf34 as a method for the detection of a congenital disease causing hardness of hearing. [46]
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
WD repeat containing protein 53 (WDR53) is a protein encoded by the WDR53 gene that has been identified in the human genome by the Human Genome Project but has, at the moment, lacked experimental procedures to understand the function. It is located on chromosome 3 at location 3q29 in Homo sapiens. It has short up and down stream untranslated regions as well as WD40 repeat regions which have been linked to various functions.
Forkhead-associated domain containing protein 1 (FHAD1) is a protein encoded by the FHAD1 gene.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Transmembrane protein 155 is a protein that in humans is encoded by the TMEM155 gene. It is located on human chromosome 4, spanning 6,497 bases. It is also referred to as FLJ30834 and LOC132332. This protein is known to be expressed mainly in the brain, placenta, and lymph nodes and is conserved throughout most placental mammals. The function and structure of this protein is still not well understood, but its level of expression has been studied pertaining to various pathologies.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C2orf80 is a protein that, in humans, is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.
Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
ZNF839 or zinc finger protein 839 is a protein which in humans is encoded by the ZNF839 gene. It is located on the long arm of chromosome 14. Zinc finger protein 839 is speculated to play a role in humoral immune response to cancer as a renal carcinoma antigen (NY-REN-50). This is because NY-REN-50 was found to be over expressed in cancer patients, especially those with renal carcinoma. Zinc finger protein 839 also plays a role in transcription regulation by metal-ion binding since it binds to DNA via C2H2-type zinc finger repeats.
Zinc Finger Protein 62, also known as "ZNF62," "ZNF755," or "ZET," is a protein that in humans is encoded by the ZFP62 gene. ZFP62 is part of the C2H2 Zinc Finger family of genes.
ZNF730 or zinc finger protein 730 is a protein which in humans is encoded by the ZNF730 gene. It is located on the short arm of chromosome 19. Zinc finger protein 730 is speculated to play a role in transcriptional regulation in acute myeloid leukemia and endometrial cancer. This is because ZNF730 was found to be expressed in higher levels in endometrial cancerous tumor samples and has been reported as a core binding factor in acute myeloid leukemia. Zinc Finger protein 730 is a C2H2-type zinc finger protein containing a β/β/α structure, held in place by a Zinc ion. The C2H2-type protein motifs can regulate transcription by recognizing and binding to DNA sequences.
{{cite book}}
: CS1 maint: others (link){{cite book}}
: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)