C10orf95 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C10orf95 , chromosome 10 open reading frame 95 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | GeneCards: C10orf95; OMA:C10orf95 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Chromosome 10 open reading frame 95 is a protein that in humans is encoded by the c10orf95 gene. [3] The protein is involved in pre-mRNA splicing and is localized to the nucleus in most tissues.
C10orf95 is located at 10q24.32. [3] It has two exons and spans 1907 base pairs. [4] No splice isoforms or variants are known.
The gene neighborhood of c10orf95 consists of c10orf95 antisense 1 (c10orf95-AS1), CUE domain containing 2 (CUEDC2), major facilitator superfamily domain containing 13A (MFSD13A), and actin related protein 1A (ACTR1A). [3] The CUEDC2 gene enables ubiquitin binding activity and is involved in cytokine production in inflammatory responses. [5] The MFSD13A gene is located in the plasma membrane but does not have a defined function. [6] The ACTR1A gene encodes for a subunit of dynactin that binds to both microtubules and cytoplasmic dynein. [7]
The c10orf95 protein structure consists of one alpha helix and five beta sheets. [9] The alpha helix is in a region of the amino acid sequence that is conserved all the way from mammals to invertebrates, and it is exposed to the external environment for binding. No transmembrane domains exist. [10] Compared to other human proteins, c10orf95 is arginine rich. Arginine rich regions allow for interactions with negatively charged molecules such as DNA or RNA, and arginine rich proteins have a significant role in pre-mRNA splicing. [11] [12]
C10orf95 has moderate ubiquitous expression at low levels in most tissues. [3] However, there is higher expression in lung tissue when compared to other tissues. The fetal heart at 10 weeks has moderately high expression that quickly decreases to almost no expression at 20 weeks gestation. [16]
The c10orf95 protein is likely to be localized to the nucleus due to the presence of multiple nuclear localization signals within the amino acid sequence. [17]
There is a signal peptide located from amino acid 1 to 37 and a cleavage site between amino acid 37 and 38. Five important phosphorylation sites exist due to their conservation among orthologs. [18] Serine and threonine were the most commonly phosphorylated amino acids.
The table below shows ortholog sequences first sorted by increasing median date of divergence in millions of years ago (MYA) followed by percent sequence identity to the human protein. The most distantly related species to humans are invertebrates (excluding fungi, bacteria, plants) with the furthest median date of divergence being 686 MYA and the average sequence identity being 18.5%. Conversely, the closest related species to humans are other mammals with the closest date of divergence being 87 MYA and the average sequence identity being 57.8%. In between there are reptiles, birds, amphibians, and bony fish that are moderately related with average sequence identities being 34.25%, 31%, 29.67%, and 31.3% respectively. There are no known paralogs.
Genus/Species | Common Name | Taxonomic Order | Date of Divergence (MYA) | Accession Number | Sequence Length (aa) | Sequence Identity | Sequence Similarity |
---|---|---|---|---|---|---|---|
Homo sapiens | Human | Primate | 0 | NP_001350509.1 | 213 | 100% | 100% |
Oryctolagus cuniculus | European Rabbit | Lagomorpha | 87 | XP_051679518.1 | 217 | 65% | 70% |
Tursiops truncatus | Common Bottlenose Dolphin | Artiodactyla | 94 | XP_033698270.1 | 219 | 61% | 65% |
Prionailurus bengalensis | Leopard Cat | Carnivora | 94 | XP_043452052.1 | 219 | 62% | 66% |
Diceros bicornis minor | South-central Black Rhinoceros | Perissodactyla | 94 | XP_058399863.1 | 211 | 66% | 71% |
Phascolarctos cinereus | Koala | Diprotodontia | 160 | XP_020823116.1 | 212 | 33% | 47% |
Tyto alba | Barn Owl | Strigiformes | 319 | XP_032844583.1 | 218 | 28% | 46% |
Rhea pennata | Darwin's Rhea | Rheiformes | 319 | XP_062436384.1 | 224 | 31% | 48% |
Melanerpes formicivorous | Acorn Woodpecker | Piciformes | 319 | XP_067995321.1 | 224 | 34% | 52% |
Ahaetulla prasina | Asian Vine Snake | Squamata | 319 | XP_058044567 | 216 | 32% | 49% |
Alligator mississippiensis | American Alligator | Crocodilia | 319 | XP_014451110.1 | 223 | 34% | 47% |
Caretta caretta | Loggerhead Sea Turtle | Testudines | 319 | XP_048714142.1 | 223 | 35% | 49% |
Eublepharis macularius | Leopard Gecko | Squamata | 319 | XP_054839650.1 | 214 | 36% | 48% |
Rhinatrema bivittatum | Two-lined Caecilian | Caecilians | 352 | XP_029466125.1 | 215 | 29% | 46% |
Pleurodeles waltl | Iberian Ribbed Newt | Urodela | 352 | KAJ1140798.1 | 187 | 29% | 44% |
Hyperolius riggenbachi | Riggenbach's Reed Frog | Anura | 352 | XP_068115078.1 | 216 | 31% | 44% |
Erpetoichthys calabaricus | Reedfish | Polypteriformes | 429 | XP_028651575.1 | 215 | 29% | 44% |
Salvelinus fontinalis | Brook Trout | Salmoniformes | 429 | XP_055757593.1 | 221 | 30% | 45% |
Mobula hypostoma | Lesser Devil Ray | Myliobatiformes | 462 | XP_062927344.1 | 201 | 35% | 49% |
Branchiostoma lanceolatum | European Lancelet | Amphioxiformes | 581 | CAH1258412.1 | 287 | 16% | 26% |
Physella acuta | Bladder Snail | Basommatophora | 686 | XP_059175428.1 | 289 | 21% | 29% |
C10orf95 is estimated to have first appeared in invertebrates about 686 million years ago. Very limited invertebrates had the protein with it only being found in lancelets and a variety of snails. The most distantly related species to humans with c10orf95 is the bladder snail that has no isoforms. The c10orf95 gene appears to evolve fairly quickly based on similarity to fibrinogen alpha evolution.
Protein [14] | Interaction Type | Detection Method | Interacting Protein Function | Score |
---|---|---|---|---|
DDX39A (DExD-box helicase 39A) | Physical Association | Anti-tag coimmunoprecipitation | ATP-dependent RNA helicase DDX39A; Involved in pre-mRNA splicing. Required for the export of mRNA out of the nucleus; Belongs to the DEAD box helicase family. DECD subfamily. | 0.292 |
NUS1 (nuclear undecaprenyl pyrophosphate synthase 1) | Physical Association | Anti-tag coimmunoprecipitation | This gene encodes a type I single transmembrane domain receptor, which is a subunit of cis-prenyltransferase, and serves as a specific receptor for the neural and cardiovascular regulator Nogo-B. The encoded protein is essential for dolichol synthesis and protein glycosylation. | 0.292 |
One study done on asthma found that c10orf95 was downregulated in the peripheral blood of asthmatics. [19] Additionally, c10orf95 was listed as a commonly downregulated gene between the severe versus normal asthma and severe versus mild groups. [20] Another study has identified a SNP at position 39 as a variant connected to an increased risk of late onset Alzheimer's disease. [21]
TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.
Transmembrane protein 151B is a protein that in humans is encoded by the TMEM151B gene.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.
The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.
Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene. The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.
Transmembrane protein 238 is a transmembrane protein that in humans is encoded by the TMEM238 gene. The Homo sapiens TMEM238 gene is related to Bardet-Biedl Syndrome 2 and may play a role in amino acid transport, primarily showing expression in stomach and colon tissues.