CXorf65 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | CXorf65 , chromosome X open reading frame 65 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2685460; HomoloGene: 52739; GeneCards: CXorf65; OMA:CXorf65 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Human uncharacterized protein CXorf65 is encoded by the gene CXorf65, which is located on the minus strand of chromosome X. [5] Its transcript is 834 nucleotides long and consists of 6 exons. [6] The translated protein is 183 amino acids in length. [7] with a molecular weight of 21.3 kDa [8] [9]
Human chromosome X open reading frame 65 (CXorf65), also known as LOC158830 [10] or A6NEN9, [11] spans 2,852 base pairs on the minus strand of chromosome X at Xq13.1. [12] It belongs to the gene family pfam 15874; [13] a two-member family for conserved putative interleukin 2 receptor gamma chain (Il2rg) domains. Additionally, human CXorf65 is one of 413 genes which belong to gene cluster 31: Spermatids – Spermatogenesis. [14]
Human CXorf65's mRNA transcript contains 6 exons which form an 834 nucleotide strand. [6]
Human CXorf65 has three isoforms: uncharacterized protein CXorf65, uncharacterized protein CXorf65 isoform X1, and a non-coding RNA sequence. [5] Only uncharacterized protein CXorf65 produces a functional product.
Human uncharacterized protein CXorf65 isoform X1 is an alternative splicing that results in the exclusion of exon 4, which shortens the transcript by 69 base pairs and ultimately leads to a nonfunctional protein. [15] The non-coding RNA sequence suffers from a frameshift mutation due to a 19 bp deletion in exon 2. [16] This results in a premature stop codon 126 bp from the start of translation.
Transcript | mRNA Accession | Nucleotides | Exons | Protein Accession | Amino Acids |
---|---|---|---|---|---|
Uncharacterized protein CXorf65 | NM_001025265.3 | 834 | 6 | NP_001020436.1 | 183 |
Uncharacterized protein CXorf65 isoform X1 | XM_005262244.5 | 765 | 5 | XP_005262301.1 | 160 |
Non-coding RNA Sequence | NR_033212.2 | 815 | 6 | Non-coding | Non-coding |
Human CXorf65 is ubiquitously expressed throughout the body at low levels, typically ranging from 1-7 RPKM in most tissues; [5] however, its expression has an affinity for testis, adrenal, thymus, and bone marrow tissues which can lead to RPKM levels increasing to 6-14 RPKM [17] Additionally, expression can spike to as high as 27 RPKM within adrenal tissues during week 20 of fetal development. [5]
Human uncharacterized protein CXorf65 consists of 183 amino acids. [7] has a molecular weight of 21.3 kDa, [8] [9] and a predicted isoelectric point of 10.33. [8] CXorf65's protein product is predicted to primarily localize within the nucleus. [18]
Human CXorf65 maintains a low whole organism protein abundance at 0.062 ppm2. [19] It has also been identified as a member of the spermatozoa proteome. [20]
Human uncharacterized protein CXorf65 has a bipartite nuclear localization signal, [21] which regulates its transport into the nucleus.
Human uncharacterized CXorf65's secondary structure is predicted to have five sections of β-sheets and four sections of α-helixesm [22] The corresponding tertiary structure is thus predicted to be a β-grasp fold [23] accompanied by a long basic (positively charged) tail. [22]
Human uncharacterized protein CXorf65 has one predicted casein kinase II phosphorylation [24] [25] site and two predicted acetylation sites. [25] Additionally, uncharacterized protein CXorf65 has predicted motifs for N-terminal degradation via type II destabilizing residues and a non-covalent binding site for SUMO (small ubiquitin-like modifier) proteins. [24]
Human CXorf65 orthologs exist in mammals, reptiles, aves, amphibians, bony fish, cartilaginous fish, and the following invertebrates: Cnidaria, Platyhelminthes, Annelida, Arthropoda, Mollusca, Rotifera, Lophophorata, Echinodermata, Hemichordate Amphioxiformes, and Tunicata [26]
There are no human paralogs of CXorf65; [27] however, there is a paralogous Il2rg domain within C22orf15, [13]
CXorf65 is a moderately evolving gene in reference to fibrinogen alpha and cytochrome c. [28] [29]
CXorf65 has been documented to co-express with IL2RG in Mus musculus (mouse). [30] an interleukin subunit coding gene located within the same gene neighborhood in humans at Xq13.1. [31] Fusions between these two genes have been observed within the following organisms: Sarcophilus harrisii (Tasmanian devil), Felis catus (Cat), Cavia porcellus (Guinea pig), Ictidomys tridecemlineatus (Thirteen-lined ground squirrel), Rattus norvegicus (Brown rat), and Mus musculus (Mouse) [30]
Differential expression of CXorf65 in humans is correlated to azoospermia and impaired spermatogenesis [32] while general expression of the gene has been linked to an improved prognosis in urothelial [33] and ovarian cancer. [10] In WG4 temporal lobe epilepsy, human CXorf65 undergoes hypermethylation. [34] In cases of disc herniation, [35] acute coronary syndrome [36] and with the presence of TGF-β in eosinophils [37] human CXorf65 is downregulated.
C12orf40, also known as Chromosome 12 Open Reading Frame 40, HEL-206, and Epididymis Luminal Protein 206 is a protein that in humans is encoded by the C12orf40 gene.
Exosomal polycystin-1-interacting protein is a protein that, in humans, is encoded by the EPCIP gene. EPCIP is found on human chromosome 21, and it is thought to be expressed in tissues of the brain and reproductive organs. Additionally, EPCIP is highly expressed in ovarian surface epithelial cells during normal regulation, but is not expressed in cancerous ovarian surface epithelial cells.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
FAM71E1, also known as Family With Sequence Similarity 71 Member E1, is a protein that in humans is encoded by the FAM71E1 gene. It is thought to be ubiquitously expressed at low levels throughout the body, and it is conserved in vertebrates, particularly mammals and some reptiles. The protein is localized to the nucleus and can be exported to the cytoplasm.
Chromosome 1 open reading frame 112, is a protein that in humans is encoded by the C1orf112 gene, and is located at position 1q24.2. C1orf112 encodes for seventeen variants of mRNA, fifteen of which are functional proteins. C1orf112 has a determined precursor molecular weight of 96.6 kDa and an isoelectric point of 5.62. C1orf112 has been experimentally determined to localize to the mitochondria, although it does not contain a mitochondrial targeting sequence.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
C2orf16 is a protein that in humans is encoded by the C2orf16 gene. Isoform 2 of this protein is 1,984 amino acids long. The gene contains 1 exon and is located at 2p23.3. Aliases for C2orf16 include Open Reading Frame 16 on Chromosome 2 and P-S-E-R-S-H-H-S Repeats Containing Sequence.
LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
FAM227B is a protein that in humans is encoded by FAM227B gene. FAM227B stands for family with sequence similarity 227 member B and encodes protein FAM227B of the same name. Its aliases include C15orf33, MGC57432 and FLJ23800.
C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.
TEKTIP1, also known as tektin-bundle interacting protein 1, is a protein that in humans is encoded by the TEKTIP1 gene.
C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
C10orf53 is a protein that in humans is encoded by the C10orf53 gene. The gene is located on the positive strand of the DNA and is 30,611 nucleotides in length. The protein is 157 amino acids and the gene has 3 exons. C10orf53 orthologs are found in mammals, birds, reptiles, amphibians, fish, and invertebrates. It is primarily expressed in the testes and at very low levels in the cerebellum, liver, placenta, and trachea.
Transmembrane protein 82 (TMEM82) is a protein encoded by the TMEM82 gene in humans.