TASOR2 | |||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | TASOR2 , C10orf18, bA318E3.2, family with sequence similarity 208 member B, transcription activation suppressor family member 2, FAM208B | ||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2145274 HomoloGene: 26435 GeneCards: TASOR2 | ||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Protein FAM208B (family with sequence similarity 208 member b) is a protein that in humans is encoded by the FAM208B gene. The gene is also known as "chromosome 10 open reading frame 18" (c10orf18). FAM208B is expressed throughout the body however its function has not been established. FAM208b has been observed to be differentially regulated in various cancers and throughout development. While the exact role of the protein is yet to be established, the significant presence of the protein within humans and throughout the phylogenetic tree depicts a central importance of the gene in normal function.
The gene is located on chromosome 10 at position 10p15.1. [5] FAM208b is upstream of ankryn repeat and SOCS box containing 13 (ASB13), and downstream of the GDP dissociation inhibitor 2 (GDI2) and nuclear receptor binding factor 2 pseudogene 5 (NRBF2P5). [5] ASBI13 and GDI2 are both found on the opposite strand of FAM208b, while NRBF2P5 is on the same strand.
FAM208b has a single paralog, FAM208a. FAM208a is also known as "retinoblastoma-associated protein 140", "Transgene Activation Suppression Protein" (TASOR), "CTCL Tumor Antigen", and "chromosome 3 open reading frame 63" (c3orf63). [6]
FAM208b is conserved only in vertebrates. [7] Orthologs can be found in mammals, reptiles, and amphibians. Distant homologs, including orthologs of the paralog, FAM208a, are observed in bony fish and sharks.
FAM208b has highly conserved N- and C- termini and a less conserved central region. Three domains of unknown function (DUFs) are found within the protein, including one DUF 3699 and two DUF 3715. All three DUFs are conserved between species. DUF 3715 is found in the paralog of FAM208b. [8]
The change in amino acids over time of FAM208b indicates that it is a rapidly evolving gene. The presence of FAM208a but not FAM208b in bony fish and sharks but not FAM208b, indicates that the paralogs split about 325 million years ago.
Two promoter regions for FAM208b can be observed. The earlier promoter region is regulated by numerous transcription factors. [9] The promoter contains binding sites for Ikaros2, Nuclear Factor Y, and at least three binding sites for Pleomorphic adenoma gene 1.
The second promoter region is found within the first intron and encodes a slightly shorter mRNA. [5] This promoter contains multiple binding sites for the FOXP1 transcription factor.
The mRNA of the most common peptide (variant x2) is 8699 nucleotides long and includes 22 exons. [10] [11] [12] [13] [14]
The 5' UTR is bound by the RNA binding proteins RBMX1, FUS, SFRS1, ACO1, and NONO. The 3' UTR is bound by EIF4B, A2BP1, and ZFP36. [15] A single non-coding variant of FAM208b is transcribed. This sequence is partially complementary to the human gene PCNX1.
A total of 20 transcript variants of FAM208b, including one non-coding RNA have been observed. [5] While multiple splice variants are present, 18 exons, composing for 7089 base pairs that code for 2331 amino acids, are present in all coding variants. This constitutes approximately 82.1% of the most common transcript variant (X2), and 95.6% of its polypeptide product. The most commonly skipped exon is Exon 12 (position ch10: 5735304-5735546). Multiple variants have alternative transcription start sites, indicative of an internal promoter sequence.
The primary isoform of FAM208b consists of 2430 amino acids. The total molecular weight is 268.86 kD. [16] FAM208b has an isoelectric point of 5.72. [17] FAM208b has an instability index of 53.64, [18] making it a relatively unstable protein in the unphosphorylated form.
FAM208b has a unique amino acid composition. An above-average proportion of serine residues are observed (11.1%). This indicates a potential role in intracellular signaling. [19]
FAM208b is predicted to have multiple alpha-helical domains. [20] It is predicted that 25% of the protein forms alpha-helices, 15% forms beta-strands, and 60% is random coil. The various DUF domains are predicted to have variable structure. DUF3699 consists of two helices and four beta-strands. The N-terminal DUF3715 appears to form a stretch of random coil, while the C-terminal DUF3715 has two helices and four beta-strands.
A tertiary structure has not yet been confirmed by X-ray crystallography. Predictions of tertiary structure indicate a modular protein, composed of three modules connected by random coil.
FAM208b has 13 experimentally confirmed phosphorylation sites on serine residues. [21] [22] [23] [24] The high serine content of FAM208b suggests a role in intracellular signaling.
FAM208b has potential for SUMOylation [25] SUMOylation has been observed to play a role in nuclear transport, which would aid FAM208b's localization prediction.
FAM208b is predicted to be an intracellular protein, indicating that it is not glycosylated.
FAM208b is predicted to be localized to the cytosol or nucleus. The peptide sequence lacks a signal sequence either at the N-terminus or internally. [26] No transmembrane domains have been observed or predicted, [27] indicating that FAM208b is not secreted or found in the cell membrane, and is very likely to be intracellular. A Nuclear Localization Signal is observed at amino acids 393-403. [28] The NLS is highly conserved in mammals, birds, and reptiles.
FAM208b expression is observed to decrease over the course of development. [29] Peak expression is observed in the blastocyst. A sharp decline in expression is observed at the fetal stage, after which expression is maintained at constant levels through adulthood.
FAM208b has been observed to be correlated in a variety of cancers. The locus of FAM208b (10p15.1) was identified as an aberration site present in translocation-positive Follicular lymphoma but not Nodal Marginal Zone Lymphoma. [30] FAM208b has also been identified as being upregulated significantly and prominently in Non-Hodgkin lymphoma cells. [31] FAM208b has been identified as a hub gene of Stage IV colorectal cancer. [32] A fusion of FAM208b and PLEKHB1 has been validated as candidate for fusion of chromosomes 10 and 11 in Donor Cell Leukemia. [33] FAM208b has also been separately observed to be differentially expressed in a variety of cancers. A decrease in transcription of FAM208b has been observed in adrenal cancer, bladder cancer, breast cancer, gastrointestinal cancer, glial cancer, kidney cancer, lymph cancer, skin cancer, muscle cancer, and uterine cancer. An increase in transcription of FAM208b has been observed in cervical cancer, leukemia, liver cancer, lung cancer, and prostate cancer. [34]
FAM208b has also been found to be expressed at higher levels in Acute Macular Degeneration. [35] [36]
FAM208b has been observed to be downregulated in bronchial epithelial cells infected by respiratory syncytial virus and has been postulated as a biosignature of the infection. [37]
C9orf64 is a gene located on chromosome 9, that in humans encodes the protein queuosine salvage protein. The function and biological process of the queuosine salvage protein is a queuosine-nucleotide N-glycosylase/hydrolase (QNG1) that releases queuine from Q-5'-monophosphate, and this activity is required for the salvage of queuine from exogenous Queuosine by S. pombe and HeLa cells. Some evidence from orthologs indicates it may be involved in tRNA processing and recycling. The most common mRNA contains 4 coding exons, and it has 2 additional alternatively spliced exons. C9orf64 has been found in 5 different splice variants.
UPF0172 protein FAM158A, also known as c14orf122 or CGI112, is a protein that in humans is encoded by the FAM158A gene located on chromosome 14q11.2.
DEPDC5 is a human protein of poorly understood function but has been associated with cancer in several studies. It is encoded by a gene of the same name, located on chromosome 22.
Solute carrier family 46 member 3 (SLC46A3) is a protein that in humans is encoded by the SLC46A3 gene. Also referred to as FKSG16, the protein belongs to the major facilitator superfamily (MFS) and SLC46A family. Most commonly found in the plasma membrane and endoplasmic reticulum (ER), SLC46A3 is a multi-pass membrane protein with 11 α-helical transmembrane domains. It is mainly involved in the transport of small molecules across the membrane through the substrate translocation pores featured in the MFS domain. The protein is associated with breast and prostate cancer, hepatocellular carcinoma (HCC), papilloma, glioma, obesity, and SARS-CoV. Based on the differential expression of SLC46A3 in antibody-drug conjugate (ADC)-resistant cells and certain cancer cells, current research is focused on the potential of SLC46A3 as a prognostic biomarker and therapeutic target for cancer. While protein abundance is relatively low in humans, high expression has been detected particularly in the liver, small intestine, and kidney.
DEP Domain Containing Protein 1B also known as XTP1, XTP8, HBV XAg-Transactivated Protein 8, [formerly referred to as BRCC3] is a human protein encoded by a gene of similar name located on chromosome 5.
NHL Repeat Containing Protein 2, or NHLRC2, is a protein encoded by the NHLRC2 gene.
Transmembrane protein 261 is a protein that in humans is encoded by the TMEM261 gene located on chromosome 9. TMEM261 is also known as C9ORF123 and DMAC1, Chromosome 9 Open Reading Frame 123 and Transmembrane Protein C9orf123 and Distal membrane-arm assembly complex protein 1.
C16orf82 is a protein that, in humans, is encoded by the C16orf82 gene. C16orf82 encodes a 2285 nucleotide mRNA transcript which is translated into a 154 amino acid protein using a non-AUG (CUG) start codon. The gene has been shown to be largely expressed in the testis, tibial nerve, and the pituitary gland, although expression has been seen throughout a majority of tissue types. The function of C16orf82 is not fully understood by the scientific community.
Chromosome 18 open reading frame 63 is a protein which in humans is encoded by the C18orf63 gene. This protein is not yet well understood by the scientific community. Research has been conducted suggesting that C18orf63 could be a potential biomarker for early stage pancreatic cancer and breast cancer.
Chromosome 1 open reading frame 112, is a protein that in humans is encoded by the C1orf112 gene, and is located at position 1q24.2. C1orf112 encodes for seventeen variants of mRNA, fifteen of which are functional proteins. C1orf112 has a determined precursor molecular weight of 96.6 kDa and an isoelectric point of 5.62. C1orf112 has been experimentally determined to localize to the mitochondria, although it does not contain a mitochondrial targeting sequence.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Transmembrane protein 171 (TMEM171) is a protein that in humans is encoded by the TMEM171 gene.
Chromosome 1 open reading frame 198 (C1orf198) is a protein that in humans is encoded by the C1orf198 gene. This particular gene does not have any paralogs in Homo sapiens, but many orthologs have been found throughout the Eukarya domain. C1orf198 has high levels of expression in all tissues throughout the human body, but is most highly expressed in lung, brain, and spinal cord tissues. Its function is most likely involved in lung development and hypoxia-associated events in the mitochondria, which are major consumers of oxygen in cells and are severely affected by decreases in available cellular oxygen.
Transmembrane protein 179 is a protein that in humans is encoded by the TMEM179 gene. The function of transmembrane protein 179 is not yet well understood, but it is believed to have a function in the nervous system.
Small integral membrane protein 14, also known as SMIM14 or C4orf34, is a protein encoded on chromosome 4 of the human genome by the SMIM14 gene. SMIM14 has at least 298 orthologs mainly found in jawed vertebrates and no paralogs. SMIM14 is classified as a type I transmembrane protein. While this protein is not well understood by the scientific community, the transmembrane domain of SMIM14 may be involved in ER retention.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
SH3 Domain Binding Kinase Family Member 3 is an enzyme that in humans is encoded by the SBK3 gene. SBK3 is a member of the serine/threonine protein kinase family. The SBK3 protein is known to exhibit transferase activity, especially phosphotransferase activity, and tyrosine kinase activity. It is well-conserved throughout mammalian organisms and has two paralogs: SBK1 and SBK2.
Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B. TMEM39B is a multi-pass membrane protein with eight transmembrane domains. The protein localizes to the plasma membrane and vesicles. The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus. Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C protein. TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.
Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.
CCDC188 or coiled-coil domain containing protein is a protein that in humans is encoded by the CCDC188 gene.