TNRC18 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | TNRC18 , CAGL79, TNRC18A, trinucleotide repeat containing 18 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 3648294 HomoloGene: 45603 GeneCards: TNRC18 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Trinucleotide repeat containing 18 is a protein that in humans is encoded by the TNRC18 gene. [5]
The exact function of TNRC18 is not yet well understood by the scientific community. The protein sequence provided by the National Center for Biotechnology Information (NCBI) database includes a Bromo Adjacent Homology (BAH) Domain within TNRC18. [6] BAH domains are often found in chromatin-associated proteins and assist in the silencing of genes. [7]
According to the UCSC Genome Browser, TNRC18 is located within Chromosome 7 in humans (chr7: 5,306,800-5,423,546). There are 29 introns and 30 exons listed. Directly preceding TNRC18 is SLC29A4 and immediately following is AC092171.4. [8] SLC29A4 encodes the plasma membrane monoamine transporter in humans.
GeneCards lists five aliases for TNRC18, Long CAG Trinucleotide Repeat-Containing Gene 79 Protein, Trinucleotide Repeat-Containing Gene 18 Protein, CAGL79, KIAA1856, and TNRC18A. Additionally, TNRC18 has two paralogs, BAH Domain And Coiled-Coil Containing 1 (BAHCC1) and Bromo Adjacent Homology Domain Containing 1 (BAHD1). [9]
The NCBI gene page for TNRC18 lists 9 different protein isoforms across 12 transcript variant mRNA sequences. [10] TNRC18 isoform X7 is encoded by mRNA transcript variants X7-X10. Additionally, isoforms X8 and X9 are produced by variants X11 and X12 respectively.
Isoform | mRNA | Protein | mRNA length (bp) | Protein length (aa) |
---|---|---|---|---|
TNRC18 | NM_001080495 | NP_001073964 | 10586 | 2968 |
TNRC18 isoform X1 | XM_017012728 | XP_016868217 | 10902 | 2979 |
TNRC18 isoform X2 | XM_017012730 | XP_016868219 | 10424 | 2978 |
TNRC18 isoform X3 | XM_017012731 | XP_016868220 | 10299 | 2936 |
TNRC18 isoform X4 | XM_017012732 | XP_016868221 | 10296 | 2935 |
TNRC18 isoform X5 | XM_017012733 | XP_016868222 | 10284 | 2931 |
TNRC18 isoform X6 | XM_017012734 | XP_016868223 | 11319 | 2929 |
TNRC18 isoform X7 | XM_017012735 | XP_016868224 | 10312 | 2905 |
TNRC18 isoform X7 | XM_017012736 | XP_016868225 | 10899 | 2905 |
TNRC18 isoform X7 | XM_017012737 | XP_016868226 | 10472 | 2905 |
TNRC18 isoform X7 | XM_017012738 | XP_016868227 | 10194 | 2905 |
TNRC18 isoform X8 | XM_017012739 | XP_016868228 | 6816 | 2189 |
TNRC18 isoform X9 | XM_017012740 | XP_016868229 | 6839 | 2094 |
The protein sequence provided by NCBI lists human TNRC18 having a length of 2968 amino acids. [6] The Compute pI/Mw tool program by ExPASy [11] predicts the isoelectric point and molecular weight for the TNRC18 to be 8.88 and 315 kDa respectively. Additionally, the NCBI protein sequence for TNRC18 contains nine phosphorylation sites on TNRC18, eight phosphoserines and one phosphothreonine. There is a large serine repeat upstream of the BAH site located from amino acid positions 2604–2670. The BAH site is located on position 2816–2960.
The predicted secondary structure for TNRC18 consists of 32.61% alpha helix, 6.74% extended strand, and 60.55% random coil. This was found using the GOR4 program available at PRABI-Lyon-Gerland with the NCBI protein sequence for TNRC18. [12] [13]
RNA sequencing of TNRC18 tissue samples found ubiquitous gene expression. Most prominent expression was observed within the colon, kidney, and prostate tissue samples. In fetal human tissue samples, notable expression was found in the stomach, lung, and brain. RNA sequencing data was acquired though the TNRC18 gene expression page found on NCBI. [14]
The Human Protein Atlas shows highest RNA expression of TNRC18 in the brain, endocrine tissue, and muscle tissue. Additionally, the highest protein expression is observed in the brain, endocrine tissue, lung, gastrointestinal tract, and male and female specific tissues. Conversely, there is no protein expression in the eye or blood tissue, yet ubiquitous RNA expression for TNRC18. [15]
TNRC18 expression in mouse brain can be found below. Noteworthy expression is observed in the olfactory bulb, isocortex, and cerebellar cortex shown in color. This image and brain atlas information is provided by the Allen Institute Brain Atlas. [16] [17]
NCBI Protein BLAST search for reference proteins lists the following orthologs for human TNRC18. The table is ordered first by increasing estimated date of divergence from humans in millions of years (MYA) and then by highest-to-lowest sequence identity with humans. Date of divergence information was acquired from TimeTree [18] and sequence identify and similarity percentages were found by a pairwise sequence alignment using the European Bioinformatics Institute (EMBL-EBI) EMBOSS Needle program. [19]
Species | Common name | NCBI Protein Accession | Date of Divergence (MYA) | Sequence Identity with Humans (%) | Sequence Similarity with Humans (%) | Length (aa) |
---|---|---|---|---|---|---|
Homo Sapiens | Human | NP_001073964 | 0 | 100 | 100 | 2968 |
Pongo abelii | Sumatran orangutan | XP_024097037 | 15.76 | 98.6 | 98.9 | 2964 |
Nomascus leucogenys | Northern white-cheeked gibbon | XP_030652734 | 19.8 | 98.4 | 98.8 | 2965 |
Ictidomys tridecemlineatus | Thirteen-lined ground squirrel | XP_021575654 | 90 | 85.3 | 88.2 | 2924 |
Rattus norvegicus | Brown Rat | NP_001100593 | 90 | 81.4 | 85.6 | 2900 |
Acinonyx jubatus | Cheetah | XP_026898111 | 96 | 87.5 | 89.9 | 2972 |
Orcinus orca | Killer whale | XP_012388176 | 96 | 87.3 | 89.8 | 2967 |
Lynx canadensis | Canada lynx | XP_030156810 | 96 | 87.3 | 89.8 | 2966 |
Enhydra lutris kenyoni | Sea otter | XP_022356606 | 96 | 86.8 | 89.6 | 2999 |
Odocoileus virginianus texanus | White-tailed deer | XP_020730376 | 96 | 86.2 | 89.1 | 2939 |
Ursus arctos horribilis | Grizzy Bear | XP_026371323 | 96 | 82.1 | 85 | 3111 |
Haliaeetus leucocephalus | Bald Eagle | XP_010568620 | 312 | 58.7 | 69.4 | 2928 |
Apteryx rowi | Okarito kiwi | XP_025939070 | 312 | 58.7 | 69.3 | 2932 |
Pogona vitticeps | Central bearded dragon | XP_020658091 | 312 | 54.7 | 65.7 | 2943 |
Python bivittatus | Burmese python | XP_015744874 | 312 | 53.6 | 65 | 2872 |
Perca flavescens | Yellow perch | XP_028454308 | 435 | 39.1 | 50.5 | 3044 |
Amphiprion ocellaris | Ocellaris clownfish | XP_023150301 | 435 | 39 | 50 | 3071 |
Branchiostoma belcheri | Lancelet | XP_019646059 | 684 | 25.6 | 36.5 | 2799 |
Saccoglossus kowalevskii | Acorn worm | XP_002738509 | 684 | 23.6 | 34.8 | 3174 |
Crassostrea virginica | Eastern oyster | XP_022319473 | 797 | 21 | 31.6 | 2200 |
The following post-translational modifications and motifs are predicted for TNRC18 and found on the ExPASy Proteomics page. [20] Exception to GPS-MSP methylation program which is found on The Cuckoo Workgroup site. [21] This list is not conclusive of the total post-translational modifications or motifs associated with TNRC18 and is solely based on software predictions.
Of the predicted post-translational modifications, there are 92 O-Linked β-N-acetylglucosamine (O-ß-GlcNAc) sites with a high scoring threshold (>=0.5), 23 Sumoylation sites, two palmitoylation sites, one methylation site, and 52 glycation sites. Additionally, GPS 5.0 predicted 22,317 phosphorylation sites on TNRC18. The program was used to confirm the nine phosphorylation sites found on the NCBI protein page for TNRC18.
Program | Predicted post-translational modification | Amino Acid location on protein |
---|---|---|
YinOYang | O-ß-GlcNAc | 80, 185, 199, 352, 416, 626, 627, 640, 788, 991, 995, 1033, 1038, 1533, 1753, 1956, 2023, 2368, 2404, 2510, 2557, 2559–2573, 2611, 2614–2667, 2721, 2892 |
GPS-SUMO (SUMOsp) | Sumoylation | 238-242, 467, 620, 652, 858–862, 1159, 1258, 1461, 1544, 1629, 1638, 1704, 1743, 1885–1889, 1893, 1898, 1899, 2098, 2213–2217, 2259–2263, 2463, 2542, 2964-2968 |
CSS-Palm | Palmitoylation | 284, 1196 |
GPS-MSP | Methylation | 2332 |
GPS 5.0 Phosphorylation | Phosphorylation | 263, 611, 1127, 1136, 1540, 1857, 1863, 2146, 2771 |
NetGlycate | Glycation | 156, 197, 270, 272, 429, 492, 548, 609, 652. 692, 749, 755, 938, 988, 1058, 1059, 1131, 1370, 1461, 1470, 1503, 1554, 1558, 1577, 1615, 1618, 1791, 1797, 1893, 1895, 1898, 1899, 1933, 1967, 1978, 2028, 2091, 2301, 2315, 2328, 2388, 2438, 2475, 2519, 2702, 2709, 2720, 2750, 2801, 2816, 2857, 2869 |
Eukaryotic Linear Motif (ELM) | Coiled-Coil region | 916-949, 1481-1516 |
Shen et al. observed circTNRC18 inhibiting miR-762 activity within pre-eclampsia (PE) placenta tissue samples. [22] The inhibition of miR-762 by circTNRC18 resulted in elevated Grhl2 protein levels. PE placenta samples were observed to have lower miR-762 levels and higher Grhl2 levels which was attributed to overexpression of circTNRC18. Shen et al. conclude that circTNRC18 was upregulated in PE placentas when compared with normal pregnancy placentas.
Chu et al. found that from 19 CpG sites linked with glomerular filtration rate (eGFR), 5 were also linked with renal fibrosis and DNA methylation occurrences in the kidney cortex of chronic kidney disease (CKD) patients. [23] Chu et. note that reduced eGFR is a defining feature of (CKD). These 5 CpG sites were found in proteins TNRC18, PTPN6/PHB2, ANKRD11, PQLC2, and PRPF8. Chu et al. conclude that epigenetic variation may be associated with CKD.
C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.
LCHN is a protein that in humans is encoded by the KIAA1147 gene located on chromosome 7. It is likely part of the tripartite DENN domain family of proteins that often function as Rab-GEFs to regulate vesicular trafficking. Both the mRNA and protein have been shown to be upregulated following ischemic stroke, and to be produced at altered levels in patients with FTD-ALS, however the gene's contribution to these states is not well understood.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Zinc finger CCHC-type containing 18 (ZCCHC18) is a protein that in humans is encoded by ZCCHC18 gene. It is also known as Smad-interacting zinc finger protein 2 (SIZN2), para-neoplastic Ma antigen family member 7b (PNMA7B), and LOC644353. Other names such as zinc finger, CCHC domain containing 12 pseudogene 1, P0CG32, ZCC18_HUMAN had been used to describe this protein.
Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.
Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
LRRIQ3, which is also known as LRRC44, is a protein that in humans is encoded by the LRRIQ3 gene. It is predominantly expressed in the testes, and is linked to a number of diseases.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
C22orf23 is a protein which in humans is encoded by the C22orf23 gene. Its predicted secondary structure consists of alpha helices and disordered/coil regions. It is expressed in many tissues and highest in the testes and it is conserved across many orthologs.
C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.
ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875
CAP-Gly Domain Containing Linker Protein Family Member 4 is a protein that in humans is encoded by the CLIP4 gene. In terms of conserved domains, the CLIP4 gene contains primarily ankyrin repeats and the eponymous CAP-Gly domains. The structure of the CLIP4 protein is largely made up of coil, with alpha helices dominating the rest of the protein. CLIP4 mRNA expression occurs largely in the adrenal cortex and atrioventricular node. The literature encompassing CLIP4's conserved domains and paralogs points toward microtubule regulation as a possible function of CLIP4.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
Human uncharacterized protein CXorf65 is encoded by the gene CXorf65, which is located on the minus strand of chromosome X. Its transcript is 834 nucleotides long and consists of 6 exons. The translated protein is 183 amino acids in length. with a molecular weight of 21.3 kDa
{{cite web}}
: CS1 maint: url-status (link)