RTL6 | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||
Aliases | RTL6 , Mar6, Mart6, dJ1033E15.2, LDOC1L, leucine zipper, down-regulated in cancer 1-like, leucine zipper, down-regulated in cancer 1 like, LDOC1 like, SIRH3, retrotransposon Gag like 6 | ||||||||||||||||||||||||
External IDs | MGI: 2675858 HomoloGene: 18594 GeneCards: RTL6 | ||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
Orthologs | |||||||||||||||||||||||||
Species | Human | Mouse | |||||||||||||||||||||||
Entrez | |||||||||||||||||||||||||
Ensembl | |||||||||||||||||||||||||
UniProt | |||||||||||||||||||||||||
RefSeq (mRNA) | |||||||||||||||||||||||||
RefSeq (protein) | |||||||||||||||||||||||||
Location (UCSC) | Chr 22: 44.49 – 44.5 Mb | Chr 15: 84.44 – 84.44 Mb | |||||||||||||||||||||||
PubMed search | [3] | [4] | |||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||
|
Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. [5] RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. [6] The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.
The gene is on Chromosome 22 (human) at 22q13.31 on the minus strand from 44492570 to 44498125 nt on the GRCh38.p7 assembly of the human genome. Aliases for the gene include LDOC1L, MAR6, MART6, and SIRH3. RTL6 is made up of 2 exons and is encoded by 5556 base pairs of DNA . [7]
RTL6 is a retrotransposon GAG related gene. It is one of eleven MART (Mammalian Retrotransposon Derived) genes in humans related to Sushi-like retrotransposons with long terminal repeats from fish and amphibians. [6] Between 170 and 310 MYA, MART genes lost their ability to retrotranspose and concomitantly gained new, beneficial function for its host organism. [8]
RTL6 has an alternate start of transcription 140 base pairs upstream of the normal transcribed region. The lengths of the primary mRNA and that with the upstream start of transcription are 5355 and 5495 base pairs respectively. [7]
The primary amino acid sequence for RTL6 is made up of 239 residues. [5] There are no known alternative splice variants of the protein. The molecular weight of the protein is 26.2 kDa and the isoelectric point is 11.58. [9] RTL6 is a proline and arginine rich protein. [9]
RTL6 contains a predicted leucine zipper motif known to participate in nucleic acid binding in other proteins. [9] RTL6 also contains a domain of unknown function from amino acid residues 98-177 . RTL6 is one of a number of genes belonging to the DUF4939 (domain of unknown function) superfamily. [14]
The secondary structure of RTL6 is made up of largely alpha helices. [15] One region of RTL6 is also predicted to participate in a coiled-coil structure from amino acid residues 29–63. [14]
There are also two predicted phosphorylation sites for Protein Kinase C with high confidence scores at amino acid residues 6 and 45. [16] [17] There is also a predicted ubiquitination site with medium-confidence at amino acid residue 8. [18]
RTL6 is expected to be localized to the nucleus and cytosol based on the presence of a leucine zipper domain, the absence of signals indicating secretion or transmembrane domains, and immunohistochemical staining. [19] [20] [21]
RTL6 has been shown to be expressed at high levels during all stages of development and in a wide variety of tissues. [22] [23] [13]
RTL6 expression has been shown to fall in HeLa cervical cancer cells upon treatment with chemotherapeutic Casiopeinas and in A549 lung cancer cells upon treatment with Actinomycin D. [24] [25]
RTL6 has been shown to interact with the following proteins:
DDIT3 | DNA damage-inducible transcript 3 protein [26] |
NXF1 | Nuclear RNA export factor 1 [27] |
STX18 | Syntaxin 18 [28] |
MAFF | MAF bZIP transcription factor F [28] |
GOPC | Golgi-associated PDZ and coiled-coil motif-containing protein [28] |
BATF3 | Basic leucine zipper transcriptional factor ATF-like 3 [28] |
TERF2 | Telomeric repeat-binding factor 2 [29] |
UXAC | Uronate isomerase (Yersinia pestis) [30] |
The RTL6 protein has been shown to interact with the UXAC protein from Yersinia pestis , the gram-negative bacterium responsible for the bubonic plague. [30]
Eleven paralogs were identified for RTL6 in humans. The paralogs have diverse functions and expression patterns, although many are known to have zinc finger domains and bind nucleic acids:
MART Family Name | Accession Number | Sequence Length | Query Cover | Percent Similarity |
---|---|---|---|---|
RTL1 [31] | NP_001128360.1 | 1358 | 100 | 7.6 |
PEG10 (RTL2) [32] | NP_055883.2 | 359 | 35 | 35 |
RTL3 [33] | NP_689907.1 | 2648 | 100 | 2.2 |
RTL4 [34] | NP_001004308.2 | 310 | 100 | 19 |
RTL5 [35] | NP_001019626.1 | 569 | 37 | 53 |
RTL6 [5] | NP_115663.2 | 239 | NA | NA |
LDOC1 (RTL7) [36] | NP_036449.1 | 146 | 33 | 41 |
RTL8A [37] | NP_001071640.1 | 113 | 33 | 44 |
RTL8B [38] | NP_001071641.1 | 113 | 33 | 43 |
RTL8C [39] | NP_001071639.1 | 113 | 33 | 44 |
RTL9 [40] | NP_065820.1 | 1388 | 22 | 39 |
RTL10 [41] | NP_078903.3 | 364 | 34 | 35 |
RTL6 is highly conserved across mammals, including the leucine zipper motif and DUF4939. The gene is also conserved in marsupials such as the opossum but not in birds such as the chicken, suggesting the gene was likely formed after the divergence of mammals and birds but before the divergence of marsupials and mammals (170-310 MYA: [6]
Organism | Common Name | Classification | Accession Number | Percent Identity | Query Cover | Percent Similarity |
---|---|---|---|---|---|---|
Homo sapiens | Humans | Primate | NP_115663.2 [5] | NA | NA | NA |
Macaca mulatta | Rhesus Monkey | Primate | NP_001181372.1 [42] | 98 | 100 | 99 |
Felis catus | House Cat | Carnivore | XP_003989415.1 [43] | 97 | 100 | 98 |
Mus muscalus | Common Mouse | Rodent | NP_808298.2 [44] | 92 | 100 | 96 |
Pteropus alecto | Black Flying Fox | Bat | XP_006917396.1 [45] | 96 | 100 | 98 |
Equus caballus | Horse | Odd-Toed Ungulates | XP_005606827.1 [46] | 95 | 100 | 98 |
Bos Taurus | Cattle | Even-Toed Ungulates | XP_015326927.1 [47] | 94 | 100 | 97 |
Orcinus orca | Killer Whale | Whales/Dolphins | XP_004279624.1 [48] | 95 | 100 | 98 |
Trichechus manatus latirostirs | Florida Manatee | Placentals | XP_004380056.1 [49] | 95 | 100 | 96 |
Erinaceus europaeus | European Hedgehog | Rabbits/Hares | XP_016043235.1 [50] | 93 | 100 | 96 |
Ochotona princeps | American Pika | Insectivoires | XP_004589491.1 [51] | 92 | 100 | 97 |
The most distantly detectable organisms with homology in the gene are bony fishes including salmon and the common carp, but similarity to the human protein sequence is markedly less than that of mammals. No traces of the gene can be seen in intermediates between mammals and bony fishes such as reptiles or amphibians:
Organism | Common Name | Classification | Accession Number | Percent Identity | Query Cover | Percent Similarity |
---|---|---|---|---|---|---|
Homo sapiens | Humans | Primate | NP_115663.2 [5] | 100 | 100 | 100 |
Cyprinus carpio | Common Carp | Bony Fishes | XP_018946777 [52] | 30 | 41 | 34 |
Esox lucius | Northern Pike | Bony Fishes | XP_019899574.1 [53] | 31 | 55 | 31 |
Nothobranchius furzeri | Black Rockcod | Bony Fishes | XP_010767110 [54] | 37 | 35 | 39 |
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
Leucine rich repeat containing 24 is a protein that, in humans, is encoded by the LRRC24 gene. The protein is represented by the official symbol LRRC24, and is alternatively known as LRRC14OS. The function of LRRC24 is currently unknown. It is a member of the leucine-rich repeat (LRR) superfamily of proteins.
Leucine-rich repeats and IQ motif containing 1 is a protein that in humans is encoded by the LRRIQ1 gene. The protein is likely a nuclear encoding mitochondrial protein and is found in all Metazoans.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.
Transmembrane and coiled-coil domains 4, TMCO4, is a protein in humans that is encoded by the TMCO4 gene. Currently, its function is not well defined. It is transmembrane protein that is predicted to cross the endoplasmic reticulum membrane three times. TMCO4 interacts with other proteins known to play a role in cancer development, hinting at a possible role in the disease of cancer.
The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
WD repeat containing protein 53 (WDR53) is a protein encoded by the WDR53 gene that has been identified in the human genome by the Human Genome Project but has, at the moment, lacked experimental procedures to understand the function. It is located on chromosome 3 at location 3q29 in Homo sapiens. It has short up and down stream untranslated regions as well as WD40 repeat regions which have been linked to various functions.
Zinc finger CCHC-type containing 18 (ZCCHC18) is a protein that in humans is encoded by ZCCHC18 gene. It is also known as Smad-interacting zinc finger protein 2 (SIZN2), para-neoplastic Ma antigen family member 7b (PNMA7B), and LOC644353. Other names such as zinc finger, CCHC domain containing 12 pseudogene 1, P0CG32, ZCC18_HUMAN had been used to describe this protein.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.