RTL6 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | RTL6 , Mar6, Mart6, dJ1033E15.2, LDOC1L, leucine zipper, down-regulated in cancer 1-like, leucine zipper, down-regulated in cancer 1 like, LDOC1 like, SIRH3, retrotransposon Gag like 6 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2675858 HomoloGene: 18594 GeneCards: RTL6 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. [5] RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. [6] The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.
The gene is on Chromosome 22 (human) at 22q13.31 on the minus strand from 44492570 to 44498125 nt on the GRCh38.p7 assembly of the human genome. Aliases for the gene include LDOC1L, MAR6, MART6, and SIRH3. RTL6 is made up of 2 exons and is encoded by 5556 base pairs of DNA . [7]
RTL6 is a retrotransposon GAG related gene. It is one of eleven MART (Mammalian Retrotransposon Derived) genes in humans related to Sushi-like retrotransposons with long terminal repeats from fish and amphibians. [6] Between 170 and 310 MYA, MART genes lost their ability to retrotranspose and concomitantly gained new, beneficial function for its host organism. [8]
RTL6 has an alternate start of transcription 140 base pairs upstream of the normal transcribed region. The lengths of the primary mRNA and that with the upstream start of transcription are 5355 and 5495 base pairs respectively. [7]
The primary amino acid sequence for RTL6 is made up of 239 residues. [5] There are no known alternative splice variants of the protein. The molecular weight of the protein is 26.2 kDa and the isoelectric point is 11.58. [9] RTL6 is a proline and arginine rich protein. [9]
RTL6 contains a predicted leucine zipper motif known to participate in nucleic acid binding in other proteins. [9] RTL6 also contains a domain of unknown function from amino acid residues 98-177 . RTL6 is one of a number of genes belonging to the DUF4939 (domain of unknown function) superfamily. [14]
The secondary structure of RTL6 is made up of largely alpha helices. [15] One region of RTL6 is also predicted to participate in a coiled-coil structure from amino acid residues 29–63. [14]
There are also two predicted phosphorylation sites for Protein Kinase C with high confidence scores at amino acid residues 6 and 45. [16] [17] There is also a predicted ubiquitination site with medium-confidence at amino acid residue 8. [18]
RTL6 is expected to be localized to the nucleus and cytosol based on the presence of a leucine zipper domain, the absence of signals indicating secretion or transmembrane domains, and immunohistochemical staining. [19] [20] [21]
RTL6 has been shown to be expressed at high levels during all stages of development and in a wide variety of tissues. [22] [23] [13]
RTL6 expression has been shown to fall in HeLa cervical cancer cells upon treatment with chemotherapeutic Casiopeinas and in A549 lung cancer cells upon treatment with Actinomycin D. [24] [25]
RTL6 has been shown to interact with the following proteins:
DDIT3 | DNA damage-inducible transcript 3 protein [26] |
NXF1 | Nuclear RNA export factor 1 [27] |
STX18 | Syntaxin 18 [28] |
MAFF | MAF bZIP transcription factor F [28] |
GOPC | Golgi-associated PDZ and coiled-coil motif-containing protein [28] |
BATF3 | Basic leucine zipper transcriptional factor ATF-like 3 [28] |
TERF2 | Telomeric repeat-binding factor 2 [29] |
UXAC | Uronate isomerase (Yersinia pestis) [30] |
The RTL6 protein has been shown to interact with the UXAC protein from Yersinia pestis , the gram-negative bacterium responsible for the bubonic plague. [30]
Eleven paralogs were identified for RTL6 in humans. The paralogs have diverse functions and expression patterns, although many are known to have zinc finger domains and bind nucleic acids:
MART Family Name | Accession Number | Sequence Length | Query Cover | Percent Similarity |
---|---|---|---|---|
RTL1 [31] | NP_001128360.1 | 1358 | 100 | 7.6 |
PEG10 (RTL2) [32] | NP_055883.2 | 359 | 35 | 35 |
RTL3 [33] | NP_689907.1 | 2648 | 100 | 2.2 |
RTL4 [34] | NP_001004308.2 | 310 | 100 | 19 |
RTL5 [35] | NP_001019626.1 | 569 | 37 | 53 |
RTL6 [5] | NP_115663.2 | 239 | NA | NA |
LDOC1 (RTL7) [36] | NP_036449.1 | 146 | 33 | 41 |
RTL8A [37] | NP_001071640.1 | 113 | 33 | 44 |
RTL8B [38] | NP_001071641.1 | 113 | 33 | 43 |
RTL8C [39] | NP_001071639.1 | 113 | 33 | 44 |
RTL9 [40] | NP_065820.1 | 1388 | 22 | 39 |
RTL10 [41] | NP_078903.3 | 364 | 34 | 35 |
RTL6 is highly conserved across mammals, including the leucine zipper motif and DUF4939. The gene is also conserved in marsupials such as the opossum but not in birds such as the chicken, suggesting the gene was likely formed after the divergence of mammals and birds but before the divergence of marsupials and mammals (170-310 MYA: [6]
Organism | Common Name | Classification | Accession Number | Percent Identity | Query Cover | Percent Similarity |
---|---|---|---|---|---|---|
Homo sapiens | Humans | Primate | NP_115663.2 [5] | NA | NA | NA |
Macaca mulatta | Rhesus Monkey | Primate | NP_001181372.1 [42] | 98 | 100 | 99 |
Felis catus | House Cat | Carnivore | XP_003989415.1 [43] | 97 | 100 | 98 |
Mus muscalus | Common Mouse | Rodent | NP_808298.2 [44] | 92 | 100 | 96 |
Pteropus alecto | Black Flying Fox | Bat | XP_006917396.1 [45] | 96 | 100 | 98 |
Equus caballus | Horse | Odd-Toed Ungulates | XP_005606827.1 [46] | 95 | 100 | 98 |
Bos Taurus | Cattle | Even-Toed Ungulates | XP_015326927.1 [47] | 94 | 100 | 97 |
Orcinus orca | Killer Whale | Whales/Dolphins | XP_004279624.1 [48] | 95 | 100 | 98 |
Trichechus manatus latirostirs | Florida Manatee | Placentals | XP_004380056.1 [49] | 95 | 100 | 96 |
Erinaceus europaeus | European Hedgehog | Rabbits/Hares | XP_016043235.1 [50] | 93 | 100 | 96 |
Ochotona princeps | American Pika | Insectivoires | XP_004589491.1 [51] | 92 | 100 | 97 |
The most distantly detectable organisms with homology in the gene are bony fishes including salmon and the common carp, but similarity to the human protein sequence is markedly less than that of mammals. No traces of the gene can be seen in intermediates between mammals and bony fishes such as reptiles or amphibians:
Organism | Common Name | Classification | Accession Number | Percent Identity | Query Cover | Percent Similarity |
---|---|---|---|---|---|---|
Homo sapiens | Humans | Primate | NP_115663.2 [5] | 100 | 100 | 100 |
Cyprinus carpio | Common Carp | Bony Fishes | XP_018946777 [52] | 30 | 41 | 34 |
Esox lucius | Northern Pike | Bony Fishes | XP_019899574.1 [53] | 31 | 55 | 31 |
Nothobranchius furzeri | Black Rockcod | Bony Fishes | XP_010767110 [54] | 37 | 35 | 39 |
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
Leucine rich repeat containing 24 is a protein that, in humans, is encoded by the LRRC24 gene. The protein is represented by the official symbol LRRC24, and is alternatively known as LRRC14OS. The function of LRRC24 is currently unknown. It is a member of the leucine-rich repeat (LRR) superfamily of proteins.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
Transmembrane and coiled-coil domains 4, TMCO4, is a protein in humans that is encoded by the TMCO4 gene. Currently, its function is not well defined. It is transmembrane protein that is predicted to cross the endoplasmic reticulum membrane three times. TMCO4 interacts with other proteins known to play a role in cancer development, hinting at a possible role in the disease of cancer.
Proline-rich protein 30 is a protein in humans that is encoded for by the PRR30 gene. PRR30 is a member in the family of Proline-rich proteins characterized by their intrinsic lack of structure. Copy number variations in the PRR30 gene have been associated with an increased risk for neurofibromatosis.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.
Chromosome 17 open reading frame 75 is a protein that in humans is encoded by the C17orf75 gene. C17orf75 is also known as SRI2 and is a human protein encoding gene located at 17q11.2 on the complementary strand. The protein this gene encodes is also known as NJMU-R1. The C17orf75 gene is ubiquitously expressed at medium-low levels throughout the body and at slightly higher levels in the brain and testes. This protein is thought to be part of a complex associated with golgin-mediated vesicle capture.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
C6orf163 is a human protein encoded by the C6orf163 gene.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.