TMTC4 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | TMTC4 , transmembrane and tetratricopeptide repeat containing 4, transmembrane O-mannosyltransferase targeting cadherins 4 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 618203; MGI: 1921050; HomoloGene: 32796; GeneCards: TMTC4; OMA:TMTC4 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Transmembrane and Tetratricopeptide repeat containing 4 is a protein that in humans is encoded by the TMTC4 gene. [5] This protein crosses the plasma membrane 10 times, and resides in the ER lumen and cytosol. The predicted structure of the TMTC4 protein is a series of alpha-helices.
TMTC4 is located on chromosome 13 at 13q32.3. The gene is flanked by ADP ribosylation factor 4 pseudogene 3 (ARF4P3) on the left, and ribosomal protein S26 pseudogene 47 (RPS26P47) on the right. TMTC4 spans 4043 bp and has a total of 23 exons. [5]
TMTC4 has seven isoform variants, the most common being isoform 1 at 4043 bp. [5]
Isoform | Length (bp) |
---|---|
1 | 4043 |
2 | 3833 |
3 | 3500 |
4 | 4217 |
5 | 4120 |
6 | 4037 |
7 | 3827 |
The 5’ UTR for TMTC4 is short and in many of the shorter isoforms, portions of this untranslated region are cut. In comparison, the 3’ UTR is long and is often complete across the seven isoforms.
The molecular weight for TMTC4 is 85.0 kdal, and there are no positive, negative, or neutral clusters of amino acids or charge runs exceeding the normal lengths. When looking at a distant ortholog (purple sea urchin) the molecular weight of TMTC4 is 85.5 kdal and there, again, are no charge runs, positive, negative or neutral clusters, or unusual spacings. There are strong similarities in protein composition across species. The isoelectric point for the domain of unknown function (DUF 1736) is lower than that of the protein overall.
Domain | Amino Acids | Molecular Weight (kdal) | Isoelectric Point |
---|---|---|---|
Human TMTC4 | 760 | 85.0 | 9.135 |
DUF 1736 | 75 | 8.6 | 4.123 |
TPR repeats | 234 | 26.7 | 9.509 |
TMTC4 has ten transmembrane regions, all of them spaced within the first half of the protein. [6]
TMTC4 is layered with tetratricopeptide (TPR) repeat sequences that are a part of the TPR superfamily of proteins. DUF1736 is present upstream of the TPR region. A seven residue repeat (SRR) is located toward the end of the protein, and it is thought to encode a coiled-coil structure. [7] Another member of the TPR family, PFTA (protein prenyltransferases alpha subunit repeat), is located within the protein's TPR region and is believed to be involved in signal transduction and vesicular traffic regulation. [8] LSPR coagulation factor V, also a repeat motif, is located within the TPR region, and is thought to be a central regulator of hemostasis. [9]
TMTC4 takes on a series of alpha-helix structures, especially within the TPR region, though there are a minimal amount of beta-strand structures spaced throughout the beginning half of the protein. [10]
There are four predicted nuclear localization signals, each tagging the protein for nuclear import. [6] At the very end of the protein, however, there is a predicted ER retention signal which would prevent the protein from leaving the ER. The protein has three predicted N-glycosylation sites, potentially altering its structure and function and there are ten predicted phosphorylation sites, each a possible activation site for a regulatory mechanism. [6]
TMTC4 is expressed in all human tissues. The gene, however, is most highly expressed in the brain and in the spinal cord. [11]
Protein abundance seems to be lower than normal for TMTC4.
There is one possible promoter for the TMTC4 gene, located in the 5’ UTR but before the start of the coding sequence.
Currently the function of TMTC4 has not been characterized.
Possible interacting proteins are NRG1, PEX19, HERC3, TXNDC15, and COL1A1 . All of these were detected through affinity chromatography. [12]
Protein Name | Known function | Location |
---|---|---|
Neuregulin 1 [NRG1] | mediates cell to cell signaling [13] | membrane glycoprotein [13] |
Peroxisomal Biogenesis Factor 19 [PEX19] | cytosolic chaperone [14] | membrane receptor protein [14] |
ECT And RLD Domain Containing E3 Ubiquitin Protein Ligase 3 [HERC3] | member of the ubiquitin ligase family [15] | cytosol [15] |
Thioredoxin Domain Containing 15 [TXNDC15] | Not known | Not known |
Collagen Type I Alpha 1 Chain [COL1A1] | triple helix collagen protein [16] | extracellular [16] |
Ortholog space for TMTC4 spans a large portion of evolutionary time. TMTC4 is present in mammals, reptiles, amphibians, birds, fish, and invertebrates. It is not present in plants, bacteria, archaea, or fungi. [17]
Sequence Number | Genus and Species | Common Name | Accession # (protein) | Identity | Date of Divergence (MYA) |
---|---|---|---|---|---|
1 | Heterocephalus glaber | Naked mole rat | EHB03258.1 | 88% | 94 |
2 | Rattus norvegicus | Brown rat | NP_001127886.1 | 90% | 94 |
3 | Myotis brandtii | Brandt's bat | EPQ01527.1 | 90% | 94 |
4 | Pteropus alecto | Black flying fox | XP_006909447.1 | 93% | 88 |
5 | Erinaceus europaeus | European hedgehog | XP_016040457.1 | 85% | 94 |
6 | Sorex araneus | Common shrew | XP_004614101.1 | 86% | 94 |
7 | Sus scrofa | Wild boar | NP_001239134.1 | 91% | 94 |
8 | Lipotes vexillifer | Baiji | XP_007461591.1 | 90% | 88 |
9 | Ailuropoda melanoleuca | Giant panda | XP_019650336.1 | 90% | 94 |
10 | Acinonyx jubatus | Cheetah | XP_014931490.1 | 93% | 94 |
11 | Tyto alba | Barn owl | KFV56414.1 | 85% | 320 |
12 | Charadrius vociferus | Killdeer | KGL87053.1 | 84% | 320 |
13 | Python bivittatus | Burmese python | XP_007425712.1 | 81% | 320 |
14 | Anolis carolinensis | Carolina anole | XP_008105174.1 | 82% | 320 |
15 | Xenopus tropicalis | Western clawed frog | NP_001121486.1 | 38% | 353 |
16 | Nanorana parkeri | Nanorana parkeri | XP_018432106.1 | 73% | 353 |
17 | Callorhinchus milii | Australian ghostshark | XP_007885231.1 | 68% | 465 |
18 | Crassostrea gigas | Pacific oyster | XP_011422949.1 | 50% | 758 |
19 | Strongylocentrotus purpuratus | Purple sea urchin | XP_011670776.1 | 49% | 627 |
Paralog space for TMTC4 spans the gene family TMTC. There are four genes in this gene family: TMTC1, TMTC2, TMTC3, and TMTC4. TMTC1 and TMTC3 split from TMTC4 about 1200 million years ago, while TMTC2 split from TMTC4 1400 million years ago. Both of these events happened somewhere between invertebrates and plants.
Small glutamine-rich tetratricopeptide repeat-containing protein alpha is a protein that in humans is encoded by the SGTA gene. SGTA orthologs have also been identified in several mammals for which complete genome data are available. STGA belongs to a family of co-chaperone proteins that obtain a TPR motif. STGA was discovered just 15 years ago.
Transmembrane and TPR repeat-containing protein 2 is a protein that in humans is encoded by the TMTC2 gene.
TSBP1 is a protein that in humans is encoded by the TSBP1 gene. TSBP1 was previously known as C6orf10. C6orf10 is an open reading frame on chromosome 6 containing a protein that is ubiquitously expressed at low levels in the adult genome and may play a role during fetal development. C6orf10 has been found to be linked to both neurodegenerative and autoimmune diseases in adults. Expression of this gene is highest in the testis but is also seen in other tissue types such as the brain, lens of the eye and the medulla.
Tetratricopeptide repeat 39A is a human protein encoded by the TTC39A gene. TTC39A is also known as DEME-6, KIAA0452, and c1orf34. The function of TTC39A is currently not well understood. The main feature within tetratricopeptide repeat 39A is the domain of unknown function 3808 (DUF3808), spanning almost the entire protein. KIAA0452 can also be seen as an isoform of TTC39A because of differences in genome sequence, but overlap in DUF domain.
Tetratricopeptide repeat protein 39C is a protein that in humans is encoded by the TTC39C gene. TTC39C is one of three TTC39. Its function is currently unknown; however, there is some evidence suggesting that it plays a role in anaphase. It also contains a relatively well-characterized structural motif called the tetratricopeptide repeat (TPR).
Tetratricopeptide repeat protein 39B is a protein that in humans is encoded by the TTC39B gene. TTC39B is also known as C9orf52 or FLJ33868. The main feature within tetratricopeptide repeat 39B is the domain of unknown function 3808 (DUF3808), spanning the majority of the protein.
EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.
Transmembrane and coiled-coil domain 6, TMCO6, is a protein that in humans is encoded by the TMCO6 gene with aliases of PRO1580, HQ1580 or FLJ39769.1.
Transmembrane protein 255A is a protein that is encoded by the TMEM255A gene. TMEM255A is often referred to as family with sequence similarity 70, member A (FAM70A). The TMEM255A protein is transmembrane and is predicted to be located the nuclear envelope of eukaryote organisms.
Single-pass membrane and coiled-coil domain-containing protein 4 is a single-pass membrane protein with a colloid-coil domain containing protein 4. In humans it is encoded by the SMCO4 gene. A single-pass transmembrane protein, means that one end of the protein will remain in the cytoplasm, while the other end is exposed to the cell exterior.
Proline-rich protein 30 is a protein in humans that is encoded for by the PRR30 gene. PRR30 is a member in the family of Proline-rich proteins characterized by their intrinsic lack of structure. Copy number variations in the PRR30 gene have been associated with an increased risk for neurofibromatosis.
Tetratricopeptide repeat domain 16 (TTC16) is an uncharacterized protein that in humans is encoded by the gene TTC16. Another alias for this gene is TPR repeat protein 16, but this is not commonly used. TTC16 is one of many proteins that contain tetratricopeptide repeat motifs as a supersecondary structure.
Chromosome X Open Reading Frame 38 (CXorf38) is a protein which, in humans, is encoded by the CXorf38 gene. CXorf38 appears in multiple studies regarding the escape of X chromosome inactivation.
LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.