TMEM128 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | TMEM128 , transmembrane protein 128 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1913559; HomoloGene: 11944; GeneCards: TMEM128; OMA:TMEM128 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. [5] TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. [6] [7] [8] TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.
The TMEM128, or transmembrane protein 128, gene in humans is located on the minus strand at 4p16.3. [9] TMEM128 contains 5 exons total and is 12,701 base pairs long including introns. [5] [9] [10]
There are two isoforms of TMEM128. [11] Isoform 1 being the longest, consists of two variants differing in the 3' UTR region. [11] Variant 1 mRNA is 1,243 base pairs long while Variant 2 mRNA is 1,241 base pairs long. [5] [12] Isoform 2 differs in the 5' UTR region of the protein and uses a different start codon location compared to the first variant. [11] This variant is longer at 1,785 base pairs and has a different N-terminus. [13]
TMEM128 is neighbored upstream by LYAR , Ly1 antibody reactive, and downstream by OTOP1, Otopetrin 1. [14]
TMEM128 Isoform 1 translates into a protein of 165 amino acids long, containing four transmembrane domains. [6] These domains exist at amino acids 49-69, 81-101, 119-139, and 144-164. [6] Isoform 1 is18,882 Da and has a pI of 6.27. [15] Using compositional analysis, the amino acid composition is similar to the average protein and there are no significant repeats in the protein. [15]
Isoform 2 translates into a protein of 141 amino acids long, also containing four transmembrane domains. [17] [18] Isoform 2 has a different molecular weight and isoelectric point compared to Isoform 1, coming in at 16,093 Da and having a pI of 6.8. [15]
Type of secondary structure | Number of amino acids | Percent composition |
---|---|---|
Alpha helix | 34 | 20.61% |
Extended strand | 59 | 35.76% |
Random coil | 72 | 43.64% |
Predicted secondary structure composition shows that most of the secondary structure consists of random coils. [19] No disulfide bonds are predicted to be present. [20]
Membrane topology of TMEM128 shows the four transmembrane domains, longer N-terminus, and shorter C terminus.
Tertiary structure is predicted to have four spiral domains in TMEM128. These domains are the transmembrane sections of the protein. For the above models, it is colored rainbow from N-terminus to C-terminus.
Several promotors/enhancers of TMEM128 exist, with the GH04J00427 promotor located near the start of transcription, the GH04J004540 enhancer located downstream, and GH04J004264 enhancer located upstream of their target gene. [9] [14] TMEM128 sequence also contains many binding sites for various transcription factors, including TATA box, CCAAT binding protein, and cAMP-responsive element binding protein. [23]
Expression of TMEM128 is also regulated at the gene level through differential tissue expression as seen with the image to the left. Red bars represent absolute expression while blue dots represent relative expression. TMEM128 is expressed highly in areas such as the adrenal gland and spinal cord, while is lower in areas such as the liver and bone marrow. [11]
Several miRNAs have binding sites on the 3' UTR of TMEM128 including: [26]
These miRNAs can participate in RNA silencing to prevent the expression of the mRNA.
Analyses of mouse brains show lack of region-specific expression throughout. [25]
In terms of protein regulation, TMEM128 contains many different post-translational domains including glycation, [27] phosphorylation, [28] SUMOylation, [29] and O-GlcNAc [30] as seen below:
Modification | Amino acid number |
---|---|
Phosphorylation | 3, 4, 52, 124, 135, 162 |
Glycation | 70, 73, 115 |
Nuclear export signal [31] | 88-95 |
SUMOylation | 39-42, 115-118, 161-165 |
O-GlcNAc | 3, 4, 34, 35, 123 |
Acetylation [32] | 40, 41, 43, 73 |
Post-translational modification alters protein structure and can thus alter protein function and viability.
TMEM128 was found to be located in the Endoplasmic Reticulum membrane, with the N-terminus and C-terminus facing into the cytoplasm. [7] [8]
Orthologs of TMEM128 have not been found outside of Eukaryotes. [33] Inside of Eukaryotes, TMEM128 orthologs have been found in mammals, birds, and several fungi. Mammals contained the highest amount of conservation at no less than 71% conservation. The most distant ortholog detected was the Diversispora epigaea, a fungus. The transmembrane domains of this protein remain the most conserved throughout species, with key amino acids Trp51, Trp139, and Trp142 being conserved in all species with orthologous proteins. All information below was obtained through NCBI BLAST. [33]
Genus and Species | Common Name | Date of Divergence (MYA) [34] | Accession number | Sequence length | Sequence identity |
---|---|---|---|---|---|
Homo sapiens | Human | 0 | NP_001284480.1 | 165 | 100% |
Rhinopithecus roxellana | Golden snub-nosed monkey | 28.81 | XP_010355887.2 | 165 | 97% |
Mus musculus | House mouse | 89 | NP_001343889.1 | 163 | 81% |
Microtus ochrogaster | Prairie vole | 89 | XP_005366021 | 164 | 80% |
Ovis aries | Sheep | 94 | XP_014952114.2 | 165 | 83% |
Vulpes vulpes | Red fox | 94 | XP_025854088.1 | 165 | 82% |
Pteropus vampyrus | Large flying fox | 94 | XP_011372965.1 | 165 | 81% |
Orcinus orca | Killer whale | 94 | XP_004269680.1 | 165 | 81% |
Monodelphis domestica | Gray short-tailed opossum | 160 | XP_001371407.3 | 170 | 71% |
Taeniopygia guttata | Zebra finch | 318 | XP_002193492.3 | 173 | 68% |
Alligator sinensis | Chinese alligator | 318 | XP_006016834.1 | 172 | 67% |
Pogona vitticeps | Central bearded dragon | 318 | XP_020633929.1 | 163 | 62% |
Xenopus laevis | African clawed frog | 351.7 | NP_001084889.1 | 166 | 52% |
Orbicella faveolata | Mountainous star coral | 687 | XP_020610022.1 | 171 | 38% |
Exaiptasia pallida | Sea anenmone | 687 | XP_028518835.1 | 169 | 36% |
Octopus vulgaris | Common octopus | 736 | XP_029645279.1 | 184 | 33% |
Brachionus plicatilis | N/A | 736 | RNA25638.1 | 170 | 28% |
Crassostrea virginica | Eastern oyster | 736 | XP_022343076.1 | 200 | 28% |
Diversispora epigaea | N/A | 1017 | RHZ70611.1 | 176 | 24% |
The evolution rate is at a medium pace, slower than the fibrinogen alpha chain but faster than cytochrome c, suggesting neither positive or negative selection at this locus.
TMEM128 has been found via yeast two-hybrid assays to interact with:
The biological function of TMEM128 is still poorly understood. As this is a transmembrane protein, common functions may include receptors, channels, or anchorage. [40] Because TMEM128 has post-translational modification sites, alternative protein states may be present that permit TMEM128 to have different forms. For example, phosphorylation of TMEM128 may make it bind to different substrates through conformational change. [41] TMEM128 also has a variety of interactions with other proteins as discussed above, suggesting it may have a broad range of action.
TMEM128 has been found to show moderate to strong positivity in some patients with carcinoma, with other types of cancer such as melanoma, glioma, breast, ovarian, renal, and pancreatic showing weak to moderate positivity. [42] TMEM128 also has been found to show low cancer specificity. [42]
TMEM128 expression is experimentally associated with presence of the ROR alpha1 protein, as TMEM128 was found in lower quantities when ROR alpha1 was deleted. [43] [44]
TMEM128 expression was lowered following a null mutation of TAp63 in skin cells. [45] [46]
TMEM128 expression was increased following a Trypanosoma cruzi infection. [47] [48]
While it has been associated with several diseases such as Wolf-Hirschhorn Syndrome, no evidence exists for the exact cause of this syndrome and may only be correlation because of location on chromosome 4 [9] [49]
Several SNPs have been found in association with TMEM128: [50]
mRNA position | Amino acid position | dbSNP rs# | Reference allele | SNP allele | Function |
---|---|---|---|---|---|
169 | 43 | rs771177507 | A | C | Missense |
186 | 49 | rs146625911 | A | C | Missense |
204 | 55 | rs1434953873 | G | T | Missense |
270 | 77 | rs13135886 | A | G | Missense |
463 | 139 | rs757745482 | T | C | Missense |
466 | 142 | rs1213450146 | G | A | Nonsense |
512 | 158 | rs202215273 | G | A, T | Missense |
Transmembrane protein 241 is a ubiquitous sugar transporter protein which in humans is encoded by the TMEM241 gene.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Transmembrane protein 255A is a protein that is encoded by the TMEM255A gene. TMEM255A is often referred to as family with sequence similarity 70, member A (FAM70A). The TMEM255A protein is transmembrane and is predicted to be located the nuclear envelope of eukaryote organisms.
TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
TBC1D30 is a gene in the human genome that encodes the protein of the same name. This protein has two domains, one of which is involved in the processing of the Rab protein. Much of the function of this gene is not yet known, but it is expressed mostly in the brain and adrenal cortex.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
{{cite journal}}
: Cite journal requires |journal=
(help)