TMEM8B | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | TMEM8B , C9orf127, NAG-5, NGX6, NAG5, transmembrane protein 8B | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 616888 MGI: 2441680 HomoloGene: 72894 GeneCards: TMEM8B | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Transmembrane protein 8B is a protein that in humans is encoded by the TMEM8B gene. It encodes for a transmembrane protein that is 338 amino acids long, and is located on human chromosome 9. [5] Aliases associated with this gene include C9orf127, NAG-5, and NGX61. [6]
Cytogenic location: 9p13.3 [7] Located on chromosome 9 in the human genome. It starts at base pair 35,814,451, and ends at 35,865,518, and contains 19 exons. There are 13 transcript variants that are protein encoding, and the longest transcript variant is 790 amino acids long.
Using information from NCBI's EST Abundance Profile page on TMEM8B, expression levels vary in 32 different human tissues. The highest levels of expression can be found in the brain, ovaries, prostate, placenta, and the pancreas. [8] Expression levels are down regulated in some cancerous tissue, specifically nasopharyngeal and colorectal carcinomas. TMEM8B is expressed in all stages of development, including fetal stages, as low levels of expression are present in the fetal liver, brain, and thymus. [8]
TMEM8B has 13 known mRNA splice variants in humans: Refer to the table below. All 13 variants are protein encoding, and all contain 19 exons.
Name | Accession Number | Amino Acid Length | mRNA |
---|---|---|---|
Isoform A | NP_001036055.1 | 472 | NM_001042589.2 |
Isoform B | NP_057530.2 | 338 | NM_016446.3 |
Isoform X1 | XP_011516213.1 | 508 | XM_011517911.2 |
Isoform X2 | XP_011516204.1 | 498 | XM_011517902.2 |
Isoform X3 | XP_024303339.1 | 482 | XM_024447571.1 |
Isoform X4 | XP_011516205.1 | 399 | XM_011517903.2 |
Isoform X5 | XP_024303338.1 | 373 | XM_024447570.1 |
Isoform X6 | XP_011516206.1 | 790 | XM_011517904.3 |
Isoform X7 | XP_011516207.1 | 334 | XM_011517905.1 |
Isoform X8 | XP_016870294.1 | 675 | XM_017014805.1 |
Isoform X9 | XP_011516218.1 | 450 | XM_011517916.2 |
Isoform X10 | XP_016870296.1 | 406 | XM_017014807.1 |
Isoform X11 | XP_011516220.1 | 398 | XM_011517918.3 |
The figure below from NCBI Gene depicts the chromosomal location of each isoform in comparison to TMEM8B.
Protein analysis was completed on Isoform A. TMEM8B isoform A is 472 amino acids long. The molecular weight is 36.8 kDa, [9] and the isoelectric point is 6.773. [10] There are 7 transmembrane domains, resulting in 52% of the protein to be within the plasma membrane. [11] The C-charge> N-charge, and therefore the C-terminal end is on the inside. Transmembrane domains are conserved in most orthologs, including all mammals. Relative to other proteins, TMEM8B has higher than normal levels of K, Lysine, and L, Leucine. [9] There are three repeating leucine-rich regions within conserved domains of TMEM8B, all 4 amino acids long. Leucine rich regions can result in hydrophobic interactions within themselves. [12]
Identifying the secondary structure is helpful in further analyzing the function of this protein. Alpha helices are the strongest indicators of transmembrane regions, as the helical structure can satisfy all backbone hydrogen-bonds internally. This is why the secondary structure of this protein is practical, as many of the alpha helices lie in the predicted transmembrane regions. Other key structures identified in this protein include extended strands, which are hypothesized to be important folding regions, and random coils, a class of conformations in the absence of a regular secondary structure.
I-TASSER [13] predicted the 3D tertiary structure of TMEM8B, with strategic folding of the alpha helices and beta sheets. Although there are no high scoring hydrophobic segments of TMEM8B, that would usually be hidden within the interior of the 3D structure, the high amounts of Leuceine (L) amino acids in this protein creates hydrophobic interactions with itself, and these areas are predicted to be buried on the inside of the structure. [12] Refer to the figure below to see a predicted tertiary structure.
TMEM8B highly resembles a tertiary structure that is similar to the Reelin protein, predicted by a 42% coverage and 14.79% identity.[ citation needed ] The Reelin protein has no transmembrane domains, and is mostly found in the cerebral cortex and the hippocampus, where it plays important roles in the control of neuronal migration and formation of cellular layers during brain development.
The orthologs of TMEM8B were sequenced in BLAST [14] and 20 various orthologs were picked. The orthologs are all multicellular organisms, and vary through mammals, rodents, birds, fish, amphibians, echinoderms, chordates, insects, and cnidarians. Refer to the table below. Time tree was a program that was used to find the evolutionary branching shown in MYA, [15] and conserved domains of the genome were found and analyzed using ClustalW. [16]
Genus Species | Common Name | Divergence from Humans (MYA) | Accession Number | Amino Acid Length | Sequence Identity | Sequence Similarity |
---|---|---|---|---|---|---|
Homo sapiens | Humans | -- | EAW58325.1 | 338 | -- | -- |
Carlito syrichta | Philippine tarsier | 67.1 | XP_008061336.2 | 273 | 96% | 97% |
Trichechus manatus latirostris | Florida manatee | 105 | XP_004372337.1 | 273 | 96% | 97% |
Neomonachus schauinslandi | Hawaiian monk seal | 96 | XP_021546789.1 | 280 | 96% | 96% |
Pelecanus Crispus | Dalmatian pelican | 312 | XP_009481450.1 | 219 | 75% | 86% |
Salmo salar | Atlantic salmon | 435 | XP_013999021.1 | 494 | 68% | 86% |
Struthio camelus australis | Southern ostrich | 312 | XP_009675834.1 | 283 | 70% | 81% |
Cariama cristata | Red-legged seriema | 312 | XP_009701221.1 | 280 | 68% | 80% |
Egretta garzetta | Little egret | 312 | XP_009645653.1 | 282 | 68% | 79% |
Sinocyclocheilus graham | Golden line fish | 435 | XP_016091386.1 | 295 | 62% | 76% |
Charadrius vociferus | Kildeer | 312 | XP_009889203.1 | 420 | 63% | 75% |
Chrysochloris asiatica | Cape golden mole | 105 | XP_006863153.1 | 392 | 93% | 75% |
Branchiostoma belcheri | Belcher's Lancelet | 684 | XP_019646192.1 | 209 | 37% | 54% |
Xenopus laevis | African clawed frog | 352 | XP_018123357.1 | 480 | 65% | 50% |
Diachasma alloeum | Common house spider | 797 | XP_015126938.1 | 252 | 29% | 47% |
Megachile rotundata | Alfalfa leafcutting bee | 797 | XP_003700975.2 | 242 | 29% | 46% |
Strongylocentrotus purpuratus | Purple sea urchin | 684 | XP_011666469.1 | 240 | 23% | 38% |
Cryptotermes brevis | Termite | 794 | XP_023705434.1 | 361 | 31% | 29% |
Exaiptasia pallida | Sea anemone | 824 | XP_020898578.1 | 361 | 29% | 28% |
Ciona intestinalis | Vase tunicate | 676 | XP_009857467.1 | 384 | 33% | 18% |
One human paralog was found when this protein was sequenced in BLAST. It is 416 amino acids long, with 40% sequence identity, and 45% sequence similarity. Accision number for this protein is: NP_067082.2.
In an evolutionary comparison of TMEM8B, one species from each group (ex. Mammals, birds, fish) was plotted to avoid overabundance of information on one graph. Also plotted the comparison of the quickly diverging cytochrome C, and slowly diverging fibrinogen. TMEM8B shows divergence somewhere in-between these two proteins.
TMEM8B shows lower expression rates in nasopharyngeal carcinomas, and expression is also down regulated in colorectal cancers. This gene also plays a negative role in an Epidermal Growth Factor Receptor (EGFR) pathway. [5] It can delay cell cycle G0-G1 progression, and thus inhibit cell proliferation in nasopharyngeal carcinoma cells. [5]
Mutations with this gene can be pathogenic, and cause chronic pain disorders, specifically erythromelalgia symptoms. [5] [17] [18] Erythromelalgia is a rare condition that affects the extremities (hands and feet), and is characterized by intense, burning pain, severe redness, and increased skin temperature. [19] Medications are available to reduce symptoms, however, there is no cure for this rare condition. [19]
Two interacting proteins were found: EGF protein, and ATXN1L protein.
EGF plays a role in cell adhesion in nasopharyngeal carcinomas (TMEM8B also plays a role in these carcinomas). This protein is expressed on the cell surface as a glycoprotein, and ectopic induction of EGF can impair NPC cell migration and improve cell adhesion and gap junctional intercellular communication. [20]
ATXN1L protein has a correlation with neurodegenerative disorders. Neurodegenerative disorders are characterized by a loss of balance due to the cerebellar Purkinje degeneration. Ataxia-causing proteins share interacting partners, a subset of which has been found to modify neurodegeneration in animal models. Interactome provides a tool for understanding pathogenic mechanisms common for neurodegenerative disorders. [21]
BPI fold containing family A, member 1 (BPIFA1), also known as Palate, lung, and nasal epithelium clone (PLUNC), is a protein that in humans is encoded by the BPIFA1 gene. It was also formerly known as "Secretory protein in upper respiratory tracts" (SPURT). The BPIFA1 gene sequence predicts 4 transcripts ; 3 mRNA variants have been well characterized. The resulting BPIFA1 is a secreted protein, expressed at very high levels in mucosa of the airways and salivary glands; at high levels in oropharyneal epithelium, including tongue and tonsils; and at moderate levels many other tissue types and glands including pituitary, testis, lung, bladder, blood, prostate, pancreas, levels in the digestive tract and pancreas. The protein can be detected on the apical side of epithelial cells and in airway surface liquid, nasal mucus, and sputum.
FXYD domain-containing ion transport regulator 5 also named dysadherin (human) or RIC (mouse) is a protein that in humans is encoded by the FXYD5 gene.
MAP11 is a protein that in human is encoded by the gene MAP11. It was previously referred to by the generic name C7orf43. C7orf43 has no other human alias, but in mice can be found as BC037034.
E3 ubiquitin-protein ligase RNF128 is an enzyme that in humans is encoded by the RNF128 gene.
Transmembrane Protein 205 (TMEM205) is a protein encoded on chromosome 19 by the TMEM205 gene.
Coiled-coil domain containing 109B (CCDC109B) is a potential calcium uniporter protein found in the membrane of human cells and is encoded by the CCDC109B gene. While CCDC109B is a transmembrane protein it is unclear if it is located within the cell membrane or mitochondrial membrane.
Transmembrane protein 8A is a protein that in humans is encoded by the TMEM8A gene (16p13.3.). Evolutionarily, TMEM8A orthologs are found in primates and mammals and in a few more distantly related species. TMEM8A contains five transmembrane domains and one EGF-like domain which are all highly conserved in the ortholog space. Although there is no confirmed function of TMEM8A, through analyzing expression and experimental data, it is predicted that TMEM8A is an adhesion protein that plays a role in keeping T-cells in their resting state.
TMEM143 is a protein that in humans is encoded by TMEM143 gene. TMEM143, a dual-pass protein, is predicted to reside in the mitochondria and high expression has been found in both human skeletal muscle and the heart. Interaction with other proteins indicate that TMEM143 could potentially play a role in tumor suppression/expression and cancer regulation.
Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.
Transmembrane Protein 217 is a protein encoded by the gene TMEM217. TMEM217 has been found to have expression correlated with the lymphatic system and endothelial tissues and has been predicted to have a function linked to the cytoskeleton.
Transmembrane protein 171 (TMEM171) is a protein that in humans is encoded by the TMEM171 gene.
Glutamate-rich protein 4 is encoded by the gene ERICH4 and can be otherwise known as chromosome 19 open reading frame 69 (C19orf69). ERICH4 is highly conserved in mammals and exhibits overexpression in tissues of the kidneys, terminal ileum, and duodenum. The function of ERICH4 has yet to be well understood by the scientific community but is suggested to contribute to immune inflammatory responses.
C22orf23 is a protein which in humans is encoded by the C22orf23 gene. Its predicted secondary structure consists of alpha helices and disordered/coil regions. It is expressed in many tissues and highest in the testes and it is conserved across many orthologs.
C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.
FAM237A is a protein coding gene which encodes a protein of the same name. Within Homo sapiens, FAM237A is believed to be primarily expressed within the brain, with moderate heart and lesser testes expression,. FAM237A is hypothesized to act as a specific activator of receptor GPR83.
RTP3 is a gene located on chromosome 3 in humans that encodes the RTP3 protein. Its expression is liver-restricted.
Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.
IGSF6 is a protein that in humans is encoded by the IGSF6 gene.
Transmembrane protein 89 (TMEM89) is a protein that in humans is encoded by the TMEM89 gene.
Maestro heat-like repeat-containing protein family member 9 (MROH9) is a protein which in humans is encoded by the MROH9 gene. The word ‘maestro’ itself is an acronym, standing for male-specific transcription in the developing reproductive organs (MRO). MRO genes belong to the MROH family, which includes MROH9.