TMEM247

Last updated
TMEM247
Identifiers
Aliases TMEM247 , transmembrane protein 247
External IDs MGI: 1925719 HomoloGene: 54379 GeneCards: TMEM247
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001145051

NM_001277980
NM_030104

RefSeq (protein)

NP_001138523

NP_001264909
NP_084380

Location (UCSC) Chr 2: 46.48 – 46.48 Mb Chr 17: 87.22 – 87.23 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Transmembrane protein 247 (also known as TMEM247 or transmembrane protein ENSP00000343375) is a multi-pass transmembrane protein of unknown function found in Homo sapiens encoded by the TMEM247 gene. Notable in the protein are two transmembrane regions near the c-terminus of the translated polypeptide. Transmembrane protein 247 has been found to be expressed almost entirely in the testes. [5]

Contents

Gene attributes

General information

The TMEM247 gene is located on chromosome 2 at c2p21, nucleotide: 46,479,565-46,484,425. It has three exons and two introns. TMEM247 is 4,861 nucleotides (nt) long pre-mRNA processing, reduced to 661 nt after mRNA processing and its protein product is 219 amino acids (aa) long. [6] The gene does not include a stop codon as most genes do, but instead has a stop codon created by the process of polyadenylation during mRNA processing. Due to this, TMEM247 has no 3' UTR (untranslated region). TMEM247 codes only for one variant.

Promoter region

The promoter region of TMEM247 has a huge variety of predicted binding sites in the promoter region associated with the gene. Twenty potential interactions of interest have been collected below, though many more exist. Anchor base positions are based on distance from the start of the gene's promoter region, which itself is 1302 base pairs long.

There are a number of notable predicted binding sites on the TMEM247 promoter, as well as a notable omission. The promoter lacks a traditional TATA box, the typical binding site for proteins that recruit RNA Polymerase and begin the process of transcription. Instead, TMEM247 contains several predicted binding sites which are core promoter elements for TATA-less promoters.

TMEM247 has a promoter region that also contains a significant number of predicted development-related binding sites, such as pluripotent stem cell related factors (Oct4, Sox2, Nanog), sex-determining HMG box factors, and various homeobox/homeodomain binding sites. [7]

Tail end of the promoter region of the TMEM247 gene with notable predicted binding sites highlighted. The start of transcription is marked by an arrow. PromoterFixed.png
Tail end of the promoter region of the TMEM247 gene with notable predicted binding sites highlighted. The start of transcription is marked by an arrow.
MatrixDetailed matrix informationAnchor baseStrandMatrix similaritySequence
V$TBX5.01Brachyury gene, mesoderm developmental factor1040(+)1ctacctcaaaGGTGtcacaccctccacca
V$EOMES.03Brachyury gene, mesoderm developmental factor1042(-)0.987tttggtggagggTGTGacacctttgaggt
V$PDEF.01Human and murine ETS1 factors (Prostate-derived Ets Factor)998(-)0.974gaactgcaGGATgggcctttg
V$RFX3.01X-box binding factors1064(+)0.974aaggggccctagCAACttg
V$SPZ1.01Testis-specific bHLH-Zip transcription factors (Spermatogenic Zip 1 transcription factor)1046(-)0.966tGGAGggtgtg
V$TBX20.02Brachyury gene, mesoderm developmental factor1149(-)0.939catcatttgaggtgctGACAtttggcctc
V$HSF1.05Heat shock factors1198(-)0.938ctgctgccatCCAGaaaaccagaac
V$MYOD.01Myogenic regulatory factor MyoD (myf3)1178(-)0.919cgctGCCAggtggggtc
V$MTBF.01Human muscle-specific Mt binding site1128(+)0.906tggaATCTg
V$RFX3.02Regulatory factor X, 3 (secondary DNA binding preference)1278(+)0.889gatggtgcctgGTGActcc
V$OCT3_4.02Motif composed of binding sites for pluripotency or stem cell factors892(+)0.882acaatctTCATttaaaaaa
V$HSF1.01Heat shock factors1190(-)0.845atccagaaaaccAGAAcgctgccag
V$EN1.01Homeobox transcription factors897(-)0.832gttcctttTTTAaatgaag
O$XCPE1.01Activator-, mediator- and TBP-dependent core promoter element for RNA polymerase II transcription from TATA-less promoters1243(+)0.831gtGCGGgagaa
V$DICE.01Downstream Immunoglobulin Control Element, critical for B cell activity and specificity1091(-)0.827tgtcGTCAtcatagc
V$ISL1.01Lim homeodomain factors1012(+)0.827tgcagttctTAATgttagcatgt
V$RFX4.03X-box binding factors1064(-)0.814caaGTTGctagggcccctt
V$EN1.01Homeobox transcription factors922(+)0.788aaatggatTTCAaatggtg
V$SOX9.03SOX/SRY-sex/testis determining and related HMG box factors1061(+)0.786caCCAAaggggccctagcaactt
V$OSNT.01Composed binding site for Oct4, Sox2, Nanog, Tcf3 (Tcf7l1) and Sall4b in pluripotent cells1151(+)0.784aatgtcaGCACctcaaatg
V$PROX1.01Prospero-related homeobox1163(+)0.783aatGATGtcttgt
V$SOX9.03SOX/SRY-sex/testis determining and related HMG box factors975(+)0.781ttTCAAagccatccttatgggca
V$HSF2.03Heat shock factors1075(+)0.777ctagcaacttgtAGAAtgtaggcta
V$HSF5.01Heat shock factors1074(-)0.764agcctacatTCTAcaagttgctagg

Protein attributes

The TMEM247 gene codes for a single protein, transmembrane protein 247 (also referred to as TMEM247). TMEM247 has two transmembrane domains at the c-terminus of the protein as part of its multi-pass transmembrane protein structure. They are identical in length at 21 amino acids each, and are separated by a span of six amino acids. [6] TMEM247 has a predicted molecular weight of 25 kilodaltons, and a predicted isoelectric point of 5. [8]

In composition, TMEM247 has a significantly higher amount of methionine when compared to the set of all human proteins. It also has slightly elevated levels of glutamic acid in the same analysis. The charge distribution of amino acids comprising TMEM247 is relatively uniform. Two predicted hydrophobic segments exist in the protein which match with the known two transmembrane regions.

Protein domains

Transmembrane protein 247 has two transmembrane domains. The three regions of the protein that remain are predicted to be outside of the membrane it resides in on the N- and C-terminus of the protein, while the segment between the protein's two transmembrane regions is predicted to reside inside of the membrane. [9] [10]

Analysis of TMEM247 predicts that it localizes in the cell at the endoplasmic reticulum. In this case, inner predicted domains would be inside the ER and outer predicted domains would reside in the cytoplasm.

A domain-level view of TMEM247 with points of interest at predicted post-translational modification sites Domain Image TMEM247.png
A domain-level view of TMEM247 with points of interest at predicted post-translational modification sites

Predicted post-translational modifications

Transmembrane protein 247 has a variety of predicted post-translational modifications that may affect protein function. Predicted modifications include O-beta-GlcNAc attachment, Glycation, and O-glycosylation. [11] [12] [13]

A conceptual translation of TMEM247 and predicted modifications to its protein product TMEM247ConceptranMk2.png
A conceptual translation of TMEM247 and predicted modifications to its protein product

Predicted kinase interactions

Protein kinases may modify transmembrane protein 247, and a variety of sites along the translated protein have been predicted to be kinase binding sites. These are represented by red squares surrounding the potential bound amino acids in the conceptual translation and listed in the table below. Predicted kinase interactions are listed in the order of the score of their prediction (higher, lower). [14]

Amino acid positionKinases
17CKI
20PKC
29unspecified
31unspecified
43unspecified, DNAPK, ATM
48unspecified
49CKII, unspecified, DNAPK
50unspecified
72unspecified, cdk5, p38MAPK
75unspecified, PKC
79PKC, unspecified
95cdk5, p38MAPK, GSK3
98unspecified
161PKA
219PKA

Protein structure

Transmembrane protein 247 has a predicted secondary structure which includes two major features in the form of beta sheets that reside near its determined transmembrane regions. This is slightly unusual for transmembrane proteins, whose transmembrane regions are often alpha helices. [15]

A Chou-Fasman method prediction of TMEM247's secondary protein structure Chao Secondary TMEM247.png
A Chou–Fasman method prediction of TMEM247's secondary protein structure
A 3D prediction of the TMEM247 secondary protein structure. 3D protein.png
A 3D prediction of the TMEM247 secondary protein structure.

Evolutionary history

Orthologs

TMEM247 has several hundred orthologs, with its most distant fully sequenced available ortholog being Anolis carolinensis . [16] [17] These orthologs are exclusive to land-based animals, as clades with an evolutionary origin before reptiles are not represented. The fact that TMEM247 has no relatives before the green anole makes it likely that the gene was novel when it appeared in an ancestor of the species, and was nonexistent before the evolution of reptiles. Classes represented in the orthologs include mammalia, aves, and reptilia.

Most orthologs within mammalia are strongly conserved across the entire gene, including a very highly conserved region near the center of the translated protein. The highest evolutionary conservation is centered around the transmembrane regions of the protein, which are highly conserved in all orthologous species. [18]

Genus and speciesCommon nameTaxonomic groupMYAAccession #Sequence length (aa)Sequence identity to humansSequence similarity to humans
Homo sapiens Human Primates0NP_001138523.1219100%100%
Tupaia chinensis Treeshrew Scandentia82XP_006159980.126674%81%
Urocitellus parryii Arctic ground squirrel Rodentia90XP_026241536.122471%77%
Cavia porcellus Guinea pig Rodentia90XP_003472978.126269%77%
Vulpes vulpes Red fox Carnivora96XP_025848559.123176%80%
Sus scrofa Wild boar Artiodactyla96XP_003125218.325774%78%
Pteropus alecto Black flying fox Chiroptera96XP_015442982.128069%78%
Myotis lucifugus Little brown bat Chiroptera96XP_006083536.121273%78%
Lynx canadensis Canadian lynx Carnivora96XP_030167645.121474%78%
Leptonychotes weddellii Weddell seal Carnivora96XP_006740668.121476%81%
Equus caballus Horse Perissodactyla96XP_023474197.128674%78%
Enhydra lutris kenyoni Sea otter Carnivora96XP_022371955.121476%80%
Canis lupus familiaris Dog Carnivora96XP_005626294.123176%80%
Camelus ferus Wild Bactrian camel Artiodactyla96XP_032353339.127673%78%
Bos taurus Cattle Artiodactyla96NP_001070537.221773%78%
Bos indicus × Bos taurus Hybrid cattleArtiodactyla96XP_027410252.125873%78%
Loxodonta africana African bush elephant Proboscideans105XP_023413034.126573%78%
Echinops telfairi Lesser hedgehog tenrec Afrosoricida105XP_004700102.121770%77%
Pelodiscus sinensis Softshell turtle Testudines312XP_006125563.218446%60%
Columba livia Pigeon Columbiformes312XP_021154517.119544%62%
Chelonia mydas Green sea turtle Testudines312XP_027681026.121338%55%
Antrostomus carolinensis Chuck-will's-widow Caprimulgiformes312XP_028940116.115438%52%
Anolis carolinensis Green anole Squamata312XP_008115619.122333%50%

Paralogs

In humans, TMEM247 has a single paralog (hCG17037) that has a sequence which theoretically would translate into a protein which is identical to that produced by TMEM247 aside from seven positions constituting a 96.8% similarity, including two deletions that reduce the total amino acid count from 219 to 217. [19] The extreme similarity of the TMEM247 gene and its paralog make it a likely result of gene duplication.

Paralog alignment

CLUSTAL O(1.2.4) multiple sequence alignment of TMEM247 and its paralog, hCG17037 TMEM247ParalogAlignment.png
CLUSTAL O(1.2.4) multiple sequence alignment of TMEM247 and its paralog, hCG17037

Significance/function

TMEM247 has no major known effects or uses in a clinical setting. There are several studies that indicate TMEM247, despite being found almost exclusively in the testes, does not play a significant role in reproduction. [20] Further studies have revealed an association with variants in TMEM247 and coronary artery disease, though not of major significance. [21]

A mutation in TMEM247 has been noted to be unusually common in populations of Tibetan highlanders. The exact mutation is rs116983452, a change at nucleotide position 248 in the gene from cystine to tyrosine, which causes a missense in the protein product of alanine to valine. [22]

While the function of TMEM247 is unknown, it is notable for its polyadenylation-synthesized stop codon. Some research has shown that genes which rely on polyadenylation for the creation of stop codons are relatively common in a human parasite, Blastocystis . [23]

Related Research Articles

<span class="mw-page-title-main">YIF1A</span> Protein-coding gene in the species Homo sapiens

Protein YIF1A is a Yip1 domain family proteins that in humans is encoded by the YIF1A gene.

CXorf49 is a protein, which in humans is encoded by the gene chromosome X open reading frame 49(CXorf49).

Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">TMCO4</span> Protein-coding gene in the species Homo sapiens

Transmembrane and coiled-coil domains 4, TMCO4, is a protein in humans that is encoded by the TMCO4 gene. Currently, its function is not well defined. It is transmembrane protein that is predicted to cross the endoplasmic reticulum membrane three times. TMCO4 interacts with other proteins known to play a role in cancer development, hinting at a possible role in the disease of cancer.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

Proline-rich protein 16 (PRR16) is a protein coding gene in Homo sapiens. The protein is known by the alias Largen.

<span class="mw-page-title-main">C5orf46</span> Protein-coding gene in the species Homo sapiens

C5orf46 is a protein coding gene located on chromosome 5 in humans. It is also known as sssp1, or skin and saliva secreted protein 1. There are two known isoforms known in humans, with isoform 2 being the longer of the two. The protein encoded is predicted to have one transmembrane domain, and has a predicted molecular weight of 9,692 Da, and a basal isoelectric point of 4.67.

<span class="mw-page-title-main">C16orf90</span> Protein-coding gene in the species Homo sapiens

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

<span class="mw-page-title-main">C1orf122</span> Protein-coding gene in the species Homo sapiens

C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">MFSD6L</span> Protein-coding gene in the species Homo sapiens

Major facilitator superfamily domain containing 6 like (MFSD6L) is a protein encoded by the MFSD6L gene in humans. The MFSD6L protein is a transmembrane protein that is part of the major facilitator superfamily (MFS) that uses chemiosmotic gradients to facilitate the transport of small solutes across cell membranes.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">TMEM144</span> Transmembrane Protein 144

Transmembrane Protein 144 (TMEM144) is a protein in humans encoded by the TMEM144 gene.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000284701 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000037689 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "TMEM247 transmembrane protein 247, Homo sapiens (human)". Gene—NCBI. Retrieved 28 April 2020.
  6. 1 2 "Homo sapiens transmembrane protein 247 (TMEM247), mRNA (345842501)". NCBI Nucleotide Database. 2019.
  7. "Genomatix" . Retrieved 29 March 2020.
  8. ExPASy—Compute pI/Mw tool. (n.d.). Retrieved April 20, 2020, from https://web.expasy.org/compute_pi/
  9. TMHMM result. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgibin/webface2.fcgi?jobid=5E9CC91C00001F03029DB033&wait=20
  10. Phobius. (n.d.). Retrieved April 20, 2020, from http://phobius.sbc.su.se/
  11. NetGlycate 1.0 Server—Prediction results. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5E9CCC4300001F0306A57D84&wait=20
  12. NetOGlyc 4.0 Server—Prediction results. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5E9CCD2200001F033FFFF880&wait=20
  13. YinOYang 1.2 Server. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/services/YinOYang/
  14. NetPhos 3.1 Server—Prediction results. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5E9CCE08000067A5DE7F60BB&wait=20
  15. "CFSSP: Chou & Fasman Secondary Structure Prediction Server" . Retrieved 20 April 2020.
  16. "BLAST: Basic Local Alignment Search Tool" . Retrieved 1 May 2020.
  17. "UCSC Genome Browser Gateway" . Retrieved 1 May 2020.
  18. EMBOSS Needle—Alignment. (n.d.). Retrieved February 9, 2020, from https://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_needle-I20200210030452-0663-36912718-p1m
  19. "HCG17037, partial Homo sapiens". Protein—NCBI. Retrieved 1 May 2020.
  20. Miyata H, Castaneda JM, Fujihara Y, Yu Z, Archambeault DR, Isotani A, et al. (July 2016). "Genome engineering uncovers 54 evolutionarily conserved and testis-enriched genes that are not required for male fertility in mice". Proceedings of the National Academy of Sciences of the United States of America. 113 (28): 7704–7710. Bibcode:2016PNAS..113.7704M. doi: 10.1073/pnas.1608458113 . PMC   4948324 . PMID   27357688.
  21. van der Harst P, Verweij N (February 2018). "Identification of 64 Novel Genetic Loci Provides an Expanded View on the Genetic Architecture of Coronary Artery Disease". Circulation Research. 122 (3): 433–443. doi:10.1161/CIRCRESAHA.117.312086. PMC   5805277 . PMID   29212778.
  22. Deng L, Zhang C, Yuan K, Gao Y, Pan Y, Ge X, et al. (November 2019). "Prioritizing natural-selection signals from the deep-sequencing genomic data suggests multi-variant adaptation in Tibetan highlanders". National Science Review. 6 (6): 1201–1222. doi:10.1093/nsr/nwz108. PMC   8291452 . PMID   34691999.
  23. Venton D (August 2014). "Highlight: not like a textbook-nuclear genes in blastocystis use mRNA polyadenylation for stop codons". Genome Biology and Evolution. 6 (8): 1962–1963. doi:10.1093/gbe/evu167. PMC   4159010 . PMID   25104295.