Small integral membrane protein 14

Last updated
SMIM14 encoded in chromosome 4, band 4p14. Human 439547950 39640710 .png
SMIM14 encoded in chromosome 4, band 4p14.
3-D rendering of SMIM14 protein via Phyre2 Final.pub.black.png
3-D rendering of SMIM14 protein via Phyre2

Small integral membrane protein 14, also known as SMIM14 or C4orf34, is a protein encoded on chromosome 4 of the human genome by the SMIM14 gene. [2] SMIM14 has at least 298 orthologs mainly found in jawed vertebrates and no paralogs. [3] SMIM14 is classified as a type I transmembrane protein. While this protein is not well understood by the scientific community, the transmembrane domain of SMIM14 may be involved in ER retention. [4]

Contents

Gene

The SMIM14 gene is located on the minus strand at cytogenetic band 4p14 and is 92,567 base pairs in length. [5] The gene has five exons, four of which constitute the open-reading frame for SMIM14. [6]

The Kozak sequence, which functions as the protein translation initiation site in most eukaryotic mRNA transcripts, is considered a strong motif. [7] There is no signal peptide in SMIM14, but the encoded transmembrane domain acts as the signal sequence. It is predicted that one disulfide bridge is encoded in SMIM14, which stabilizes the tertiary (and sometimes quaternary) structures of proteins. There are at least ten polyadenylation sequences in the 3’ UTR of the SMIM14 gene, indicating transcription termination.

SMIM14 is expressed at four-times the level of an average gene. [8]

Gene regulation

Promoter

SMIM14 has seven predicted promoter regions. The promoter with the greatest number of transcripts and CAGE tags is approximately 1,420 base pairs in length. It is found on the minus strand and has a start position at residue 39,638,806 and ends at residue 39,640,225. The identified promoter has five coding transcripts and a maximum of 105,458 CAGE tags from one of the transcripts. [9]

Promoter IDStart PositionEnd PositionLength (bp)Coding Transcripts
GXP_15011239,549,54739,550,8121,2660
GXP_319801339,583,91939,584,9581,0400
GXP_952040639,605,10539,606,1441,040N/A
GXP_952040739,626,49039,627,5291,040N/A
GXP_675087639,627,08239,628,1211,0401
GXP_319801539,638,19139,639,2301,0400
GXP_675087739,638,80639,640,2251,4205

For the SMIM14 gene, the associated CpG sites are found in CpG island 76; additional transcription factors can bind to this promoter to drive SMIM14 gene expression. [10]

Literature-curated Transcription Factors

(via ORegAnno)

SMARCA4
STAT1
RBL2
TRIM28
EGR1
TFAP2C

RNA and expression

SMIM14 has three mRNA transcript variants. Transcript variant 1 is the longest variant, with 6,397 base pairs. [2]

TranscriptLength (bp)Accession Number
Transcript variant 16,397NM_001317896.2
Transcript variant 26,252NM_174921
Transcript variant 36,263NM_001317897

SMIM14 has high expression in the liver, adrenal gland, colon, and prostate. It is under-expressed in peripheral blood lymphocytes, skeletal muscles, and the heart. [11]

Protein

Visualization of SMIM14 protein with helical transmembrane domain (TMD) via the Protter web server. Protter Cropped.png
Visualization of SMIM14 protein with helical transmembrane domain (TMD) via the Protter web server.

From SMIM14, transcript variant 1, a protein of 99 amino acids is synthesized. [13]

Primary structure

The predicted molecular weight (Mw) of the SMIM14 protein is 10710.34 Da. The SMIM14 protein carries no electrical charge at a pH value of 5.10 (i.e. isoelectric point, pI). [14] The abundance of every amino acid is within the normal range for humans. [14]

Transmembrane domain and motifs

The Kozak sequence is considered a strong motif. [7]

SMIM14 has one transmembrane domain, so it is classified as a single-pass membrane protein. [15] The transmembrane domain extends from residues 51–70. [16] It is predicted that within the domain, there is a dileucine motif, which plays a role in the sorting of transmembrane proteins to endosomes and lysosomes. [17] The N-terminus is positioned in the extracellular space, while the C-terminus is located inside the cell, further classifying SMIM14 as a type I transmembrane protein.

Secondary structure

It is predicted that there is an ɑ-helix within the transmembrane domain. [18] It is also predicted that SMIM14 is randomly coiled near the C-terminus. [18] [19] A random coil is regarded as the protein's lack of a secondary structure, so it assumes a relaxed, non-interacting nor stabilizing conformation. It is also predicted that extended strands (E-strands) are throughout the protein. [18] [19] E-strands are a common secondary structure, as well, and are often characterized by their involvement in hydrogen bonding with polar side chains.

Within the N-terminus, SMIM14 is predicted to have three palmitoylation sites, [20] which facilitates the clustering of proteins, and one disulfide bridge, stabilizing the structure of the protein. There is also a predicted glycosaminoglycan site spanning residues 45–48, proximal to the transmembrane domain. [21] The C-terminus is predicted to have two unidentified phosphorylation sites and one PKA-phosphorylation site. [22]

Subcellular location

SMIM14, a transmembrane protein, is usually expressed in the ER membrane. [4] While there is no conventional ER retention signal within SMIM14 coding sequences, it has been suggested that the transmembrane domain mediates ER retention.

Homology

SMIM14 has no known paralogs and at least 298 orthologs.

Paralogs

Through BLAST, it has been established that there are no paralogs of the SMIM14 gene in Homo sapiens. [23]

Orthologs

SMIM14 is conserved in most vertebrates, excluding hagfish, lampreys, lobe-finned fish, and lungfish. [23] For invertebrates, they are conserved in flatworms, roundworms, mollusks, and arthropods. It is also relatively conserved in distant relatives, such as sea anemones and corals.

SpeciesCommon NameTaxonsDoD (mya)% Identity% SimilarityCorrected % Divergence (m)Accession Number
Mastomys couchaSouthern multimammate mouse rodentia 9087.998.012.9XP_031198284.1
Phyllostomus discolorpale spear-nosed bat mammalia 9693.499.06.70XP_028361411.1
Manacus vitellinusgolden-collared manakin aves 31285.191.116.1XP_017923893.1
Python bivittatusBurmese python reptilia 31280.289.122.1XP_007426519
Nanorana parkerihigh Himalaya frog amphibia 35269.279.836.8XP_018420132.1
Danio reriozebrafish actinopterygii 43568.082.538.6NP_991165.1
Rhincodon typuswhale shark chondrichthyes 47371.884.533.1XP_020383770.1
Ciona intestinalissea vase ascidiacea 67642.755.385.1XP_026690156.1
Strongylocentrotus

purpuratus

Pacific purple sea urchin echinodermata 68450.568.068.3XP_787363.2
Lingula anatinalamp shell brachiopoda 79759.074.352.8XP_013382479.1
Limulus polyphemusAtlantic horseshoe crab arthropoda 79749.565.070.3XP_013782563.1
Agrilus planipennisemerald ash borer insecta 79739.857.392.1XP_018319678.1
Octopus vulgarisoctopus mollusca 79751.064.467.3XP_029637526.1
Strongyloides rattithreadworm nematoda 79733.348.1110XP_024504825.1
Exaiptasia pallidasea anemone anthozoa 82458.265.554.1XP_020902189.1
Schistosoma haematobiumurinary blood fluke platyhelminthes 82437.453.398.3XP_012793134.1

The sequence of the SMIM14 gene is highly conserved in orthologs proximal to the N-terminus. In stark contrast, the C-terminus is more varied across orthologs. Sequence analysis of the SMIM14 gene in humans suggests that the C-terminus encodes a disproportionate amount of proline residues (9 out of 29; 31%) with several proline-rich sequences (PXXP). [4] Proline-rich domains are usually associated with protein-protein interactions; thus, the C-terminus has a high probability of interacting with proteins.

Protein interactions

SMIM14 has been predicted to interact with the FATE1 protein, which is involved in the Ca2+ transfer from the ER to mitochondria, a regulatory mechanism for apoptosis. [24] [25] It has also been predicted that SMIM14 interacts with LSM4, a glycine-rich protein that plays a role in pre-mRNA splicing. [26] [27]

Related Research Articles

<span class="mw-page-title-main">YIF1A</span> Protein-coding gene in the species Homo sapiens

Protein YIF1A is a Yip1 domain family proteins that in humans is encoded by the YIF1A gene.

<span class="mw-page-title-main">TMEM98</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 98 is a single-pass membrane protein that in humans is encoded by the TMEM98 gene. The function of this protein is currently unknown. TMEM98 is also known as UNQ536/PRO1079.

Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.

<span class="mw-page-title-main">CCDC138</span> Protein found in humans

Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.

<span class="mw-page-title-main">SLC46A3</span> Protein-coding gene in the species Homo sapiens

Solute carrier family 46 member 3 (SLC46A3) is a protein that in humans is encoded by the SLC46A3 gene. Also referred to as FKSG16, the protein belongs to the major facilitator superfamily (MFS) and SLC46A family. Most commonly found in the plasma membrane and endoplasmic reticulum (ER), SLC46A3 is a multi-pass membrane protein with 11 α-helical transmembrane domains. It is mainly involved in the transport of small molecules across the membrane through the substrate translocation pores featured in the MFS domain. The protein is associated with breast and prostate cancer, hepatocellular carcinoma (HCC), papilloma, glioma, obesity, and SARS-CoV. Based on the differential expression of SLC46A3 in antibody-drug conjugate (ADC)-resistant cells and certain cancer cells, current research is focused on the potential of SLC46A3 as a prognostic biomarker and therapeutic target for cancer. While protein abundance is relatively low in humans, high expression has been detected particularly in the liver, small intestine, and kidney.

TMEM143 is a protein that in humans is encoded by TMEM143 gene. TMEM143, a dual-pass protein, is predicted to reside in the mitochondria and high expression has been found in both human skeletal muscle and the heart. Interaction with other proteins indicate that TMEM143 could potentially play a role in tumor suppression/expression and cancer regulation.

<span class="mw-page-title-main">TMCO6</span> Protein-coding gene in the species Homo sapiens

Transmembrane and coiled-coil domain 6, TMCO6, is a protein that in humans is encoded by the TMCO6 gene with aliases of PRO1580, HQ1580 or FLJ39769.1.

C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.

<span class="mw-page-title-main">C12orf60</span> Protein-coding gene in humans

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

Transmembrane Protein 217 is a protein encoded by the gene TMEM217. TMEM217 has been found to have expression correlated with the lymphatic system and endothelial tissues and has been predicted to have a function linked to the cytoskeleton.

LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.

<span class="mw-page-title-main">Transmembrane protein 179</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 179 is a protein that in humans is encoded by the TMEM179 gene. The function of transmembrane protein 179 is not yet well understood, but it is believed to have a function in the nervous system.

<span class="mw-page-title-main">TMEM128</span>

TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.

<span class="mw-page-title-main">TMEM81</span> Protein-coding gene in the species Homo sapiens

Transmembrane Protein 81 or TMEM81 is a protein that in humans is encoded by the TMEM81 gene. TMEM81 is a poorly-characterized transmembrane protein which contains an extracellular immunoglobulin domain.

<span class="mw-page-title-main">TMEM247</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 247 is a multi-pass transmembrane protein of unknown function found in Homo sapiens encoded by the TMEM247 gene. Notable in the protein are two transmembrane regions near the c-terminus of the translated polypeptide. Transmembrane protein 247 has been found to be expressed almost entirely in the testes.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">TEDDM1</span> Protein-coding gene in the species Homo sapiens

Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.


<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">TMEM271</span> TMEM271 gene and protein

Transmembrane protein 271, or TMEM271 is a protein in Homo sapiens encoded by the TMEM271 gene, located at 4p16.3 on the minus strand. The protein is located on the plasma membrane of cells and highly expressed in several regions of the brain.

References

  1. Hunt, Sarah E; McLaren, William; Gil, Laurent; Thormann, Anja; Schuilenburg, Helen; Sheppard, Dan; Parton, Andrew; Armean, Irina M; Trevanion, Stephen J; Flicek, Paul; Cunningham, Fiona (1 January 2018). "Ensembl variation resources". Database. 2018. doi:10.1093/database/bay119. PMC   6310513 . PMID   30576484.
  2. 1 2 "Homo sapiens small integral membrane protein 14 (SMIM14), transcript variant 1, mRNA". 2019-07-07.{{cite journal}}: Cite journal requires |journal= (help)
  3. "SMIM14 orthologs". NCBI. Retrieved 2020-02-07.
  4. 1 2 3 Jun, Mi-Hee; Jun, Young-Wu; Kim, Kun-Hyung; Lee, Jin-A; Jang, Deok-Jin (31 October 2014). "Characterization of the cellular localization of C4orf34 as a novel endoplasmic reticulum resident protein". BMB Reports. 47 (10): 563–568. doi:10.5483/bmbrep.2014.47.10.252. PMC   4261514 . PMID   24499674.
  5. Chalifa-Caspi, V.; Shmueli, O; Benjamin-Rodrig, H; Rosen, N; Shmoish, M; Yanai, I; Ophir, R; Kats, P; Safran, M; Lancet, D (1 January 2003). "GeneAnnot: Interfacing GeneCards with high-throughput gene expression compendia". Briefings in Bioinformatics. 4 (4): 349–360. doi: 10.1093/bib/4.4.349 . PMID   14725348.
  6. "SMIM14 Gene - GeneCards | SIM14 Protein | SIM14 Antibody". www.genecards.org. Retrieved 2020-02-25.
  7. 1 2 Hernández, Greco; Osnaya, Vincent G.; Pérez-Martínez, Xochitl (1 December 2019). "Conservation and Variability of the AUG Initiation Codon Context in Eukaryotes". Trends in Biochemical Sciences. 44 (12): 1009–1021. doi: 10.1016/j.tibs.2019.07.001 . PMID   31353284. S2CID   198966937.
  8. "AceView: Gene:C4orf34, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2020-04-30.
  9. Cartharius, K.; Frech, K.; Grote, K.; Klocke, B.; Haltmeier, M.; Klingenhoff, A.; Frisch, M.; Bayerlein, M.; Werner, T. (1 July 2005). "MatInspector and beyond: promoter analysis based on transcription factor binding sites". Bioinformatics. 21 (13): 2933–2942. doi: 10.1093/bioinformatics/bti473 . PMID   15860560.
  10. Kent, W. J.; Sugnet, C. W.; Furey, T. S.; Roskin, K. M.; Pringle, T. H.; Zahler, A. M.; Haussler, a. D. (16 May 2002). "The Human Genome Browser at UCSC". Genome Research. 12 (6): 996–1006. doi:10.1101/gr.229102. PMC   186604 . PMID   12045153.
  11. "49002542 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-04-30.
  12. "Protter - interactive protein feature visualization". wlab.ethz.ch. Retrieved 2020-05-01.
  13. "small integral membrane protein 14 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-04-30.
  14. 1 2 Brendel, V.; Bucher, P.; Nourbakhsh, I. R.; Blaisdell, B. E.; Karlin, S. (15 March 1992). "Methods and algorithms for statistical analysis of protein sequences". Proceedings of the National Academy of Sciences. 89 (6): 2002–2006. Bibcode:1992PNAS...89.2002B. doi: 10.1073/pnas.89.6.2002 . PMC   48584 . PMID   1549558.
  15. Kall, L.; Krogh, A.; Sonnhammer, E. L.L. (8 May 2007). "Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server". Nucleic Acids Research. 35 (Web Server): W429–W432. doi:10.1093/nar/gkm256. PMC   1933244 . PMID   17483518.
  16. Gouw, Marc; Michael, Sushama; Sámano-Sánchez, Hugo; Kumar, Manjeet; Zeke, András; Lang, Benjamin; Bely, Benoit; Chemes, Lucía B; Davey, Norman E; Deng, Ziqi; Diella, Francesca; Gürth, Clara-Marie; Huber, Ann-Kathrin; Kleinsorg, Stefan; Schlegel, Lara S; Palopoli, Nicolás; Roey, Kim V; Altenberg, Brigitte; Reményi, Attila; Dinkel, Holger; Gibson, Toby J (4 January 2018). "The eukaryotic linear motif resource – 2018 update". Nucleic Acids Research. 46 (D1): D428–D434. doi:10.1093/nar/gkx1077. PMC   5753338 . PMID   29136216.
  17. Bonifacino, Juan S.; Traub, Linton M. (June 2003). "Signals for Sorting of Transmembrane Proteins to Endosomes and Lysosomes". Annual Review of Biochemistry. 72 (1): 395–447. doi:10.1146/annurev.biochem.72.121801.161800. PMID   12651740.
  18. 1 2 3 Combet, C; Blanchet, C; Geourjon, C; Deléage, G (March 2000). "NPS@: Network Protein Sequence Analysis". Trends in Biochemical Sciences. 25 (3): 147–150. doi:10.1016/s0968-0004(99)01540-6. PMID   10694887.
  19. 1 2 Ashok Kumar, T (1 April 2013). "CFSSP: Chou and Fasman Secondary Structure Prediction server". Wide Spectrum. 1 (9): 15–19. doi:10.5281/ZENODO.50733.
  20. Ren, J.; Wen, L.; Gao, X.; Jin, C.; Xue, Y.; Yao, X. (27 August 2008). "CSS-Palm 2.0: an updated software for palmitoylation sites prediction". Protein Engineering Design and Selection. 21 (11): 639–644. doi:10.1093/protein/gzn039. PMC   2569006 . PMID   18753194.
  21. Gouw, Marc; Michael, Sushama; Sámano-Sánchez, Hugo; Kumar, Manjeet; Zeke, András; Lang, Benjamin; Bely, Benoit; Chemes, Lucía B; Davey, Norman E; Deng, Ziqi; Diella, Francesca; Gürth, Clara-Marie; Huber, Ann-Kathrin; Kleinsorg, Stefan; Schlegel, Lara S; Palopoli, Nicolás; Roey, Kim V; Altenberg, Brigitte; Reményi, Attila; Dinkel, Holger; Gibson, Toby J (4 January 2018). "The eukaryotic linear motif resource – 2018 update". Nucleic Acids Research. 46 (D1): D428–D434. doi:10.1093/nar/gkx1077. PMC   5753338 . PMID   29136216.
  22. Blom, Nikolaj; Sicheritz-Pontén, Thomas; Gupta, Ramneek; Gammeltoft, Steen; Brunak, Søren (June 2004). "Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence". Proteomics. 4 (6): 1633–1649. doi:10.1002/pmic.200300771. PMID   15174133. S2CID   18810164.
  23. 1 2 Altschul, Stephen F.; Gish, Warren; Miller, Webb; Myers, Eugene W.; Lipman, David J. (October 1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–410. doi:10.1016/S0022-2836(05)80360-2. PMID   2231712. S2CID   14441902.
  24. "FATE1 - Fetal and adult testis-expressed transcript protein - Homo sapiens (Human) - FATE1 gene & protein". www.uniprot.org. Retrieved 2020-04-30.
  25. Doghman‐Bouguerra, Mabrouka; Granatiero, Veronica; Sbiera, Silviu; Sbiera, Iuliu; Lacas‐Gervais, Sandra; Brau, Frédéric; Fassnacht, Martin; Rizzuto, Rosario; Lalli, Enzo (September 2016). "FATE 1 antagonizes calcium‐ and drug‐induced apoptosis by uncoupling ER and mitochondria". EMBO Reports. 17 (9): 1264–1280. doi:10.15252/embr.201541504. PMC   5007562 . PMID   27402544.
  26. "LSM4 - U6 snRNA-associated Sm-like protein LSm4 - Homo sapiens (Human) - LSM4 gene & protein". www.uniprot.org. Retrieved 2020-04-30.
  27. Bertram, Karl; Agafonov, Dmitry E.; Dybkov, Olexandr; Haselbach, David; Leelaram, Majety N.; Will, Cindy L.; Urlaub, Henning; Kastner, Berthold; Lührmann, Reinhard; Stark, Holger (August 2017). "Cryo-EM Structure of a Pre-catalytic Human Spliceosome Primed for Activation". Cell. 170 (4): 701–713.e11. doi: 10.1016/j.cell.2017.07.011 . PMID   28781166. S2CID   12185819.