SMIM15 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | SMIM15 , C5orf43, small integral membrane protein 15 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1922866 HomoloGene: 90075 GeneCards: SMIM15 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
SMIM15(small integral membrane protein 15) is a protein in humans that is encoded by the SMIM15 gene. [5] It is a transmembrane protein that interacts with PBX4. [6] Deletions where SMIM15 is located have produced mental defects and physical deformities. [7] [8] The gene has been found to have ubiquitous but variable expression in many tissues throughout the body. [5]
Small integral membrane protein 15 (SMIM15) is a protein in humans that is encoded by the SMIM15 gene. [5] It has also been known under the aliases C5orf43 [5] and GC05M060454. [5] It is made up of 74 amino acids. It is located at 5q12.1. [5] SMIM15 has 4741 base pairs with three exons [5] [9]
SMIM15 has zero isoforms [5] The 5' UTR region spans 420 bases and the 3' UTR spans 2243 bases. [9]
Exon | Number of Base Pairs | Start and End Locations |
1 | 252 | 61162217 – 61162468 |
2 | 140 | 61161088 – 61161227 |
3 | 2496 | 61157704 – 61160199 |
Primary sequence of SMIM15 is: [11] MFDIKAWAEY VVEWAAKDPY GFLTTVILAL TPLFLASAVL SWKLAKMIEA REKEQKKKQK. RQENIAKAKR LKKD
Molecular weight of SMIM15 has been found to be 8.6 kdal and it has a pI of 9.82. [12] There are no significant compositional features compositional features like charge clusters, hydrophobic segments, charge runs, patterns, multiplets or periodicities. [13]
There is one transmembrane domain located from amino acids 20 – 42. [14] [15]
The other domains include a luminal domain from amino acids 1 - 19 and cytosolic domain from amino acids 43 - 74. [14] [15]
The secondary structure for SMIM15 is largely alpha-helical with alpha helices making up 62.16% (46 amino acids) of the protein. [16] Random coil makes up 25.68% (19 amino acids) and extended strands make up 12.16% (9 amino acids) of the SMIM15 protein. [16]
There are a number of post-translational modifications of the SMIM15 protein, which are shown in the Conceptual Translation of Human SMIM15 as shown in figure 1.
The predicted sites for sumoylation are at positions: 5, 67, 69, 72, 73. [17] It is known to affect protein stability, protect from degradation, cellular localization, protein-protein interactions and DNA binding.
The predicted sites for glycation are at positions: 5, 43, 58, 72, 73. [18] Glycation can lead to the creation of AGE (advanced glycation end products. [19] Glycation is a process in which proteins react with reducing sugar molecules, which will lead to impairment of the function and changes the characteristics of the protein. [20] [21]
Finally, there are four predicted sites for phosphorylation of tyrosine on position 20, threonine on positions 25 and 31, and serine on position 41. [22] Phosphorylation will affect different cellular processes and thus regulating protein function. [23]
SMIM15 has a transmembrane domain found within amino acids 20–42. There are cleavage sites at the C-terminous and nuclear localization signals. [24]
SMIM15 has been found to have ubiquitous but variable expression in many different tissues throughout the body. [5] it has the highest level of expression within the prostate. [25] There are lower levels of expression within skeletal muscles compared to other tissues within the body. [26]
SMIM15 has one CpG island within the promoter. SMIM15 has lower levels of H3K4Me1 but higher levels of H3K4Me3 and H3K27Ac across all of their cell lines [27]
The Promoter region for SMIM15 is 1049 base pairs long. [10] and it is known as GXP_922465. There are 431 different transcription binding factor sites, [10] some of these binding factors include GATA1, TGIF, LMX1A, and NKX61 [10]
There are no known micro-RNA targets in the 3' UTR. [10] mRNA secondary structures exhibited a high number of predicted stem-loop structures. This could indicate high stability of the mRNA transcript, and some binding sites for regulatory mechanisms.
The function of SMIM15 is currently not well understood.
There is only one interacting protein currently identified. [28] [29] This protein is PBX4 which is known for playing critical roles in embryonic development and cellular differentiation both as Hox cofactors and through Hox - independent pathways. [6] PBX4 is also a member of the pre-B cell leukemia transcription factor family. [6] [30]
Deletion of 5q12.1 can lead to the development of mental retardation and ocular defects. [7] Another deletion in the 5q12.1 - 5q12.3 region lead to mental-motor retardation and dysmorphia. [8] In terms of diseases, Caries is a multifactorial disease and little is still known about the host genetic factors influencing susceptibility. The interval 5q12.1-5q13.3 as linked to low caries susceptibility in Filipino families. [31]
SMIM15 is conserved in both vertebrates and invertebrates. It is not found in insects or fungi. SMIM15 does not have any paralogs [5] and the farthest known relative of the Homo sapiens SMIM15 is found within Trichoplax sp.H2 with a date of divergence 747 MYA [32]
Receptor expression-enhancing protein 5 is a protein that in humans is encoded by the REEP5 gene. Receptor Expression Enhancing Protein is a protein encoded for in Humans by the REEP5 gene.
Interferon-inducible GTPase 5 also known as immunity-related GTPase cinema 1 (IRGC1) is an enzyme that in humans is coded by the IRGC gene. It is predicted to behave like other proteins in the p47-GTPase-like and IRG families. It is most expressed in the testis.
Transmembrane protein 241 is a ubiquitous sugar transporter protein which in humans is encoded by the TMEM241 gene.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.
Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
Proline-rich protein 16 (PRR16) is a protein coding gene in Homo sapiens. The protein is known by the alias Largen.
C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.
C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.
C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
CAP-Gly Domain Containing Linker Protein Family Member 4 is a protein that in humans is encoded by the CLIP4 gene. In terms of conserved domains, the CLIP4 gene contains primarily ankyrin repeats and the eponymous CAP-Gly domains. The structure of the CLIP4 protein is largely made up of coil, with alpha helices dominating the rest of the protein. CLIP4 mRNA expression occurs largely in the adrenal cortex and atrioventricular node. The literature encompassing CLIP4's conserved domains and paralogs points toward microtubule regulation as a possible function of CLIP4.
C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
C12orf54 is a protein in humans that is encoded by the C12orf54 gene.
{{cite web}}
: CS1 maint: archived copy as title (link)