SERTM2, also known as the Serine Rich And Transmembrane Domain Containing 2, is a protein which in humans is encoded by the SERTM2 gene. The SERTM2 protein is a transmembrane protein located in the intracellular membrane and active in membrane-bound organelles. [1] [2] SERTM2 expression has been linked to metastatic prostate tumors, prostate carcinomas and renal cell carcinomas. [3] [4]
The SERTM2 gene in humans is located on the positive strand of the X chromosome (Xq23), spanning 10,755 base pairs. [5] The SERTM2 gene has three total exons. There is one known transcript or isoform that spans 4,612 base pairs. [6]
SERTM2 is also known as:
The SERTM2 protein is 90 amino acids long. This protein has a predicted molecular weight of 10 kDa and an isoelectric point of 6. [11] [12] The human SERTM2 protein structure contains two topological domains: extracellular and cytoplasmic. [13] These domains are connected by a transmembrane domain within a confirmed alpha helix. [8] [9] [10] [12] The human protein contains a disordered region at the tail of the protein. [12] Despite having serine-rich in its common name, the protein was not found to have abundance of serine or any other amino acid when compared to other human proteins. [11]
The human SERTM2 protein has one confirmed post-translational modification at the 11th position. [6] The asparagine at that position undergoes N-linked glycosylation, or the attachment of an oligosaccharide to a nitrogen atom on the asparagine side chain. [15]
RNA-sequencing and human tissue profiling has found that SERTM2 is expressed primarily in the endometrium prostate, and liver of humans at moderate level. [6] SERTM2 is found to be upregulated in cardiac progenitor cells compared to mesoderm cells and in fetal cells versus adult heart tissue using RNA-sequencing data. [7] Using knockout and overexpression experiments, it was found that both the knockout and overexpression of SERTM2 results in low cardiomyocyte yield, suggesting that expression must be carefully regulated during cellular differentiation for normal cardiac development to occur and resulted in the nickname CARDEL (Cardiac Development Long non-coding RNA). [7]
The human SERTM2 has no paralogs. SERTM2 orthologs are found in mammals, birds, reptiles, amphibians, and some fish. [13] The earliest known SERTM2 gene appeared 462 million years ago in the catshark, a cartilaginous fish. The gene is hard to find in fish, with only two other known appearances in the tiger barb and the Chinese sucker fish, two bony fish. SERTM2 became more established in amphibians 352 million years ago, and its orthologs are found throughout modern reptiles, birds, mammals, and primates. [12]
Table 1:Human serine-rich and transmembrane-domain containing 2 (SERTM2) gene orthologs. Orthologs are sorted first by date of divergence from the human gene, then by similarity to the human sequence. [12]
Common Name | Scientific Name | Accession Number | Taxonomical Group | Sequence Length (amino acids) | Date of Divergence (MYA) | % identical | |
Primata | Human | Homo sapiens | NP_001341402.1 | Primates | 90 | - | 100 |
Ring-tailed lemur | Lemur catta | XP_045393689.1 | Primates | 90 | 74 | 93 | |
Beluga whale | Delphinapterus leucas | XP_030615360.1 | Cetacea | 90 | 94 | 92 | |
Mouse | Mus musculus | NP_001341422.1 | Rodentia | 89 | 87 | 91 | |
Big brown bat | Eptesicus fuscus | XP_054573025.1 | Chiroptera | 90 | 94 | 81 | |
Common wombat | Vombatus ursinus | XP_027691215.1 | Marsupial | 90 | 160 | 81 | |
Aves | Blue tit | Cyanistes caeruleus | XP_023773484.1 | Aves | 91 | 319 | 76 |
Chicken | Gallus gallus | XP_046795767.1 | Aves | 92 | 319 | 73 | |
Reptilia | Alligator | Alligator mississippiensis | XP_059588794.1 | Crocodilia | 92 | 319 | 79 |
Burmese python | Python bivittatus | XP_025020345.1 | Squamata | 92 | 319 | 75 | |
Softshell turtle | Pelodiscus sinensis | XP_025033828.1 | Testudines | 92 | 319 | 60 | |
Amphibians | Microcaecilia unicolor | Microcaecilia unicolor | XP_030065343.1 | Gymnophiona | 91 | 352 | 68 |
Two-lined caecilians | Rhinatrema bivittatum | XP_029463498.1 | Gymnophiona | 93 | 352 | 67 | |
Common frog | Rana temporaria | XP_040179805.1 | Anura | 92 | 352 | 70 | |
Fish/Sharks | Tiger barb | Puntigrus tetrazona | XP_043094501.1 | Osteichthyes | 103 | 429 | 24 |
Chinese sucker fish | Myxocyprinus asiaticus | XP_051542736.1 | Osteichthyes | 108 | 429 | 21 | |
Catshark | Scyliorhinus canicula | XP_038632174.1 | Chondrichthyes | 89 | 462 | 42 |
Metastatic tumors in the prostate have been shown to have 3-fold more expression of SERTM2 than primary tumors, suggesting that overexpression of SERTM2 may be linked to the metastatic nature of prostate tumors. [3] SERTM2 overexpression has been observed in tumor microenvironment of androgen receptor pathway-positive adenocarcinoma of the prostate (ARPC). [4] In comparison to ARPC, SERTM2 expression is lower in the tumor microenvironment of neuroendocrine prostate carcinomas (NEPC), a more severe type of prostate cancer. [4]
Transmembrane protease, serine 2 is an enzyme that in humans is encoded by the TMPRSS2 gene. It belongs to the TMPRSS family of proteins, whose members are transmembrane proteins which have a serine protease activity. The TMPRSS2 protein is found in high concentration in the cell membranes of epithelial cells of the lung and of the prostate, but also in the heart, liver and gastrointestinal tract.
Transmembrane prostate androgen-induced protein is a protein that in humans is encoded by the PMEPA1 gene.
Protein FAM83A also known as tumor antigen BJ-TSA-9 is a protein that in humans is encoded by the FAM83A gene.
Glutamine Serine Rich Protein 1 or QSER1 is a protein encoded by the QSER1 gene.
TM6SF2 is the Transmembrane 6 superfamily 2 human gene which codes for a protein by the same name. This gene is otherwise called KIAA1926. Its exact function is currently unknown.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.
Vexin is a protein encoded by VXN gene. VXN is found to be highly expressed in regions of the brain and spinal cord.
Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.
Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
SH3 Domain Binding Kinase Family Member 3 is an enzyme that in humans is encoded by the SBK3 gene. SBK3 is a member of the serine/threonine protein kinase family. The SBK3 protein is known to exhibit transferase activity, especially phosphotransferase activity, and tyrosine kinase activity. It is well-conserved throughout mammalian organisms and has two paralogs: SBK1 and SBK2.
C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.
OCEL1, also called Occludin//ELL Domain Containing 1, is a protein encoding gene located at chromosome 19p13.11 in the human genome. Other aliases for the gene include FLJ22709, FWP009, and S863-9. The function of OCEL1 has not yet been identified.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
CKLF-like MARVEL transmembrane domain-containing 5 (CMTM5), previously termed chemokine-like factor superfamily 5, designates any one of the six protein isoforms encoded by six different alternative splices of its gene, CMTM5; CMTM5-v1 is the most studied of these isoforms. The CMTM5 gene is located in band 11.2 on the long arm of chromosome 14.
Transmembrane protein 248, also known as C7orf42, is a gene that in humans encodes the TMEM248 protein. This gene contains multiple transmembrane domains and is composed of seven exons.TMEM248 is predicted to be a component of the plasma membrane and be involved in vesicular trafficking. It has low tissue specificity, meaning it is ubiquitously expressed in tissues throughout the human body. Orthology analyses determined that TMEM248 is highly conserved, having homology with vertebrates and invertebrates. TMEM248 may play a role in cancer development. It was shown to be more highly expressed in cases of colon, breast, lung, ovarian, brain, and renal cancers.
Transmembrane protein 61 (TMEM61) is a protein that is encoded by the TMEM61 gene in humans. It is located on the first chromosome in humans and is highly expressed in the intestinal regions predominantly the kidney, adrenal gland and pituitary tissues. The protein, unlike other transmembrane protein in the region does not promote cancer growth. However, the TMEM61 protein when inhibited by secondary factors restricts normal activity in the kidney. The human protein shares many Orthologs and has been prevalent on Earth for millions of years.
Coiled-Coil Domain Containing 177 (CCDC177) is a protein, which in humans, is encoded by the gene CCDC177. It is composed of a coiled helical domain that spans half of the protein. CCDC177 deletions are associated with intellectual disability and congenital heart defects.
WDR88 is a protein, which in humans, is encoded by the gene WDR88. It consists of seven WD40 repeats, which form a seven-bladed beta-propeller. Mutations within the WDR88 gene are associated with a variety of cancers, as well as schizophrenia and fungal infections.