SCRN3 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | SCRN3 , SES3, secernin 3 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 614967 MGI: 1921866 HomoloGene: 11601 GeneCards: SCRN3 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. [5] As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. [6] Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. [6] SCRN3 is predicted to be an integral component of the cytoplasm.
SCRN3 is also commonly known as FLJ23142 and SES3.
Homo sapiens secernin-3 (SCRN3) is a protein-coding gene. It can be found on chromosome 2, with its specific location being 2q31.1, on the '+' strand. [7] [8] The gene is 33,846 base pairs long and contains 8 exons. [7] [5]
The most common transcript of the SCRN3 protein-coding gene is transcript variant 1, which is 3052 base pairs long. [7] SCRN3 is expressed at a high level, 2.4 times the average gene in this release.
Human SCRN3 has 8 different isoforms. [6]
The mRNA of SCRN3 was found to be moderate in humans. SCRN3 is expressed in most major tissues. The mRNA is expressed at slightly elevated levels in the brain, thyroid, heart, and prostate relative to other tissues, though the underlying trend was relatively consistent ubiquitous expression among various tissues. [9] [10]
In an analysis of SCRN3 in situ hybridization of both mouse brain and embryo, no specific areas of strong expression were located, instead showing a moderate expression throughout, confirming that SCRN3 likely has ubiquitous expression within most tissues. Immunohistochemistry data also indicated that human SCRN3 has low tissue, single cell, immune cell, and brain region specificity, once again adding to the evidence of ubiquitous expression. [11]
Transcript variant 1 of the SCRN3 gene encodes the most common protein isoform, secernin-3 isoform 1, which is 424 amino acids long. The molecular weight of the unmodified SCRN3 protein is approximately 48.413 kDa [12] and the theoretical isoelectric point (pI) of SCRN3 is 5.38. [13] The theoretical isoelectric point, coupled with a predominance of acidic amino acids in the protein's composition, suggest that SCRN3 is a relatively acidic protein.
Additionally, the relative protein abundance of SCRN3 in humans was found to be moderately high compared to other human proteins, at 6.13 ppm. [14]
SCRN3 has a single notable domain, identified as the Peptidase_C69 Domain, or PepD domain for short. This domain spans from amino acid position 5 to 226 of the protein. The sequences found within this domain are characteristic of the Peptidase C69 family, and more specifically the Secernin subfamily, known to be mainly dipeptidases. Within this family, comparative sequence and structural analysis revealed a cysteine as the catalytic nucleophile, a feature that can be found on Secernin-3. [5] [15]
Within the predicted tertiary structure of SCRN3, the most highly conserved amino acids were found predominantly within the internal portion of the protein. This suggests that the most conserved amino acids, being on the inside, are important to providing the structure of the protein, as well as providing internal functionality.
Within the cell, SCRN3 is predicted to be primarily expressed in the cytoplasm. [16] [17] The cytoplasmic localization prediction was consistent among 5 additional orthologs ( Mauremys reevesii, Gallus gallus, Microcaecilia unicolor, Danio rerio, & Anopheles gambiae ), confirming the predicted cytoplasmic subcellular localization of human SCRN3.
SCRN3 is subject to several predicted post-translational modifications, including phosphorylation, ubiquitylation, sumoylation, lysine acetylation, and O-beta-GlcNAc attachment sites, among others.
Additionally, Secernin-3 provided the first example of a predicted naturally occurring N-terminal glyoxylyl (Glox) electrophile through the use of reverse-polarity activity-based protein profiling (RP-ABPP). Using hydrazine probes, it was confirmed that the cysteine (Cys) residue was post-translationally converted to Glox. This identified an electrophilic n-terminal glyoxylyl group for the first time in secernin-3, though the functions of both the protein and Glox as a cofactor have not yet been experimentally validated. [18] [19] [20]
SCRN3 has two known paralogs, SCRN2 and SCRN1, which share a 67.4% and 63.8% similarity to the SCRN3 protein sequence, respectively. Both paralogs are moderately related to SCRN3. SCRN2 was found within the same species groups as SCRN3. SCRN1 was conserved in fewer species groups, including mammals, birds, reptiles, amphibians, and cartilaginous fish, but not in other fish or invertebrates. [21] [22]
Over 100 orthologs exist for the human gene SCRN3. [23] The known orthologs were found to exist in vertebrates and invertebrates, but not in plants, bacteria, or fungi. The divergence date of 20 orthologs found were compared relative to Homo sapiens. Invertebrates are the most distantly related orthologs to human SCRN3, with the furthest median date of divergence from this set of orthologs being 694 million years ago.
Genus and Species | Common Name | Taxonomic Group | Median Date of Divergence (MYA) | Accession # | Sequence Length (aa) | Sequence Identity to Human Protein (%) | Sequence Similarity to Human Protein (%) | |
---|---|---|---|---|---|---|---|---|
Mammal | Homo sapiens | Human | Primates | 0 | NP_078859.2 | 424 | 100 | 100 |
Mus musculus | House mouse | Rodentia | 87 | NP_083298.1 | 418 | 90.1 | 93.7 | |
Canis lupus familiaris | Dog | Carnivora | 94 | XP_038303032.1 | 422 | 80.9 | 89.9 | |
Gracilinanus agilis | Agile Gracile Opossum | Didelphimorphia | 160 | XP_044522430.1 | 421 | 74.2 | 84.5 | |
Tachyglossus aculeatus | Australian Echidna | Monotremata | 180 | XP_038607834.1 | 427 | 70.5 | 81.2 | |
Reptilia | Mauremys reevesii | Reeves' Turtle | Testudines | 319 | XP_039351479.1 | 424 | 73.1 | 82.6 |
Crocodylus porosus | Australian Saltwater Crocodile | Crocodylia | 319 | XP_019409221 | 423 | 72.5 | 82 | |
Varanus komodoensis | Komodo Dragon | Squamata | 319 | XP_044273731.1 | 421 | 70.3 | 81.9 | |
Aves | Gallus gallus | Red Junglefowl (Chicken) | Galliformes | 319 | NP_001244270.2 | 420 | 70.7 | 79.5 |
Anas platyrhynchos | Mallard | Anseriformes | 319 | XP_027317143.2 | 422 | 70.1 | 81 | |
Corvus hawaiiensis | Hawaiian Crow | Passeriformes | 319 | XP_048165925.1 | 420 | 70 | 80 | |
Amphibian | Microcaecilia unicolor | N/A | Gymnophiona | 353 | XP_030064980.1 | 415 | 67.8 | 81.2 |
Xenopus tropicalis | Tropical Clawed Frog | Anura | 353 | XP_002934649.3 | 410 | 63.4 | 74.4 | |
Fish | Protopterus annectens | West African lungfish | Dipnoi | 408 | XP_043931491.1 | 420 | 63.3 | 75.5 |
Latimeria chalumnae | Coelacanth | Coelacanthiformes | 414 | XP_006003581 | 431 | 63.5 | 75.7 | |
Danio rerio | Zebrafish | Actinopterygii | 431 | NP_956032.1 | 417 | 61.7 | 74.2 | |
Callorhinchus milii | Elephant Shark | Chondrichthyes | 464 | XP_007888203.1 | 426 | 63.2 | 77.5 | |
Invertebrate | Branchiostoma lanceolatum | Common Lancelet | Cephalochordata | 556 | CAH1238234.1 | 434 | 48 | 64.4 |
Trichinella sp. T9 | Trichinella Roundworm | Nematoda | 694 | KRX60400.1 | 418 | 47.2 | 62 | |
Trichonephila inaurata madagascariensis | Red-Legged Golden Orb-Web Spider | Arthropoda | 694 | GFY76389.1 | 412 | 45.4 | 59.1 | |
Anopheles gambiae | African malaria mosquito | Arthropoda | 694 | XP_321103.4 | 371 | 29.1 | 44.1 |
The relative rate of molecular evolution for SCRN3 was moderately high, being slightly lower than the evolution rate of Fibrinogen Alpha, and more rapid than the evolution rate of Cytochrome C. SCRN3 is estimated to have first appeared in invertebrates approximately 694 million years ago, evolving to eventually being found in humans.
A search of PSCQUIC [25] identified 5 proteins that interact with human SCRN3 protein.
Interacting Protein | Protein Full Name | Interaction Type | Interaction Detection method | Experimental Role | Cellullar Compartment | Function |
---|---|---|---|---|---|---|
RCVRN | Recoverin | association, physical association | anti tag coimmunoprecipitation, affinity chromatography technology | bait | cytosol, nucleus, mitochondrion, cytoskeleton, extracellular, plasma membrane | Encodes a member of the recoverin family of neuronal calcium sensors. May prolong the termination of the phototransduction cascade in the retina by blocking the phosphorylation of photo-activated rhodopsin. |
EPS8 | Epidermal Growth Factor Receptor Pathway Substrate 8 | colocalization, physical association (x2) | confocal microscopy, two hybrid, affinity chromatography technology | neutral component, unspecified role | cytosol, extracellular, plasma membrane | It functions as part of the epidermal growth factor receptor (EGFR) pathway. Signaling adapter that controls various cellular protrusions by regulating actin cytoskeleton dynamics and architecture |
MAGOH | Mago Homolog, Exon Junction Complex Subunit | association, physical association, direct interaction | anti tag coimmunoprecipitation, affinity chromatography technology, two hybrid | bait | nucleus, cytosol | Required for pre-mRNA splicing as component of the spliceosome. Core component of the exon junction complex (EJC). The EJC is a dynamic structure consisting of core proteins and several peripheral nuclear and cytoplasmic associated factors that join the complex only transiently either during EJC assembly or during subsequent mRNA metabolism. Expressed ubiquitously in adult tissues. |
DAPK1 | Death Associated Protein Kinase 1 | direct interaction | protein array | prey | Cytoskeleton, plasma membrane, cytosol, nucleus | Positive mediator of gamma-interferon induced programmed cell death. Involved in multiple cellular signaling pathways that trigger cell survival, apoptosis, and autophagy |
SMYD1 | SET and MYND Domain Containing 1 | association, physical association, direct interaction | anti tag coimmunoprecipitation, affinity chromatography technology | bait | cytoplasm, nucleus | Predicted to enable histone-lysine- N-methyltransferase activity. Involved in positive regulation of myoblast differentiation. Predicted to be located in cytoplasm. Acts as a transcriptional repressor. |
TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.
METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.
Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.
TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Glutamate rich protein 5 is a protein in humans encoded by the ERICH5 gene, also known as chromosome 8 open reading frame 47 (C8orf47).
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Protein CDV3 homolog also known as carnitine deficiency-associated gene expressed in ventricle 3 is a protein that in humans is encoded by the CDV3 gene.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).
Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene. The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Transmembrane protein 248, also known as C7orf42, is a gene that in humans encodes the TMEM248 protein. This gene contains multiple transmembrane domains and is composed of seven exons.TMEM248 is predicted to be a component of the plasma membrane and be involved in vesicular trafficking. It has low tissue specificity, meaning it is ubiquitously expressed in tissues throughout the human body. Orthology analyses determined that TMEM248 is highly conserved, having homology with vertebrates and invertebrates. TMEM248 may play a role in cancer development. It was shown to be more highly expressed in cases of colon, breast, lung, ovarian, brain, and renal cancers.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.