SPATS1 | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||
Aliases | SPATS1 , DDIP, SPATA8, SRSP1, spermatogenesis associated serine rich 1 | ||||||||||||||||||||||||
External IDs | MGI: 1918270 HomoloGene: 12376 GeneCards: SPATS1 | ||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
Orthologs | |||||||||||||||||||||||||
Species | Human | Mouse | |||||||||||||||||||||||
Entrez | |||||||||||||||||||||||||
Ensembl | |||||||||||||||||||||||||
UniProt | |||||||||||||||||||||||||
RefSeq (mRNA) | |||||||||||||||||||||||||
RefSeq (protein) | |||||||||||||||||||||||||
Location (UCSC) | Chr 6: 44.34 – 44.38 Mb | Chr 17: 45.75 – 45.79 Mb | |||||||||||||||||||||||
PubMed search | [3] | [4] | |||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||
|
Spermatogenesis associated serine rich 1 (SPATS1) is a protein which in humans is encoded by the SPATS1 gene. It is also known by the aliases Dishevelled-DEP domain interacting protein (DDIP), Spermatogenesis Associated 8 (SPATA8), and serin-rich spermatogenic protein 1 (SRSP1). [5] A general idea of its chemical structure, subcellular localization, expression, and conservation is known. Research suggests SPATS1 may play a role in the canonical Wnt Signaling pathway and in the first spermatogenic wave.
The human SPATS1 gene contains 1150 nucleotides, coding for 300 amino acids. It's located on the positive strand of chromosome 6 in the 21p1 region. [5] As of now there are no known single nucleotide polymorphisms (SNPs) that prove to be clinically significant. [6]
The protein in its longest form has 8 exons. There is another possible isoform, but experimental confirmation is lacking – possibly due to it being produced at low levels because of an immature stop codon. [7] Bioinformatic analysis suggests that the protein does not have transmembrane structure and is composed of both alpha helixes and beta sheets. There have been conflicting numbers for SPATS1 isoelectric points. Several sources have said 6.68, while two others suggested that it is higher, 7.04 and 7.47. [8] [9] [10]
Studies have suggested that most of the expression is found in the cytoplasm of the cell, but there is also evidence of expression in the nucleus. [11] Expression in the nucleus may be supported by the fact that the rat homolog of the SPATS1 gene was experimentally found to have a probable bipartite nuclear localization signal. [12] In addition, bioinformatic tools have identified a bipartite nuclear localization signal with high probability in the human protein at amino acids 174 - 191. [13]
Bioinformatic analysis suggests that it undergoes several post-translational modifications. The more plausible ones propose a GPI – modification site at amino acid 280, N-glycosylation sites at amino acids 49 and 229, and a phosphorylation site at amino acid 113. There are 85 predicted sites of phosphorylation, 23 having an 80% or higher likelihood. [14] Only the one located at amino acid 113 has been experimentally confirmed. [5] There is also a high probability of a SASRP1 motif that spans amino acids 51 - 288. [15]
Possible interacting proteins are listed in the table below. Note that these proteins have not been experimentally confirmed to interact with SPATS1. Instead, their interaction potential was determined by looking
at concurrence patterns and textmining. [16]
Abbreviation | Protein Name | Function | Score |
---|---|---|---|
ZNF683 | zinc finger protein 683 | may be involved in transcriptional regulation | 0.633 |
TMC5 | transmembrane channel like 5 | probable ion channel | 0.624 |
GTSF1L | gametocyte specific factor 1 like | unknown | 0.567 |
TMEM225 | transmembrane protein 225 | most likely inhibits phosphate 1 (PP1) in sperm via binding to catalytic sub-unit PPP1CC | 0.566 |
SPATA3 | spermatogenesis associated 3 | unknown | 0.537 |
FAM71F1 | family with sequence similarity 71 member F1 | unknown | 0.535 |
C9orf139 | chromosome 9 open reading frame 139 | unknown | 0.477 |
SPACA4 | sperm acrosome associated 4 | sperm surface membrane protein that may be involved in sperm - egg plasma membrane adhesion and fusion during fertilization | 0.472 |
SCML4 | sex comb on midleg-like protein 4 | PcG proteins that act by forming multi-protein complexes, which are required to maintain the transcriptionally repressive state of homeotic genes throughout development | 0.457 |
The expression of this protein has been found to greatly decline in adulthood, compared to expression levels measured in fetuses. [11] Studies have shown some fluctuation during the gestation period, but overall remaining relatively high. There has also been evidence of high expression levels up until day 28 postpartum. [17]
Expression of this protein has been found in peritubular myoid cells, gonocytes, pachytene spermatocytes, spermatogonia, myoid cells, and Sertoli cells. [11]
Mouse brains have shown expression in various areas of the brain including the pituitary gland, the prefrontal cortex, the frontal lobe, the cerebellum, and the parietal lobe. [18] Highest expression levels have been found in the testes, the next highest levels being found in the trachea. A protein abundance histogram, which compares the abundance of a desired protein to other proteins, shows that SPATS1 is on the lower level of expression. [5]
The specific function of SPATS1 is still being studied. Research has indicated that it may play a role in initiation of the first spermatogenic wave as well as the first male meiotic division. [11] Another study suggests that it acts as a negative regulator in the canonical Wnt signaling pathway. [12] Several microaary studies have studied the effects of knocking out different proteins and enzymes and the resulting effects on SPATS1 expression. Epigentic factors, specifically histone methylation, have also been looked at. The effects of knockout on phenotypes have also been done in several studies. [5]
SPATS1 protein is conserved in species as early as Oxytricha trifallax. No orthologues have been found for this protein in archaea or bacteria. Nor have orthologs been found in birds. [19] There is a high level of conservation among mammals and other close orthologs in the coding region. There is conservation among distant orthologs in non-coding regions, including the promoter, 5' UTR, and 3' UTR. These conservations are kept through either the same nucleotide, or a chemically similar nucleotide. [20] Below is a table of orthologs along with the percent similarity and their date of divergence. [19] [21]
Ortholog | Sequence Similarity to Homo sapiens | Sequence Identity to Homo sapiens | Date of Divergence (MYA) |
---|---|---|---|
Pongo abelii | 95.70% | 95.00% | 15.2 |
Heterocephalus glaber | 58.30% | 52.00% | 88 |
Pteropus alecto | 71.30% | 66.90% | 94 |
Bos taurus | 50.70% | 47.70% | 94 |
Bos mutus | 64.10% | 58.80% | 94 |
Balaenoptera acutorostrata scammoni | 80.30% | 74.00% | 94 |
Loxodonta africana | 67.20% | 61.00% | 102 |
Sarcophilus harrisii | 48.20% | 37.50% | 160 |
Ornithorhynchus anatinus | 49.20% | 39.90% | 169 |
Gavialis gangeticus | 45.40% | 36.70% | 320 |
Anolis carolinensis | 48.30% | 34.10% | 320 |
Pelodiscus sinensis | 45.90% | 33.40% | 320 |
Nanorana parkeri | 43.10% | 30.30% | 353 |
Strongylocentrotus purpuratus | 33.60% | 25.60% | 627 |
Nematostella vectensis | 28.30% | 25.20% | 685 |
Branchiostoma belcheri | 36.50% | 29.20% | 699 |
Crassostrea gigas | 35.00% | 27.00% | 758 |
Lottia gigantea | 32.70% | 26.20% | 758 |
Oxytricha trifallax | 31.80% | 20.40% | 1781 |
TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long. and its cDNA has 1214 base pairs It was previously designated C16orf42.
Transmembrane protein 53, or TMEM53, is a protein that is encoded on chromosome 1 in humans. It has no paralogs but is predicted to have many orthologs across eukaryotes.
HIKESHI is a protein important in lung and multicellular organismal development that, in humans, is encoded by the HIKESHI gene. HIKESHI is found on chromosome 11 in humans and chromosome 7 in mice. Similar sequences (orthologs) are found in most animal and fungal species. The mouse homolog, lethal gene on chromosome 7 Rinchik 6 protein is encoded by the l7Rn6 gene.
C7orf38 is a gene located on chromosome 7 in the human genome. The gene is expressed in nearly all tissue types at very low levels. Evolutionarily, it can be found throughout the kingdom animalia. While the function of the protein is not fully understood by the scientific community, bioinformatic tools have shown that the protein bares much similarity to zinc finger or transposase proteins. Many of its orthologs, paralogs, and neighboring genes have been shown to possess zinc finger domains. The protein contains a hAT dimerization domain nears its C-terminus. This domain is highly conserved in transposase enzymes.
KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. It's aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.
Protein FAM83A also known as tumor antigen BJ-TSA-9 is a protein that in humans is encoded by the FAM83A gene.
Transmembrane protein 131 (TMEM131) is a protein that is encoded by the TMEM131 gene in humans. The TMEM131 protein contains three domains of unknown function 3651 (DUF3651) and two transmembrane domains. This protein has been implicated as having a role in T cell function and development. TMEM131 also resides in a locus (2q11.1) that is associated with Nievergelt's Syndrome when deleted.
Coiled-coil domain-containing protein 144A is a protein that in humans is encoded by the CCDC144A gene. An alias of this gene is called KIAA0565. There are four members of the CCDC family: CCDC 144A, 144B, 144C and putative CCDC 144 N-terminal like proteins.
Neuroblastoma breakpoint family, member 1, or NBPF1, is a protein that is encoded by the gene NBPF1 in humans. This protein is member of the neuroblastoma breakpoint family of proteins, a group of proteins that are thought to be involved in the development of the nervous system.
C12orf40, also known as Chromosome 12 Open Reading Frame 40, HEL-206, and Epididymis Luminal Protein 206 is a protein that in humans is encoded by the C12orf40 gene.
C9orf135 is a gene that encodes a 229 amino acid protein. It is located on Chromosome 9 of the Homo sapiens genome at 9q12.21. The protein has a transmembrane domain from amino acids 124-140 and a glycosylation site at amino acid 75. C9orf135 is part of the GRCh37 gene on Chromosome 9 and is contained within the domain of unknown function superfamily 4572. Also, c9orf135 is known by the name of LOC138255 which is a description of the gene location on Chromosome 9.1.
FAM163A, also known as cebelin and neuroblastoma-derived secretory protein (NDSP) is a protein that in humans is encoded by the FAM163A gene. This protein has been implicated in promoting proliferation and anchorage-independent growth of neuroblastoma cancer cells. In addition, this protein has been found to be up-regulated in the lung tissue of chronic smokers. FAM163A is found on human chromosome 1q25.2; its protein product is 167 amino acids long. FAM163A contains a very highly conserved signal peptide sequence, coded for by the first ~37 amino acids in its sequence; albeit only conserved in eukaryotes, the most distant of which being the Japanese Rice Fish.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Cilia and flagella associated protein 157 (CFAP157) also known as chromosome 9 open reading frame 117 (c9orf117) is a protein that in humans is encoded by the CFAP157 gene.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
MIPOL1 , also known as CCDC193 , is a protein that in humans is encoded by the MIPOL1 gene. Mutation of this gene is associated with mirror-image polydactyly in humans, which is a rare genetic condition characterized by mirror-image duplication of digits.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates.TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.