SPMAP1

Last updated
SPMAP1
Protein-20180225191330.png
Identifiers
Aliases SPMAP1 , chromosome 17 open reading frame 98, sperm microtubule associated protein 1
External IDs MGI: 1919465; HomoloGene: 19140; GeneCards: SPMAP1; OMA:SPMAP1 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001080465

NM_028156

RefSeq (protein)

NP_001073934

NP_082432

Location (UCSC) Chr 17: 38.84 – 38.84 Mb Chr 11: 97.66 – 97.67 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Sperm microtubule associated protein 1 is a protein which in humans is encoded by the SPMAP1 gene. The protein is derived from Homo sapiens chromosome 17. [5] The SPMAP1 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. [6] SPMAP1 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. [7] [8] SPMAP1 does not belong to any other families nor does it have any isoforms. [9] The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family. [10]

Contents

Like most proteins, SPMAP1 is known to be highly expressed in the testes. [11] The protein has also been known to have elevated levels in cancer. [11] The protein has been shown to be expressed in proximity to or within intermediate filaments and the nucleolus. [11] Additionally, SPMAP1 has transcription factors which are also active in hematopoietic stem cells, the immune system, and the cardiovascular system, among others. [12] The gene is over-expressed in many cancer types, including kidney renal clear cell carcinoma and lung squamous cell carcinoma. [13] Motif and transcription factor analysis points towards SPMAP1 playing a role in proliferation, specially in immune cell proliferation.

Gene

Background

The SPMAP1 gene consists of 6,303 bases. It has three exons and two large introns. The gene has no alternative splice sites. [14] The 5' UTR sequence of SPMAP1 is highly conserved in primates. No non-mammalian 5' UTR matches were able to be determined. [15] [16] SPMAP1 has 11 Alu repeats. [17]

Enhancers

GeneCards determined that SPMAP1 has five enhancer sequences. The role of the sequences may provide insight into the function of SPMAP1. Four of the five enhancers are active in the thymus. All five enhancers are active in the H1 hESC. Additionally, all five enhancers are active in iPS DF 19.11 derived from foreskin fibroblasts. [18]

Transcription factors

The SPMAP1 promoter has many transcription factors binding sites. [19] SPMAP1's transcription factors are commonly found in hematopoietic cells, connective tissue, cardiovascular tissue, and the immune system. The presence of Krueppel Like Transcription Factors suggests a role for SPMAP1 in proliferation or apoptosis. The presence of SMAD indicates an involvement in the TGF-β pathway, while the presence of Myc related transcription factors indicates a potential proliferation function of the protein. Additionally, other SPMAP1 transcription factors, like RBPJ-Kappa are involved in proliferation and signalling.

Variants

Numerous SNPs were found in the 5' UTR, 3' UTR, and coding region of SPMAP1. [20] Few SNPs were found in highly conserved regions. In all, four SNPs were found in the highly conserved amino acids. One SNP was found in the start codon sequence. Of these five, three had a SNP on the third position of the codon. Due to the wobble hypothesis, three of the five SNPs would have no effect on the overall protein structure.

mRNA

SPMAP1 does not have any miRNA binding sites. [21] Its mRNA has low abundance (0.44%). [22] The mRNA sequence has three hexaloops, none of which are significant. [23]

Protein

Primary structure

SPMAP1 is a 17.6kDa protein. [8] Distant orthologs are 5 to 6 kDa larger, but some of the discrepancies come from an added NLS sequence, which Homo sapiens does not have There are no positive or negative charge clusters. There are no transmembrane components. The isoelectric point is 9.80 / 17564.67 pI/Mw. [24] SPMAP1 is hydrophobic and soluble.

Secondary structure and phosphorylation sites Secondary structure and Phosphorylation sites of C17orf98.png
Secondary structure and phosphorylation sites

Secondary and tertiary structure

Secondary structure of SPMAP1 consists of both beta sheets and alpha helices (see diagram on right). Results are confirmed in the tertiary structure, however, alpha helix and beta sheet numbers differ slightly (see diagram on right).

Motifs and binding sites

There are no N-terminal signal peptides. Cleavage motifs were not found. There are no ER membrane retention signals, nor peroxisomal targeting signal. SKL2 is not present, thus a secondary peroxisome signal is not present. There are no vacuolar targeting signals. There are no RNA binding motifs or actinin type actin binding motifs. There are no N-myristoylation pattern or prenylation patterns. [25]

SWISS-MODEL 3D structure of SPMAP1 Model swiss.png
SWISS-MODEL 3D structure of SPMAP1

Kinase finder at Cuckoo determined kinase binding sites for SPMAP1. There are many Serine/Threonine, and Tyrosine kinase phosphorylation sites. [26] Serine and Threonine kinase binding sites are the most prevalent above the statistically significant threshold. There are no SUMOylation sites. [27] SPMAP1 gene has six sites on the sequence of possible O-GlcNAc sites. [28] Highly conserved O-GlcNAc amino acid sites are 24, 32, 117, and 142. O-GlcNAc post-translational modification occurs on Ser/Thr residues, specifically on oncogenes, tumor suppressors, and proteins involved in growth factor signaling. [29]

SPMAP1 has a Caspase3/7 motif, where either Caspase 3 or 7 would cleave. [30] This supports the idea that SPMAP1 is involved in proliferation, as a proapoptotic caspase would want to destroy any protein driving proliferation. The protein also has a motif where peptidyl-prolyl cis-trans isomerase NIMA interacting 1 (Pin1) binds. [30] Pin1 upregulation is involved in cancer and immune disorders. [31] This supports the claim that SPMAP1 is involved in cancer, immune cells, and perhaps cancers of the immune system. Additionally, SPMAP1 protein has an IBM site, where inhibitors of apoptosis (IAPs) bind. [30] This again supports the idea of SPMAP1 being involved in inhibiting apoptosis, and logically, driving cancer. Furthermore, SPMAP1 has motifs where GRB2's SH2 domain binds. GRB2 is an adapter protein involved in the RAS signaling pathway, a pathway that when deregulated drives uncontrolled proliferation.

Amino acid sequence

A duplication may have occurred at positions 59–71.

Homo sapiens

MAYLSECRLRLEKGFILDGVAVSTAARAYGRSRPKLWSAIPPYNAQQDYHARSYFQ SHVVPPLLRVVPPLLRKTDQDHGGTGRDGWIVDYIHIFGQGQRYLNRRNWAGTGHS LQQVTGHDHYNADLKPIDGFNGRFGYRRNTPALRQSTSVFGEVTHFPLF 

Associated proteins

There are no known associated proteins. [32] [33] [34] [35]

Expression

Protein abundance in Homo sapiens whole organism is quite low. No data is available for other species. [36] Allen Brain Atlas yields no brain atlas for SPMAP1. [37]

Subcellular localization

SPMAP1 protein has been found to be expressed in the intermediate filaments and the nucleoli. [38] A SPMAP1 antibody is available from Sigma-Aldrich. [39] Additionally, SPMAP1 localizes in the cytoplasm. Distantly related SPMAP1 orthologs in organisms such as Macrostomum lignano and Amphimedon queenslandica exhibit nuclear expression. [40] Nuclear localization signals are present in distantly related organisms in non-conserved sites. The results of the k-NN prediction is cytoplasmic localization. [41] SPMAP1 is not a signal peptide. [42] The protein is a soluble. [43]

Tissue

Like most proteins, SPMAP1 protein is highly expressed in the testes. [44] The protein is expressed on adult tissues as well as fetal tissue. The protein has been found to be mildly expressed in connective tissue. [45] Additionally, expression has been seen in the sperm, breast epithelial cells, and various cells of the immune system. [46]

Clinical significance

Cancer

Protein expression is elevated in many cancer patients. Specifically, protein expression has been shown to be high on colorectal, breast, prostate, and lung. [47] SPMAP1 is expressed in papillary thyroid cancer as well. [48] Additionally, mutations were found in SPMAP1 in endometrial, stomach, coloratura, and kidney cancer. [49] SPMAP1 expression is elevated in cancer patients with BRCA. In kidney renal clear cell carcinoma patients, SPMAP1 expression dramatically decreased compared to the non cancerous state. [13] In 80% of chromophobe renal cell carcinoma patients, at least one gene duplication SPMAP1 was present. [13]

Other conditions

Protein expression is lower in males with teratozoospermia as compared to those without. [50] Many Geo Profile experiments have been conducted with SPMAP1, however, none yield data showing significant change in expression. [51]

Evolution

SPMAP1 is a slow mutating protein. It resembles cytochrome c in its rate of divergence, as determined by the molecular clock equations. [52]

Unrooted SPMAP1 Phylogenetic Tree with 20 orthologs (see table below) Unrooted c17orf98 Phylogenetic Tree.png
Unrooted SPMAP1 Phylogenetic Tree with 20 orthologs (see table below)

Paralogs

There are no known Homo sapiens paralogs for SPMAP1. [53]

Orthologs

SPMAP1 protein has additional distantly related orthologs across the metazoan kingdom. Its most distant relative is in the sponge family. There is no known ortholog in ctenophores, nematodes, bacteria, fungus, plants, or zebrafish. [10] There are only two fish with the SPMAP1 gene. Model organisms such as Caenorhabditis elegans , and Drosophila melanogaster , do not have the gene.

SPMAP1 Orthologs [10]

Sequence #Genus and speciesCommon nameAccession #Protein lengthMYA DivSeq IdConfidence
1 Homo sapiens HumanNP_0010739341540100%na
2 Camelus ferus Wild Bactrian camelXP_0061764361549683%2.00E-94
3 Pteropus alecto Black flying foxXP_0069247841549681%1.00E-92
4 Lipotes vexilifer Yangtze river dolphinXP_0074652081549681%6.00E-89
5 Condylura cristat Star-nosed moleXP_0046843221549675%5.00E-78
6 Myotis brandtii Brandt's batEPQ050641719678%6.00E-78
7 Marmata marmata marmata Alpine marmotXP_015362150.11549081%3.00E-94
8 Octodon degus Chilean rodentXP_0046339311539073%1.00E-76
9 Alligator sinensis Chinese alligatorXP_00602263015431263%8.00E-68
10 Anolis carolinensis LizardXP_00322255315431262%6.00E-67
11 Xenopus laevis African clawed frogXP_01809022824435251%4.00E-38
12 Rhincodon typus Whale sharkXP_020388051.116447653%5.00E-52
13 Acanthaster planci StarfishXP_02208646320968448%1.00E-37
14 Mizuhopecten yessoensis ScallopXP_02134030127579745%5.00E-06
15 Lottia gigantea Sea snailXP_00906387617379745%2.00E-37
16 Lingula anatine Lamp shellXP_013388744.121179743%2.00E-35
17 Biomphalaria glabrata Freshwater snailXP_01308831719879741%6.00E-15
18 Nematostella vectensis Sea anemoneXP_00162961617382448%2.00E-35
19 Stylophora pistillata CoralXP_02279512522682446%3.00E-38
20 Macrostonum lignano FlatwormPAA7361523582436%4.00E-25
21 Amphimedon queenslandica SpongeXP_003389909275951.832%2.00E-12

Related Research Articles

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.

<span class="mw-page-title-main">C8orf48</span> Protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

<span class="mw-page-title-main">C9orf135</span> Mammalian protein found in Homo sapiens

C9orf135 is a gene that encodes a 229 amino acid protein. It is located on Chromosome 9 of the Homo sapiens genome at 9q12.21. The protein has a transmembrane domain from amino acids 124-140 and a glycosylation site at amino acid 75. C9orf135 is part of the GRCh37 gene on Chromosome 9 and is contained within the domain of unknown function superfamily 4572. Also, c9orf135 is known by the name of LOC138255 which is a description of the gene location on Chromosome 9.1.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

<span class="mw-page-title-main">FAM227a</span> Protein

FAM227A is a protein that in humans is encoded by FAM227A gene. Current studies have determined the location of this gene to be in the nuclear region of the cell. FAM227A is most highly expressed in the tissues of the fallopian tube, testis, and pituitary gland. FAM227A is present in species of mammals, birds and reptiles, and gene alignment sequences have shown that FAM227A is a rapidly evolving gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">TMEM125</span> Protein

Transmembrane protein 125 is a protein that, in humans, is encoded by the TMEM125 gene. It has 4 transmembrane domains and is expressed in the lungs, thyroid, pancreas, intestines, spinal cord, and brain. Though its function is currently poorly understood by the scientific community, research indicates it may be involved in colorectal and lung cancer networks. Additionally, it was identified as a cell adhesion molecule in oligodendrocytes, suggesting it may play a role in neuron myelination.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C14orf119</span> Protein-coding gene in the species Homo sapiens

C14orf119 is a protein that in humans is encoded by the c14orf119 gene. The c14orf119 protein is predicted to be localized in the nucleus. Additionally, c14orf119 expression is decreased in individuals with systemic lupus erythematosus (SLE) when compared with healthy individual and is increased in individuals with various types of lymphomas when compared to healthy individuals.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">PANO1</span> Mammalian protein found in Homo sapiens

PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">GPATCH2L</span> It is Wikipedia article of unknown gene called "GPATCH2L".

GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.

References

  1. 1 2 3 ENSG00000276913 GRCh38: Ensembl release 89: ENSG00000275489, ENSG00000276913 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000018543 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. Zody MC, Garber M, Adams DJ, Sharpe T, Harrow J, Lupski JR, et al. (April 2006). "DNA sequence of human chromosome 17 and analysis of rearrangement in the human lineage". Nature. 440 (7087): 1045–9. Bibcode:2006Natur.440.1045Z. doi:10.1038/nature04689. PMC   2610434 . PMID   16625196.
  6. PSORT II entry on c17orf98 https://psort.hgc.jp/form2.html
  7. NCBI Conserved Domains entry C17orf98
  8. 1 2 ENMBL-EBI SAPS entry on c17orf98
  9. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2 May 2018.
  10. 1 2 3 "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2 May 2018.
  11. 1 2 3 Human protein atlas entry on c17orf98
  12. Genomatix El Derado etnry on c17orf98
  13. 1 2 3 TissGDB entry on c17orf98
  14. Acieview entry on c17orf98
  15. ClustalW entry on c17orf98 5' UTR
  16. NCBI Blast entry on c17orf98 5' UTR https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blastho me
  17. Genomatix El Derado etnry on c17orf98 [ permanent dead link ]
  18. Database, GeneCards Human Gene. "C17orf98 Gene - GeneCards - CQ098 Protein - CQ098 Antibody". www.genecards.org. Retrieved 2 May 2018.
  19. "Genomatix El Derado etnry on c17orf98".[ permanent dead link ]
  20. NCBI Genome Data Viewer
  21. Target Scan entry on c17orf98 http://www.targetscan.org/cgibin/targetscan/vert_71/view_gene.cgi?rs=ENST00000398575.4&taxid=9606&showcnc=0&shownc=0&shownc_nc=&showncf1=&showncf2=&subset=1%5B%5D
  22. Pax-db entry on c17orf98
  23. "mFold entry on c17orf98 5' UTR".[ permanent dead link ]
  24. ExPASy pI/mW entry on c17orf98 https://web.expasy.org/cgi-bin/compute_pi/pi_tool%5B%5D
  25. PSort II entry on C17orf98 [ permanent dead link ]
  26. Bio Cockoo GPS entry on C17orf98 http://gps.biocu%5B%5D
  27. GPS Sumo entry on c17orf98
  28. YinOyang entry on c17orf98 http://www.cbs.dtu.dk/services/YinOYang/
  29. Hanover, John A.; Krause, Michael W.; Love, Dona C. (2010). "The Hexosamine Signaling Pathway: O-GlcNAc cycling in feast or famine". Biochimica et Biophysica Acta (BBA) - General Subjects. 1800 (2): 80–95. doi:10.1016/j.bbagen.2009.07.017. PMC   2815088 . PMID   19647043.
  30. 1 2 3 Eukaryotic Linear Motif search on c17orf98 amino acid sequence
  31. Esnault S, Braun RK, Shen ZJ, Xiang Z, Heninger E, Love RB, Sandor M, Malter JS (February 2007). "Pin1 modulates the type 1 immune response". PLOS ONE. 2 (2): e226. Bibcode:2007PLoSO...2..226E. doi: 10.1371/journal.pone.0000226 . PMC   1790862 . PMID   17311089.
  32. BioGrid entry on c17orf98
  33. MINT entry on c17orf98
  34. STRING entry on C17orf98
  35. PSICQUIC View entry on c17orf98
  36. pax-db entry on c17orf98 https://pax-db.org/protein/1858623#
  37. "Microarray Data :: Allen Brain Atlas: Human Brain". human.brain-map.org. Retrieved 2018-05-06.
  38. Human Protein Atlas (sigma) entry on c17orf98 https://www.proteinatlas.org/ENSG00000275489-C17orf98/cell%5B%5D
  39. Sigma Aldrich entry on c17orf98 https://www.sigmaaldrich.com/catalog/product/sigma/hpa051696?lang=en&region=US
  40. PSORT II entry on c17orf98 amino acid sequence https://psort.hgc.jp/form2.html
  41. PSort II entry on C17orf98 https://psort.hgc.jp/cgi-bin/runpsort.pl%5B%5D
  42. DTU Bioinformatics entry on c17orf98
  43. Expasy Sosui entry on C17orf98
  44. Protein Atlas entry on c17orf98
  45. NCBI Unigene entry on c17orf98 www.ncbi.nlm.nih.gov/UniGene/clust.cgi?UGID=169593&TAXID=9606&SEARCH=c17orf98
  46. "Bio GPS entry on c17orf98".
  47. Human Protein Atlas (sigma) entry on c17orf98 https://www.proteinatlas.org/ENSG00000275489-C17orf98/cell
  48. NCBI GeoProfiles entry on c17orf98 https://www.ncbi.nlm.nih.gov/geoprofiles
  49. Phosphosite entry on c17orf98 https://www.phosphosite.org/proteinAction.action?id=5156341&showAllSites=true
  50. "C17orf98 - Teratozoospermia (HG-U133 2.0 )".
  51. "NCBI GeoProfiles entry on c17orf98".
  52. "The Molecular Clock and Estimating Species Divergence - Learn Science at Scitable". www.nature.com. Retrieved 2 May 2018.
  53. Blast entry on c17orf98 https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins