C17orf98

Last updated
C17orf98
Protein-20180225191330.png
Identifiers
Aliases C17orf98 , chromosome 17 open reading frame 98
External IDs MGI: 1919465 HomoloGene: 19140 GeneCards: C17orf98
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001080465

NM_028156

RefSeq (protein)

NP_001073934

NP_082432

Location (UCSC) Chr 17: 38.84 – 38.84 Mb Chr 11: 97.66 – 97.67 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. [5] The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. [6] C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. [7] [8] C17orf98 does not belong to any other families nor does it have any isoforms. [9] The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family. [10]

Contents

Like most proteins, C17orf98 is known to be highly expressed in the testes. [11] The protein has also been known to have elevated levels in cancer. [11] The protein has been shown to be expressed in proximity to or within intermediate filaments and the nucleolus. [11] Additionally, c17orf98 has transcription factors which are also active in hematopoietic stem cells, the immune system, and the cardiovascular system, among others. [12] The gene is over-expressed in many cancer types, including kidney renal clear cell carcinoma and lung squamous cell carcinoma. [13] Motif and transcription factor analysis points towards c17orf98 playing a role in proliferation, specially in immune cell proliferation.

Gene

Background

The C17orf98 gene consists of 6,303 bases. It has three exons and two large introns. The gene has no alternative splice sites. [14] The 5' UTR sequence of C17orf98 is highly conserved in primates. No non-mammalian 5' UTR matches were able to be determined. [15] [16] C17orf98 has 11 Alu repeats. [17]

Enhancers

GeneCards determined that C17orf98 has five enhancer sequences. The role of the sequences may provide insight into the function of C17orf98. Four of the five enhancers are active in the thymus. All five enhancers are active in the H1 hESC. Additionally, all five enhancers are active in iPS DF 19.11 derived from foreskin fibroblasts. [18]

Transcription factors

The C17orf98 promoter has many transcription factors binding sites. [19] C17orf98's transcription factors are commonly found in hematopoietic cells, connective tissue, cardiovascular tissue, and the immune system. The presence of Krueppel Like Transcription Factors suggests a role for c17orf 98 in proliferation or apoptosis. The presence of SMAD indicates an involvement in the TGF-β pathway, while the presence of Myc related transcription factors indicates a potential proliferation function of the protein. Additionally, other C17orf98 transcription factors, like RBPJ-Kappa are involved in proliferation and signalling.

Variants

Numerous SNPs were found in the 5' UTR, 3' UTR, and coding region of c17orf98. [20] Few SNPs were found in highly conserved regions. In all, four SNPs were found in the highly conserved amino acids. One SNP was found in the start codon sequence. Of these five, three had a SNP on the third position of the codon. Due to the wobble hypothesis, three of the five SNPs would have no effect on the overall protein structure.

mRNA

C17orf98 does not have any miRNA binding sites. [21] Its mRNA has low abundance (0.44%). [22] The mRNA sequence has three hexaloops, none of which are significant. [23]

Protein

Primary structure

C17orf98 is a 17.6kDa protein. [8] Distant orthologs are 5 to 6 kDa larger, but some of the discrepancies come from an added NLS sequence, which Homo sapiens does not have There are no positive or negative charge clusters. There are no transmembrane components. The isoelectric point is 9.80 / 17564.67 pI/Mw. [24] C17orf98 is hydrophobic and soluble.

Secondary structure and phosphorylation sites Secondary structure and Phosphorylation sites of C17orf98.png
Secondary structure and phosphorylation sites

Secondary and tertiary structure

Secondary structure of c17orf98 consists of both beta sheets and alpha helices (see diagram on right). Results are confirmed in the tertiary structure, however, alpha helix and beta sheet numbers differ slightly (see diagram on right).

Motifs and binding sites

There are no N-terminal signal peptides. Cleavage motifs were not found. There are no ER membrane retention signals, nor peroxisomal targeting signal. SKL2 is not present, thus a secondary peroxisome signal is not present. There are no vacuolar targeting signals. There are no RNA binding motifs or actinin type actin binding motifs. There are no N-myristoylation pattern or prenylation patterns. [25]

SWISS-MODEL 3D structure of c17orf98 Model swiss.png
SWISS-MODEL 3D structure of c17orf98

Kinase finder at Cuckoo determined kinase binding sites for c17orf98. There are many Serine/Threonine, and Tyrosine kinase phosphorylation sites. [26] Serine and Threonine kinase binding sites are the most prevalent above the statistically significant threshold. There are no SUMOylation sites. [27] C17orf98 gene has six sites on the sequence of possible O-GlcNAc sites. [28] Highly conserved O-GlcNAc amino acid sites are 24, 32, 117, and 142. O-GlcNAc post-translational modification occurs on Ser/Thr residues, specifically on oncogenes, tumor suppressors, and proteins involved in growth factor signaling. [29]

C17orf98 has a Caspase3/7 motif, where either Caspase 3 or 7 would cleave. [30] This supports the idea that C17orf98 is involved in proliferation, as a proapoptotic caspase would want to destroy any protein driving proliferation. The protein also has a motif where peptidyl-prolyl cis-trans isomerase NIMA interacting 1 (Pin1) binds. [30] Pin1 upregulation is involved in cancer and immune disorders. [31] This supports the claim that C17orf98 is involved in cancer, immune cells, and perhaps cancers of the immune system. Additionally, C17orf98 protein has an IBM site, where inhibitors of apoptosis (IAPs) bind. [30] This again supports the idea of C17orf98 being involved in inhibiting apoptosis, and logically, driving cancer. Furthermore, C17orf98 has motifs where GRB2's SH2 domain binds. GRB2 is an adapter protein involved in the RAS signaling pathway, a pathway that when deregulated drives uncontrolled proliferation.

Amino acid sequence

A duplication may have occurred at positions 59–71.

Homo sapiens

MAYLSECRLRLEKGFILDGVAVSTAARAYGRSRPKLWSAIPPYNAQQDYHARSYFQ SHVVPPLLRVVPPLLRKTDQDHGGTGRDGWIVDYIHIFGQGQRYLNRRNWAGTGHS LQQVTGHDHYNADLKPIDGFNGRFGYRRNTPALRQSTSVFGEVTHFPLF 

Associated proteins

There are no known associated proteins. [32] [33] [34] [35]

Expression

Protein abundance in Homo sapiens whole organism is quite low. No data is available for other species. [36] Allen Brain Atlas yields no brain atlas for c17orf98. [37]

Subcellular localization

C17orf98 protein has been found to be expressed in the intermediate filaments and the nucleoli. [38] A C17orf98 antibody is available from Sigma-Aldrich. [39] Additionally, C17orf98 localizes in the cytoplasm. Distantly related c17orf98 ortholgs in organisms such as Macrostomum lignano and Amphimedon queenslandica exhibit nuclear expression. [40] Nuclear localization signals are present in distantly related organisms in non-conserved sites. The results of the k-NN prediction is cytoplasmic localization. [41] C17orf98 is not a signal peptide. [42] The protein is a soluble. [43]

Tissue

Like most proteins, C17orf98 protein is highly expressed in the testes. [44] The protein is expressed on adult tissues as well as fetal tissue. The protein has been found to be mildly expressed in connective tissue. [45] Additionally, expression has been seen in the sperm, breast epithelial cells, and various cells of the immune system. [46]

Clinical significance

Cancer

Protein expression is elevated in many cancer patients. Specifically, protein expression has been shown to be high on colorectal, breast, prostate, and lung. [47] C17orf98 is expressed in papillary thyroid cancer as well. [48] Additionally, mutations were found in c17orf98 in endometrial, stomach, coloratura, and kidney cancer. [49] C17orf98 expression is elevated in cancer patients with BRCA. In kidney renal clear cell carcinoma patients, c17orf98 expression dramatically decreased compared to the non cancerous state. [13] In 80% of chromophobe renal cell carcinoma patients, at least one gene duplication c17orf98 was present. [13]

Other conditions

Protein expression is lower in males with teratozoospermia as compared to those without. [50] Many Geo Profile experiments have been conducted with C17orf98, however, none yield data showing significant change in expression. [51]

Evolution

C17orf98 is a slow mutating protein. It resembles cytochrome c in its rate of divergence, as determined by the molecular clock equations. [52]

Unrooted c17orf98 Phylogenetic Tree with 20 orthologs (see table below) Unrooted c17orf98 Phylogenetic Tree.png
Unrooted c17orf98 Phylogenetic Tree with 20 orthologs (see table below)

Paralogs

There are no known Homo sapiens paralogs for C17orf98. [53]

Orthologs

C17orf98 protein has additional distantly related orthologs across the metazoan kingdom. Its most distant relative is in the sponge family. There is no known ortholog in ctenophores, nematodes, bacteria, fungus, plants, or zebrafish. [10] There are only two fish with the C17orf 98 gene. Model organisms such as Caenorhabditis elegans , and Drosophila melanogaster , do not have the gene.

C17orf98 Orthologs [10]

Sequence #Genus and speciesCommon nameAccession #Protein lengthMYA DivSeq IdConfidence
1 Homo sapiens HumanNP_0010739341540100%na
2 Camelus ferus Wild Bactrian camelXP_0061764361549683%2.00E-94
3 Pteropus alecto Black flying foxXP_0069247841549681%1.00E-92
4 Lipotes vexilifer Yangtze river dolphinXP_0074652081549681%6.00E-89
5 Condylura cristat Star-nosed moleXP_0046843221549675%5.00E-78
6 Myotis brandtii Brandt's batEPQ050641719678%6.00E-78
7 Marmata marmata marmata Alpine marmotXP_015362150.11549081%3.00E-94
8 Octodon degus Chilean rodentXP_0046339311539073%1.00E-76
9 Alligator sinensis Chinese alligatorXP_00602263015431263%8.00E-68
10 Anolis carolinensis LizardXP_00322255315431262%6.00E-67
11 Xenopus laevis African clawed frogXP_01809022824435251%4.00E-38
12 Rhincodon typus Whale sharkXP_020388051.116447653%5.00E-52
13 Acanthaster planci StarfishXP_02208646320968448%1.00E-37
14 Mizuhopecten yessoensis ScallopXP_02134030127579745%5.00E-06
15 Lottia gigantea Sea snailXP_00906387617379745%2.00E-37
16 Lingula anatine Lamp shellXP_013388744.121179743%2.00E-35
17 Biomphalaria glabrata Freshwater snailXP_01308831719879741%6.00E-15
18 Nematostella vectensis Sea anemoneXP_00162961617382448%2.00E-35
19 Stylophora pistillata CoralXP_02279512522682446%3.00E-38
20 Macrostonum lignano FlatwormPAA7361523582436%4.00E-25
21 Amphimedon queenslandica SpongeXP_003389909275951.832%2.00E-12

Related Research Articles

<span class="mw-page-title-main">METTL26</span>

METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.

<span class="mw-page-title-main">C8orf48</span> Protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

<span class="mw-page-title-main">C9orf135</span> Mammalian protein found in Homo sapiens

C9orf135 is a gene that encodes a 229 amino acid protein. It is located on Chromosome 9 of the Homo sapiens genome at 9q12.21. The protein has a transmembrane domain from amino acids 124-140 and a glycosylation site at amino acid 75. C9orf135 is part of the GRCh37 gene on Chromosome 9 and is contained within the domain of unknown function superfamily 4572. Also, c9orf135 is known by the name of LOC138255 which is a description of the gene location on Chromosome 9.1.

<span class="mw-page-title-main">C16orf86</span>

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">SMCO3</span>

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C14orf119</span> Protein-coding gene in the species Homo sapiens

C14orf119 is a protein that in humans is encoded by the c14orf119 gene. The c14orf119 protein is predicted to be localized in the nucleus. Additionally, c14orf119 expression is decreased in individuals with systemic lupus erythematosus (SLE) when compared with healthy individual and is increased in individuals with various types of lymphomas when compared to healthy individuals.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

<span class="mw-page-title-main">FAM155B</span> Protein-coding gene in the species Homo sapiens

Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

C3orf56 is a protein encoding gene found on chromosome 3. Although, the structure and function of the protein is not well understood, it is known that the C3orf56 protein is exclusively expressed in metaphase II of oocytes and degrades as the oocyte develops towards the blastocyst stage. Degradation of the C3orf56 protein suggests that this gene plays a role in the progression from maternal to embryonic genome and in embryonic genome activation.

<span class="mw-page-title-main">SMIM19</span>

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">C6orf136</span>

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">C15orf54</span>

C15orf54 is a protein in humans that is encoded by the C6orf54 gene. This gene is mostly conserved in mammals, primarily primates. While the function of the gene is currently unknown, the gene has shown high expression in the prostate, thymus, appendix, bone marrow, and lungs.

<span class="mw-page-title-main">CCDC190</span> Protein-coding gene in the species Homo sapiens

Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19-85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">C11orf98</span>

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C12orf29</span>

C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.

<span class="mw-page-title-main">GPATCH2L</span> It is Wikipedia article of unknown gene called "GPATCH2L".

GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.

Proline-rich protein 29, encoded by the PRR29 gene in humans, is a protein which is located in the human genome at 17q23. Its function is not fully understood. Its name is derived from the chain of 5 proline amino acids located toward the end of the protein. The primary domain within the sequence of this protein is known as DUF4587. It is reported to have high levels of expression in tissues pertaining to the circulatory system and the immune system. It is hypothesized that PRR29 is a nuclear protein that facilitates communication between the nucleus and the mitochondria.

References

  1. 1 2 3 ENSG00000276913 GRCh38: Ensembl release 89: ENSG00000275489, ENSG00000276913 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000018543 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. Zody MC, Garber M, Adams DJ, Sharpe T, Harrow J, Lupski JR, et al. (April 2006). "DNA sequence of human chromosome 17 and analysis of rearrangement in the human lineage". Nature. 440 (7087): 1045–9. Bibcode:2006Natur.440.1045Z. doi:10.1038/nature04689. PMC   2610434 . PMID   16625196.
  6. PSORT II entry on c17orf98 https://psort.hgc.jp/form2.html
  7. NCBI Conserved Domains entry C17orf98
  8. 1 2 ENMBL-EBI SAPS entry on c17orf98
  9. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2 May 2018.
  10. 1 2 3 "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2 May 2018.
  11. 1 2 3 Human protein atlas entry on c17orf98
  12. Genomatix El Derado etnry on c17orf98
  13. 1 2 3 TissGDB entry on c17orf98
  14. Acieview entry on c17orf98
  15. ClustalW entry on c17orf98 5' UTR
  16. NCBI Blast entry on c17orf98 5' UTR https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blastho me
  17. Genomatix El Derado etnry on c17orf98 [ permanent dead link ]
  18. Database, GeneCards Human Gene. "C17orf98 Gene - GeneCards - CQ098 Protein - CQ098 Antibody". www.genecards.org. Retrieved 2 May 2018.
  19. "Genomatix El Derado etnry on c17orf98".[ permanent dead link ]
  20. NCBI Genome Data Viewer
  21. Target Scan entry on c17orf98 http://www.targetscan.org/cgibin/targetscan/vert_71/view_gene.cgi?rs=ENST00000398575.4&taxid=9606&showcnc=0&shownc=0&shownc_nc=&showncf1=&showncf2=&subset=1%5B%5D
  22. Pax-db entry on c17orf98
  23. "mFold entry on c17orf98 5' UTR".[ permanent dead link ]
  24. ExPASy pI/mW entry on c17orf98 https://web.expasy.org/cgi-bin/compute_pi/pi_tool%5B%5D
  25. PSort II entry on C17orf98 [ permanent dead link ]
  26. Bio Cockoo GPS entry on C17orf98 http://gps.biocu%5B%5D
  27. GPS Sumo entry on c17orf98
  28. YinOyang entry on c17orf98 http://www.cbs.dtu.dk/services/YinOYang/
  29. Hanover, John A.; Krause, Michael W.; Love, Dona C. (2010). "The Hexosamine Signaling Pathway: O-GlcNAc cycling in feast or famine". Biochimica et Biophysica Acta (BBA) - General Subjects. 1800 (2): 80–95. doi:10.1016/j.bbagen.2009.07.017. PMC   2815088 . PMID   19647043.
  30. 1 2 3 Eukaryotic Linear Motif search on c17orf98 amino acid sequence
  31. Esnault S, Braun RK, Shen ZJ, Xiang Z, Heninger E, Love RB, Sandor M, Malter JS (February 2007). "Pin1 modulates the type 1 immune response". PLOS ONE. 2 (2): e226. Bibcode:2007PLoSO...2..226E. doi: 10.1371/journal.pone.0000226 . PMC   1790862 . PMID   17311089.
  32. BioGrid entry on c17orf98
  33. MINT entry on c17orf98
  34. STRING entry on C17orf98
  35. PSICQUIC View entry on c17orf98
  36. pax-db entry on c17orf98 https://pax-db.org/protein/1858623#
  37. "Microarray Data :: Allen Brain Atlas: Human Brain". human.brain-map.org. Retrieved 2018-05-06.
  38. Human Protein Atlas (sigma) entry on c17orf98 https://www.proteinatlas.org/ENSG00000275489-C17orf98/cell%5B%5D
  39. Sigma Aldrich entry on c17orf98 https://www.sigmaaldrich.com/catalog/product/sigma/hpa051696?lang=en&region=US
  40. PSORT II entry on c17orf98 amino acid sequence https://psort.hgc.jp/form2.html
  41. PSort II entry on C17orf98 https://psort.hgc.jp/cgi-bin/runpsort.pl%5B%5D
  42. DTU Bioinformatics entry on c17orf98
  43. Expasy Sosui entry on C17orf98
  44. Protein Atlas entry on c17orf98
  45. NCBI Unigene entry on c17orf98 www.ncbi.nlm.nih.gov/UniGene/clust.cgi?UGID=169593&TAXID=9606&SEARCH=c17orf98
  46. "Bio GPS entry on c17orf98".
  47. Human Protein Atlas (sigma) entry on c17orf98 https://www.proteinatlas.org/ENSG00000275489-C17orf98/cell
  48. NCBI GeoProfiles entry on c17orf98 https://www.ncbi.nlm.nih.gov/geoprofiles
  49. Phosphosite entry on c17orf98 https://www.phosphosite.org/proteinAction.action?id=5156341&showAllSites=true
  50. "C17orf98 - Teratozoospermia (HG-U133 2.0 )".
  51. "NCBI GeoProfiles entry on c17orf98".
  52. "The Molecular Clock and Estimating Species Divergence - Learn Science at Scitable". www.nature.com. Retrieved 2 May 2018.
  53. Blast entry on c17orf98 https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins