CLIP4

Last updated
CLIP4 protein predicted tertiary structure. Taken from PhyreRisk. CLIP4 tertiary structure.png
CLIP4 protein predicted tertiary structure. Taken from PhyreRisk.
CLIP4
Available structures
PDB Ortholog search: PDBe RCSB
Identifiers
Aliases CLIP4 , RSNL2, CAP-Gly domain containing linker protein family member 4
External IDs MGI: 1919100; HomoloGene: 11662; GeneCards: CLIP4; OMA:CLIP4 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001287527
NM_001287528
NM_024692

NM_001271483
NM_001271484
NM_030179
NM_175378

RefSeq (protein)

NP_001274456
NP_001274457
NP_078968

NP_001258412
NP_001258413
NP_084455
NP_780587

Location (UCSC) Chr 2: 29.1 – 29.2 Mb Chr 17: 72.08 – 72.17 Mb
PubMed search [4] [5]
Wikidata
View/Edit Human View/Edit Mouse

CAP-Gly Domain Containing Linker Protein Family Member 4 is a protein that in humans is encoded by the CLIP4 gene. [6] In terms of conserved domains, the CLIP4 gene contains primarily ankyrin repeats and the eponymous CAP-Gly domains. [6] The structure of the CLIP4 protein is largely made up of coil, with alpha helices dominating the rest of the protein. [7] CLIP4 mRNA expression occurs largely in the adrenal cortex and atrioventricular node. [8] The literature encompassing CLIP4's conserved domains and paralogs points toward microtubule regulation as a possible function of CLIP4.

Contents

Gene

The human CLIP4 gene, also known as Restin-Like Protein 2 (RSNL2), [9] is located on the plus strand of the short (p) arm of chromosome 2 at region 2, band 3 [9] from base pair 29,096,676 to base pair 29,189,643. CLIP4 is 92,968 base pairs in length and consists of 23 exons. [9]

Transcript

Transcript variants

TranscriptmRNA size (nucleotides)
CLIP4 transcript variant 1 [10] 4299
CLIP4 transcript variant 2 [11] 4295
CLIP4 transcript variant 3 [12] 2353

Protein

The human CLIP4 protein is 705 amino acids in length and is composed of two main types of conserved domains: Two CAP-Gly domains and numerous ankyrin repeats. [9] The secondary structure of CLIP4 consists largely of random coil, with alpha helices as the second-most abundant structure and beta sheets as the third-most abundant structure. [7]

The isoelectronic point of the unprocessed CLIP4 protein is slightly basic (8.62 pI), meaning there is a slight excess of basic amino acids compared to acidic amino acids. [13] The molecular weight is about 65 kD. [13] The most abundant amino acid in CLIP4 is Serine, which makes up 10.7% of the protein. [14] Aligned matching blocks of separated, tandem, and periodic repeats are found between positions 340-345 and 542-547, as well as 447-547 and 564-568. [14] The unusual 9-figure periodic element of a singular Lysine followed by eight other amino acids occurs five times within the protein when compared to the swp23s.q dataset. [14] Another unusual phenomenon is a 7-figure periodic element of a negatively charged amino acid followed by six other hydrophobic amino acids, which occurs six times within the protein when compared to the swp23s.q dataset. [14] There are two instances of Serine spacing and two instances of Phenylalanine spacing that comprise unusually large distances when compared to the swp23s.q dataset. [14]

Protein isoforms

IsoformProtein size (amino acids)
CLIP4 isoform 1 [15] 705
CLIP4 isoform 2 [16] 599

Expression

CLIP4 RNA expression is consistently measured to a high degree in the thyroid. [6] Additionally, high degrees of transcription occur in the adrenal cortex and atrioventricular node. [8] The Human Protein Atlas points toward high RNA expression values in the muscle tissues, as well as some in the skin, endocrine tissues, and proximal digestive tract. [17] Greatest protein expression values appeared in the muscle tissues as well, in addition to some in the lung, gastrointestinal tract, liver & gallbladder, and bone marrow & lymphoid tissues. [17]

CLIP4 protein expression seems to be highly expressed during Ada3 deficiency. [18] There also exists a higher trend towards higher CLIP4 expression in the absence of U28. [18]

Regulation

Gene

Common transcription factor binding sites

These transcription factors were chosen and organized based on proximity to the promoter and matrix similarity. [19]

Transcription FactorDetailed Matrix InfoAnchor BaseMatrix SimilaritySequence
NOLFEarly B-cell factor 1


17
0.98taagagTCCCcagggcagaaaca


PAX2Zebrafish PAX2 paired domain protein


180.8aagagtccccagggcagAAACaa


AP2FTranscription factor AP-2, alpha


160.98ctgcCCTGgggactc


AP2FTranscription factor AP-2, beta


160.899gagTCCCcagggcag


SORYSRY (sex-determining region Y) box 9, dimeric binding sites


350.768aAACAaaatccagtgagggagag


HNF6CUT-homeodomain transcription factor Onecut-2


320.827aaacaaAATCcagtgag


PAX5B-cell-specific activator protein


400.815acaaaaTCCAgtgagggagagatgcaggg


ZF16PR/SET domain 15


360.852aaatccagtgaGGGA


SORYHMGI(Y) high-mobility-group protein I (Y), architectural transcription factor organizing the framework of a nuclear protein-DNA transcriptional complex


780.945tggaAATTttctaccttaggagc


NFATNuclear factor of activated T-cells 5


830.955ttttGGAAattttctacct


NFATNuclear factor of activated T-cells 5


830.871aggtAGAAaatttccaaaa


CEBPCCAAT/enhancer binding protein (C/EBP), epsilon


890.975agccttttGGAAatt


CAATCellular and viral CCAAT box


1100.91gcagCCATttaatct


CAATAvian C-type LTR CCAAT box            


1650.875cccaCCAAgcagtgg


CEBPCCAAT/enhancer binding protein (C/EBP), gamma


6500.866ctaaTTGCtcaacgt


CEBPCCAAT/enhancer binding protein alpha


6510.971cacgttgaGCAAtta


VTBPMammalian C-type LTR TATA box


6800.903tgctgTAAAaggcctaa


TF2BTranscription factor II B (TFIIB) recognition element


9831ccgCGCC


TF2BTranscription factor II B (TFIIB) recognition element


11571ccgCGCC


TF2BTranscription factor II B (TFIIB) recognition element


12281ccgCGCC


Transcriptional

The human CLIP4 mRNA sequence has 12 stem-loop structures in its 5' UTR and 13 stem-loop structures in its 3' UTR. Of those secondary structures, there are 12 conserved stem-loop secondary structures in the 5'UTR as well as 1 conserved stem-loop secondary structure in the 3' UTR. [20]

Protein

The human CLIP4 protein is localized within the cellular nuclear membrane. [21] CLIP4 does not have a signal peptide due to its intracellular localization. [22] It also does not have N-linked glycosylation sites for that same reason. [23] CLIP4 is not cleaved. [24] However, numerous O-linked glycosylation sites are present. [25] A high density of phosphorylation sites are present in the 400-599 amino acid positions on the CLIP4 protein, although many are also present throughout the rest of the protein. [26]

Function

CAP-Gly domains are often associated with microtubule regulation. [27] In addition, ankyrin repeats are known to mediate protein-protein interactions. [28] Furthermore, CLIP1, a paralog of CLIP4 in humans, is known to bind to microtubules and regulate the microtubule cytoskeleton. [29] The CLIP4 protein is also predicted to interact with various microtubule-associated proteins. [30] As a result, it is likely that the CLIP4 protein, although uncharacterized, is associated with microtubule regulation.

Interacting Proteins

The CLIP4 protein is predicted to interact with many proteins associated with microtubules; namely, MAPRE1, MAPRE2, and MAPRE3. It is also predicted to interact with CKAP5 and DCTN1, a cytoskeleton-associated protein and dynactin-associated protein respectively. [30]

Clinical significance

Importance in various cancers

CLIP4 activity is correlated with the spread of renal cell carcinomas (RCCs) within the host and could therefore be a potential biomarker for RCC metastasis in cancer patients. [31] Additionally, measurement of promotor methylation levels of CLIP4 using a Global Methylation DNA Index reveals that higher methylation of CLIP4 is associated with an increase in severity of gastritis to possibly gastric cancer. [32] This indicates that CLIP4 could be used for early detection of gastric cancer. [33] A similar finding was also documented for prostate cancer, in which CLIP4 was found to be hypermethylated in patients with prostate cancer. [34]

Importance in other diseases

The presence of CLIP4 was found to be highly increased in samples with predicted severe fibrosis as a result of Chronic Hepatitis C virus (HCV). [35] Additionally, the presence of CLIP4 as a novel self-antigen in Systemic Lupus Arythematosus points to it having a potential role in the disease mechanism. [36]

Homology

CLIP4 orthologs

These orthologs were chosen and organized based on estimated date of divergence from the human protein as well as the global sequence identity. [37]

Binomial NomenclatureCommon NameTaxonomic GroupEstimated DoD from Human (MYA)Accession NumberSequence Length (AA)Global Sequence Identity to Human Protein (%)Global Sequence Similarity to Human Protein (%)
Homo sapiens (Hsa)HumanPrimate0AAP97312601100100
Aotus nancymaae (Ana)Ma's night monkeyPrimate43.2XP_01233089570483.583.7
Sorex araneus (Sar)Common shrewEulipotyphla96XP_0046200567077478.5
Antrostomus carolinensis (Aca)Chuck-will's-widowAves312XP_02894299770266.575.4
Gekko japonicus (Gja)Schlegel's Japanese geckoReptilia312XP_01527036670263.873.1
Rhinatrema bivittatum (Rbi)Two-lined caecilianAmphibians351.8XP_02944886270759.570.5
Callorhinchus milii (Cmi)Elephant sharkChondrichthyes473XP_00789501671552.565.6
Branchiostoma floridae (Bfl)Florida lanceletLeptocardii684XP_00260682448140.452.8
Saccoglossus kowalevskii (Sko)Acorn wormEnteropneusta684XP_00682268664835.747.5
Ixodes scapularis (Isc)Black-legged tickArachnid797XP_02983109052738.953
Limulus polyphemus (Lpo)Atlantic horseshoe crabArachnid797XP_0137863764623851.6
Lottia gigantea (Lgi)Owl limpet  Gastropods797XP_00904684366936.349.3
Mizuhopecten yessoensis (Mye)Yesso scallopBivalvia797XP_02135974763335.447.2
Parasteatoda tepidariorum (Pte)Common house spiderArachnid797XP_01591496661634.747.6
Aplysia californica (Aca)California sea hareGastropods797XP_01294534665333.745.7
Crassostrea virginica (Cvi)Eastern oysterBivalvia797XP_02231587964632.745.1
Tetranychus urticae (Tur)Two-spotted spider miteArachnid797XP_01579053665231.943.5
Centruroides sculpturatus (Csc)Bark scorpionArachnid797XP_02322948460530.643.4
Penaeus vannamei (Pva)Pacific white shrimpMalacostracans797XP_02720674668122.934
Monosiga brevicollis (Mbr)ChoanoflagellateChoanoflagellatea1023XP_00174858057625.340.8

Related Research Articles

<span class="mw-page-title-main">ANKRD24</span> Protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.

<span class="mw-page-title-main">Zinc finger protein 684</span> Protein found in humans

Zinc finger protein 684 is a protein that in humans is encoded by the ZNF684 gene.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

<span class="mw-page-title-main">SPMAP1</span> Protein-coding gene in the species Homo sapiens

Sperm microtubule associated protein 1 is a protein which in humans is encoded by the SPMAP1 gene. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

<span class="mw-page-title-main">ZCCHC18</span> Protein-coding gene in the species Homo sapiens

Zinc finger CCHC-type containing 18 (ZCCHC18) is a protein that in humans is encoded by ZCCHC18 gene. It is also known as Smad-interacting zinc finger protein 2 (SIZN2), para-neoplastic Ma antigen family member 7b (PNMA7B), and LOC644353. Other names such as zinc finger, CCHC domain containing 12 pseudogene 1, P0CG32, ZCC18_HUMAN had been used to describe this protein.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">Uncharacterized protein C15orf32</span> Protein-coding gene in the species Homo sapiens

Uncharacterized Protein C15orf32 is a protein which in humans is encoded by the C15orf32 gene and is located on chromosome 15, location 15q26.1. Variants of C15orf32 have been linked to bipolar disorder, alcohol use disorder, and acute myeloid leukemia.

<span class="mw-page-title-main">C20orf202</span>

C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.

<span class="mw-page-title-main">C1orf122</span> Protein-coding gene in the species Homo sapiens

C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">SMIM15</span> Mammalian protein found in Homo sapiens

SMIM15(small integral membrane protein 15) is a protein in humans that is encoded by the SMIM15 gene. It is a transmembrane protein that interacts with PBX4. Deletions where SMIM15 is located have produced mental defects and physical deformities. The gene has been found to have ubiquitous but variable expression in many tissues throughout the body.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">ANKMY1</span> Protein in humans

Ankyrin Repeat And MYND Domain Containing 1 (ANKMY1) is a protein that in humans is encoded by the ANKMY1 gene. Known aliases of ANKMY1 include Zinc Finger Myeloid, Nervy and DEAF-1 or ZMYND13.

<span class="mw-page-title-main">PRR23A</span> Protein that is encoded by the Proline-Rich 23A (PRR23A) gene

Proline-Rich Protein 23A is a protein that is encoded by the Proline-Rich 23A (PRR23A) gene.

References

  1. "PhyreRisk". phyrerisk.bc.ic.ac.uk. Retrieved 2020-05-03.
  2. 1 2 3 GRCh38: Ensembl release 89: ENSG00000115295 Ensembl, May 2017
  3. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000024059 Ensembl, May 2017
  4. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  6. 1 2 3 "CLIP4 CAP-Gly domain containing linker protein family member 4 [Homo sapiens (human)] - Gene - NCBI". ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  7. 1 2 "CFSSP: Chou & Fasman Secondary Structure Prediction Server". biogem.org. Retrieved 2020-05-03.
  8. 1 2 "BioGPS - your Gene Portal System". biogps.org. Retrieved 2020-05-01.
  9. 1 2 3 4 "CLIP4 Gene - GeneCards | CLIP4 Protein | CLIP4 Antibody". genecards.org. Retrieved 2020-05-01.
  10. "Homo sapiens CAP-Gly domain containing linker protein family member 4 (CLIP4), transcript variant 1, mRNA". 2020-04-25.{{cite journal}}: Cite journal requires |journal= (help)
  11. "Homo sapiens CAP-Gly domain containing linker protein family member 4 (CLIP4), transcript variant 2, mRNA". 2020-04-25.{{cite journal}}: Cite journal requires |journal= (help)
  12. "Homo sapiens CAP-Gly domain containing linker protein family member 4 (CLIP4), transcript variant 3, mRNA". 2020-04-25.{{cite journal}}: Cite journal requires |journal= (help)
  13. 1 2 "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2020-05-03.
  14. 1 2 3 4 5 "SAPS < Sequence Statistics < EMBL-EBI". ebi.ac.uk. Retrieved 2020-05-03.
  15. "CAP-Gly domain-containing linker protein 4 isoform 1 [Homo sapiens] - Protein - NCBI". ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  16. "CAP-Gly domain-containing linker protein 4 isoform 2 [Homo sapiens] - Protein - NCBI". ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  17. 1 2 "CLIP4 protein expression summary - The Human Protein Atlas". proteinatlas.org. Retrieved 2020-05-01.
  18. 1 2 "CLIP4 - GEO Profiles - NCBI". ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  19. "Sequence Utilities". bioline.com. Retrieved 2020-05-01.
  20. "RNA secondary structure prediction". genebee.msu.su. Archived from the original on 2021-09-17. Retrieved 2020-05-01.
  21. "PSORT II Prediction". psort.hgc.jp. Retrieved 2020-05-03.
  22. "SignalP-5.0". cbs.dtu.dk. Retrieved 2020-05-03.
  23. "NetNGlyc 1.0 Server". cbs.dtu.dk. Retrieved 2020-05-03.
  24. "ProP 1.0 Server". cbs.dtu.dk. Retrieved 2020-05-03.
  25. "NetOGlyc 4.0 Server". cbs.dtu.dk. Retrieved 2020-05-03.
  26. "NetPhos 3.1 Server". cbs.dtu.dk. Retrieved 2020-05-03.
  27. Weisbrich A, Honnappa S, Jaussi R, Okhrimenko O, Frey D, Jelesarov I, et al. (October 2007). "Structure-function relationship of CAP-Gly domains". Nature Structural & Molecular Biology. 14 (10): 959–67. doi:10.1038/nsmb1291. PMID   17828277. S2CID   37088265.
  28. Li J, Mahajan A, Tsai MD (December 2006). "Ankyrin repeat: a unique motif mediating protein-protein interactions". Biochemistry. 45 (51): 15168–78. doi:10.1021/bi062188q. PMID   17176038.
  29. "UniProtKB - P30622 (CLIP1_HUMAN)". UniProt. Archived from the original on 2010-12-16. Retrieved 3 May 2020.
  30. 1 2 "CLIP4 protein (human) - STRING interaction network". string-db.org. Retrieved 2020-05-03.
  31. "CLIP4 - CAP-Gly domain-containing linker protein 4 - Homo sapiens (Human) - CLIP4 gene & protein". uniprot.org. Retrieved 2020-05-01.
  32. Pirini F, Noazin S, Jahuira-Arias MH, Rodriguez-Torres S, Friess L, Michailidi C, et al. (June 2017). "Early detection of gastric cancer using global, genome-wide and IRF4, ELMO1, CLIP4 and MSC DNA methylation in endoscopic biopsies". Oncotarget. 8 (24): 38501–38516. doi:10.18632/oncotarget.16258. PMC   5503549 . PMID   28418867.
  33. Pirini F, Noazin S, Jahuira-Arias MH, Rodriguez-Torres S, Friess L, Michailidi C, et al. (June 2017). "Early detection of gastric cancer using global, genome-wide and IRF4, ELMO1, CLIP4 and MSC DNA methylation in endoscopic biopsies". Oncotarget. 8 (24): 38501–38516. doi:10.18632/oncotarget.16258. PMC   5503549 . PMID   28418867.
  34. Kron K, Pethe V, Briollais L, Sadikovic B, Ozcelik H, Sunderji A, et al. (2009-03-13). "Discovery of novel hypermethylated genes in prostate cancer using genomic CpG island microarrays". PLOS ONE. 4 (3): e4830. Bibcode:2009PLoSO...4.4830K. doi: 10.1371/journal.pone.0004830 . PMC   2653233 . PMID   19283074.
  35. Gehrau R, Mas V, Archer K, Maluf D (2012-06-06). "Biomarkers of disease differentiation: HCV recurrence versus acute cellular rejection". Fibrogenesis & Tissue Repair. 5 (Suppl 1): S11. doi: 10.1186/1755-1536-5-S1-S11 . PMC   3368799 . PMID   23259646.
  36. "Barbara Dema". Discovery Medicine.
  37. "Nucleotide BLAST: Search nucleotide databases using a nucleotide query". blast.ncbi.nlm.nih.gov. Retrieved 2020-05-03.