C1orf198

Last updated
C1orf198
Identifiers
Aliases C1orf198 , chromosome 1 open reading frame 198
External IDs MGI: 1916801 HomoloGene: 13120 GeneCards: C1orf198
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_032800
NM_001136494
NM_001136495

NM_175149

RefSeq (protein)

NP_001129966
NP_001129967
NP_116189
NP_116189.1

NP_780358

Location (UCSC) Chr 1: 230.84 – 230.87 Mb Chr 8: 125.36 – 125.39 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Chromosome 1 open reading frame 198 (C1orf198) is a protein that in humans is encoded by the C1orf198 gene. [5] This particular gene does not have any paralogs in Homo sapiens , but many orthologs have been found throughout the Eukarya domain. [6] C1orf198 has high levels of expression in all tissues throughout the human body, but is most highly expressed in lung, brain, and spinal cord tissues. Its function is most likely involved in lung development and hypoxia-associated events in the mitochondria, which are major consumers of oxygen in cells and are severely affected by decreases in available cellular oxygen.

Contents

Gene

Location

C1orf198 is a protein-encoding gene found on the reverse strand of chromosome 1 at the locus 1q42. The longest mRNA transcript comprises 3,778 base pairs and spans from 230,837,119 to 230,869,589 on chromosome 1. [7] The span of the gene from the start of transcription to polyA site, including introns, is 32,470 bp. This gene also contains a domain of unknown function called DUF4706. In total, C1orf198 has 4 exons.

The location of C1orf198 on chromosome 1. C1orf198-gene.png
The location of C1orf198 on chromosome 1.

Expression

Tissue distribution

RNA-seq tissue data revealed high expression of C1orf198 across all tissues, but especially high expression in lung, heart, spinal cord, and brain tissues. [8] Expression from RNA-seq assays are reported as mean TPM, or transcripts per million, which correspond to mean values of the different individual samples from each tissue. Transcription profiling by high throughput sequencing revealed similar patterns of expression. [9]

Gene expression of C1orf198 in human tissues. C1orf198 Tissue Expression.png
Gene expression of C1orf198 in human tissues.

Conditional expression

Comparison of far-upstream element binding protein knockdowns revealed differential expression in C1orf198. [10] Compared to FBP1 and FBP3, FBP2 knockdown had a significant impact on the expression of C1orf198. FBP2 knockdown was associated with a decrease in C1orf198 expression in comparison to cells with regular expression of FBP2.

Regulation

Promoter


A simple diagram of C1orf198, showing the exons, introns, and promoter. C1orf198 Gene Diagram.png
A simple diagram of C1orf198, showing the exons, introns, and promoter.

Genomatix predicted several promoters, but the best prediction was of a 1,223 bp long promoter that overlapped with exon 1 of C1orf198 by 82 bp. [11] This promoter, GXP_127773, was conserved in all 15 orthologs found by Genomatix.

Transcription Factor Binding Sites

Many transcription factor (TF) binding sites have been predicted, but a few of the more notable TFs found to bind to a region on C1orf198 are XCPE1, HIF, and USF. XCPE1 is an important transcription factor for poorly categorized TATA-less genes in the human genome, and it drives RNA polymerase II transcription. [12] It is found in the core promoter regions of approximately 1% of human genes.  XCPE1 is located between nucleotides -8 and +2 in relation to the start of transcription (+1).  With a matrix score of 0.83, it containing the correct consensus sequence, and its location on the promoter being correct, the probability of this transcription factor actually binding to this promoter is high.

HIF is a transcription factor that responds to decreases in available oxygen in the cellular environment. [13] It functions as a master regulator of cellular and systemic homeostatic response to hypoxia by activating transcription of many genes.  HIF-1 is known to induce transcription of gene involved in energy metabolism, angiogenesis, apoptosis, and other genes whose protein products increase oxygen delivery or facilitate metabolic adaptation to hypoxia.

LKLF2 is a transcription factor that has shown high expression in adult mouse lungs and is thought to play a role in lung development. [14] Overexpression of LKLF in lung epithelial cells increases cytosolic phospholipase A2, which has shown to be the cause of tumorigenesis of non-small-cell lung cancer. [15]

E26 transformation-specific (ETS) Proto-oncogene 1 functions as an oncogene and plays a key role in the progression of certain cancer. [16]  Expression of ETS1was increased in cancer tissues as compared with the expression in corresponding non-neoplastic tissues.

Finally, USF is an upstream stimulating factor, which is involved in mediating recruitment of chromatin remodelling enzymes and interacting with co-activators and members of the transcription pre-initiation complex. [17]

Protein

C1orf198’s longest isoform has a sequence length of 327 amino acids.  The entire sequence is as follows:

MASMAAAIAASRSAVMSGNRPLDDRERKRFTYFSSLSPMARKIMQDKEKIREKYGPEWARLPPAQQDEII

DRCLVGPRAPAPRDPGDSEELTRFPGLRGPTGQKVVRFGDEDLTWQDEHSAPFSWETKSQMEFSISALSI

QEPSNGTAASEPRPLSKASQGSQALKSSQGSRSSSLDALGPTRKEEEASFWKINAERSRGEGPEAEFQSL

TPSQIKSMEKGEKVLPPCYRQEPAPKDREAKVERPSTLRQEQRPLPNVSTERERPQPVQAFSSALHEAAP

SQLEGKLPSPDVRQDDGEDTLFSEPKFAQVSSSNVVLKTGFDFLDNW

The entire protein has a theoretical molecular weight of 36.346 kDa and its isoelectric point is 5.6. [18]

Isoforms

Three different isoforms of C1orf198 have been found. The longest isoform contains 327 amino acids and has a molecular mass of 36.3 kDa. The second isoform is 289 amino acids long. The third and last known isoform is 197 amino acids long and also lacks DUF4706.

The amino acid composition of C1orf198. C1orf198 Amino Acid Composition.png
The amino acid composition of C1orf198.

Amino acid composition

C1orf198 has the highest composition of serine, glutamic acid, proline, alanine, and arginine; It has the lowest composition of histidine.  Relative to the average human protein, C1orf198 is serine-rich, proline-rich, and tyrosine-poor. [19]

Domain

This sequence includes a domain of unknown function, DUF4706, which is approximately 101 amino acids long.  DUF4706 is located from amino acids 31 to 131 on C1orf198. It has a predicted molecular weight of 11.6 kDa and an isoelectric point of 5.41. [20]

Post-translational modifications

The post-translational modifications (PTMs) found in C1orf198 include phosphorylations, SUMOylations, and O-linked β-N-acetylglucosamine (O-GlcNAc) sites. While phosphorylations are the most common PTM and found in all protein types, O-GlcNAc is a regulatory PTM of nuclear and cytosolic proteins. [21]

Subcellular location

C1orf198 is predicted to be targeted towards the cytoplasm, mitochondria, and nucleus. [22] The most highly supported sub cellular location is the cytoplasm, with many bioinformatics tools citing that as the sole location. Both immunohistochemistry and immunofluorescent staining of human cells showed strong cytoplasmic positivity. [23] However, a mitochondrial targeting peptide was predicted in C1orf198, suggesting that its directed towards the mitochondria in some situations. [24]

Interactions

Multiple protein interactions with C1orf198 were found using text mining. One protein interaction involved SART1, which is also known as hypoxia-associated factor. SART1 is known to play a role in mRNA splicing and appears to play a role in hypoxia-induced regulation of EPO gene expression [25] Another protein that interacts with C1orf198 is TOMM20, which is a mitochondrial import receptor subunit. TOMM20 is responsible for the recognition and translocation of cytosolically synthesized mitochondrial preproteins. [26]

Evolution

Paralogs

There are no known paralogs of C1orf198. [27]

Homologs

As seen in the table below, the homologs for C1orf198 trace back to insects, which diverged from human approximately 797 million years ago. [27]

SpeciesEstimated Date of Divergence from Humans (in MYA). [28] IdentitySimilarityAmino Acid Sequence LengthReference Sequence
Homo sapiens (Human)0100%100%327NP_116189
Delphinapterus leucas(Beluga Whale)9681%86%317XP_022408830.1
Hipposideros armiger (Great Roundleaf Bat)9679%85%317XP_019521397.1
Erinaceus europaeus (European Hedgehog)9676%82%333XP_007538428.1
Phascolarctos cinereus (Koala)15965%76%333XP_020856095.1
Parus major (Great Tit)31259%72%335XP_015478640.1
Numida meleagris (Helmeted Guineafowl)31259%71%335XP_021245723.1
Gallus gallus (Chicken)31259%70%334XP_015139870.1
Pogona vitticeps (Bearded Dragon)31258%69%333XP_020656857.1
Notechis scutatus (Tiger Snake)31257%69%333XP_026525262.1
Gekko japonicus (Japanese Gecko)31257%69%330XP_015284731.1
Xenopus tropicalis (Tropical Clawed Frog)35247%68%350XP_002942404.1
Monopterus albus (Asian Swamp Eel)43542%56%360XP_020471043.1
Anabas testudineus (Climbing Perch)43542%56%352XP_026197678.1
Danio rerio (Zebrafish)43541%54%330NP_001188382.1
Callorhinchus milii (Elephant Shark)47348%60%349XP_007896578.1
Helicoverpa armigera (Cotton Bollworm)79728%40%284XP_021198534.1
Copidosoma floridanum (Wasp)79725%41%297XP_014207188.1
Chilo suppressalis (Asiatic Rice Borer)79724%40%280RVE51599.1

Homologous domains

The domain of unknown function 4706 (DUF4706) was highly conserved in most orthologs. [29]

Function and biochemistry

C1orf198 is most likely involved in lung development and hypoxia-associated events in the mitochondria, which are major consumers of oxygen in cells and are severely affected by decreases in available cellular oxygen.  This is supported by a few major findings.  First, the transcription factor LKLF binds to the promoter, which is involved in embryonic lung development and can cause lung cancer if overexpressed.  The protein product also interacts with SART1, also known as hypoxia associated factor, which appears to play a role in hypoxia-induced regulation of EPO gene expression.

Clinical significance

C1orf198 has been found to be associated with a few diseases and disorders, even though the function of the gene is not yet well understood.  For example, it was identified as a novel gene in colon, gastric, and pancreatic cancer.  Specifically, it was found to be a positive impact factor of gastric cancer. [30]  Additionally, microarray analysis revealed that C1orf198 was a differentially expressed gene (DEG) between lung squamous cell carcinoma (SCC) and normal controls. The down-regulation of C1orf198 was found to be correlated to lung SCC but was not one of the top DEGs found in the study. [31]  A third association was found to be an upregulation of C1orf198 in ginsenoside RH2-treated MCF-7, which is a human breast cancer cell line.  When the cell line was treated with RH2, the C1orf198 gene was found to be hypomethylated, which suggested that its function could be involved in cell-mediated immune responses and cancer-related pathways. The results of this study showed a higher survival rate associated with the up-regulation of C1orf198. [32]

Related Research Articles

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">INAVA</span> Protein-coding gene in the species Homo sapiens

INAVA, sometimes referred to as hypothetical protein LOC55765, is a protein of unknown function that in humans is encoded by the INAVA gene. Less common gene aliases include FLJ10901 and MGC125608.

<span class="mw-page-title-main">Proser2</span> Protein-coding gene in the species Homo sapiens

PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

<span class="mw-page-title-main">C16orf82</span> Protein-coding gene in the species Homo sapiens

C16orf82 is a protein that, in humans, is encoded by the C16orf82 gene. C16orf82 encodes a 2285 nucleotide mRNA transcript which is translated into a 154 amino acid protein using a non-AUG (CUG) start codon. The gene has been shown to be largely expressed in the testis, tibial nerve, and the pituitary gland, although expression has been seen throughout a majority of tissue types. The function of C16orf82 is not fully understood by the scientific community.

<span class="mw-page-title-main">Chromosome 9 open reading frame 43</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.

<span class="mw-page-title-main">TMEM171</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 171 (TMEM171) is a protein that in humans is encoded by the TMEM171 gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">CFAP299</span> Protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">C22orf31</span> Protein-coding gene in the species Homo sapiens

C22orf31 is a protein which in humans is encoded by the C22orf31 gene. The C22orf31 mRNA transcript has an upstream in-frame stop codon, while the protein has a domain of unknown function (DUF4662) spanning the majority of the protein-coding region. The protein has orthologs with high percent similarity in mammals. The most distant orthologs are found in species of bony fish, but C22orf31 is not found in any species of birds or amphibians.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">TMEM101</span>

Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.

<span class="mw-page-title-main">CCDC190</span> Protein found in humans

Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.

<span class="mw-page-title-main">PANO1</span> Mammalian protein found in Homo sapiens

PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.

<span class="mw-page-title-main">GPATCH2L</span> It is Wikipedia article of unknown gene called "GPATCH2L".

GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.

<span class="mw-page-title-main">CCDC188</span> Protein found in humans

CCDC188 or coiled-coil domain containing protein is a protein that in humans is encoded by the CCDC188 gene.

<span class="mw-page-title-main">ZNF548</span> Protein-coding gene in the species Homo sapiens

Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000119280 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000031983 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "C1orf198 chromosome 1 open reading frame 198 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-02-28.
  6. "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2019-02-28.
  7. "C1orf198 Gene - GeneCards | CA198 Protein | CA198 Antibody". www.genecards.org. Retrieved 2019-02-28.
  8. S. Navani, The human protein atlas. J. Obstet. Gynecol. India. 61(2011), pp. 27–31.
  9. NCBI, NCBI Gene. Gene Cat.(2016), , doi : 10.1016/B978-0-444-52898-8.00003-3
  10. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A (January 2013). "NCBI GEO: archive for functional genomics data sets--update". Nucleic Acids Research. 41 (Database issue): D991–5. doi:10.1093/nar/gks1193. PMC   3531084 . PMID   23193258.
  11. Cartharius K, Frech K, Grote K, Klocke B, Haltmeier M, Klingenhoff A, Frisch M, Bayerlein M, Werner T (July 2005). "MatInspector and beyond: promoter analysis based on transcription factor binding sites". Bioinformatics. 21 (13): 2933–42. doi: 10.1093/bioinformatics/bti473 . PMID   15860560.
  12. Tokusumi Y, Ma Y, Song X, Jacobson RH, Takada S (March 2007). "The new core promoter element XCPE1 (X Core Promoter Element 1) directs activator-, mediator-, and TATA-binding protein-dependent but TFIID-independent RNA polymerase II transcription from TATA-less promoters". Molecular and Cellular Biology. 27 (5): 1844–58. doi:10.1128/MCB.01363-06. PMC   1820453 . PMID   17210644.
  13. G. L. Semenza, in Cambridge University Press(2007), vol. 9780521853767, pp. 246–255.
  14. M. A. Wani, S. E. Wert, J. B. Lingrel, Lung Kruppel-like factor, a zinc finger transcription factor, is essential for normal lung development. J. Biol. Chem.274, 21180–21185 (1999).
  15. M. J. WICK, S. BLAINE, V. VAN PUTTEN, M. SAAVEDRA, R. A. NEMENOFF, Lung Krüppel-like factor (LKLF) is a transcriptional activator of the cytosolic phospholipase A 2 α promoter . Biochem. J.387, 239–246 (2005).
  16. X. Liu et al., E26 Transformation-Specific Transcription Factor ETS2 as an Oncogene Promotes the Progression of Hypopharyngeal Cancer. Cancer Biother. Radiopharm.32, 327–334 (2017).
  17. Corre S, Galibert MD (October 2005). "Upstream stimulating factors: highly versatile stress-responsive transcription factors". Pigment Cell Research. 18 (5): 337–48. doi: 10.1111/j.1600-0749.2005.00262.x . PMID   16162174.
  18. S. Chojnacki, A. Cowley, J. Lee, A. Foix, R. Lopez, Programmatic access to bioinformatics tools from EMBL-EBI update: 2017. Nucleic Acids Res.45, W550–W553 (2017).
  19. B. Rost, J. Liu, The PredictProtein server. Nucleic Acids Res.31, 3300–3304 (2003).
  20. Chojnacki S, Cowley A, Lee J, Foix A, Lopez R (July 2017). "Programmatic access to bioinformatics tools from EMBL-EBI update: 2017". Nucleic Acids Research. 45 (W1): W550–W553. doi:10.1093/nar/gkx273. PMC   5570243 . PMID   28431173.
  21. Vaidyanathan K, Durning S, Wells L (2014). "Functional O-GlcNAc modifications: implications in molecular regulation and pathophysiology". Critical Reviews in Biochemistry and Molecular Biology. 49 (2): 140–163. doi:10.3109/10409238.2014.884535. PMC   4912837 . PMID   24524620.
  22. K. Nakai, P. Horton, PSORT: A program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci.24(1999), pp. 34–35.
  23. SIGMA-ALDRICH. Anal. Chem.65, 868A–868A (2012).
  24. Emanuelsson O, Nielsen H, Brunak S, von Heijne G (July 2000). "Predicting subcellular localization of proteins based on their N-terminal amino acid sequence". Journal of Molecular Biology. 300 (4): 1005–16. doi:10.1006/jmbi.2000.3903. PMID   10891285.
  25. Calderone A, Castagnoli L, Cesareni G (August 2013). "mentha: a resource for browsing integrated protein-interaction networks". Nature Methods. 10 (8): 690–1. doi:10.1038/nmeth.2561. PMID   23900247. S2CID   9733108.
  26. Orchard S, Ammari M, Aranda B, Breuza L, Briganti L, Broackes-Carter F, Campbell NH, Chavali G, Chen C, del-Toro N, Duesbury M, Dumousseau M, Galeota E, Hinz U, Iannuccelli M, Jagannathan S, Jimenez R, Khadake J, Lagreid A, Licata L, Lovering RC, Meldal B, Melidoni AN, Milagros M, Peluso D, Perfetto L, Porras P, Raghunath A, Ricard-Blum S, Roechert B, Stutz A, Tognolli M, van Roey K, Cesareni G, Hermjakob H (January 2014). "The MIntAct project--IntAct as a common curation platform for 11 molecular interaction databases". Nucleic Acids Research. 42 (Database issue): D358–63. doi:10.1093/nar/gkt1115. PMC   3965093 . PMID   24234451.
  27. 1 2 BLAST, Nucleotide BLAST: Search nucleotide databases using a nucleotide query. Basic Local Alignment Search Tool(2009), (available at https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome%0Ahttp://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&BLAST_PROGRAMS=megaBlast&PAGE_TYPE=BlastSearch&SHOW_DEFAULTS=on&LINK_LOC=blasthome ).
  28. Morrison DA (August 2009). "The Timetree of Life". Systematic Biology. 58 (4): 461–2. doi: 10.1093/sysbio/syp042 .
  29. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (November 2007). "Clustal W and Clustal X version 2.0". Bioinformatics. 23 (21): 2947–8. doi: 10.1093/bioinformatics/btm404 . PMID   17846036.
  30. Wang Z, Chen G, Wang Q, Lu W, Xu M (September 2017). "Identification and validation of a prognostic 9-genes expression signature for gastric cancer". Oncotarget. 8 (43): 73826–73836. doi:10.18632/oncotarget.17764. PMC   5650304 . PMID   29088749.
  31. Zhang F, Chen X, Wei K, Liu D, Xu X, Zhang X, Shi H (January 2017). "Identification of Key Transcription Factors Associated with Lung Squamous Cell Carcinoma". Medical Science Monitor. 23: 172–206. doi:10.12659/MSM.898297. PMC   5248564 . PMID   28081052.
  32. Lee H, Lee S, Jeong D, Kim SJ (October 2018). "Ginsenoside Rh2 epigenetically regulates cell-mediated immune pathway to inhibit proliferation of MCF-7 breast cancer cells". Journal of Ginseng Research. 42 (4): 455–462. doi:10.1016/j.jgr.2017.05.003. PMC   6187096 . PMID   30337805.