KnetMiner

Last updated
KnetMiner
Knetminer logo.png
Content
DescriptionKnowledge Network Miner
Data types
captured
Knowledge Discovery and Visualisation
Contact
Research center Rothamsted Research
LaboratoryBioinformatics
Primary citation doi:10.1111/pbi.13583
Release dateOctober 2013
Access
Website knetminer.org
Tools
Web KnetMiner Gene View

KnetMiner Evidence View KnetMiner Network View

Ondex-knet-builder data integration platform
Miscellaneous
License MIT
Versioning Yes
Data release
frequency
Quarterly
Version4.0 (June 2020)
Curation policyManual Curation

Knowledge Network Miner [1] (KnetMiner) [2] [3] is a system of tools used to integrate, search, and visualize biological knowledge graphs (KGs). It is used to search for information across large biological databases and literature to find links between genes, traits, diseases, and other relevant information.

Contents

Current KnetMiners (non-exhaustive)

KnetMiner KGs are built using the data integration platform, KnetBuilder, with output available in OXL, Neo4j, and RDF graph formats. It follows FAIR data [4] principles and supports a variety of biological data formats. The KnetMiner API provides web endpoints that enable users to search for specific genes and keywords, returning results in the form of a knowledge graph.

Originally developed through a collaboration of researchers at Rothamsted Research, KnetMiner has undergone further development and has initiated a spin-out process.

KnetMiner hosts a range of different species, including a knowledge graph dedicated to SARS-CoV-2 [5] [6] in response to the 2020 global pandemic, on Rothamsted Research HPC machines.

Species included in KnetMiner's knowledge graphs:

KnetMiner has been involved in several studies, including studies for wheat, [7] [8] [9] willow, [9] and SARS-CoV-2. [5] It is also being used for exploring pathogen-host interactions in collaboration with PHI-base, soybean loopers, and other species.

API access

KnetMiner uses REST API access to obtain either JSON outputs of each view type or network views for certain searches.

Funding

KnetMiner is funded by the Biotechnology and Biological Sciences Research Council, a UK research council. [2]

Related Research Articles

Biostatistics is a branch of statistics that applies statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results.

<span class="mw-page-title-main">Wheat</span> Genus of grass cultivated for grain

Wheat is a grass widely cultivated for its seed, a cereal grain that is a staple food around the world. The many species of wheat together make up the genus Triticum ; the most widely grown is common wheat. The archaeological record suggests that wheat was first cultivated in the regions of the Fertile Crescent around 9600 BC. Botanically, the wheat kernel is a caryopsis, a type of fruit.

<span class="mw-page-title-main">Face</span> Part of the body that is at the front of the head

The face is the front of an animal's head that features the eyes, nose and mouth, and through which animals express many of their emotions. The face is crucial for human identity, and damage such as scarring or developmental deformities may affect the psyche adversely.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

<span class="mw-page-title-main">Rothamsted Research</span> UK agricultural research institution

Rothamsted Research, previously known as the Rothamsted Experimental Station and then the Institute of Arable Crops Research, is one of the oldest agricultural research institutions in the world, having been founded in 1843. It is located at Harpenden in the English county of Hertfordshire and is a registered charity under English law.

<span class="mw-page-title-main">Durum wheat</span> Species of wheat used for food

Durum wheat, also called pasta wheat or macaroni wheat, is a tetraploid species of wheat. It is the second most cultivated species of wheat after common wheat, although it represents only 5% to 8% of global wheat production. It was developed by artificial selection of the domesticated emmer wheat strains formerly grown in Central Europe and the Near East around 7000 BC, which developed a naked, free-threshing form. Like emmer, durum wheat is awned. It is the predominant wheat that grows in the Middle East.

<span class="mw-page-title-main">Marco Marra</span> Canadian geneticist

Marco A. Marra is a Distinguished Scientist and Director of Canada's Michael Smith Genome Sciences Centre at the BC Cancer Research Centre and Professor of Medical Genetics at the University of British Columbia (UBC). He also serves as UBC Canada Research Chair in Genome Science for the Canadian Institutes of Health Research and is an inductee in the Canadian Medical Hall of Fame. Marra has been instrumental in bringing genome science to Canada by demonstrating the pivotal role that genomics can play in human health and disease research.

<span class="mw-page-title-main">Dynamic Bayesian network</span> Probabilistic graphical model

A dynamic Bayesian network (DBN) is a Bayesian network (BN) which relates variables to each other over adjacent time steps.

<span class="mw-page-title-main">MCPA</span> Organic compound used as an herbicide

MCPA is a widely used phenoxy herbicide introduced in 1945. It selectively controls broad-leaf weeds in pasture and cereal crops. The mode of action of MCPA is as an auxin, which are growth hormones that naturally exist in plants.

Biclustering, block clustering, Co-clustering or two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix. The term was first introduced by Boris Mirkin to name a technique introduced many years earlier, in 1972, by John A. Hartigan.

<span class="mw-page-title-main">PHI-base</span> Biological database

The Pathogen-Host Interactions database (PHI-base) is a biological database that contains manually curated information on genes experimentally proven to affect the outcome of pathogen-host interactions. The database has been maintained by researchers at Rothamsted Research and external collaborators since 2005. PHI-base has been part of the UK node of ELIXIR, the European life-science infrastructure for biological information, since 2016.

<span class="mw-page-title-main">Biological network inference</span> Type of inference

Biological network inference is the process of making inferences and predictions about biological networks. By using these networks to analyze patterns in biological systems, such as food-webs, we can visualize the nature and strength of these interactions between species, DNA, proteins, and more.

<i>Gibberella zeae</i> Species of fungus

Gibberella zeae, also known by the name of its anamorph Fusarium graminearum, is a fungal plant pathogen which causes fusarium head blight (FHB), a devastating disease on wheat and barley. The pathogen is responsible for billions of dollars in economic losses worldwide each year. Infection causes shifts in the amino acid composition of wheat, resulting in shriveled kernels and contaminating the remaining grain with mycotoxins, mainly deoxynivalenol (DON), which inhibits protein biosynthesis; and zearalenone, an estrogenic mycotoxin. These toxins cause vomiting, liver damage, and reproductive defects in livestock, and are harmful to humans through contaminated food. Despite great efforts to find resistance genes against F. graminearum, no completely resistant variety is currently available. Research on the biology of F. graminearum is directed towards gaining insight into more details about the infection process and reveal weak spots in the life cycle of this pathogen to develop fungicides that can protect wheat from scab infection.

<span class="mw-page-title-main">Network motif</span> Type of sub-graph

Network motifs are recurrent and statistically significant subgraphs or patterns of a larger graph. All networks, including biological networks, social networks, technological networks and more, can be represented as graphs, which include a wide variety of subgraphs.

<span class="mw-page-title-main">Biological network</span> Method of representing systems

A biological network is a method of representing systems as complex sets of binary interactions or relations between various biological entities. In general, networks or graphs are used to capture relationships between entities or objects. A typical graphing representation consists of a set of nodes connected by edges.

EmBiology is a web-based Software as a service tool from Elsevier in which researchers can view biological relationships between entities, such as genes, proteins, and cells.

Debasis Dash is an Indian computational biologist and chief scientist at the Institute of Genomics and Integrative Biology (IGIB). Known for his research on proteomics and Big Data and Artificial Intelligence studies, his studies have been documented by way of a number of articles and ResearchGate, an online repository of scientific articles has listed 120 of them. The Department of Biotechnology of the Government of India awarded him the National Bioscience Award for Career Development, one of the highest Indian science awards, for his contributions to biosciences, in 2014. He was appointed as the director of Institute of Life Sciences, Bhubaneswar on 18 May 2023.

<span class="mw-page-title-main">Knowledge graph</span> Type of knowledge base

In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the free-form semantics or relationships underlying these entities.

Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.

<span class="mw-page-title-main">Matthias Grossglauser</span> Swiss communication engineer

Matthias Grossglauser is a Swiss communication engineer. He is a professor of computer science at EPFL and co-director of the Information and Network Dynamics Laboratory (INDY) at EPFL's School of Computer and Communication Sciences School of Basic Sciences.

References

  1. "knetMiner".
  2. 1 2 Hassani‐Pak, Keywan; Singh, Ajit; Brandizi, Marco; Hearnshaw, Joseph; Parsons, Jeremy D.; Amberkar, Sandeep; Phillips, Andrew L.; Doonan, John H.; Rawlings, Chris (2021-03-22). "KnetMiner: a comprehensive approach for supporting evidence‐based gene discovery and complex trait analysis across species". Plant Biotechnology Journal. 19 (8): 1670–1678. doi:10.1111/pbi.13583. ISSN   1467-7644. PMC   8384599 . PMID   33750020.
  3. Hassani-Pak, Keywan (2017). "KnetMiner - An integrated data platform for gene mining and biological knowledge discovery". Bielefeld University, Bielefeld.
  4. Brandizi, Marco; Singh, Ajit; Rawlings, Christopher; Hassani-Pak, Keywan (2018). "Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach". Journal of Integrative Bioinformatics. 15 (3). doi: 10.1515/jib-2018-0023 . ISSN   1613-4516. PMC   6340125 . PMID   30085931.
  5. 1 2 Hearnshaw, Joseph; Brandizi, Marco; Singh, Ajit; Hassani-Pak, Keywan. "Rothamsted Answers White House Call For Coronavirus Data Help". Rothamsted Research.
  6. "Artificial-intelligence tools aim to tame the coronavirus literature". Nature.
  7. Adamski, Nikolai M; Borrill, Philippa; Brinton, Jemima; Harrington, Sophie; Marchal, Clemence; Bentley, Alison R; Bovill, Wiliam D; Cattivelli, Luigi; Cockram, James; Contreras-Moreira, Bruno; Ford, Brett; Ghosh, Sreya; Harwood, Wendy; Hassani-Pak, Keywan; Hayta, Sadiye; Hickey, Lee T; Kanyuka, Kostya; King, Julie; Maccaferri, Marco; Naamati, Guy; Pozniak, Curtis J; Ramirez-Gonzalez, Ricardo H; Sansaloni, Carolina; Trevaskis, Ben; Wingen, Luzie U; Wulff, Brande BH; Uauy, Cristobal. "A roadmap for gene functional characterisation in wheat" (PDF). doi: 10.7287/peerj.preprints.26877v2 . S2CID   43925415.{{cite journal}}: Cite journal requires |journal= (help)
  8. Harrington, Sophie A.; Backhaus, Anna E.; Singh, Ajit; Hassani-Pak, Keywan; Uauy, Cristobal (2019). "Validation and characterisation of a wheat GENIE3 network using an independent RNA-Seq dataset". doi: 10.1101/684183 . S2CID   198383382.{{cite journal}}: Cite journal requires |journal= (help)
  9. 1 2 Alabdullah, Abdul Kader; Borrill, Philippa; Martin, Azahara C.; Ramirez-Gonzalez, Ricardo H.; Hassani-Pak, Keywan; Uauy, Cristobal; Shaw, Peter; Moore, Graham (2019). "A Co-Expression Network in Hexaploid Wheat Reveals Mostly Balanced Expression and Lack of Significant Gene Loss of Homeologous Meiotic Genes Upon Polyploidization". Frontiers in Plant Science. 10: 1325. doi: 10.3389/fpls.2019.01325 . ISSN   1664-462X. PMC   6813927 . PMID   31681395.