The Gene Wiki is a project within Wikipedia that aims to describe the relationships and functions of all human genes. It was established to transfer information from scientific resources to Wikipedia stub articles. [1] [2] [3]
The Gene Wiki project also initiated publication of gene-specific review articles in the journal Gene , together with the editing of the gene-specific pages in Wikipedia. [4]
The Gene Wiki project in collaboration with the journal Gene was terminated in May 2022, ten years after the project's initiation. A report by the project's leaders summarizes the project's achievements. [5]
The human genome contains an estimated 20,000–25,000 protein-coding genes. [6] The goal of the Gene Wiki project is to create seed articles for every notable human gene, that is, every gene whose function has been assigned in the peer-reviewed scientific literature. Approximately half of human genes have assigned function, therefore the total number of articles seeded by the Gene Wiki project would be expected to be in the range of 10,000–15,000. To date,[ as of? ] approximately 11,000 articles have been created or augmented to include Gene Wiki project content.[ citation needed ]
Once seed articles have been established, the hope and expectation is that these will be annotated and expanded by editors ranging in experience from the lay audience to students to professionals and academics. [1]
Only a small portion of the genome actually encodes protein in the human genome. Understanding the function of a gene that codes for a protein generally requires understanding of the function of the corresponding protein. In addition to including basic information about the gene, the project therefore also includes information about the protein encoded by the gene. The function of other portions of the genome, non-coding DNA, also called "junk" DNA in the past because they had no apparent function, actually are thought to have regulatory functions.
Stubs for the Gene Wiki project are created by a bot and contain links to the following primary gene/protein databases:
A report found that between 2013 and 2017, the content which Gene Wiki contributed to Wikipedia got crowdsourced development over time. [8]
Israel Hanukoglu is a Turkish-born Israeli scientist. He is a full professor of biochemistry and molecular biology at Ariel University and former science and technology adviser to the prime minister of Israel (1996–1999). He is founder of Israel Science and Technology Directory.
The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims "to build a comprehensive parts list of functional elements in the human genome."
Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families.
G protein-coupled receptor 1, also known as GPR1, is a protein that in humans is encoded by the GPR1 gene.
Neuromedin-U receptor 1 is a protein that in humans is encoded by the NMUR1 gene.
C-C chemokine receptor-like 2 is a protein that in humans is encoded by the CCRL2 gene. Recently it was found that CCRL2 also acts as a receptor for the chemokine chemerin.
DNA-binding protein RFXANK is a protein that in humans is encoded by the RFXANK gene.
Acid-sensing ion channel 3 (ASIC3) also known as amiloride-sensitive cation channel 3 (ACCN3) or testis sodium channel 1 (TNaC1) is a protein that in humans is encoded by the ASIC3 gene. The ASIC3 gene is one of the five paralogous genes that encode proteins that form trimeric acid-sensing ion channels (ASICs) in mammals. The cDNA of this gene was first cloned in 1998. The ASIC genes have splicing variants that encode different proteins that are called isoforms.
The SCNN1D gene encodes for the δ (delta) subunit of the epithelial sodium channel ENaC in vertebrates. ENaC is assembled as a heterotrimer composed of three homologous subunits α, β, and γ or δ, β, and γ. The other ENAC subunits are encoded by SCNN1A, SCNN1B, and SCNN1G.
Exostosin-like 1 is a protein that in humans is encoded by the EXTL1 gene.
Neuronal acetylcholine receptor subunit alpha-9, also known as nAChRα9, is a protein that in humans is encoded by the CHRNA9 gene. The protein encoded by this gene is a subunit of certain nicotinic acetylcholine receptors (nAchR).
Coiled-coil domain-containing protein 113 also known as HSPC065, GC16Pof6842 and GC16P044152, is a protein that in humans is encoded by the CCDC113 gene. The human CCDC113 gene is located on chromosome 16q21 and encodes 5,304 base pairs of mRNA and 377 amino acids.
GENCODE is a scientific project in genome research and part of the ENCODE scale-up project.
Armadillo repeat containing X-linked 6 is a protein that in humans is encoded by the ARMCX6 gene located on the X-chromosome.
In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome, by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate. Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.
TMEM106A is a gene that encodes the transmembrane protein 106A (TMEM106A) in Homo sapiens. It is located at 17q21.31 on the plus strand next to cancer-related genes NBR1 and BRCA1. The TMEM106A gene contains a domain of unknown function, DUF1356.
Single nucleotide polymorphism annotation is the process of predicting the effect or function of an individual SNP using SNP annotation tools. In SNP annotation the biological information is extracted, collected and displayed in a clear form amenable to query. SNP functional annotation is typically performed based on the available information on nucleic acid and protein sequences.
Alexander George Bateman is a computational biologist and Head of Protein Sequence Resources at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL) in Cambridge, UK. He has led the development of the Pfam biological database and introduced the Rfam database of RNA families. He has also been involved in the use of Wikipedia for community-based annotation of biological databases.
Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.