VxInsight is a knowledge mining tool developed by Sandia National Laboratories with the Institute for Scientific Information. It allows the user to visualize the relationship between groups of objects in large databases as a 3D landscape.
Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data; in contrast, data mining uses machine-learning and statistical models to uncover clandestine or hidden patterns in a large volume of data.
The Sandia National Laboratories (SNL), managed and operated by the National Technology and Engineering Solutions of Sandia, is one of three National Nuclear Security Administration research and development laboratories. In December 2016, it was announced that National Technology and Engineering Solutions of Sandia, under the direction of Honeywell International, will take over the management of Sandia National Laboratories starting on May 1, 2017.
The Institute for Scientific Information (ISI) was an academic publishing service, founded by Eugene Garfield in Philadelphia in 1960. ISI offered scientometric and bibliographic database services. Its specialty was citation indexing and analysis, a field pioneered by Garfield.
In what Hillier et al. call a "pioneering study," [1] VxInsight has been used to analyze gene expression (i.e. microarray) data across a number of conditions in C. elegans . Using VxInsight, Kim et al. were able to cluster genes into "mounts" with coherent functions, and were also able to make novel observations, such as the finding that distinct classes of transposons (such as Tc3 and Mariner transposons) appear to be differentially regulated during development. [2]
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as transfer RNA (tRNA) or small nuclear RNA (snRNA) genes, the product is a functional RNA.
A microarray is a multiplex lab-on-a-chip. It is a two-dimensional array on a solid substrate that assays (tests) large amounts of biological material using high-throughput screening miniaturized, multiplexed and parallel processing and detection methods. The concept and methodology of microarrays was first introduced and illustrated in antibody microarrays by Tse Wen Chang in 1983 in a scientific publication and a series of patents. The "gene chip" industry started to grow significantly after the 1995 Science Paper by the Ron Davis and Pat Brown labs at Stanford University. With the establishment of companies, such as Affymetrix, Agilent, Applied Microarrays, Arrayjet, Illumina, and others, the technology of DNA microarrays has become the most sophisticated and the most widely used, while the use of protein, peptide and carbohydrate microarrays is expanding.
Caenorhabditis elegans is a free-living, transparent nematode, about 1 mm in length, that lives in temperate soil environments. It is the type species of its genus. The name is a blend of the Greek caeno- (recent), rhabditis (rod-like) and Latin elegans (elegant). In 1900, Maupas initially named it Rhabditides elegans, Osche placed it in the subgenus Caenorhabditis in 1952, and in 1955, Dougherty raised Caenorhabditis to the status of genus.
In the fields of molecular biology and genetics, a genome is the genetic material of an organism. It consists of DNA. The genome includes both the genes and the noncoding DNA, as well as mitochondrial DNA and chloroplast DNA. The study of the genome is called genomics.
A transposable element is a DNA sequence that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size. Transposition often results in duplication of the same genetic material. Barbara McClintock's discovery of them earned her a Nobel Prize in 1983.
Barbara McClintock was an American scientist and cytogeneticist who was awarded the 1983 Nobel Prize in Physiology or Medicine. McClintock received her PhD in botany from Cornell University in 1927. There she started her career as the leader in the development of maize cytogenetics, the focus of her research for the rest of her life. From the late 1920s, McClintock studied chromosomes and how they change during reproduction in maize. She developed the technique for visualizing maize chromosomes and used microscopic analysis to demonstrate many fundamental genetic ideas. One of those ideas was the notion of genetic recombination by crossing-over during meiosis—a mechanism by which chromosomes exchange information. She produced the first genetic map for maize, linking regions of the chromosome to physical traits. She demonstrated the role of the telomere and centromere, regions of the chromosome that are important in the conservation of genetic information. She was recognized as among the best in the field, awarded prestigious fellowships, and elected a member of the National Academy of Sciences in 1944.
The Visualization Toolkit (VTK) is an open-source software system for 3D computer graphics, image processing and visualization.
In molecular biology, insertional mutagenesis is the creation of mutations of DNA by the addition of one or more base pairs. Such insertional mutations can occur naturally, mediated by viruses or transposons, or can be artificially created for research purposes in the lab.
Piwi-interacting RNA (piRNA) is the largest class of small non-coding RNA molecules expressed in animal cells. piRNAs form RNA-protein complexes through interactions with piwi proteins. These piRNA complexes are mostly involved in the epigenetic and post-transcriptional silencing of transposons, but can also be involved in the regulation of other genetic elements in germ line cells. piRNAs are mostly created from loci that function as transposon traps and provide an RNA-mediated adaptive immunity against transposon expansions and invasions. They are distinct from microRNA (miRNA) in size, lack of sequence conservation, and increased complexity.
RNA silencing or RNA interference refers to a family of gene silencing effects by which gene expression is negatively regulated by non-coding RNAs such as microRNAs. RNA silencing may also be defined as sequence-specific regulation of gene expression triggered by double-stranded RNA (dsRNA). RNA silencing mechanisms are highly conserved in most eukaryotes. The most common and well-studied example is RNA interference (RNAi), in which endogenously expressed microRNA (miRNA) or exogenously derived small interfering RNA (siRNA) induces the degradation of complementary messenger RNA. Other classes of small RNA have been identified, including piwi-interacting RNA (piRNA) and its subspecies repeat associated small interfering RNA (rasiRNA).
Ubiquitin-conjugating enzyme E2 G1 is a protein that in humans is encoded by the UBE2G1 gene.
Transposon mutagenesis, or transposition mutagenesis, is a biological process that allows genes to be transferred to a host organism's chromosome, interrupting or modifying the function of an extant gene on the chromosome and causing mutation. Transposon mutagenesis is much more effective than chemical mutagenesis, with a higher mutation frequency and a lower chance of killing the organism. Other advantages include being able to induce single hit mutations, being able to incorporate selectable markers in strain construction, and being able to recover genes after mutagenesis. Disadvantages include the low frequency of transposition in living systems, and the inaccuracy of most transposition systems.
A knockout rat is a genetically engineered rat with a single gene turned off through a targeted mutation used for academic and pharmaceutical research. Knockout rats can mimic human diseases and are important tools for studying gene function and for drug discovery and development. The production of knockout rats was not economically or technically feasible until 2008.
Ridges are domains of the genome with a high gene expression; the opposite of ridges are antiridges. The term was first used by Caron et al. in 2001. Characteristics of ridges are:
The nematode worm Caenorhabditis elegans was first studied in the laboratory by Victor Nigon and Ellsworth Dougherty in the 1940s, but came to prominence after being adopted by Sydney Brenner in 1963 as a model organism for the study of developmental biology using genetics. In 1974, Brenner published the results of his first genetic screen, which isolated hundreds of mutants with morphological and functional phenotypes, such as being uncoordinated. In the 1980s, John Sulston and co-workers identified the lineage of all 959 cells in the adult hermaphrodite, the first genes were cloned, and the physical map began to be constructed. In 1998, the worm became the first multi-cellular organism to have its genome sequenced. Notable research using C. elegans includes the discoveries of caspases, RNA interference, and microRNAs. Six scientists have won the Nobel prize for their work on C. elegans.
Helitrons are one of the three groups of eukaryotic class 2 transposable elements (TEs) so far described. They are the eukaryotic rolling-circle transposable elements which are hypothesized to transpose by a rolling circle replication mechanism via a single-stranded DNA intermediate. They were first discovered in plants and in the nematode Caenorhabditis elegans, and now they have been identified in a diverse range of species, from protists to mammals. Helitrons make up a substantial fraction of many genomes where non-autonomous elements frequently outnumber the putative autonomous partner. Helitrons seem to have a major role in the evolution of host genomes. They frequently capture diverse host genes, some of which can evolve into novel host genes or become essential for Helitron transposition.
The Sleeping Beauty transposon system is a synthetic DNA transposon designed to introduce precisely defined DNA sequences into the chromosomes of vertebrate animals for the purposes of introducing new traits and to discover new genes and their functions. It is a Tc1/mariner-type system, with the transposase resurrected from multiple inactive fish sequences.
Essential genes are those genes of an organism that are thought to be critical for its survival. However, being essential is highly dependent on the circumstances in which an organism lives. For instance, a gene required to digest starch is only essential if starch is the only source of energy. Recently, systematic attempts have been made to identify those genes that are absolutely required to maintain life, provided that all nutrients are available. Such experiments have led to the conclusion that the absolutely required number of genes for bacteria is on the order of about 250–300. These essential genes encode proteins to maintain a central metabolism, replicate DNA, translate genes into proteins, maintain a basic cellular structure, and mediate transport processes into and out of the cell. Most genes are not essential but convey selective advantages and increased fitness.
WormBase is an online biological database about the biology and genome of the nematode model organism Caenorhabditis elegans and contains information about other related nematodes. WormBase is used by the C. elegans research community both as an information resource and as a place to publish and distribute their results. The database is regularly updated with new versions being released every two months. WormBase is one of the organizations participating in the Generic Model Organism Database (GMOD) project.
Transposition is the process by which a specific genetic sequence, known as a transposon, is moved from one location of the genome to another. Simple, or conservative transposition, is a non-replicative mode of transposition. That is, in conservative transposition the transposon is completely removed from the genome and reintegrated into a new, non-homologous locus, the same genetic sequence is conserved throughout the entire process. The site in which the transposon is reintegrated into the genome is called the target site. A target site can be in the same chromosome as the transposon or within a different chromosome. Conservative transposition uses the "cut-and-paste" mechanism driven by the catalytic activity of the enzyme transposase. Transposase acts like DNA scissors; it is an enzyme that cuts through double-stranded DNA to remove the transposon, then transfers and pastes it into a target site.
Tc1/mariner is a class and superfamily of interspersed repeats DNA transposons. The elements of this class are found in all animals, including humans. They can also be found in protists and bacteria.
This database software-related article is a stub. You can help Wikipedia by expanding it. |