Genetic Algorithm for Rule Set Production

Last updated

Genetic Algorithm for Rule Set Production (GARP) is a computer program based on genetic algorithm that creates ecological niche models for species. The generated models describe environmental conditions (precipitation, temperatures, elevation, etc.) under which the species should be able to maintain populations. As input, local observations of species and related environmental parameters are used which describe potential limits of the species' capabilities to survive. Such environmental parameters are commonly stored in geographical information systems. A GARP model is a random set of mathematical rules which can be read as limiting environmental conditions. Each rule is considered as a gene; the set of genes is combined in random ways to further generate many possible models describing the potential of the species to occur.

Genetic algorithm competitive algorithm for searching a problem space

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on bio-inspired operators such as mutation, crossover and selection. John Holland introduced genetic algorithms in 1960 based on the concept of Darwin’s theory of evolution; afterwards, his student David E. Goldberg extended GA in 1989.

Contents

See also

Related Research Articles

Ecology Scientific study of the relationships between living organisms and their environment

Ecology is the branch of biology which studies the interactions among organisms and their environment. Objects of study include interactions of organisms that include biotic and abiotic components of their environment. Topics of interest include the biodiversity, distribution, biomass, and populations of organisms, as well as cooperation and competition within and between species. Ecosystems are dynamically interacting systems of organisms, the communities they make up, and the non-living components of their environment. Ecosystem processes, such as primary production, pedogenesis, nutrient cycling, and niche construction, regulate the flux of energy and matter through an environment. These processes are sustained by organisms with specific life history traits. Biodiversity means the varieties of species, genes, and ecosystems, enhances certain ecosystem services.

Theoretical ecology

Theoretical ecology is the scientific discipline devoted to the study of ecological systems using theoretical methods such as simple conceptual models, mathematical models, computational simulations, and advanced data analysis. Effective models improve understanding of the natural world by revealing how the dynamics of species populations are often based on fundamental biological conditions and processes. Further, the field aims to unify a diverse range of empirical observations by assuming that common, mechanistic processes generate observable phenomena across species and ecological environments. Based on biologically realistic assumptions, theoretical ecologists are able to uncover novel, non-intuitive insights about natural processes. Theoretical results are often verified by empirical and observational studies, revealing the power of theoretical methods in both predicting and understanding the noisy, diverse biological world.

Ecological niche The fit of a species living under specific environmental conditions.

In ecology, a niche is the match of a species to a specific environmental condition. It describes how an organism or population responds to the distribution of resources and competitors and how it in turn alters those same factors. "The type and number of variables comprising the dimensions of an environmental niche vary from one species to another [and] the relative importance of particular environmental variables for a species may vary according to the geographic and biotic contexts".

Protein engineering is the process of developing useful or valuable proteins. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles. It is also a product and services market, with an estimated value of $168 billion by 2017.

Lifemapper is building a species diversity map of the world. It is similar to the SETI@Home client, in that it uses a distributed computing client running primarily on home user's computers to correlate georeferenced biological samples with environmental models of the Earth. It is an experimental GIS, or Geographic Information System, that uses a special genetic algorithm to see if predicted rules about where a species lives match up with the species' observed natural settings. It is hoped that this technique will be able to both represent a current "map" of all organisms habitats on Earth as well as predict where organisms may possibly thrive or face extinction due to climate change and other ecological transformations.

Sympatric speciation A process through which new species evolve from a single ancestral species while inhabiting the same geographic region

Sympatric speciation is the evolution of a new species from a surviving ancestral species while both continue to inhabit the same geographic region. In evolutionary biology and biogeography, sympatric and sympatry are terms referring to organisms whose ranges overlap so that they occur together at least in some places. If these organisms are closely related, such a distribution may be the result of sympatric speciation. Etymologically, sympatry is derived from the Greek roots συν ("together") and πατρίς ("homeland"). The term was invented by Edward Bagnall Poulton in 1904, who explains the derivation.

Learning classifier system

Learning classifier systems, or LCS, are a paradigm of rule-based machine learning methods that combine a discovery component with a learning component. Learning classifier systems seek to identify a set of context-dependent rules that collectively store and apply knowledge in a piecewise manner in order to make predictions. This approach allows complex solution spaces to be broken up into smaller, simpler parts.

In computer programming, gene expression programming (GEP) is an evolutionary algorithm that creates computer programs or models. These computer programs are complex tree structures that learn and adapt by changing their sizes, shapes, and composition, much like a living organism. And like living organisms, the computer programs of GEP are also encoded in simple linear chromosomes of fixed length. Thus, GEP is a genotype–phenotype system, benefiting from a simple genome to keep and transmit the genetic information and a complex phenotype to explore the environment and adapt to it.

Parapatric speciation Speciation within a population where subpopulations are reproductively isolated

In parapatric speciation, two subpopulations of a species evolve reproductive isolation from one another while continuing to exchange genes. This mode of speciation has three distinguishing characteristics: 1) mating occurs non-randomly, 2) gene flow occurs unequally, and 3) populations exist in either continuous or discontinuous geographic ranges. This distribution pattern may be the result of unequal dispersal, incomplete geographical barriers, or divergent expressions of behavior, among other things. Parapatric speciation predicts that hybrid zones will often exist at the junction between the two populations.

Coalescent theory is a model of how gene variants sampled from a population may have originated from a common ancestor. In the simplest case, coalescent theory assumes no recombination, no natural selection, and no gene flow or population structure, meaning that each variant is equally likely to have been passed from one generation to the next. The model looks backward in time, merging alleles into a single ancestral copy according to a random process in coalescence events. Under this model, the expected time between successive coalescence events increases almost exponentially back in time. Variance in the model comes from both the random passing of alleles from one generation to the next, and the random occurrence of mutations in these alleles.

Dual inheritance theory (DIT), also known as gene–culture coevolution or biocultural evolution, was developed in the 1960s through early 1980s to explain how human behavior is a product of two different and interacting evolutionary processes: genetic evolution and cultural evolution. Genes and culture continually interact in a feedback loop, changes in genes can lead to changes in culture which can then influence genetic selection, and vice versa. One of the theory's central claims is that culture evolves partly through a Darwinian selection process, which dual inheritance theorists often describe by analogy to genetic evolution.

Computational phylogenetics is the application of computational algorithms, methods, and programs to phylogenetic analyses. The goal is to assemble a phylogenetic tree representing a hypothesis about the evolutionary ancestry of a set of genes, species, or other taxa. For example, these techniques have been used to explore the family tree of hominid species and the relationships between specific genes shared by many types of organisms. Traditional phylogenetics relies on morphological data obtained by measuring and quantifying the phenotypic properties of representative organisms, while the more recent field of molecular phylogenetics uses nucleotide sequences encoding genes or amino acid sequences encoding proteins as the basis for classification. Many forms of molecular phylogenetics are closely related to and make extensive use of sequence alignment in constructing and refining phylogenetic trees, which are used to classify the evolutionary relationships between homologous genes represented in the genomes of divergent species. The phylogenetic trees constructed by computational methods are unlikely to perfectly reproduce the evolutionary tree that represents the historical relationships between the species being analyzed. The historical species tree may also differ from the historical tree of an individual homologous gene shared by those species.

Grammatical evolution is a relatively new evolutionary computation technique pioneered by Conor Ryan, JJ Collins and Michael O'Neill in 1998 at the BDS Group in the University of Limerick.

The term or acronym GARP can refer to:

Isolation by distance

Isolation by distance (IBD) is a term used to refer to the accrual of local genetic variation under geographically limited dispersal The IBD model is useful for determining the distribution of gene frequencies over a geographic region. Both dispersal variance and migration probabilities are variables in this model and both contribute to local genetic differentiation. Isolation by distance is usually the simplest model for the cause of genetic isolation between populations. Evolutionary biologists and population geneticists have been exploring varying theories and models for explaining population structure. Yoichi Ishida compares two important theories of isolation by distance and clarifies the relationship between the two. According to Ishida, Sewall Wright's isolation by distance theory is termed ecological isolation by distance while Gustave Malécot's theory is called genetic isolation by distance. Isolation by distance is distantly related to speciation. Multiple types of isolating barriers, namely prezygotic isolating barriers which includes isolation by distance, are considered to be the key factor in keeping populations apart limiting gene flow.

Species distribution modelling

Species distribution modelling (SDM), also known as environmental(or ecological) niche modelling (ENM), habitat modelling, predictive habitat distribution modelling, and range mapping uses computer algorithms to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data, but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes. Predictions from an SDM may be of a species’ future distribution under climate change, a species’ past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications.

AquaMaps

AquaMaps is a collaborative project with the aim of producing computer-generated predicted global distribution maps for marine species on a 0.5 x 0.5 degree grid of the oceans based on data available through online species databases such as FishBase and SeaLifeBase and species occurrence records from OBIS or GBIF and using an environmental envelope model in conjunction with expert input. The underlying model represents a modified version of the relative environmental suitability (RES) model developed by Kristin Kaschner to generate global predictions of marine mammal occurrences.

In applied mathematics, multimodal optimization deals with optimization tasks that involve finding all or most of the multiple solutions of a problem, as opposed to a single best solution. Evolutionary multimodal optimization is a branch of evolutionary computation, which is closely related to machine learning. Wong provides a short survey, wherein the chapter of Shir and the book of Preuss cover the topic in more detail.

References

Software