Eureqa

Last updated
Eureqa
OwnerNutonian, Inc.
Created byMichael Schmidt and Hod Lipson
URL www.nutonian.com
CommercialYes
LaunchedNovember 2009
Current statusActive

Eureqa was a proprietary modeling engine created in Cornell's Artificial Intelligence Lab and later commercialized by Nutonian, Inc. The software used genetic algorithms to determine mathematical equations that describe sets of data in their simplest form, a technique referred to as symbolic regression.

Contents

Origin and development

Since the 1970s, the primary way companies had performed data science was to hire data scientists and equip them with tools like R, Python, SAS, and SQL to execute predictive and statistical modeling. [1] In 2007 Michael Schmidt, then a PhD student in Computational Biology at Cornell, along with his advisor Hod Lipson, developed Eureqa to help automate the curve fitting work of data scientists by creating a tool that would automatically search for the "best" mathematical model to fit a given dataset (where best is defined as the simplest model that can be found to achieve a given level of fit to the data). [2] [3]

In November 2009 the program was made available to download as freeware. [4] Lipson described the tool's benefit as dealing with fields that are overwhelmed with data but lack theory to explain it. [5] In the October 2011 edition of "Physical Biology", Lipson described a yeast experiment that predicted seven known equations. [6] This took place after Lipson had asked scientists from different disciplines to share their work to test Eureqa's versatility. [6]

Technology

Eureqa worked by creating random equations with the data through evolutionary search. [5] Initial guesses might not fit the data well but some of the equations will fit better than others and those will be used as the basis for the next round of guesses until the fit cannot be further improved. [7]

Reception and use

Over 80,000 users downloaded the program. [8] People used the application for many uses including analyzing the herding of cattle and modeling the behavior of the stock market. [4]

In 2017 Nutonian was acquired by DataRobot and Eureqa merged into their payware portfolio. [9]

Related Research Articles

A mathematical model is an abstract description of a concrete system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used in applied mathematics and in the natural sciences and engineering disciplines, as well as in non-physical systems such as the social sciences (such as economics, psychology, sociology, political science). It can also be taught as a subject in its own right.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

A scientific theory is an explanation of an aspect of the natural world and universe that can be repeatedly tested and corroborated in accordance with the scientific method, using accepted protocols of observation, measurement, and evaluation of results. Where possible, some theories are tested under controlled conditions in an experiment. In circumstances not amenable to experimental testing, theories are evaluated through principles of abductive reasoning. Established scientific theories have withstood rigorous scrutiny and embody scientific knowledge.

<span class="mw-page-title-main">Computer simulation</span> Process of mathematical modelling, performed on a computer

Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be determined by comparing their results to the real-world outcomes they aim to predict. Computer simulations have become a useful tool for the mathematical modeling of many natural systems in physics, astrophysics, climatology, chemistry, biology and manufacturing, as well as human systems in economics, psychology, social science, health care and engineering. Simulation of a system is represented as the running of the system's model. It can be used to explore and gain new insights into new technology and to estimate the performance of systems too complex for analytical solutions.

<span class="mw-page-title-main">Systems biology</span> Computational and mathematical modeling of complex biological systems

Systems biology is the computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach to biological research.

<span class="mw-page-title-main">Mathematical and theoretical biology</span> Branch of biology

Mathematical and theoretical biology, or biomathematics, is a branch of biology which employs theoretical analysis, mathematical models and abstractions of the living organisms to investigate the principles that govern the structure, development and behavior of the systems, as opposed to experimental biology which deals with the conduction of experiments to prove and validate the scientific theories. The field is sometimes called mathematical biology or biomathematics to stress the mathematical side, or theoretical biology to stress the biological side. Theoretical biology focuses more on the development of theoretical principles for biology while mathematical biology focuses on the use of mathematical tools to study biological systems, even though the two terms are sometimes interchanged.

<span class="mw-page-title-main">Dynamical systems theory</span> Area of mathematics used to describe the behavior of complex dynamical systems

Dynamical systems theory is an area of mathematics used to describe the behavior of complex dynamical systems, usually by employing differential equations or difference equations. When differential equations are employed, the theory is called continuous dynamical systems. From a physical point of view, continuous dynamical systems is a generalization of classical mechanics, a generalization where the equations of motion are postulated directly and are not constrained to be Euler–Lagrange equations of a least action principle. When difference equations are employed, the theory is called discrete dynamical systems. When the time variable runs over a set that is discrete over some intervals and continuous over other intervals or is any arbitrary time-set such as a Cantor set, one gets dynamic equations on time scales. Some situations may also be modeled by mixed operators, such as differential-difference equations.

Computational science, also known as scientific computing, technical computing or scientific computation (SC), is a division of science that uses advanced computing capabilities to understand and solve complex physical problems. This includes

Modelling biological systems is a significant task of systems biology and mathematical biology. Computational systems biology aims to develop and use efficient algorithms, data structures, visualization and communication tools with the goal of computer modelling of biological systems. It involves the use of computer simulations of biological systems, including cellular subsystems, to both analyze and visualize the complex connections of these cellular processes.

Mathematical software is software used to model, analyze or calculate numeric, symbolic or geometric data.

<span class="mw-page-title-main">Metabolic network modelling</span> Form of biological modelling

Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome with molecular physiology. A reconstruction breaks down metabolic pathways into their respective reactions and enzymes, and analyzes them within the perspective of the entire network. In simplified terms, a reconstruction collects all of the relevant metabolic information of an organism and compiles it in a mathematical model. Validation and analysis of reconstructions can allow identification of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. This knowledge can then be applied to create novel biotechnology.

The Systems Biology Markup Language (SBML) is a representation format, based on XML, for communicating and storing computational models of biological processes. It is a free and open standard with widespread software support and a community of users and developers. SBML can represent many different classes of biological phenomena, including metabolic networks, cell signaling pathways, regulatory networks, infectious diseases, and many others. It has been proposed as a standard for representing computational models in systems biology today.

<span class="mw-page-title-main">RapidMiner</span> Data science software

RapidMiner is a data science platform that analyses the collective impact of an organization's data. It was acquired by Altair Engineering in September 2022.

Systems immunology is a research field under systems biology that uses mathematical approaches and computational methods to examine the interactions within cellular and molecular networks of the immune system. The immune system has been thoroughly analyzed as regards to its components and function by using a "reductionist" approach, but its overall function can't be easily predicted by studying the characteristics of its isolated components because they strongly rely on the interactions among these numerous constituents. It focuses on in silico experiments rather than in vivo.

<span class="mw-page-title-main">Hod Lipson</span> American robotics engineer (born 1967)

Hod Lipson is an Israeli - American robotics engineer. He is the director of Columbia University's Creative Machines Lab. Lipson's work focuses on evolutionary robotics, design automation, rapid prototyping, artificial life, and creating machines that can demonstrate some aspects of human creativity. His publications have been cited more than 43,000 times, and he has an h-index of 86, as of 12 April 2023. Lipson is interviewed in the 2018 documentary on artificial intelligence Do You Trust This Computer?

<span class="mw-page-title-main">Cellular model</span>

A cellular model is a mathematical model of aspects of a biological cell, for the purposes of in silico research.

Rexer Analytics’s Annual Data Miner Survey is the largest survey of data mining, data science, and analytics professionals in the industry. It consists of approximately 50 multiple choice and open-ended questions that cover seven general areas of data mining science and practice: (1) Field and goals, (2) Algorithms, (3) Models, (4) Tools, (5) Technology, (6) Challenges, and (7) Future. It is conducted as a service to the data mining community, and the results are usually announced at the PAW conferences and shared via freely available summary reports. In the 2013 survey, 1259 data miners from 75 countries participated. After 2011, Rexer Analytics moved to a biannual schedule.

Virtual Cell (VCell) is an open-source software platform for modeling and simulation of living organisms, primarily cells. It has been designed to be a tool for a wide range of scientists, from experimental cell biologists to theoretical biophysicists.

<span class="mw-page-title-main">Symbolic regression</span> Type of regression analysis

Symbolic regression (SR) is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity.

References

  1. Piatetsky, Gregory. "Four main languages for Analytics, Data Mining, Data Science". www.kdnuggets.com. KDnuggets. Retrieved 19 May 2016.
  2. Keohane, Dennis. "Nutonian - At the Cutting Edge of Technology, Science, and Data Analysis". www.venturefizz.com. VentureFizz.
  3. Regalado, Antonio (August 19, 2014). "35 Innovators Under 35". MIT Technology Review. Retrieved 19 May 2016.
  4. 1 2 Keim, Brandon (December 3, 2009). "Download Your Own Robot Scientist". Wired Magazine . Retrieved April 22, 2013.
  5. 1 2 Chang, Kenneth (April 2, 2009). "Hal, Call Your Office: Computers That Act Like Physicists". The New York Times . Retrieved April 22, 2013.
  6. 1 2 Ehrenberg, Rachel (January 14, 2012). "Software Scientist". Science News Digital. Retrieved April 22, 2013.
  7. Manjoo, Farhad (September 30, 2009). "Will Robots Steal Your Job?". Slate . Retrieved April 20, 2013.
  8. Shtull-Trauring, Asaf (February 3, 2012). "An Israeli professor's 'Eureqa' moment". Haaretz. Retrieved April 20, 2013.
  9. "DataRobot Acquires Nutonian". DataRobot. May 25, 2017. Retrieved December 9, 2023.