Designed by | David Fournier |
---|---|
Developer | ADMB Core Team [1] |
Stable release | 13.1 [2] / 23 December 2022 |
OS | Cross-platform |
License | BSD |
Website | www |
Dialects | |
C++ |
ADMB or AD Model Builder is a free and open source software suite for non-linear statistical modeling. [3] [4] It was created by David Fournier and now being developed by the ADMB Project, a creation of the non-profit ADMB Foundation. The "AD" in AD Model Builder refers to the automatic differentiation capabilities that come from the AUTODIF Library, a C++ language extension also created by David Fournier, which implements reverse mode automatic differentiation. [5] A related software package, ADMB-RE, provides additional support for modeling random effects. [6]
Markov chain Monte Carlo methods are integrated into the ADMB software, making it useful for Bayesian modeling. [7] In addition to Bayesian hierarchical models, ADMB provides support for modeling random effects in a frequentist framework using Laplace approximation and importance sampling. [6]
ADMB is widely used by scientists in academic institutions, government agencies, and international commissions, [8] most commonly for ecological modeling. In particular, many fisheries stock assessment models have been built using this software. [9] ADMB is freely available under the New BSD License, [10] with versions available for Windows, Linux, Mac OS X, and OpenSolaris operating systems. [10] Source code for ADMB was made publicly available in March 2009. [11] [12]
Work by David Fournier in the 1970s on development of highly parameterized integrated statistical models in fisheries motivated the development of the AUTODIF Library, and ultimately ADMB. The likelihood equations in these models are typically non-linear and estimates of the parameters are obtained by numerical methods.
Early in Fournier's work, it became clear that general numerical solutions to these likelihood problems could only be reliably achieved using function minimization algorithms that incorporate accurate information about the gradients of the likelihood surface. Computing the gradients (i.e. partial derivatives of the likelihood with respect to all model variables) must also be done with the same accuracy as the likelihood computation itself.
Fournier developed a protocol for writing code to compute the required derivatives based on the chain rule of differential calculus. This protocol is very similar to the suite of methods that came to be known as ``reverse mode automatic differentiation . [13]
The statistical models using these methods [14] [15] [16] [17] typically included eight constituent code segments:
Model developers are usually only interested in the first of these constituents. Any programming tools that can reduce the overhead of developing and maintaining the other seven will greatly increase their productivity.
Bjarne Stroustrup began development of C++ in the 1970s at Bell Labs as an enhancement to the C programming language. C++ spread widely, and by 1989, C++ compilers were available for personal computers. The polymorphism of C++ makes it possible to envisage a programming system in which all mathematical operators and functions can be overloaded to automatically compute the derivative contributions of every differentiable numerical operation in any computer program.
Fournier formed Otter Research Ltd. in 1989, and by 1990 the AUTODIF Library included special classes for derivative computation and the requisite overloaded functions for all C++ operators and all functions in the standard C++ math library. The AUTODIF Library automatically computes the derivatives of the objective function with the same accuracy as the objective function itself and thereby frees the developer from the onerous task of writing and maintaining derivative code for statistical models. Equally important from the standpoint of model development, the AUTODIF Library includes a "gradient stack", a quasi-Newton function minimizer, a derivative checker, and container classes for vectors and matrices. The first application of the AUTODIF Library was published in 1992 [18]
The AUTODIF Library does not, however, completely liberate the developer from writing all of the model constituents listed above. In 1993, Fournier further abstracted the writing of statistical models by creating ADMB, a special "template" language to simplify model specification by creating the tools to transform models written using the templates into the AUTODIF Library applications. ADMB produces code to manage the exchange of model parameters between the model and the function minimizer, automatically computes the Hessian matrix and inverts it to provide an estimate the covariance of the estimated parameters. ADMB thus completes the liberation of the model developer from all of the tedious overhead of managing non-linear optimization, thereby freeing him or her to focus on the more interesting aspects of the statistical model.
By the mid-1990s, ADMB had earned acceptance by researchers working on all aspects of resource management. Population models based on the ADMB are used to monitor a range of both endangered species and commercially valuable fish populations including whales, dolphins, sea lions, penguins, albatross, abalone, lobsters, tunas, marlins, sharks, rays, anchovy, and pollock. ADMB has also been used to reconstruct movements of many species of animals tracked with electronic tags.
In 2002, Fournier teamed up with Hans Skaug to introduce random effects into ADMB. This development included automatic computation of second and third derivatives and the use of forward mode automatic differentiation followed by two sweeps of reverse model AD in certain cases.
In 2007, a group of ADMB users that included John Sibert, Mark Maunder and Anders Nielsen became concerned about ADMB's long-term development and maintenance. An agreement was reached with Otter Research to sell the copyright to ADMB for the purpose of making ADMB an open-source project and distributing it without charge. The non-profit ADMB Foundation was created to coordinate development and promote use of ADMB. The ADMB Foundation drafted a proposal to the Gordon and Betty Moore Foundation for the funds to purchase ADMB from Otter Research. The Moore Foundation provided a grant to the National Center of Ecological Analysis and Synthesis at the University of California at Santa Barbara in late 2007 so that the Regents of the University of California could purchase the rights to ADMB. The purchase was completed in mid-2008, and the complete ADMB libraries were posted on the ADMB Project website in December 2008. By May 2009, more than 3000 downloads of the libraries had occurred. The source code was made available in December 2009. In mid-2010, ADMB was supported on all common operating systems (Windows, Linux, MacOS and Sun/SPARC), for all common C++ compilers (GCC, Visual Studio, Borland), and for both 32 and 64 bit architectures.
ADMB Foundation efforts during the first two years of the ADMB Project have focused on automating the building of ADMB for different platforms, streamlining installation, and creation of a user-friendly working environments. Planned technical developments include parallelization of internal computations, implementation of hybrid MCMC, and improvement of the large sparse matrix for use in random effects models.
Supervised learning (SL) is a machine learning paradigm for problems where the available on data consists of labeled examples, meaning that each data point contains features (covariates) and an associated label. The goal of supervised learning algorithms is learning a function that maps feature vectors (inputs) to labels (output), based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. In supervised learning, each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. This statistical quality of an algorithm is measured through the so-called generalization error.
Numerical analysis is the study of algorithms that use numerical approximation for the problems of mathematical analysis. It is the study of numerical methods that attempt at finding approximate solutions of problems rather than the exact ones. Numerical analysis finds application in all fields of engineering and the physical sciences, and in the 21st century also the life and social sciences, medicine, business and even the arts. Current growth in computing power has enabled the use of more complex numerical analysis, providing detailed and realistic mathematical models in science and engineering. Examples of numerical analysis include: ordinary differential equations as found in celestial mechanics, numerical linear algebra in data analysis, and stochastic differential equations and Markov chains for simulating living cells in medicine and biology.
Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.
A computer algebra system (CAS) or symbolic algebra system (SAS) is any mathematical software with the ability to manipulate mathematical expressions in a way similar to the traditional manual computations of mathematicians and scientists. The development of the computer algebra systems in the second half of the 20th century is part of the discipline of "computer algebra" or "symbolic computation", which has spurred work in algorithms over mathematical objects such as polynomials.
Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.
Molecular modelling encompasses all methods, theoretical and computational, used to model or mimic the behaviour of molecules. The methods are used in the fields of computational chemistry, drug design, computational biology and materials science to study molecular systems ranging from small chemical systems to large biological molecules and material assemblies. The simplest calculations can be performed by hand, but inevitably computers are required to perform molecular modelling of any reasonably sized system. The common feature of molecular modelling methods is the atomistic level description of the molecular systems. This may include treating atoms as the smallest individual unit, or explicitly modelling protons and neutrons with its quarks, anti-quarks and gluons and electrons with its photons.
In mathematics and computer algebra, automatic differentiation, also called algorithmic differentiation, computational differentiation, is a set of techniques to evaluate the partial derivative of a function specified by a computer program.
Computational science, also known as scientific computing, technical computing or scientific computation (SC), is a division of science that uses advanced computing capabilities to understand and solve complex physical problems. This includes
In numerical analysis, numerical differentiation algorithms estimate the derivative of a mathematical function or function subroutine using values of the function and perhaps other knowledge about the function.
In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions.
In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. However, M-estimators are not inherently robust, as is clear from the fact that they include maximum likelihood estimators, which are in general not robust. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation. 48 samples of robust M-estimators can be found in a recent review study.
In applied mathematics, Hessian automatic differentiation are techniques based on automatic differentiation (AD) that calculate the second derivative of an -dimensional function, known as the Hessian matrix.
PROSE was the mathematical 4GL virtual machine that established the holistic modeling paradigm known as Synthetic Calculus. A successor to the SLANG/CUE simulation and optimization language developed at TRW Systems, it was introduced in 1974 on Control Data supercomputers. It was the first commercial language to employ automatic differentiation (AD), which was optimized to loop in the instruction-stack of the CDC 6600 CPU.
Stan is a probabilistic programming language for statistical inference written in C++. The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.
Adept is a combined automatic differentiation and array software library for the C++ programming language. The automatic differentiation capability facilitates the development of applications involving mathematical optimization. Adept is notable for having applied the template metaprogramming technique of expression templates to speed-up the differentiation of mathematical statements. Along with the efficient way that it stores the differential information, this makes it significantly faster than most other C++ tools that provide similar functionality, although comparable performance has been reported for Stan and in some cases Sacado. Differentiation may be in forward mode, reverse mode, or the full Jacobian matrix may be computed.
Owl Scientific Computing is a software system for scientific and engineering computing developed in the Department of Computer Science and Technology, University of Cambridge. The System Research Group (SRG) in the department recognises Owl as one of the representative systems developed in SRG in the 2010s. The source code is licensed under the MIT License and can be accessed from the GitHub repository.