Algorithm engineering

Last updated

Algorithm engineering focuses on the design, analysis, implementation, optimization, profiling and experimental evaluation of computer algorithms, bridging the gap between algorithmics theory and practical applications of algorithms in software engineering. [1] It is a general methodology for algorithmic research. [2]

Contents

Origins

In 1995, a report from an NSF-sponsored workshop "with the purpose of assessing the current goals and directions of the Theory of Computing (TOC) community" identified the slow speed of adoption of theoretical insights by practitioners as an important issue and suggested measures to

But also, promising algorithmic approaches have been neglected due to difficulties in mathematical analysis. [2]

The term "algorithm engineering" was first used with specificity in 1997, with the first Workshop on Algorithm Engineering (WAE97), organized by Giuseppe F. Italiano. [4]

Difference from algorithm theory

Algorithm engineering does not intend to replace or compete with algorithm theory, but tries to enrich, refine and reinforce its formal approaches with experimental algorithmics (also called empirical algorithmics).

This way it can provide new insights into the efficiency and performance of algorithms in cases where

Methodology

Some researchers describe algorithm engineering's methodology as a cycle consisting of algorithm design, analysis, implementation and experimental evaluation, joined by further aspects like machine models or realistic inputs. They argue that equating algorithm engineering with experimental algorithmics is too limited, because viewing design and analysis, implementation and experimentation as separate activities ignores the crucial feedback loop between those elements of algorithm engineering. [2]

Realistic models and real inputs

While specific applications are outside the methodology of algorithm engineering, they play an important role in shaping realistic models of the problem and the underlying machine, and supply real inputs and other design parameters for experiments. [2]

Design

Compared to algorithm theory, which usually focuses on the asymptotic behavior of algorithms, algorithm engineers need to keep further requirements in mind: Simplicity of the algorithm, implementability in programming languages on real hardware, and allowing code reuse. Additionally, constant factors of algorithms have such a considerable impact on real-world inputs that sometimes an algorithm with worse asymptotic behavior performs better in practice due to lower constant factors.

Analysis

Some problems can be solved with heuristics and randomized algorithms in a simpler and more efficient fashion than with deterministic algorithms. Unfortunately, this makes even simple randomized algorithms difficult to analyze because there are subtle dependencies to be taken into account. [2]

Implementation

Huge semantic gaps between theoretical insights, formulated algorithms, programming languages and hardware pose a challenge to efficient implementations of even simple algorithms, because small implementation details can have rippling effects on execution behavior. The only reliable way to compare several implementations of an algorithm is to spend an considerable amount of time on tuning and profiling, running those algorithms on multiple architectures, and looking at the generated machine code. [2]

Experiments

See: Experimental algorithmics

Application engineering

Implementations of algorithms used for experiments differ in significant ways from code usable in applications. While the former prioritizes fast prototyping, performance and instrumentation for measurements during experiments, the latter requires thorough testing, maintainability, simplicity, and tuning for particular classes of inputs. [2]

Algorithm libraries

Stable, well-tested algorithm libraries like LEDA play an important role in technology transfer by speeding up the adoption of new algorithms in applications. Such libraries reduce the required investment and risk for practitioners, because it removes the burden of understanding and implementing the results of academic research.

Conferences

Two main conferences on Algorithm Engineering are organized annually, namely:

The 1997 Workshop on Algorithm Engineering (WAE'97) was held in Venice (Italy) on September 11–13, 1997. The Third International Workshop on Algorithm Engineering (WAE'99) was held in London, UK in July 1999. [6] The first Workshop on Algorithm Engineering and Experimentation (ALENEX99) was held in Baltimore, Maryland on January 15–16, 1999. [7] It was sponsored by DIMACS, the Center for Discrete Mathematics and Theoretical Computer Science (at Rutgers University), with additional support from SIGACT, the ACM Special Interest Group on Algorithms and Computation Theory, and SIAM, the Society for Industrial and Applied Mathematics. [7]

Related Research Articles

<span class="mw-page-title-main">Analysis of algorithms</span> Study of resources used by an algorithm

In computer science, the analysis of algorithms is the process of finding the computational complexity of algorithms—the amount of time, storage, or other resources needed to execute them. Usually, this involves determining a function that relates the size of an algorithm's input to the number of steps it takes or the number of storage locations it uses. An algorithm is said to be efficient when this function's values are small, or grow slowly compared to a growth in the size of the input. Different inputs of the same size may cause the algorithm to have different behavior, so best, worst and average case descriptions might all be of practical interest. When not otherwise specified, the function describing the performance of an algorithm is usually an upper bound, determined from the worst case inputs to the algorithm.

<span class="mw-page-title-main">Computer science</span> Study of computation

Computer science is the study of computation, information, and automation. Computer science spans theoretical disciplines to applied disciplines. Though more often considered an academic discipline, computer science is closely related to computer programming.

Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. The name comes from the Monte Carlo Casino in Monaco, where the primary developer of the method, physicist Stanislaw Ulam, was inspired by his uncle's gambling habits.

<span class="mw-page-title-main">Computational physics</span> Numerical simulations of physical problems via computers

Computational physics is the study and implementation of numerical analysis to solve problems in physics. Historically, computational physics was the first application of modern computers in science, and is now a subset of computational science. It is sometimes regarded as a subdiscipline of theoretical physics, but others consider it an intermediate branch between theoretical and experimental physics — an area of study which supplements both theory and experiment.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

<span class="mw-page-title-main">Theoretical computer science</span> Subfield of computer science and mathematics

Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, formal language theory, the lambda calculus and type theory.

<span class="mw-page-title-main">System identification</span> Statistical methods to build mathematical models of dynamical systems from measured data

The field of system identification uses statistical methods to build mathematical models of dynamical systems from measured data. System identification also includes the optimal design of experiments for efficiently generating informative data for fitting such models as well as model reduction. A common approach is to start from measurements of the behavior of the system and the external influences and try to determine a mathematical relation between them without going into many details of what is actually happening inside the system; this approach is called black box system identification.

Design for Six Sigma (DFSS) is a collection of best-practices for the development of new products and processes. It is sometimes deployed as an engineering design process or business process management method. DFSS originated at General Electric to build on the success they had with traditional Six Sigma; but instead of process improvement, DFSS was made to target new product development. It is used in many industries, like finance, marketing, basic engineering, process industries, waste management, and electronics. It is based on the use of statistical tools like linear regression and enables empirical research similar to that performed in other fields, such as social science. While the tools and order used in Six Sigma require a process to be in place and functioning, DFSS has the objective of determining the needs of customers and the business, and driving those needs into the product solution so created. It is used for product or process design in contrast with process improvement. Measurement is the most important part of most Six Sigma or DFSS tools, but whereas in Six Sigma measurements are made from an existing process, DFSS focuses on gaining a deep insight into customer needs and using these to inform every design decision and trade-off.

Computational science, also known as scientific computing, technical computing or scientific computation (SC), is a division of science that uses advanced computing capabilities to understand and solve complex physical problems. This includes

In calculus, symbolic integration is the problem of finding a formula for the antiderivative, or indefinite integral, of a given function f(x), i.e. to find a formula for a differentiable function F(x) such that

In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor worse than the best possible algorithm. It is a term commonly encountered in computer science research as a result of widespread use of big-O notation.

The Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) is a collaboration between Rutgers University, Princeton University, and the research firms AT&T, Bell Labs, Applied Communication Sciences, and NEC. It was founded in 1989 with money from the National Science Foundation. Its offices are located on the Rutgers campus, and 250 members from the six institutions form its permanent members.

A surrogate model is an engineering method used when an outcome of interest cannot be easily measured or computed, so an approximate mathematical model of the outcome is used instead. Most engineering design problems require experiments and/or simulations to evaluate design objective and constraint functions as a function of design variables. For example, in order to find the optimal airfoil shape for an aircraft wing, an engineer simulates the airflow around the wing for different shape variables. For many real-world problems, however, a single simulation can take many minutes, hours, or even days to complete. As a result, routine tasks such as design optimization, design space exploration, sensitivity analysis and "what-if" analysis become impossible since they require thousands or even millions of simulation evaluations.

<span class="mw-page-title-main">The William Davidson Faculty of Industrial Engineering & Management</span>

The Faculty of Data and Decision Sciences is an academic faculty of the Technion and the oldest such department in Israel. The department is currently headed by Prof. Rann Smorodinsky and is based in the Cooper and Bloomfield buildings at Technion City. The department employs 52 faculty members, who as of 2023 served a total of 500 graduate and 1000 undergraduate students.

The European Symposium on Algorithms (ESA) is an international conference covering the field of algorithms. It has been held annually since 1993, typically in early Autumn in a different European location each year. Like most theoretical computer science conferences its contributions are strongly peer-reviewed; the articles appear in proceedings published in Springer Lecture Notes in Computer Science. Acceptance rate of ESA is 24% in 2012 in both Design and Analysis and Engineering and Applications tracks.

In computer science, empirical algorithmics is the practice of using empirical methods to study the behavior of algorithms. The practice combines algorithm development and experimentation: algorithms are not just designed, but also implemented and tested in a variety of situations. In this process, an initial design of an algorithm is analyzed so that the algorithm may be developed in a stepwise manner.

<span class="mw-page-title-main">Applied mathematics</span> Application of mathematical methods to other fields

Applied mathematics is the application of mathematical methods by different fields such as physics, engineering, medicine, biology, finance, business, computer science, and industry. Thus, applied mathematics is a combination of mathematical science and specialized knowledge. The term "applied mathematics" also describes the professional specialty in which mathematicians work on practical problems by formulating and studying mathematical models.

<span class="mw-page-title-main">JFLAP</span>

JFLAP is interactive educational software written in Java for experimenting with topics in the computer science area of formal languages and automata theory, primarily intended for use at the undergraduate level or as an advanced topic for high school. JFLAP allows one to create and simulate structures, such as programming a finite state machine, and experiment with proofs, such as converting a nondeterministic finite automaton (NFA) to a deterministic finite automaton (DFA).

This glossary of computer science is a list of definitions of terms and concepts used in computer science, its sub-disciplines, and related fields, including terms relevant to software, data science, and computer programming.

Derivative Based Global Sensitivity Measures (DGSM) is a technique used in global sensitivity analysis to identify the importance of different subsets of input variables to variation in model output. It is gaining popularity among practitioners due to its link with Sobol’ sensitivity indices and easiness of implementation. The computational time required for DGSM is usually much less than that needed for estimating Sobol’ sensitivity indices.

References

  1. 1 2 "Algorithm Engineering", Camil Demetrescu, Irene Finocchi, Giuseppe F. Italiano, web: http://www.dis.uniroma1.it/~demetres/docs/ae.pdf
  2. 1 2 3 4 5 6 7 "Algorithm Engineering – An Attempt at a Definition", Peter Sanders, web: http://algo2.iti.kit.edu/documents/definition.pdf
  3. "Emerging Opportunities for Theoretical Computer Science", Aho, Johnson, Karp, Kosaraju, McGeoch, Papadimitriou, web: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.9160
  4. "Workshop on Algorithm Engineering". Archived from the original on 2018-09-27. Retrieved 2009-09-14.
  5. "Towards a Discipline of Experimental Algorithmics", Bernard M. E. Moret, web: http://infoscience.epfl.ch/record/97865/files/dimacs_algorithmics.pdf
  6. Algorithm engineering: 3rd International Workshop, Jeffrey Scott Vitter, Christos D. Zaroliagis, 1999, web: BGoogle-sC.
  7. 1 2 "Workshop on Algorithm Engineering and Experiments" (overview), JHU.edu, 1999, web: JHU-ALENEX99.