Greedy algorithm

Last updated
Greedy algorithms determine the minimum number of coins to give while making change. These are the steps most people would take to emulate a greedy algorithm to represent 36 cents using only coins with values {1, 5, 10, 20}. The coin of the highest value, less than the remaining change owed, is the local optimum. (In general, the change-making problem requires dynamic programming to find an optimal solution; however, most currency systems are special cases where the greedy strategy does find an optimal solution.) Greedy algorithm 36 cents.svg
Greedy algorithms determine the minimum number of coins to give while making change. These are the steps most people would take to emulate a greedy algorithm to represent 36 cents using only coins with values {1, 5, 10, 20}. The coin of the highest value, less than the remaining change owed, is the local optimum. (In general, the change-making problem requires dynamic programming to find an optimal solution; however, most currency systems are special cases where the greedy strategy does find an optimal solution.)

A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. [1] In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally optimal solutions that approximate a globally optimal solution in a reasonable amount of time.

Contents

For example, a greedy strategy for the travelling salesman problem (which is of high computational complexity) is the following heuristic: "At each step of the journey, visit the nearest unvisited city." This heuristic does not intend to find the best solution, but it terminates in a reasonable number of steps; finding an optimal solution to such a complex problem typically requires unreasonably many steps. In mathematical optimization, greedy algorithms optimally solve combinatorial problems having the properties of matroids and give constant-factor approximations to optimization problems with the submodular structure.

Specifics

Greedy algorithms produce good solutions on some mathematical problems, but not on others. Most problems for which they work will have two properties:

Greedy choice property
We can make whatever choice seems best at the moment and then solve the subproblems that arise later. The choice made by a greedy algorithm may depend on choices made so far, but not on future choices or all the solutions to the subproblem. It iteratively makes one greedy choice after another, reducing each given problem into a smaller one. In other words, a greedy algorithm never reconsiders its choices. This is the main difference from dynamic programming, which is exhaustive and is guaranteed to find the solution. After every stage, dynamic programming makes decisions based on all the decisions made in the previous stage and may reconsider the previous stage's algorithmic path to the solution.
Optimal substructure
"A problem exhibits optimal substructure if an optimal solution to the problem contains optimal solutions to the sub-problems." [2]

Cases of failure

Examples on how a greedy algorithm may fail to achieve the optimal solution.
Greedy Glouton.svg
Starting from A, a greedy algorithm that tries to find the maximum by following the greatest slope will find the local maximum at "m", oblivious to the global maximum at "M".
Greedy-search-path-example.gif
To reach the largest sum, at each step, the greedy algorithm will choose what appears to be the optimal immediate choice, so it will choose 12 instead of 3 at the second step, and will not reach the best solution, which contains 99.

Greedy algorithms fail to produce the optimal solution for many other problems and may even produce the unique worst possible solution. One example is the travelling salesman problem mentioned above: for each number of cities, there is an assignment of distances between the cities for which the nearest-neighbour heuristic produces the unique worst possible tour. [3] For other possible examples, see horizon effect.

Types

Greedy algorithms can be characterized as being 'short sighted', and also as 'non-recoverable'. They are ideal only for problems that have an 'optimal substructure'. Despite this, for many simple problems, the best-suited algorithms are greedy. It is important, however, to note that the greedy algorithm can be used as a selection algorithm to prioritize options within a search, or branch-and-bound algorithm. There are a few variations to the greedy algorithm:

Theory

Greedy algorithms have a long history of study in combinatorial optimization and theoretical computer science. Greedy heuristics are known to produce suboptimal results on many problems, [4] and so natural questions are:

A large body of literature exists answering these questions for general classes of problems, such as matroids, as well as for specific problems, such as set cover.

Matroids

A matroid is a mathematical structure that generalizes the notion of linear independence from vector spaces to arbitrary sets. If an optimization problem has the structure of a matroid, then the appropriate greedy algorithm will solve it optimally. [5]

Submodular functions

A function defined on subsets of a set is called submodular if for every we have that .

Suppose one wants to find a set which maximizes . The greedy algorithm, which builds up a set by incrementally adding the element which increases the most at each step, produces as output a set that is at least . [6] That is, greedy performs within a constant factor of as good as the optimal solution.

Similar guarantees are provable when additional constraints, such as cardinality constraints, [7] are imposed on the output, though often slight variations on the greedy algorithm are required. See [8] for an overview.

Other problems with guarantees

Other problems for which the greedy algorithm gives a strong guarantee, but not an optimal solution, include

Many of these problems have matching lower bounds; i.e., the greedy algorithm does not perform better than the guarantee in the worst case.

Applications

Greedy algorithms typically (but not always) fail to find the globally optimal solution because they usually do not operate exhaustively on all the data. They can make commitments to certain choices too early, preventing them from finding the best overall solution later. For example, all known greedy coloring algorithms for the graph coloring problem and all other NP-complete problems do not consistently find optimum solutions. Nevertheless, they are useful because they are quick to think up and often give good approximations to the optimum.

If a greedy algorithm can be proven to yield the global optimum for a given problem class, it typically becomes the method of choice because it is faster than other optimization methods like dynamic programming. Examples of such greedy algorithms are Kruskal's algorithm and Prim's algorithm for finding minimum spanning trees and the algorithm for finding optimum Huffman trees.

Greedy algorithms appear in the network routing as well. Using greedy routing, a message is forwarded to the neighbouring node which is "closest" to the destination. The notion of a node's location (and hence "closeness") may be determined by its physical location, as in geographic routing used by ad hoc networks. Location may also be an entirely artificial construct as in small world routing and distributed hash table.

Examples

See also

Related Research Articles

<span class="mw-page-title-main">Travelling salesman problem</span> NP-hard problem in combinatorial optimization

The travelling salesman problem, also known as the travelling salesperson problem (TSP), asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

The nearest neighbor algorithm was one of the first algorithms used to solve the travelling salesman problem approximately. In that problem, the salesman starts at a random city and repeatedly visits the nearest city until all have been visited. The algorithm quickly yields a short tour, but usually not the optimal one.

<span class="mw-page-title-main">Mathematical optimization</span> Study of mathematical algorithms for optimization problems

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

In computer science, local search is a heuristic method for solving computationally hard optimization problems. Local search can be used on problems that can be formulated as finding a solution maximizing a criterion among a number of candidate solutions. Local search algorithms move from solution to solution in the space of candidate solutions by applying local changes, until a solution deemed optimal is found or a time bound is elapsed.

<span class="mw-page-title-main">Combinatorial optimization</span> Subfield of mathematical optimization

Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combinatorial optimization problems are the travelling salesman problem ("TSP"), the minimum spanning tree problem ("MST"), and the knapsack problem. In many such problems, such as the ones previously mentioned, exhaustive search is not tractable, and so specialized algorithms that quickly rule out large parts of the search space or approximation algorithms must be resorted to instead.

Branch and bound is a method for solving optimization problems by breaking them down into smaller sub-problems and using a bounding function to eliminate sub-problems that cannot contain the optimal solution. It is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems with provable guarantees on the distance of the returned solution to the optimal one. Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture. Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time. The field of approximation algorithms, therefore, tries to understand how closely it is possible to approximate optimal solutions to such problems in polynomial time. In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution. However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution. A notable example of an approximation algorithm that provides both is the classic approximation algorithm of Lenstra, Shmoys and Tardos for scheduling on unrelated parallel machines.

In combinatorics, a greedoid is a type of set system. It arises from the notion of the matroid, which was originally introduced by Whitney in 1935 to study planar graphs and was later used by Edmonds to characterize a class of optimization problems that can be solved by greedy algorithms. Around 1980, Korte and Lovász introduced the greedoid to further generalize this characterization of greedy algorithms; hence the name greedoid. Besides mathematical optimization, greedoids have also been connected to graph theory, language theory, order theory, and other areas of mathematics.

In mathematics, engineering, computer science and economics, an optimization problem is the problem of finding the best solution from all feasible solutions.

<span class="mw-page-title-main">Jack Edmonds</span> American/Canadian mathematician and computer scientist

Jack R. Edmonds is an American-born and educated computer scientist and mathematician who lived and worked in Canada for much of his life. He has made fundamental contributions to the fields of combinatorial optimization, polyhedral combinatorics, discrete mathematics and the theory of computing. He was the recipient of the 1985 John von Neumann Theory Prize.

<span class="mw-page-title-main">Linear programming relaxation</span>

In mathematics, the relaxation of a (mixed) integer linear program is the problem that arises by removing the integrality constraint of each variable.

In combinatorics, a branch of mathematics, a weighted matroid is a matroid endowed with a function that assigns a weight to each element. Formally, let be a matroid, where E is the set of elements and I is the family of independent set. A weighted matroid has a weight function for assigns a strictly positive weight to each element of . We extend the function to subsets of by summation; is the sum of over in .

In mathematical optimization and computer science, heuristic is a technique designed for problem solving more quickly when classic methods are too slow for finding an exact or approximate solution, or when classic methods fail to find any exact solution in a search space. This is achieved by trading optimality, completeness, accuracy, or precision for speed. In a way, it can be considered a shortcut.

In combinatorial optimization, the matroid intersection problem is to find a largest common independent set in two matroids over the same ground set. If the elements of the matroid are assigned real weights, the weighted matroid intersection problem is to find a common independent set with the maximum possible weight. These problems generalize many problems in combinatorial optimization including finding maximum matchings and maximum weight matchings in bipartite graphs and finding arborescences in directed graphs.

In mathematics, a submodular set function is a set function that, informally, describes the relationship between a set of inputs and an output, where adding more of one input has a decreasing additional benefit. The natural diminishing returns property which makes them suitable for many applications, including approximation algorithms, game theory and electrical networks. Recently, submodular functions have also found utility in several real world problems in machine learning and artificial intelligence, including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization and many other domains.

In the mathematical theory of matroids, the rank of a matroid is the maximum size of an independent set in the matroid. The rank of a subset S of elements of the matroid is, similarly, the maximum size of an independent subset of S, and the rank function of the matroid maps sets of elements to their ranks.

<span class="mw-page-title-main">Matroid parity problem</span> Largest independent set of paired elements

In combinatorial optimization, the matroid parity problem is a problem of finding the largest independent set of paired elements in a matroid. The problem was formulated by Lawler (1976) as a common generalization of graph matching and matroid intersection. It is also known as polymatroid matching, or the matchoid problem.

The vertex k-center problem is a classical NP-hard problem in computer science. It has application in facility location and clustering. Basically, the vertex k-center problem models the following real problem: given a city with facilities, find the best facilities where to build fire stations. Since firemen must attend any emergency as quickly as possible, the distance from the farthest facility to its nearest fire station has to be as small as possible. In other words, the position of the fire stations must be such that every possible fire is attended as quickly as possible.

Egalitarian item allocation, also called max-min item allocation is a fair item allocation problem, in which the fairness criterion follows the egalitarian rule. The goal is to maximize the minimum value of an agent. That is, among all possible allocations, the goal is to find an allocation in which the smallest value of an agent is as large as possible. In case there are two or more allocations with the same smallest value, then the goal is to select, from among these allocations, the one in which the second-smallest value is as large as possible, and so on. Therefore, an egalitarian item allocation is sometimes called a leximin item allocation.

The welfare maximization problem is an optimization problem studied in economics and computer science. Its goal is to partition a set of items among agents with different utility functions, such that the welfare – defined as the sum of the agents' utilities – is as high as possible. In other words, the goal is to find an item allocation satisfying the utilitarian rule.

References

  1. Black, Paul E. (2 February 2005). "greedy algorithm". Dictionary of Algorithms and Data Structures. U.S. National Institute of Standards and Technology (NIST). Retrieved 17 August 2012.
  2. Cormen et al. 2001 , Ch. 16
  3. Gutin, Gregory; Yeo, Anders; Zverovich, Alexey (2002). "Traveling salesman should not be greedy: Domination analysis of greedy-type heuristics for the TSP". Discrete Applied Mathematics. 117 (1–3): 81–86. doi: 10.1016/S0166-218X(01)00195-0 .
  4. Feige 1998
  5. Papadimitriou & Steiglitz 1998
  6. Nemhauser, Wolsey & Fisher 1978
  7. Buchbinder et al. 2014
  8. Krause & Golovin 2014
  9. "Lecture 5: Introduction to Approximation Algorithms" (PDF). Advanced Algorithms (2IL45) — Course Notes. TU Eindhoven. Archived (PDF) from the original on 2022-10-09.

Sources