Combinatorial explosion

Last updated

In mathematics, a combinatorial explosion is the rapid growth of the complexity of a problem due to how the combinatorics of the problem is affected by the input, constraints, and bounds of the problem. Combinatorial explosion is sometimes used to justify the intractability of certain problems. [1] [2] Examples of such problems include certain mathematical functions, the analysis of some puzzles and games, and some pathological examples which can be modelled as the Ackermann function.

Contents

Examples

Latin squares

A Latin square of order n is an n×n array with entries from a set of n elements with the property that each element of the set occurs exactly once in each row and each column of the array. An example of a Latin square of order three is given by,

123
231
312

A common example of a Latin square would be a completed Sudoku puzzle. [3] A Latin square is a combinatorial object (as opposed to an algebraic object) since only the arrangement of entries matters and not what the entries actually are. The number of Latin squares as a function of the order (independent of the set from which the entries are drawn) (sequence A002860 in the OEIS ) provides an example of combinatorial explosion as illustrated by the following table.

nThe number of Latin squares of order n
11
22
312
4576
5161,280
6812,851,200
761,479,419,904,000
8108,776,032,459,082,956,800
95,524,751,496,156,892,842,531,225,600
109,982,437,658,213,039,871,725,064,756,920,320,000
11776,966,836,171,770,144,107,444,346,734,230,682,311,065,600,000

Sudoku

A combinatorial explosion can also occur in some puzzles played on a grid, such as Sudoku. [2] A Sudoku is a type of Latin square with the additional property that each element occurs exactly once in sub-sections of size n ×n (called boxes). Combinatorial explosion occurs as n increases, creating limits to the properties of Sudokus that can be constructed, analyzed, and solved, as illustrated in the following table.

nThe number of Sudoku grids of order n
(boxes are sizen×n)
The number of Latin squares of order n
(for comparison)
11 1
4288 [4] 576
96,670,903,752,021,072,936,960 [4] [5] 5,524,751,496,156,892,842,531,225,600
(n = 9 is the commonly played 9×9 Sudoku. The puzzle does not include grids where n is irrational.)

Games

One example in a game where combinatorial complexity leads to a solvability limit is in solving chess (a game with 64 squares and 32 pieces). Chess is not a solved game. In 2005 all chess game endings with six pieces or fewer were solved, showing the result of each position if played perfectly. It took ten more years to complete the tablebase with one more chess piece added, thus completing a 7-piece tablebase. Adding one more piece to a chess ending (thus making an 8-piece tablebase) is considered intractable due to the added combinatorial complexity. [6] [7]

Furthermore, the prospect of solving larger chess-like games becomes more difficult as the board-size is increased, such as in large chess variants, and infinite chess. [8]

Computing

Combinatorial explosion can occur in computing environments in a way analogous to communications and multi-dimensional space. Imagine a simple system with only one variable, a boolean called A. The system has two possible states, A = true or A = false. Adding another boolean variable B will give the system four possible states, A = true and B = true, A = true and B = false, A = false and B = true, A = false and B = false. A system with n booleans has 2n possible states, while a system of n variables each with Z allowed values (rather than just the 2 (true and false) of booleans) will have Zn possible states.

The possible states can be thought of as the leaf nodes of a tree of height n, where each node has Z children. This rapid increase of leaf nodes can be useful in areas like searching, since many results can be accessed without having to descend very far. It can also be a hindrance when manipulating such structures.

A class hierarchy in an object-oriented language can be thought of as a tree, with different types of object inheriting from their parents. If different classes need to be combined, such as in a comparison (like A < B) then the number of possible combinations which may occur explodes. If each type of comparison needs to be programmed then this soon becomes intractable for even small numbers of classes. Multiple inheritance can solve this, by allowing subclasses to have multiple parents, and thus a few parent classes can be considered rather than every child, without disrupting any existing hierarchy.

An example is a taxonomy where different vegetables inherit from their ancestor species. Attempting to compare the tastiness of each vegetable with the others becomes intractable since the hierarchy only contains information about genetics and makes no mention of tastiness. However, instead of having to write comparisons for carrot/carrot, carrot/potato, carrot/sprout, potato/potato, potato/sprout, sprout/sprout, they can all multiply inherit from a separate class of tasty whilst keeping their current ancestor-based hierarchy, then all of the above can be implemented with only a tasty/tasty comparison.

Arithmetic

Suppose we take the factorial of n:

Then 1! = 1, 2! = 2, 3! = 6, and 4! = 24. However, we quickly get to extremely large numbers, even for relatively small n. For example, 100! ≈ 9.33262154×10157, a number so large that it cannot be displayed on most calculators, and vastly larger than the estimated number of fundamental particles in the observable universe. [9]

Communication

Using separate lines of communication, four organizations require six channels 4x2.svg
Using separate lines of communication, four organizations require six channels
Using an intermediary, only one channel per organization is required 4xn.svg
Using an intermediary, only one channel per organization is required

In administration and computing, a combinatorial explosion is the rapidly accelerating increase in communication lines as organizations are added in a process. (This growth is often casually described as "exponential" but is actually polynomial.)

If two organizations need to communicate about a particular topic, it may be easiest to communicate directly in an ad hoc manneronly one channel of communication is required. However, if a third organization is added, three separate channels are required. Adding a fourth organization requires six channels; five, ten; six, fifteen; etc.

In general, it will take communication lines for n organizations, which is just the number of 2-combinations of n elements (see also Binomial coefficient). [10]

The alternative approach is to realize when this communication will not be a one-off requirement, and produce a generic or intermediate way of passing information. The drawback is that this requires more work for the first pair, since each must convert its internal approach to the common one, rather than the superficially easier approach of just understanding the other.

See also

Related Research Articles

In logic and computer science, the Boolean satisfiability problem is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable. On the other hand, if no such assignment exists, the function expressed by the formula is FALSE for all possible variable assignments and the formula is unsatisfiable. For example, the formula "a AND NOT b" is satisfiable because one can find the values a = TRUE and b = FALSE, which make = TRUE. In contrast, "a AND NOT a" is unsatisfiable.

In theoretical computer science, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

In computational complexity theory, a decision problem is PSPACE-complete if it can be solved using an amount of memory that is polynomial in the input length and if every other problem that can be solved in polynomial space can be transformed to it in polynomial time. The problems that are PSPACE-complete can be thought of as the hardest problems in PSPACE, the class of decision problems solvable in polynomial space, because a solution to any one such problem could easily be used to solve any other problem in PSPACE.

In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem's statement.

Backtracking is a general algorithm for finding solutions to some computational problems, notably constraint satisfaction problems, that incrementally builds candidates to the solutions, and abandons a candidate ("backtracks") as soon as it determines that the candidate cannot possibly be completed to a valid solution.

Combinatorial game theory Branch of game theory about two-player sequential games with perfect information

Combinatorial game theory is a branch of mathematics and theoretical computer science that typically studies sequential games with perfect information. Study has been largely confined to two-player games that have a position that the players take turns changing in defined ways or moves to achieve a defined winning condition. Combinatorial game theory has not traditionally studied games of chance or those that use imperfect or incomplete information, favoring games that offer perfect information in which the state of the game and the set of available moves is always known by both players. However, as mathematical techniques advance, the types of game that can be mathematically analyzed expands, thus the boundaries of the field are ever changing. Scholars will generally define what they mean by a "game" at the beginning of a paper, and these definitions often vary as they are specific to the game being analyzed and are not meant to represent the entire scope of the field.

In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input or output. The complexity of a problem is then measured as a function of those parameters. This allows the classification of NP-hard problems on a finer scale than in the classical setting, where the complexity of a problem is only measured as a function of the number of bits in the input. The first systematic work on parameterized complexity was done by Downey & Fellows (1999).

In computational complexity theory, the Cook–Levin theorem, also known as Cook's theorem, states that the Boolean satisfiability problem is NP-complete. That is, it is in NP, and any problem in NP can be reduced in polynomial time by a deterministic Turing machine to the Boolean satisfiability problem.

Generalized game Game generalized so that it can be played on a board or grid of any size

In computational complexity theory, a generalized game is a game or puzzle that has been generalized so that it can be played on a board or grid of any size. For example, generalized chess is the game of chess played on an board, with pieces on each side. Generalized Sudoku includes Sudokus constructed on an grid.

Kakuro

Kakuro or Kakkuro or Kakoro is a kind of logic puzzle that is often referred to as a mathematical transliteration of the crossword. Kakuro puzzles are regular features in many math-and-logic puzzle publications across the world. In 1966, Canadian Jacob E. Funk, an employee of Dell Magazines, came up with the original English name Cross Sums and other names such as Cross Addition have also been used, but the Japanese name Kakuro, abbreviation of Japanese kasan kurosu, seems to have gained general acceptance and the puzzles appear to be titled this way now in most publications. The popularity of Kakuro in Japan is immense, second only to Sudoku among Nikoli's famed logic-puzzle offerings.

Sudoku Logic-based number-placement puzzle

Sudoku is a logic-based, combinatorial number-placement puzzle. In classic Sudoku, the objective is to fill a 9 × 9 grid with digits so that each column, each row, and each of the nine 3 × 3 subgrids that compose the grid contain all of the digits from 1 to 9. The puzzle setter provides a partially completed grid, which for a well-posed puzzle has a single solution.

In computational complexity theory, Karp's 21 NP-complete problems are a set of computational problems which are NP-complete. In his 1972 paper, "Reducibility Among Combinatorial Problems", Richard Karp used Stephen Cook's 1971 theorem that the boolean satisfiability problem is NP-complete to show that there is a polynomial time many-one reduction from the boolean satisfiability problem to each of 21 combinatorial and graph theoretical computational problems, thereby showing that they are all NP-complete. This was one of the first demonstrations that many natural computational problems occurring throughout computer science are computationally intractable, and it drove interest in the study of NP-completeness and the P versus NP problem.

Endgame tablebase Database of precalculated chess analysis

An endgame tablebase is a computerized database that contains precalculated exhaustive analysis of chess endgame positions. It is typically used by a computer chess engine during play, or by a human or computer that is retrospectively analysing a game that has already been played.

Mathematics of Sudoku Mathematical investigation of Sudoku

Sudoku puzzles can be studied mathematically to answer questions such as "How many filled Sudoku grids are there?", "What is the minimal number of clues in a valid puzzle?" and "In what ways can Sudoku grids be symmetric?" through the use of combinatorics and group theory.

Sudoku solving algorithms Algorithms to complete a sudoku

A standard Sudoku contains 81 cells, in a 9×9 grid, and has 9 boxes, each box being the intersection of the first, middle, or last 3 rows, and the first, middle, or last 3 columns. Each cell may contain a number from one to nine, and each number can only occur once in each row, column, and box. A Sudoku starts with some cells containing numbers (clues), and the goal is to solve the remaining cells. Proper Sudokus have one solution. Players and investigators use a wide range of computer algorithms to solve Sudokus, study their properties, and make new puzzles, including Sudokus with interesting symmetries and other properties.

In computational complexity theory, the language TQBF is a formal language consisting of the true quantified Boolean formulas. A (fully) quantified Boolean formula is a formula in quantified propositional logic where every variable is quantified, using either existential or universal quantifiers, at the beginning of the sentence. Such a formula is equivalent to either true or false. If such a formula evaluates to true, then that formula is in the language TQBF. It is also known as QSAT.

In computer science, the Sharp Satisfiability Problem is the problem of counting the number of interpretations that satisfies a given Boolean formula, introduced by Valiant in 1979. In other words, it asks in how many ways the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. For example, the formula is satisfiable by three distinct boolean value assignments of the variables, namely, for any of the assignments, ,
, we have = TRUE.

Solving chess means finding an optimal strategy for the game of chess, that is, one by which one of the players can always force a victory, or either can force a draw. It also means more generally solving chess-like games, such as Capablanca chess and infinite chess. According to Zermelo's theorem, a determinable optimal strategy must exist for chess and chess-like games.

Taking Sudoku Seriously: The math behind the world's most popular pencil puzzle is a book on the mathematics of Sudoku. It was written by Jason Rosenhouse and Laura Taalman, and published in 2011 by the Oxford University Press. The Basic Library List Committee of the Mathematical Association of America has suggested its inclusion in undergraduate mathematics libraries. It was the 2012 winner of the PROSE Awards in the popular science and popular mathematics category.

Games, Puzzles, and Computation is a book on game complexity, written by Robert Hearn and Erik Demaine, and published in 2009 by A K Peters. It is revised from Hearn's doctoral dissertation, which was supervised by Demaine. The Basic Library List Committee of the Mathematical Association of America has recommended it for inclusion in undergraduate mathematics libraries.

References

  1. Krippendorff, Klaus. "Combinatorial Explosion". Web Dictionary of Cybernetics and Systems. PRINCIPIA CYBERNETICA WEB. Retrieved 29 November 2010.
  2. 1 2 http://intelligence.worldofcomputing/combinatorial-explosion Combinatorial Explosion.
  3. All completed puzzles are Latin squares, but not all Latin squares can be completed puzzles since there is additional structure in a Sudoku puzzle.
  4. 1 2 Sloane, N. J. A. (ed.). "SequenceA107739(Number of (completed) sudokus (or Sudokus) of size n^2 X n^2)". The On-Line Encyclopedia of Integer Sequences . OEIS Foundation. Retrieved 14 April 2017.
  5. "Sudoku enumeration problems". Afjarvis.staff.shef.ac.uk. Retrieved 20 October 2013.
  6. http://chessok.com/Lomonosov Endgame Tablebases Lomonosov Endgame Tablebases
  7. "7-piece-endgame-tablebase (chess)". Stack Exchange .
  8. Aviezri Fraenkel; D. Lichtenstein (1981), "Computing a perfect strategy for n×n chess requires time exponential in n", J. Combin. Theory Ser. A, 31 (2): 199–214, doi: 10.1016/0097-3165(81)90016-9
  9. "The Universe By Numbers - The Physics of the Universe". www.physicsoftheuniverse.com. Retrieved 2021-08-27.
  10. Benson, Tim. (2010). Principles of health interoperability HL7 and SNOMED. New York: Springer. p. 23. ISBN   9781848828032. OCLC   663097524.