Complexity characterizes the behavior of a system or model whose components interact in multiple ways and follow local rules, leading to non-linearity, randomness, collective dynamics, hierarchy, and emergence. [1] [2]
The term is generally used to characterize something with many parts where those parts interact with each other in multiple ways, culminating in a higher order of emergence greater than the sum of its parts. The study of these complex linkages at various scales is the main goal of complex systems theory.
The intuitive criterion of complexity can be formulated as follows: a system would be more complex if more parts could be distinguished, and if more connections between them existed. [3]
As of 2010 [update] , a number of approaches to characterizing complexity have been used in science; Zayed et al. [4] reflect many of these. Neil Johnson states that "even among scientists, there is no unique definition of complexity – and the scientific notion has traditionally been conveyed using particular examples..." Ultimately Johnson adopts the definition of "complexity science" as "the study of the phenomena which emerge from a collection of interacting objects". [5]
Definitions of complexity often depend on the concept of a "system" – a set of parts or elements that have relationships among them differentiated from relationships with other elements outside the relational regime. Many definitions tend to postulate or assume that complexity expresses a condition of numerous elements in a system and numerous forms of relationships among the elements. However, what one sees as complex and what one sees as simple is relative and changes with time.
Warren Weaver posited in 1948 two forms of complexity: disorganized complexity, and organized complexity. [6] Phenomena of 'disorganized complexity' are treated using probability theory and statistical mechanics, while 'organized complexity' deals with phenomena that escape such approaches and confront "dealing simultaneously with a sizable number of factors which are interrelated into an organic whole". [6] Weaver's 1948 paper has influenced subsequent thinking about complexity. [7]
The approaches that embody concepts of systems, multiple elements, multiple relational regimes, and state spaces might be summarized as implying that complexity arises from the number of distinguishable relational regimes (and their associated state spaces) in a defined system.
Some definitions relate to the algorithmic basis for the expression of a complex phenomenon or model or mathematical expression, as later set out herein.
One of the problems in addressing complexity issues has been formalizing the intuitive conceptual distinction between the large number of variances in relationships extant in random collections, and the sometimes large, but smaller, number of relationships between elements in systems where constraints (related to correlation of otherwise independent elements) simultaneously reduce the variations from element independence and create distinguishable regimes of more-uniform, or correlated, relationships, or interactions.
Weaver perceived and addressed this problem, in at least a preliminary way, in drawing a distinction between "disorganized complexity" and "organized complexity".
In Weaver's view, disorganized complexity results from the particular system having a very large number of parts, say millions of parts, or many more. Though the interactions of the parts in a "disorganized complexity" situation can be seen as largely random, the properties of the system as a whole can be understood by using probability and statistical methods.
A prime example of disorganized complexity is a gas in a container, with the gas molecules as the parts. Some would suggest that a system of disorganized complexity may be compared with the (relative) simplicity of planetary orbits – the latter can be predicted by applying Newton's laws of motion. Of course, most real-world systems, including planetary orbits, eventually become theoretically unpredictable even using Newtonian dynamics; as discovered by modern chaos theory. [8]
Organized complexity, in Weaver's view, resides in nothing else than the non-random, or correlated, interaction between the parts. These correlated relationships create a differentiated structure that can, as a system, interact with other systems. The coordinated system manifests properties not carried or dictated by individual parts. The organized aspect of this form of complexity in regards to other systems, rather than the subject system, can be said to "emerge," without any "guiding hand".
The number of parts does not have to be very large for a particular system to have emergent properties. A system of organized complexity may be understood in its properties (behavior among the properties) through modeling and simulation, particularly modeling and simulation with computers. An example of organized complexity is a city neighborhood as a living mechanism, with the neighborhood people among the system's parts. [9]
There are generally rules which can be invoked to explain the origin of complexity in a given system.
The source of disorganized complexity is the large number of parts in the system of interest, and the lack of correlation between elements in the system.
In the case of self-organizing living systems, usefully organized complexity comes from beneficially mutated organisms being selected to survive by their environment for their differential reproductive ability or at least success over inanimate matter or less organized complex organisms. See e.g. Robert Ulanowicz's treatment of ecosystems. [10]
Complexity of an object or system is a relative property. For instance, for many functions (problems), such a computational complexity as time of computation is smaller when multitape Turing machines are used than when Turing machines with one tape are used. Random Access Machines allow one to even more decrease time complexity (Greenlaw and Hoover 1998: 226), while inductive Turing machines can decrease even the complexity class of a function, language or set (Burgin 2005). This shows that tools of activity can be an important factor of complexity.
In several scientific fields, "complexity" has a precise meaning:
Other fields introduce less precisely defined notions of complexity:
Complexity has always been a part of our environment, and therefore many scientific fields have dealt with complex systems and phenomena. From one perspective, that which is somehow complex – displaying variation without being random – is most worthy of interest given the rewards found in the depths of exploration.
The use of the term complex is often confused with the term complicated. In today's systems, this is the difference between myriad connecting "stovepipes" and effective "integrated" solutions. [17] This means that complex is the opposite of independent, while complicated is the opposite of simple.
While this has led some fields to come up with specific definitions of complexity, there is a more recent movement to regroup observations from different fields to study complexity in itself, whether it appears in anthills, human brains or social systems. [18] One such interdisciplinary group of fields is relational order theories.
The behavior of a complex system is often said to be due to emergence and self-organization. Chaos theory has investigated the sensitivity of systems to variations in initial conditions as one cause of complex behaviour.
Recent developments in artificial life, evolutionary computation and genetic algorithms have led to an increasing emphasis on complexity and complex adaptive systems.
In social science, the study on the emergence of macro-properties from the micro-properties, also known as macro-micro view in sociology. The topic is commonly recognized as social complexity that is often related to the use of computer simulation in social science, i.e. computational sociology.
Systems theory has long been concerned with the study of complex systems (in recent times, complexity theory and complex systems have also been used as names of the field). These systems are present in the research of a variety disciplines, including biology, economics, social studies and technology. Recently, complexity has become a natural domain of interest of real world socio-cognitive systems and emerging systemics research. Complex systems tend to be high-dimensional, non-linear, and difficult to model. In specific circumstances, they may exhibit low-dimensional behaviour.
In information theory, algorithmic information theory is concerned with the complexity of strings of data.
Complex strings are harder to compress. While intuition tells us that this may depend on the codec used to compress a string (a codec could be theoretically created in any arbitrary language, including one in which the very small command "X" could cause the computer to output a very complicated string like "18995316"), any two Turing-complete languages can be implemented in each other, meaning that the length of two encodings in different languages will vary by at most the length of the "translation" language – which will end up being negligible for sufficiently large data strings.
These algorithmic measures of complexity tend to assign high values to random noise. However, under a certain understanding of complexity, arguably the most intuitive one, random noise is meaningless and so not complex at all.
Information entropy is also sometimes used in information theory as indicative of complexity, but entropy is also high for randomness. In the case of complex systems, information fluctuation complexity was designed so as not to measure randomness as complex and has been useful in many applications. More recently, a complexity metric was developed for images that can avoid measuring noise as complex by using the minimum description length principle. [19]
There has also been interest in measuring the complexity of classification problems in supervised machine learning. This can be useful in meta-learning to determine for which data sets filtering (or removing suspected noisy instances from the training set) is the most beneficial [20] and could be expanded to other areas. For binary classification, such measures can consider the overlaps in feature values from differing classes, the separability of the classes, and measures of geometry, topology, and density of manifolds. [21]
For non-binary classification problems, instance hardness [22] is a bottom-up approach that first seeks to identify instances that are likely to be misclassified (assumed to be the most complex). The characteristics of such instances are then measured using supervised measures such as the number of disagreeing neighbors or the likelihood of the assigned class label given the input features.
A recent study based on molecular simulations and compliance constants describes molecular recognition as a phenomenon of organisation. [23] Even for small molecules like carbohydrates, the recognition process can not be predicted or designed even assuming that each individual hydrogen bond's strength is exactly known.
Driving from the law of requisite variety, Boisot and McKelvey formulated the ‘Law of Requisite Complexity’, that holds that, in order to be efficaciously adaptive, the internal complexity of a system must match the external complexity it confronts. [24]
The application in project management of the Law of Requisite Complexity, as proposed by Stefan Morcov, is the analysis of positive, appropriate and negative complexity. [25] [26]
Project complexity is the property of a project which makes it difficult to understand, foresee, and keep under control its overall behavior, even when given reasonably complete information about the project system. [27] [28]
Maik Maurer considers complexity as a reality in engineering. He proposed a methodology for managing complexity in systems engineering [29] :
1. Define the system.
2. Identify the type of complexity.
3. Determine the strategy.
4. Determine the method.
5. Model the system.
6. Implement the method.
Computational complexity theory is the study of the complexity of problems – that is, the difficulty of solving them. Problems can be classified by complexity class according to the time it takes for an algorithm – usually a computer program – to solve them as a function of the problem size. Some problems are difficult to solve, while others are easy. For example, some difficult problems need algorithms that take an exponential amount of time in terms of the size of the problem to solve. Take the travelling salesman problem, for example. It can be solved, as denoted in Big O notation, in time (where n is the size of the network to visit – the number of cities the travelling salesman must visit exactly once). As the size of the network of cities grows, the time needed to find the route grows (more than) exponentially.
Even though a problem may be computationally solvable in principle, in actual practice it may not be that simple. These problems might require large amounts of time or an inordinate amount of space. Computational complexity may be approached from many different aspects. Computational complexity can be investigated on the basis of time, memory or other resources used to solve the problem. Time and space are two of the most important and popular considerations when problems of complexity are analyzed.
There exist a certain class of problems that although they are solvable in principle they require so much time or space that it is not practical to attempt to solve them. These problems are called intractable.
There is another form of complexity called hierarchical complexity. It is orthogonal to the forms of complexity discussed so far, which are called horizontal complexity.
The concept of complexity is being increasingly used in the study of cosmology, big history, and cultural evolution with increasing granularity, as well as increasing quantification.
Eric Chaisson has advanced a cosmoglogical complexity [30] metric which he terms Energy Rate Density. [31] This approach has been expanded in various works, most recently applied to measuring evolving complexity of nation-states and their growing cities. [32]
In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory.
In the computer science subfield of algorithmic information theory, a Chaitin constant or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will halt. These numbers are formed from a construction due to Gregory Chaitin.
In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time and memory storage requirements. The complexity of a problem is the complexity of the best algorithms that allow solving the problem.
In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and explores the relationships between these classifications. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.
Information theory is the mathematical study of the quantification, storage, and communication of information. The field was established and put on a firm footing by Claude Shannon in the 1940s, though early contributions were made in the 1920s through the works of Harry Nyquist and Ralph Hartley. It is at the intersection of electronic engineering, mathematics, statistics, computer science, neurobiology, physics, and electrical engineering.
In information theory, the entropy of a random variable quantifies the average level of uncertainty or information associated with the variable's potential states or possible outcomes. This measures the expected amount of information needed to describe the state of the variable, considering the distribution of probabilities across all potential states. Given a discrete random variable , which takes values in the set and is distributed according to , the entropy is where denotes the sum over the variable's possible values. The choice of base for , the logarithm, varies for different applications. Base 2 gives the unit of bits, while base e gives "natural units" nat, and base 10 gives units of "dits", "bans", or "hartleys". An equivalent definition of entropy is the expected value of the self-information of a variable.
Quantum information is the information of the state of a quantum system. It is the basic entity of study in quantum information theory, and can be manipulated using quantum information processing techniques. Quantum information refers to both the technical definition in terms of Von Neumann entropy and the general computational term.
In philosophy, systems theory, science, and art, emergence occurs when a complex entity has properties or behaviors that its parts do not have on their own, and emerge only when they interact in a wider whole.
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems via biologically inspired operators such as selection, crossover, and mutation. Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, hyperparameter optimization, and causal inference.
The concept of a random sequence is essential in probability theory and statistics. The concept generally relies on the notion of a sequence of random variables and many statistical discussions begin with the words "let X1,...,Xn be independent random variables...". Yet as D. H. Lehmer stated in 1951: "A random sequence is a vague notion... in which each term is unpredictable to the uninitiated and whose digits pass a certain number of tests traditional with statisticians".
Reductionism is any of several related philosophical ideas regarding the associations between phenomena which can be described in terms of simpler or more fundamental phenomena. It is also described as an intellectual and philosophical position that interprets a complex system as the sum of its parts.
Theoretical computer science is a subfield of computer science and mathematics that focuses on the abstract and mathematical foundations of computation.
Ray Solomonoff was an American mathematician who invented algorithmic probability, his General Theory of Inductive Inference, and was a founder of algorithmic information theory. He was an originator of the branch of artificial intelligence based on machine learning, prediction and probability. He circulated the first report on non-semantic machine learning in 1956.
In algorithmic information theory, algorithmic probability, also known as Solomonoff probability, is a mathematical method of assigning a prior probability to a given observation. It was invented by Ray Solomonoff in the 1960s. It is used in inductive inference theory and analyses of algorithms. In his general theory of inductive inference, Solomonoff uses the method together with Bayes' rule to obtain probabilities of prediction for an algorithm's future outputs.
Solomonoff's theory of inductive inference proves that, under its common sense assumptions (axioms), the best possible scientific model is the shortest algorithm that generates the empirical data under consideration. In addition to the choice of data, other assumptions are that, to avoid the post-hoc fallacy, the programming language must be chosen prior to the data and that the environment being observed is generated by an unknown algorithm. This is also called a theory of induction. Due to its basis in the dynamical character of Algorithmic Information Theory, it encompasses statistical as well as dynamical information criteria for model selection. It was introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory.
Specified complexity is a creationist argument introduced by William Dembski, used by advocates to promote the pseudoscience of intelligent design. According to Dembski, the concept can formalize a property that singles out patterns that are both specified and complex, where in Dembski's terminology, a specified pattern is one that admits short descriptions, whereas a complex pattern is one that is unlikely to occur by chance. An example cited by Dembski is a poker hand, where for example the repeated appearance of a royal flush will raise suspicion of cheating. Proponents of intelligent design use specified complexity as one of their two main arguments, along with irreducible complexity.
In computational complexity and optimization the no free lunch theorem is a result that states that for certain types of mathematical problems, the computational cost of finding a solution, averaged over all problems in the class, is the same for any solution method. The name alludes to the saying "no such thing as a free lunch", that is, no method offers a "short cut". This is under the assumption that the search space is a probability density function. It does not apply to the case where the search space has underlying structure that can be exploited more efficiently than random search or even has closed-form solutions that can be determined without search at all. For such probabilistic assumptions, the outputs of all procedures solving a particular type of problem are statistically identical. A colourful way of describing such a circumstance, introduced by David Wolpert and William G. Macready in connection with the problems of search and optimization, is to say that there is no free lunch. Wolpert had previously derived no free lunch theorems for machine learning. Before Wolpert's article was published, Cullen Schaffer independently proved a restricted version of one of Wolpert's theorems and used it to critique the current state of machine learning research on the problem of induction.
Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects, such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."
In theoretical computer science, a computational problem is one that asks for a solution in terms of an algorithm. For example, the problem of factoring
Project complexity is the property of a project which makes it difficult to understand, foresee, and keep under control its overall behavior, even when given reasonably complete information about the project system. With a lens of systems thinking, project complexity can be defined as an intricate arrangement of the varied interrelated parts in which the elements can change and evolve constantly with an effect on the project objectives. The identification of complex projects is specifically important to multi-project engineering environments.
{{cite book}}
: CS1 maint: location missing publisher (link)