Causal AI

Last updated

Causal AI is a technique in artificial intelligence that builds a causal model and can thereby make inferences using causality rather than just correlation. One practical use for causal AI is for organisations to explain decision-making and the causes for a decision. [1] [2]

Systems based on causal AI, by identifying the underlying web of causality for a behaviour or event, provide insights that solely predictive AI models might fail to extract from historical data. [3] An analysis of causality may be used to supplement human decisions in situations where understanding the causes behind an outcome is necessary, such as quantifying the impact of different interventions, policy decisions or performing scenario planning. [4] A 2024 paper from Google DeepMind demonstrated mathematically that "Any agent capable of adapting to a sufficiently large set of distributional shifts must have learned a causal model". [5] The paper offers the interpretation that learning to generalise beyond the original training set requires learning a causal model, concluding that causal AI is necessary for artificial general intelligence.

History

The concept of causal AI and the limits of machine learning were raised by Judea Pearl, the Turing Award-winning computer scientist and philosopher, in 2018's The Book of Why: The New Science of Cause and Effect. Pearl asserted: “Machines' lack of understanding of causal relations is perhaps the biggest roadblock to giving them human-level intelligence.” [6] [7]

In 2020, Columbia University established a Causal AI Lab under Director Elias Bareinboim. Professor Bareinboim’s research focuses on causal and counterfactual inference and their applications to data-driven fields in the health and social sciences as well as artificial intelligence and machine learning. [8] Technological research and consulting firm Gartner for the first time included causal AI in its 2022 Hype Cycle report, citing it as one of five critical technologies in accelerated AI automation. [9] [10]

One significant advance in the field is the concept of Algorithmic Information Dynamics: [11] a model-driven approach for causal discovery using Algorithmic Information Theory and perturbation analysis. It solves inverse causal problems by studying dynamical systems computationally. A key application is causal deconvolution, which separates generative mechanisms in data with algorithmic models rather than traditional statistics. [12] This method identifies causal structures in networks and sequences, moving away from probabilistic and regression-based techniques, marking one of the first practical Causal AI approaches using algorithmic complexity and algorithmic probability in Machine Learning. [13]

Related Research Articles

Causality is an influence by which one event, process, state, or object (acause) contributes to the production of another event, process, state, or object (an effect) where the cause is at least partly responsible for the effect, and the effect is at least partly dependent on the cause. The cause of something may also be described as the reason for the event or process.

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

Minimum Description Length (MDL) is a model selection principle where the shortest description of the data is the best model. MDL methods learn through a data compression perspective and are sometimes described as mathematical applications of Occam's razor. The MDL principle can be extended to other forms of inductive inference and learning, for example to estimation and sequential prediction, without explicitly identifying a single model of the data.

Ray Solomonoff was an American mathematician who invented algorithmic probability, his General Theory of Inductive Inference, and was a founder of algorithmic information theory. He was an originator of the branch of artificial intelligence based on machine learning, prediction and probability. He circulated the first report on non-semantic machine learning in 1956.

<span class="mw-page-title-main">Judea Pearl</span> Computer scientist (born 1936)

Judea Pearl is an Israeli-American computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks. He is also credited for developing a theory of causal and counterfactual inference based on structural models. In 2011, the Association for Computing Machinery (ACM) awarded Pearl with the Turing Award, the highest distinction in computer science, "for fundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning". He is the author of several books, including the technical Causality: Models, Reasoning and Inference, and The Book of Why, a book on causality aimed at the general public.

<span class="mw-page-title-main">Trygve Haavelmo</span> Norwegian economist and econometrician

Trygve Magnus Haavelmo, born in Skedsmo, Norway, was an economist whose research interests centered on econometrics. He received the Nobel Memorial Prize in Economic Sciences in 1989.

<span class="mw-page-title-main">Markov blanket</span> Subset of variables that contains all the useful information

In statistics and machine learning, when one wants to infer a random variable with a set of variables, usually a subset is enough, and other variables are useless. Such a subset that contains all the useful information is called a Markov blanket. If a Markov blanket is minimal, meaning that it cannot drop any variable without losing information, it is called a Markov boundary. Identifying a Markov blanket or a Markov boundary helps to extract useful features. The terms of Markov blanket and Markov boundary were coined by Judea Pearl in 1988. A Markov blanket can be constituted by a set of Markov chains.

The Markov condition, sometimes called the Markov assumption, is an assumption made in Bayesian probability theory, that every node in a Bayesian network is conditionally independent of its nondescendants, given its parents. Stated loosely, it is assumed that a node has no bearing on nodes which do not descend from it. In a DAG, this local Markov condition is equivalent to the global Markov condition, which states that d-separations in the graph also correspond to conditional independence relations. This also means that a node is conditionally independent of the entire network, given its Markov blanket.

The following outline is provided as an overview of and topical guide to artificial intelligence:

Causal analysis is the field of experimental design and statistics pertaining to establishing cause and effect. Typically it involves establishing four elements: correlation, sequence in time, a plausible physical or information-theoretical mechanism for an observed effect to follow from a possible cause, and eliminating the possibility of common and alternative ("special") causes. Such analysis usually involves one or more artificial or natural experiments.

Clark N. Glymour is the Alumni University Professor Emeritus in the Department of Philosophy at Carnegie Mellon University. He is also a senior research scientist at the Florida Institute for Human and Machine Cognition.

<span class="mw-page-title-main">Collider (statistics)</span> Variable that is causally influenced by two or more variables

In statistics and causal graphs, a variable is a collider when it is causally influenced by two or more variables. The name "collider" reflects the fact that in graphical models, the arrow heads from variables that lead into the collider appear to "collide" on the node that is the collider. They are sometimes also referred to as inverted forks.

Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference analyzes the response of an effect variable when a cause of the effect variable is changed. The study of why things occur is called etiology, and can be described using the language of scientific causal notation. Causal inference is said to provide the evidence of causality theorized by causal reasoning.

In statistics, econometrics, epidemiology, genetics and related disciplines, causal graphs are probabilistic graphical models used to encode assumptions about the data-generating process.

<span class="mw-page-title-main">Rina Dechter</span> Computer scientist

Rina Dechter is a distinguished professor of computer science in the Donald Bren School of Information and Computer Sciences at the University of California, Irvine. Her research is on automated reasoning in artificial intelligence focusing on probabilistic and constraint-based reasoning. In 2013, she was elected a Fellow of the Association for Computing Machinery.

Explainable AI (XAI), often overlapping with interpretable AI, or explainable machine learning (XML), is a field of research within artificial intelligence (AI) that explores methods that provide humans with the ability of intellectual oversight over AI algorithms. The main focus is on the reasoning behind the decisions or predictions made by the AI algorithms, to make them more understandable and transparent. This addresses users' requirement to assess safety and scrutinize the automated decision making in applications. XAI counters the "black box" tendency of machine learning, where even the AI's designers cannot explain why it arrived at a specific decision.

<span class="mw-page-title-main">Richard Neapolitan</span>

Richard Eugene Neapolitan was an American scientist. Neapolitan is most well-known for his role in establishing the use of probability theory in artificial intelligence and in the development of the field Bayesian networks.

Causal analysis is the field of experimental design and statistical analysis pertaining to establishing cause and effect. Exploratory causal analysis (ECA), also known as data causality or causal discovery is the use of statistical algorithms to infer associations in observed data sets that are potentially causal under strict assumptions. ECA is a type of causal inference distinct from causal modeling and treatment effects in randomized controlled trials. It is exploratory research usually preceding more formal causal research in the same way exploratory data analysis often precedes statistical hypothesis testing in data analysis

<i>The Book of Why</i> 2018 book by Judea Pearl and Dana Mackenzie

The Book of Why: The New Science of Cause and Effect is a 2018 nonfiction book by computer scientist Judea Pearl and writer Dana Mackenzie. The book explores the subject of causality and causal inference from statistical and philosophical points of view for a general audience.

Hector Geffner is an Argentinian computer scientist and a Alexander von Humboldt Professor of artificial intelligence at RWTH Aachen University and Wallenberg Guest Professor in AI at Linköping University. His research interests are focused on artificial intelligence, especially automated planning and the integration of model-based AI and data-based AI. He is best known for his work on domain-independent heuristic planning and received several International Conference on Automated Planning and Scheduling (ICAPS) influential paper awards. Previously he held a research professorship at ICREA and the Artificial Intelligence and Machine Learning Group at University Pompeu Fabra in Barcelona since 2001. He was a staff researcher at the IBM Thomas J. Watson Research Center from 1990 to 1992 and a professor at Simón Bolívar University in Caracas, Venezuela from 1992 to 2001. Geffner was awarded an ERC Advanced Grant in 2020 to explore the connection between machine learning and model-based AI, and is a former board member and current fellow of the European Association for Artificial Intelligence (EurAI). He was elected an AAAI Fellow in 2007.

References

  1. Blogger, SwissCognitive Guest (18 January 2022). "Causal AI". SwissCognitive, World-Leading AI Network. Retrieved 11 October 2022.
  2. Sgaier, Sema K; Huang, Vincent; Grace, Charles (2020). "The Case for Causal AI". Stanford Social Innovation Review . 18 (3): 50–55. ISSN   1542-7099. ProQuest   2406979616.
  3. "Beyond the Limits of Historical Data | causa". causa.tech. 29 June 2024. Retrieved 29 June 2024.
  4. "How to Understand the World of Causality | causaLens". causalens.com. 28 February 2023. Retrieved 7 October 2023.
  5. "Robust agents learn causal world models". S2CID   267740124.{{cite web}}: Missing or empty |url= (help)
  6. Pearl, Judea (2019). The book of why : the new science of cause and effect. Dana Mackenzie. London, UK: Penguin Books. ISBN   978-0-14-198241-0. OCLC   1047822662.
  7. Hartnett, Kevin (15 May 2018). "To Build Truly Intelligent Machines, Teach Them Cause and Effect". Quanta Magazine. Retrieved 11 October 2022.
  8. "What AI still can't do". MIT Technology Review. Retrieved 18 October 2022.
  9. "What is New in the 2022 Gartner Hype Cycle for Emerging Technologies". Gartner. Retrieved 11 October 2022.
  10. Sharma, Shubham (10 August 2022). "Gartner picks emerging technologies that can drive differentiation for enterprises". VentureBeat. Retrieved 11 October 2022.
  11. Zenil, Hector (25 July 2020). "Algorithmic Information Dynamics". Scholarpedia. 15 (7). Bibcode:2020SchpJ..1553143Z. doi: 10.4249/scholarpedia.53143 . hdl: 10754/666314 .Zenil, Hector; Kiani, Narsis A.; Tegner, Jesper (2023). Algorithmic Information Dynamics: A Computational Approach to Causality with Applications to Living Systems. Cambridge University Press. doi:10.1017/9781108596619. ISBN   978-1-108-59661-9.
  12. Zenil, Hector; Kiani, Narsis A.; Zea, Allan A.; Tegner, Jesper (2019). "Causal deconvolution by algorithmic generative models". Nature Machine Intelligence. 1 (1): 58–66. doi:10.1038/s42256-018-0005-0. hdl: 10754/630919 .
  13. Hernández-Orozco, Santiago; Zenil, Hector; Riedel, Jürgen; Uccello, Adam; Kiani, Narsis A.; Tegnér, Jesper (2021). "Algorithmic Probability-Guided Machine Learning on Non-Differentiable Spaces". Frontiers in Artificial Intelligence. 3: 567356. doi: 10.3389/frai.2020.567356 . PMC   7944352 . PMID   33733213.