Statistical relational learning

Last updated February 04, 2024

Statistical relational learning (SRL) is a subdiscipline of artificial intelligence and machine learning that is concerned with domain models that exhibit both uncertainty (which can be dealt with using statistical methods) and complex, relational structure.^[1]^[2] Typically, the knowledge representation formalisms developed in SRL use (a subset of) first-order logic to describe relational properties of a domain in a general manner (universal quantification) and draw upon probabilistic graphical models (such as Bayesian networks or Markov networks) to model the uncertainty; some also build upon the methods of inductive logic programming. Significant contributions to the field have been made since the late 1990s.^[1]

As is evident from the characterization above, the field is not strictly limited to learning aspects; it is equally concerned with reasoning (specifically probabilistic inference) and knowledge representation. Therefore, alternative terms that reflect the main foci of the field include statistical relational learning and reasoning (emphasizing the importance of reasoning) and first-order probabilistic languages (emphasizing the key properties of the languages with which models are represented). Another term that is sometimes used in the literature is relational machine learning (RML).

Canonical tasks

A number of canonical tasks are associated with statistical relational learning, the most common ones being.^[3]

collective classification, i.e. the (simultaneous) prediction of the class of several objects given objects' attributes and their relations
link prediction, i.e. predicting whether or not two or more objects are related
link-based clustering, i.e. the grouping of similar objects, where similarity is determined according to the links of an object, and the related task of collaborative filtering, i.e. the filtering for information that is relevant to an entity (where a piece of information is considered relevant to an entity if it is known to be relevant to a similar entity)
social network modelling
object identification/entity resolution/record linkage, i.e. the identification of equivalent entries in two or more separate databases/datasets

Representation formalisms

One of the fundamental design goals of the representation formalisms developed in SRL is to abstract away from concrete entities and to represent instead general principles that are intended to be universally applicable. Since there are countless ways in which such principles can be represented, many representation formalisms have been proposed in recent years.^[1] In the following, some of the more common ones are listed in alphabetical order:

Bayesian logic program
BLOG model
Markov logic networks
Multi-entity Bayesian network
Probabilistic logic programs
Probabilistic relational model – a Probabilistic Relational Model (PRM) is the counterpart of a Bayesian network in statistical relational learning.^[4]^[5]
Probabilistic soft logic
Recursive random field
Relational Bayesian network
Relational dependency network
Relational Markov network
Relational Kalman filtering

Resources

Brian Milch, and Stuart J. Russell: First-Order Probabilistic Languages: Into the Unknown , Inductive Logic Programming, volume 4455 of Lecture Notes in Computer Science, page 10–24. Springer, 2006
Rodrigo de Salvo Braz, Eyal Amir, and Dan Roth: A Survey of First-Order Probabilistic Models , Innovations in Bayesian Networks, volume 156 of Studies in Computational Intelligence, Springer, 2008
Hassan Khosravi and Bahareh Bina: A Survey on Statistical Relational Learning , Advances in Artificial Intelligence, Lecture Notes in Computer Science, Volume 6085/2010, 256–268, Springer, 2010
Ryan A. Rossi, Luke K. McDowell, David W. Aha, and Jennifer Neville: Transforming Graph Data for Statistical Relational Learning , Journal of Artificial Intelligence Research (JAIR), Volume 45, page 363-441, 2012
Luc De Raedt, Kristian Kersting, Sriraam Natarajan and David Poole, "Statistical Relational Artificial Intelligence: Logic, Probability, and Computation", Synthesis Lectures on Artificial Intelligence and Machine Learning" March 2016 ISBN 9781627058414.

Related Research Articles

Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. The term "inductive" here refers to philosophical rather than mathematical induction. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples.

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word infer means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in Europe dates at least to Aristotle. Deduction is inference deriving logical conclusions from premises known or assumed to be true, with the laws of valid inference being studied in logic. Induction is inference from particular evidence to a universal conclusion. A third type of inference is sometimes distinguished, notably by Charles Sanders Peirce, contradistinguishing abduction from induction.

<span class="mw-page-title-main">Symbolic artificial intelligence</span> Methods in artificial intelligence research

In artificial intelligence, symbolic artificial intelligence is the term for the collection of all methods in artificial intelligence research that are based on high-level symbolic (human-readable) representations of problems, logic and search. Symbolic AI used tools such as logic programming, production rules, semantic nets and frames, and it developed applications such as knowledge-based systems, symbolic mathematics, automated theorem provers, ontologies, the semantic web, and automated planning and scheduling systems. The Symbolic AI paradigm led to seminal ideas in search, symbolic programming languages, agents, multi-agent systems, the semantic web, and the strengths and limitations of formal knowledge and reasoning systems.

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning.

A dynamic Bayesian network (DBN) is a Bayesian network (BN) which relates variables to each other over adjacent time steps.

A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, defining probability distributions on possible worlds on any given domain.

Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighbouring" samples, a CRF can take context into account. To do so, the predictions are modelled as a graphical model, which represents the presence of dependencies between the predictions. What kind of graph is used depends on the application. For example, in natural language processing, "linear chain" CRFs are popular, for which each prediction is dependent only on its immediate neighbours. In image processing, the graph typically connects locations to nearby and/or similar locations to enforce that they receive similar predictions.

Probabilistic logic involves the use of probability and logic to deal with uncertain situations. Probabilistic logic extends traditional logic truth tables with probabilistic expressions. A difficulty of probabilistic logics is their tendency to multiply the computational complexities of their probabilistic and logical components. Other difficulties include the possibility of counter-intuitive results, such as in case of belief fusion in Dempster–Shafer theory. Source trust and epistemic uncertainty about the probabilities they provide, such as defined in subjective logic, are additional elements to consider. The need to deal with a broad variety of contexts and issues has led to many different proposals.

The following outline is provided as an overview of and topical guide to artificial intelligence:

Structured prediction or structured (output) learning is an umbrella term for supervised machine learning techniques that involves predicting structured objects, rather than scalar discrete or real values.

Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general purpose programming in order to make the former easier and more widely applicable. It can be used to create systems that help make decisions in the face of uncertainty.

Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative and often recursive programs from incomplete specifications, such as input/output examples or constraints.

Action model learning is an area of machine learning concerned with creation and modification of software agent's knowledge about effects and preconditions of the actions that can be executed within its environment. This knowledge is usually represented in logic-based action description language and used as the input for automated planners.

Probabilistic Soft Logic (PSL) is a statistical relational learning (SRL) framework for modeling probabilistic and relational domains. It is applicable to a variety of machine learning problems, such as collective classification, entity resolution, link prediction, and ontology alignment. PSL combines two tools: first-order logic, with its ability to succinctly represent complex phenomena, and probabilistic graphical models, which capture the uncertainty and incompleteness inherent in real-world knowledge. More specifically, PSL uses "soft" logic as its logical component and Markov random fields as its statistical model. PSL provides sophisticated inference techniques for finding the most likely answer (i.e. the maximum a posteriori (MAP) state). The "softening" of the logical formulas makes inference a polynomial time operation rather than an NP-hard operation.

This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.

The following outline is provided as an overview of and topical guide to machine learning:

In network theory, link prediction is the problem of predicting the existence of a link between two entities in a network. Examples of link prediction include predicting friendship links among users in a social network, predicting co-authorship links in a citation network, and predicting interactions between genes and proteins in a biological network. Link prediction can also have a temporal aspect, where, given a snapshot of the set of links at time $, the goal is to predict the links at time . Link prediction is widely applicable. In e-commerce, link prediction is often a subtask for recommending items to users. In the curation of citation databases, it can be used for record deduplication. In bioinformatics, it has been used to predict protein-protein interactions (PPI). It is also used to identify hidden groups of terrorists and criminals in security related applications.$

In network theory, collective classification is the simultaneous prediction of the labels for multiple objects, where each label is predicted using information about the object's observed features, the observed features and labels of its neighbors, and the unobserved labels of its neighbors. Collective classification problems are defined in terms of networks of random variables, where the network structure determines the relationship between the random variables. Inference is performed on multiple random variables simultaneously, typically by propagating information between nodes in the network to perform approximate inference. Approaches that use collective classification can make use of relational information when performing inference. Examples of collective classification include predicting attributes of individuals in a social network, classifying webpages in the World Wide Web, and inferring the research area of a paper in a scientific publication dataset.

References

1 2 3 Getoor, Lise; Taskar, Ben (2007). Introduction to Statistical Relational Learning. MIT Press. ISBN 978-0262072885.
↑ Ryan A. Rossi, Luke K. McDowell, David W. Aha, and Jennifer Neville, "Transforming Graph Data for Statistical Relational Learning." Journal of Artificial Intelligence Research (JAIR), Volume 45 (2012), pp. 363-441.
↑ Matthew Richardson and Pedro Domingos, "Markov Logic Networks." Machine Learning, 62 (2006), pp. 107–136.
↑ Friedman N, Getoor L, Koller D, Pfeffer A. (1999) "Learning probabilistic relational models". In: International joint conferences on artificial intelligence, 1300–09
↑ Teodor Sommestad, Mathias Ekstedt, Pontus Johnson (2010) "A probabilistic relational model for security risk analysis", Computers & Security, 29 (6), 659-679 doi:10.1016/j.cose.2010.02.002

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[getoor:book07-1] 1 2 3 Getoor, Lise; Taskar, Ben (2007). Introduction to Statistical Relational Learning. MIT Press. ISBN 978-0262072885.

[rossi:jair12-2] Ryan A. Rossi, Luke K. McDowell, David W. Aha, and Jennifer Neville, "Transforming Graph Data for Statistical Relational Learning." Journal of Artificial Intelligence Research (JAIR), Volume 45 (2012), pp. 363-441.

[richardson:ml06-3] Matthew Richardson and Pedro Domingos, "Markov Logic Networks." Machine Learning, 62 (2006), pp. 107–136.

[friedman:ijcai99-4] Friedman N, Getoor L, Koller D, Pfeffer A. (1999) "Learning probabilistic relational models". In: International joint conferences on artificial intelligence, 1300–09

[sommestad:compsec10-5] Teodor Sommestad, Mathias Ekstedt, Pontus Johnson (2010) "A probabilistic relational model for security risk analysis", Computers & Security, 29 (6), 659-679 doi:10.1016/j.cose.2010.02.002

[1]

[2]

[3]

[4]

[5]