Probabilistic soft logic

Last updated
PSL
PSL Logo.png
Developer(s) LINQS Lab
Initial releaseSeptember 23, 2011 (2011-09-23)
Stable release
2.2.2 [1] / May 20, 2020 (2020-05-20)
Repository github.com/linqs/psl
Written in Java
Platform Linux, macOS, Windows
Type Machine Learning, Statistical relational learning
License Apache License 2.0
Website psl.linqs.org

Probabilistic Soft Logic (PSL) is a statistical relational learning (SRL) framework for modeling probabilistic and relational domains. [2] It is applicable to a variety of machine learning problems, such as collective classification, entity resolution, link prediction, and ontology alignment. PSL combines two tools: first-order logic, with its ability to succinctly represent complex phenomena, and probabilistic graphical models, which capture the uncertainty and incompleteness inherent in real-world knowledge. More specifically, PSL uses "soft" logic as its logical component and Markov random fields as its statistical model. PSL provides sophisticated inference techniques for finding the most likely answer (i.e. the maximum a posteriori (MAP) state). The "softening" of the logical formulas makes inference a polynomial time operation rather than an NP-hard operation.

Contents

Description

The SRL community has introduced multiple approaches that combine graphical models and first-order logic to allow the development of complex probabilistic models with relational structures. A notable example of such approaches is Markov logic networks (MLNs). [3] Like MLNs, PSL is a modelling language (with an accompanying implementation [4] ) for learning and predicting in relational domains. Unlike MLNs, PSL uses soft truth values for predicates in an interval between [0,1]. This allows for the underlying inference to be solved quickly as a convex optimization problem. This is useful in problems such as collective classification, link prediction, social network modelling, and object identification/entity resolution/record linkage.

Probabilistic Soft Logic was first released in 2009 by Lise Getoor and Matthias Broecheler. [5] This first version focused heavily on reasoning about similarities between entities. Later versions of PSL would still keep the ability to reason about similarities, but generalize the language to be more expressive.

In 2017, a Journal of Machine Learning Research article detailing PSL and the underlying graphical model was published along with the release of a new major version of PSL (2.0.0). [2] The major new features in PSL 2.0.0 was a new type of rule mainly used in specifying constraints and a command-line interface.

Syntax and Semantics

Terminology

Syntax

A PSL model is composed of a series of weighted rules and constraints. PSL supports two types of rules: Logical and Arithmetic. [6]

Logical rules are composed of an implication with only a single atom or a conjunction of atoms in the body and a single atom or a disjunction of atoms in the head. Since PSL uses soft logic, hard logic operators are replaced with Łukasiewicz soft logical operators. An example of a logical rule expression is:

Similar(A,B)&HasLabel(A,X)->HasLabel(B,X)

This rule can be interpreted to mean: If A and B are similar and A has the label X, then there is evidence that B also has the label X.

Arithmetic rules are relations of two linear combinations of atoms. Restricting each side to a linear combination ensures that the resulting potential is convex. The following relational operators are supported: =, <=, and >=.

Similar(A,B)=Similar(B,A)

This rule encodes the notion that similarity is symmetric in this model.

A commonly used feature of arithmetic rules is the summation operation. The summation operation can be used to aggregate multiple atoms. When used, the atom is replaced with the sum of all possible atoms where the non-summation variables are fixed. Summation variables are made by prefixing a variable with a +. Fox example:

HasLabel(A,+X)=1.0

If the possible values for X are label1, label2, and label3, then the above rule is equivalent to:

HasLabel(A,'label1')+HasLabel(A,'label2')+HasLabel(A,'label3')=1.0

Both of these rules force the sum of all possible labels for an entity to sum to 1.0. This type of rule is especially useful for collective classification problems, where only one class can be selected.

Semantics

HL-MRF

A PSL program defines a family of probabilistic graphical models that are parameterized by data. More specifically, the family of graphical models it defines belongs to a special class of Markov random field known as a Hinge-Loss Markov Field (HL-MRF). An HL-MRF determines a density function over a set of continuous variables with joint domain using set of evidence , weights , and potential functions of the form where is a linear function and . The conditional distribution of given the observed data is defined as

Where is the partition function. This density is a logarithmically convex function, and thus the common inference task in PSL of finding a maximum a posteriori estimation of the joint state of is a convex problem. This allows inference in PSL to be achievable in polynomial-time.

Open/Closed Predicates -- Closed World Assumption

Predicates in PSL can be labeled as open or closed.

When a predicate is labeled closed, PSL makes the closed-world assumption: any predicates that are not explicitly provided to PSL are assumed to be false. In other words, the closed world assumption presumes that a predicate that is partially true is also known to be partially true. For example, if we had the following constants in the data for representing people: and the following constant for movies: , and we provided PSL with the predicate data and was labeled closed, then PSL would assume that even though this data was never explicitly provided to the system.

If a predicate is labeled as open, then PSL does not make the closed-world assumption. Instead, PSL will attempt to collectively infer the unobserved instances.

Grounding

Data is used to instantiate several potential functions in a process called grounding. The resulting potential functions are then used to define the HL-MRF.

Grounding predicates in PSL is the process of making all possible substitutions of the variables in each predicate with the existing constants in the data, resulting in a collection of ground atoms, . Then, all possible substitutions of the ground atoms for the predicates in the rules are made to create ground rules.

Each of the ground rules are interpreted as either potentials or hard constraints in the induced HL-MRF. A logical rule is translated as a continuous relaxation of Boolean connectives using Łukasiewicz logic. A ground logical rule is transformed into its disjunctive normal form. Let be the set of indices of the variables that correspond to atoms that are not negated, and, likewise the set of indices corresponding to atoms that are negated, in the disjunctive clause. Then the logical rule maps to the inequality:

If the logical rule is weighted with a weight and exponentiated with , then the potential

is added to the HL-MRF with a weight parameter of .

An arithmetic rule is manipulated to and the resulting potential takes the form .

Interfaces

PSL is available via three different language interfaces: CLI, Java, and Python. PSL's command line interface (CLI) is the recommended way to use PSL. [7] It supports all the features commonly used in a reproducible form that does not require compilation. Since PSL is written in Java, the PSL Java interface is the most expansive and users can call directly into the core of PSL. [8] The Java interface is available through the Maven central repository. [9] The PSL Python interface is available through PyPi [10] and uses pandas DataFrames to pass data between PSL and the user. [11]

PSL previously provided a Groovy interface. [12] It has been deprecated in 2.2.1 release of PSL, and is scheduled to be removed in the 2.3.0 release. [13]

Examples

The LINQS lab, developers of the official PSL implementation, maintain a collection of PSL examples. [14] These examples cover both synthetic and real-world datasets and include examples from academic publications using PSL. Below is a toy example from this repository that can be used to infer relations in a social network. Along with each rule is a comment describing the motivating intuition behind the statements.

/* People living in the same location are more likely to know one another. */20:Lived(P1,L)&Lived(P2,L)&(P1!=P2)->Knows(P1,P2)^2/* People who have not lived in the same location are not likely to know one another. */5:Lived(P1,L1)&Lived(P2,L2)&(P1!=P2)&(L1!=L2)->!Knows(P1,P2)^2/* Two people with similar interests are more likely to know one another. */10:Likes(P1,X)&Likes(P2,X)&(P1!=P2)->Knows(P1,P2)^2/* People in the same circles tend to know one another (transitivity). */5:Knows(P1,P2)&Knows(P2,P3)&(P1!=P3)->Knows(P1,P3)^2/* Knowing one another is symmetric. */Knows(P1,P2)=Knows(P2,P1)./* By default, assume that two arbitrary people do not know one another (negative prior). */5:!Knows(P1,P2)^2

See also

Related Research Articles

An axiom, postulate or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Greek axíōma (ἀξίωμα) 'that which is thought worthy or fit' or 'that which commends itself as evident.'

First-order logic—also known as predicate logic, quantificational logic, and first-order predicate calculus—is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over non-logical objects, and allows the use of sentences that contain variables, so that rather than propositions such as "Socrates is a man", one can have expressions in the form "there exists x such that x is Socrates and x is a man", where "there exists" is a quantifier, while x is a variable. This distinguishes it from propositional logic, which does not use quantifiers or relations; in this sense, propositional logic is the foundation of first-order logic.

Original proof of Gödels completeness theorem

The proof of Gödel's completeness theorem given by Kurt Gödel in his doctoral dissertation of 1929 is not easy to read today; it uses concepts and formalisms that are no longer used and terminology that is often obscure. The version given below attempts to represent all the steps in the proof and all the important ideas faithfully, while restating the proof in the modern language of mathematical logic. This outline should not be considered a rigorous proof of the theorem.

Propositional calculus is a branch of logic. It is also called propositional logic, statement logic, sentential calculus, sentential logic, or sometimes zeroth-order logic. It deals with propositions and relations between propositions, including the construction of arguments based on them. Compound propositions are formed by connecting propositions by logical connectives. Propositions that contain no logical connectives are called atomic propositions.

In mathematical logic, a universal quantification is a type of quantifier, a logical constant which is interpreted as "given any" or "for all". It expresses that a predicate can be satisfied by every member of a domain of discourse. In other words, it is the predication of a property or relation to every member of the domain. It asserts that a predicate within the scope of a universal quantifier is true of every value of a predicate variable.

In set theory and its applications to logic, mathematics, and computer science, set-builder notation is a mathematical notation for describing a set by enumerating its elements, or stating the properties that its members must satisfy.

Indicator function Type of mathematical function

In mathematics, an indicator function or a characteristic function of a subset A of a set X is a function defined from X to the two-element set , typically denoted as , and it indicates whether an element in X belongs to A; if an element in X belongs to A, and if does not belong to A. It is also denoted by to emphasize the fact that this function identifies the subset A of X.

In logic, temporal logic is any system of rules and symbolism for representing, and reasoning about, propositions qualified in terms of time. It is sometimes also used to refer to tense logic, a modal logic-based system of temporal logic introduced by Arthur Prior in the late 1950s, with important contributions by Hans Kamp. It has been further developed by computer scientists, notably Amir Pnueli, and logicians.

In mathematical logic, propositional logic and predicate logic, a well-formed formula, abbreviated WFF or wff, often simply formula, is a finite sequence of symbols from a given alphabet that is part of a formal language. A formal language can be identified with the set of formulas in the language.

Stratification has several usages in mathematics.

Epistemic modal logic is a subfield of modal logic that is concerned with reasoning about knowledge. While epistemology has a long philosophical tradition dating back to Ancient Greece, epistemic logic is a much more recent development with applications in many fields, including philosophy, theoretical computer science, artificial intelligence, economics and linguistics. While philosophers since Aristotle have discussed modal logic, and Medieval philosophers such as Avicenna, Ockham, and Duns Scotus developed many of their observations, it was C. I. Lewis who created the first symbolic and systematic approach to the topic, in 1912. It continued to mature as a field, reaching its modern form in 1963 with the work of Kripke.

The term "information algebra" refers to mathematical techniques of information processing. Classical information theory goes back to Claude Shannon. It is a theory of information transmission, looking at communication and storage. However, it has not been considered so far that information comes from different sources and that it is therefore usually combined. It has furthermore been neglected in classical information theory that one wants to extract those parts out of a piece of information that are relevant to specific questions.

Probabilistic Computation Tree Logic (PCTL) is an extension of computation tree logic (CTL) that allows for probabilistic quantification of described properties. It has been defined in the paper by Hansson and Jonsson.

In descriptive complexity, a branch of computational complexity, FO is a complexity class of structures that can be recognized by formulas of first-order logic, and also equals the complexity class AC0. Descriptive complexity uses the formalism of logic, but does not use several key notions associated with logic such as proof theory or axiomatization.

In logic, especially mathematical logic, a Hilbert system, sometimes called Hilbert calculus, Hilbert-style deductive system or Hilbert–Ackermann system, is a type of system of formal deduction attributed to Gottlob Frege and David Hilbert. These deductive systems are most often studied for first-order logic, but are of interest for other logics as well.

An interpretation is an assignment of meaning to the symbols of a formal language. Many formal languages used in mathematics, logic, and theoretical computer science are defined in solely syntactic terms, and as such do not have any meaning until they are given some interpretation. The general study of interpretations of formal languages is called formal semantics.

Dependence logic is a logical formalism, created by Jouko Väänänen, which adds dependence atoms to the language of first-order logic. A dependence atom is an expression of the form , where are terms, and corresponds to the statement that the value of is functionally dependent on the values of .

In computer science and mathematics, more precisely in automata theory, model theory and formal language, a regular numerical predicate is a kind of relation over integers. Regular numerical predicates can also be considered as a subset of for some arity . One of the main interests of this class of predicates is that it can be defined in plenty of different ways, using different logical formalisms. Furthermore, most of the definitions use only basic notions, and thus allows to relate foundations of various fields of fundamental computer science such as automata theory, syntactic semigroup, model theory and semigroup theory.

In network theory, link prediction is the problem of predicting the existence of a link between two entities in a network. Examples of link prediction include predicting friendship links among users in a social network, predicting co-authorship links in a citation network, and predicting interactions between genes and proteins in a biological network. Link prediction can also have a temporal aspect, where, given a snapshot of the set of links at time , the goal is to predict the links at time . Link prediction is widely applicable. In e-commerce, link prediction is often a subtask for recommending items to users. In the curation of citation databases, it can be used for record deduplication. In bioinformatics, it has been used to predict protein-protein interactions (PPI). It is also used to identify hidden groups of terrorists and criminals in security related applications.

In network theory, collective classification is the simultaneous prediction of the labels for multiple objects, where each label is predicted using information about the object's observed features, the observed features and labels of its neighbors, and the unobserved labels of its neighbors. Collective classification problems are defined in terms of networks of random variables, where the network structure determines the relationship between the random variables. Inference is performed on multiple random variables simultaneously, typically by propagating information between nodes in the network to perform approximate inference. Approaches that use collective classification can make use of relational information when performing inference. Examples of collective classification include predicting attributes of individuals in a social network, classifying webpages in the World Wide Web, and inferring the research area of a paper in a scientific publication dataset.

References

  1. "PSL 2.2.2". GitHub. Retrieved 16 July 2020.
  2. 1 2 Bach, Stephen; Broecheler, Matthias; Huang, Bert; Getoor, Lise (2017). "Hinge-Loss Markov Random Fields and Probabilistic Soft Logic". Journal of Machine Learning Research. 18: 1–67.
  3. Getoor, Lise; Taskar, Ben (2007). Introduction to Statistical Relational Learning. MIT Press. ISBN   978-0262072885.
  4. "GitHub repository" . Retrieved 26 March 2018.
  5. Broecheler, Matthias; Getoor, Lise (2009). Probabilistic Similarity Logic. International Workshop on Statistical Relational Learning (SRL).
  6. "Rule Specification". psl.linqs.org. LINQS Lab. December 6, 2019. Retrieved July 10, 2020.
  7. Augustine, Eriq (15 July 2018). "Getting Started with PSL". Probabilistic Soft Logic. Retrieved 15 July 2020.
  8. "PSL API Reference". Probabilistic Soft Logic. Retrieved 15 July 2020.
  9. "Maven Repository: org.linqs » psl-java". mvnrepository.com. Retrieved 15 July 2020.
  10. "pslpython: A python inferface to the PSL SRL/ML software". Python Package Index. Retrieved 15 July 2020.
  11. Augustine, Eriq (6 December 2019). "PSL 2.2.1 Release". Probabilistic Soft Logic. Retrieved 15 July 2020.
  12. "Maven Repository: org.linqs » psl-groovy". mvnrepository.com.
  13. Augustine, Eriq (6 December 2019). "PSL 2.2.1 Release". Probabilistic Soft Logic. Retrieved 15 July 2020.
  14. "linqs/psl-examples". Github. linqs. 19 June 2020.