Ancestral graph

Last updated
An ancestral graph Ancestral graph.svg
An ancestral graph

In statistics and Markov modeling, an ancestral graph is a type of mixed graph used to provide a graphical representation for the result of marginalizing one or more vertices in a graphical model that takes the form of a directed acyclic graph. Ancestral graphs can encode conditional independence relations that arise in directed acyclic graph (DAG) models with latent variables and selection bias. [1]

Contents

Definition

Ancestral graphs are mixed graphs with three kinds of edges: directed edges (→), drawn as an arrow from one vertex to another, bidirected edges (↔), which have an arrowhead at both ends, and undirected edges (—), which have no arrowheads. An ancestral graph must satisfy the following constraints: [1]

A maximal ancestral graph (MAG) is an ancestral graph in which no additional edge may be added without violating the ancestral graph properties or changing the encoded independence model. [2]

In the context of causal inference, ancestral graphs assume the Causal Markov Condition (CMC) and Causal Faithfulness Condition (CFC). The CMC requires that every variable is independent of its non-descendants conditional on its parents in the causal graph, while the CFC stipulates that the only independence relationships are those implied by the CMC. [3]

Markov equivalence

Two ancestral graphs are Markov equivalent if they encode the same set of conditional independence relations. The independence relations in an ancestral graph are determined using m-separation, a graphical criterion that generalizes d-separation from DAGs to ancestral graphs. [2]

A collider on a path is a vertex where two arrowheads meet. A non-collider on a path is any vertex on the path that is not a collider. A collider may have an order, which indicates the minimum number of vertices on certain discriminating paths associated with that collider. A discriminating path for a vertex in an ancestral graph is a path where: [1]

For maximal ancestral graphs, Markov equivalence can be characterized graphically: two MAGs are Markov equivalent if and only if they have: [1]

This characterization leads to a polynomial-time algorithm for determining whether two maximal ancestral graphs are Markov equivalent, running in time where is the number of vertices and is the number of edges. [2]

Applications

Ancestral graphs are used to depict conditional independence relations between variables in Markov models. [4] They are particularly useful for representing the independence structure over observed variables when some variables in an underlying DAG model are unobserved (latent) or when data are sampled from a specific sub-population determined by selection variables. [2]

Ancestral graphs are advantageous because they can represent causal relationships without explicitly including latent variables in the model, which makes the search space tractable for causal discovery algorithms. Unlike DAG models with latent variables, which create an infinite search space unless the number of latent variables or network topology is constrained, ancestral graphs provide a finite representation. [1]

Ancestral graphs are closely related to inducing path graphs (IPGs), another graphical representation for causal systems with latent variables. While syntactically MAGs form a subclass of IPGs (as MAGs do not contain almost directed cycles while IPGs may), PAGs generally reveal more qualitative causal information than partially oriented inducing path graphs (POIPGs). [5] This makes ancestral graphs particularly advantageous for causal discovery and inference tasks.

Ancestral graphs enable the estimation of causal effects from observational data even when the system is causally insufficient (containing unmeasured common causes). The LV-IDA algorithm (Latent Variable Intervention-calculus when the DAG is Absent) extends the IDA approach to estimate bounds on causal effects when latent variables may be present. [3] Unlike methods that assume causal sufficiency (no unmeasured confounders), this approach can identify when causal effects are non-identifiable and provide informative bounds instead of point estimates.

Causal discovery

The FCI algorithm (Fast Causal Inference) is a procedure for learning causal structure from observational data in the presence of latent confounders and selection bias. The algorithm outputs a partial ancestral graph (PAG), which represents a Markov equivalence class of MAGs. Zhang (2008) established that with additional orientation rules beyond those in the original FCI algorithm, the procedure is complete. That is, it can discover all aspects of causal structure that are uniquely determined by conditional independence relations, under standard causal inference assumptions. [1]

The completeness of orientation rules for PAGs has important implications for identifying invariant relationships under interventions. A conditional probability is invariant under an intervention on if the probability remains unchanged after the intervention. PAGs provide complete graphical criteria for determining when such invariance holds, meaning they can identify all invariant relationships that are uniquely determined by the conditional independence structure. [5]

Do-calculus

Ancestral graphs support causal reasoning about the effects of interventions. Zhang (2008) extended Pearl's do-calculus to the context of ancestral graphs, enabling causal inference when only an equivalence class of causal structures (represented by a PAG) is known rather than a unique causal DAG. [5] This calculus provides rules for determining when post-intervention probability distributions can be identified from pre-intervention (observational) data, even when the exact causal structure is not uniquely determined.

The extension requires distinguishing between visible and invisible directed edges in a MAG. A directed edge is visible if there exists a vertex not adjacent to such that either there is an edge between and that is into , or there is a collider path between and that is into and every vertex on this path is a parent of . Otherwise the edge is invisible. [5] This distinction determines how graph manipulations affect the correspondence between a MAG and its underlying DAGs.

The calculus involves two types of graph manipulations:

These manipulations allow formulation of graphical conditions under which certain probabilistic equalities hold, such as when . The PAG-based do-calculus is formulated in terms of definite m-separation, which requires the absence of possibly m-connecting paths rather than just m-connecting paths. [5]

A central application is determining invariance under interventions: a conditional probability is invariant under an intervention if the post-intervention probability equals the pre-intervention probability. Graphical conditions for such invariance can be checked directly from a PAG without enumerating all compatible DAGs. [5]

References

  1. 1 2 3 4 5 6 Zhang, Jiji (2008), "On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias", Artificial Intelligence, 172 (16–17): 1873–1896, doi:10.1016/j.artint.2008.08.001
  2. 1 2 3 4 Ali, R. Ayesha; Richardson, Thomas S.; Spirtes, Peter (2009), "Markov equivalence for ancestral graphs" (PDF), The Annals of Statistics, 37 (5B): 2808–2837, doi:10.1214/08-AOS626, MR   2541442
  3. 1 2 Malinsky, Daniel; Spirtes, Peter (2016), "Estimating Causal Effects with Ancestral Graph Markov Models", Proceedings of the Eighth International Conference on Probabilistic Graphical Models, 52, PMLR: 299–309
  4. Richardson, Thomas; Spirtes, Peter (2002), "Ancestral graph Markov models", The Annals of Statistics, 30 (4): 962–1030, CiteSeerX   10.1.1.33.4906 , doi:10.1214/aos/1031689015, MR   1926166
  5. 1 2 3 4 5 6 Zhang, Jiji (2008), "Causal Reasoning with Ancestral Graphs", Journal of Machine Learning Research, 9: 1437–1474