From 6e7853e6937dce65ef1544a41d7587457e448acf Mon Sep 17 00:00:00 2001 From: zeptodoctor <44736852+zeptodoctor@users.noreply.github.com> Date: Sat, 7 Dec 2024 14:04:19 +0000 Subject: [PATCH] build based on ea50013 --- dev/comments/index.html | 2 +- dev/examples/backdoor_example/index.html | 2 +- dev/examples/ges_basic_examples/index.html | 2 +- dev/examples/pc_basic_examples/index.html | 2 +- dev/examples/pc_cmi_examples/index.html | 2 +- dev/examples/pc_real_example/index.html | 2 +- dev/index.html | 2 +- dev/library/index.html | 20 ++++++++++---------- dev/search/index.html | 2 +- 9 files changed, 18 insertions(+), 18 deletions(-) diff --git a/dev/comments/index.html b/dev/comments/index.html index 2a7b14e2..aebd3bc5 100644 --- a/dev/comments/index.html +++ b/dev/comments/index.html @@ -1,2 +1,2 @@ -Developer comments · CausalInference.jl

Developer comments

Whether to use Meek's or Chickerings completion algorithm.

Meek's approach we currently use in PC and GES is: Take the PDAG and apply R1-R4 repeatedly until no rule applies. To obtain the CPDAG from a pattern the rules R1-R3 suffice, R4 is not needed.

Chickering's approach is to take the PDAG and find a DAG with same skeleton and v-structures. (This can be done e.g. with this algorithm by Dor and Tarsi: https://ftp.cs.ucla.edu/pub/stat_ser/r185-dor-tarsi.pdf) From the DAG construct the CPDAG (which is possible in linear-time, we also have this implemented cpdag(g)).

Implementing Chickerings approach would be an option because the complicated part (from an implementation perspective) is DAG-to-CPDAG which we already have. A straightforward implementation of PDAG to DAG by Dor and Tarsi are just a few lines. For speed of PC and GES this will likely not matter too much overall, other parts are more costly in case of PC the pattern learned from data might not have a DAG with same skeleton and v-structures. Then, it is not clear how to use Chickerings approach.

On Meek's rule 4

R4 is only needed in case of background knowledge, i.e. some edge orientations which do not follow from v-structures. For soundness of the rule, it is necessary that v and l are adjacent. For completeness of the Meek rules it suffices to apply it only if there is an undirected edge.

+Developer comments · CausalInference.jl

Developer comments

Whether to use Meek's or Chickerings completion algorithm.

Meek's approach we currently use in PC and GES is: Take the PDAG and apply R1-R4 repeatedly until no rule applies. To obtain the CPDAG from a pattern the rules R1-R3 suffice, R4 is not needed.

Chickering's approach is to take the PDAG and find a DAG with same skeleton and v-structures. (This can be done e.g. with this algorithm by Dor and Tarsi: https://ftp.cs.ucla.edu/pub/stat_ser/r185-dor-tarsi.pdf) From the DAG construct the CPDAG (which is possible in linear-time, we also have this implemented cpdag(g)).

Implementing Chickerings approach would be an option because the complicated part (from an implementation perspective) is DAG-to-CPDAG which we already have. A straightforward implementation of PDAG to DAG by Dor and Tarsi are just a few lines. For speed of PC and GES this will likely not matter too much overall, other parts are more costly in case of PC the pattern learned from data might not have a DAG with same skeleton and v-structures. Then, it is not clear how to use Chickerings approach.

On Meek's rule 4

R4 is only needed in case of background knowledge, i.e. some edge orientations which do not follow from v-structures. For soundness of the rule, it is necessary that v and l are adjacent. For completeness of the Meek rules it suffices to apply it only if there is an undirected edge.

diff --git a/dev/examples/backdoor_example/index.html b/dev/examples/backdoor_example/index.html index aa2b8665..81051742 100644 --- a/dev/examples/backdoor_example/index.html +++ b/dev/examples/backdoor_example/index.html @@ -20,4 +20,4 @@ t = plot_pc_graph_tikz(dag)

We are interested in the average causal effect (ACE) of a treatment X (variable nr. 6) on an outcome Y (variable nr. 8), which stands for the expected increase of Y per unit of a controlled increase in X. Variables nr. 1 and nr. 2 are unobserved.

Regressing Y (nr. 8) on X (nr. 6) will fail to measure the effect because of the presence of a confounder C (variable nr. 4), which opens a backdoor path, a connection between X and Y via the path 6 ← 4 → 8 which is not causal.

Confounder

We can avert such problems by checking the backdoor criterion. Indeed

backdoor_criterion(dag, 6, 8) # false

reports a problem.

One might want to condition on the confounder C (nr. 4)` to obtain the causal effect, but then

backdoor_criterion(dag, 6, 8, [4]) # still false

there is still a problem ( because that conditioning opens up a non-causal path via 6 ← 3 ← 1 → 4 ← 2 → 5 → 8.)

Opened new path

But conditioning on both Z = [4, 5] solves the problem, as verified by the backdoor criterion.

backdoor_criterion(dag, 6, 8, [3, 4]) # true
 backdoor_criterion(dag, 6, 8, [4, 5]) # true

(also conditioning on Z = [3, 4] would be possible.)

Thus, regressing Y on X and controlling for variables numbered Z = [4, 5] we measure the average causal effect.

Good controls

What we have done by hand here is the search for an backdoor adjustment set. We could have directly queried

Zs = list_covariate_adjustment(dag, 6, 8, Int[], setdiff(Set(1:8), [1, 2])) # exclude variables nr. 1 and nr. 2 because they are unobserved.

which lists possible adjustment sets,

println.(Zs);
    Set([4, 3])
     Set([5, 4])
-    Set([5, 4, 3])

to get the list of possible adjustment sets. Here list_backdoor_adjustment gives adjustment sets for a backdoor criterion similar to backdoor_criterion and list_covariate_adjustment the more versatile adjustment set based on the sound and complete graphical criterion for covariate adjustment given in [https://arxiv.org/abs/1203.3515] using the algorithmic approach proposed in [https://arxiv.org/abs/1803.00116].

+ Set([5, 4, 3])

to get the list of possible adjustment sets. Here list_backdoor_adjustment gives adjustment sets for a backdoor criterion similar to backdoor_criterion and list_covariate_adjustment the more versatile adjustment set based on the sound and complete graphical criterion for covariate adjustment given in [https://arxiv.org/abs/1203.3515] using the algorithmic approach proposed in [https://arxiv.org/abs/1803.00116].

diff --git a/dev/examples/ges_basic_examples/index.html b/dev/examples/ges_basic_examples/index.html index f34b7d67..7c199684 100644 --- a/dev/examples/ges_basic_examples/index.html +++ b/dev/examples/ges_basic_examples/index.html @@ -29,4 +29,4 @@ [1=>2, 1=>3, 1=>5, 2=>1, 2=>4, 3=>1, 3=>4, 4=>5] => 0.000565705 [1=>2, 1=>3, 1=>4, 2=>1, 2=>4, 3=>1, 3=>4, 4=>5] => 0.000543214 [1=>2, 1=>3, 2=>1, 2=>4, 2=>5, 3=>1, 3=>4, 4=>5] => 0.000517662 - ...

From this we see that the maximum a posterior graph coincides with the GES estimate and has high posterior weight, showing that the result of the GES algorithm is not incidental.

Exact Algorithm

We also provide an implementation of the exact optimization algorithm by Silander and Myllymäki (2006). Given observed data, it outputs the CPDAG representing the set of DAGs with maximum BIC score. The algorithm is exact, meaning it is guaranteed that the optimal CPDAG with regard to the BIC score is found.

The time and memory complexity of the algorithm is in the order of $O(n \cdot 2^{n-1})$ with $n$ being the number of variables (which is significantly better than a naive brute-force algorithm, which considers all CPDAGs on $n$ vertices, but still amounts to exponential time). The algorithm generally runs reasonably fast for less than 20 variables. It can be scaled further, provided enough RAM is available, up to a little more than 25 variables, but cases beyond that are likely infeasible.

It can be called similar to GES:

est_g = exactscorebased(df; penalty=1.0, parallel=true)

Reference:

+ ...

From this we see that the maximum a posterior graph coincides with the GES estimate and has high posterior weight, showing that the result of the GES algorithm is not incidental.

Exact Algorithm

We also provide an implementation of the exact optimization algorithm by Silander and Myllymäki (2006). Given observed data, it outputs the CPDAG representing the set of DAGs with maximum BIC score. The algorithm is exact, meaning it is guaranteed that the optimal CPDAG with regard to the BIC score is found.

The time and memory complexity of the algorithm is in the order of $O(n \cdot 2^{n-1})$ with $n$ being the number of variables (which is significantly better than a naive brute-force algorithm, which considers all CPDAGs on $n$ vertices, but still amounts to exponential time). The algorithm generally runs reasonably fast for less than 20 variables. It can be scaled further, provided enough RAM is available, up to a little more than 25 variables, but cases beyond that are likely infeasible.

It can be called similar to GES:

est_g = exactscorebased(df; penalty=1.0, parallel=true)

Reference:

diff --git a/dev/examples/pc_basic_examples/index.html b/dev/examples/pc_basic_examples/index.html index ef018989..53cae8db 100644 --- a/dev/examples/pc_basic_examples/index.html +++ b/dev/examples/pc_basic_examples/index.html @@ -30,4 +30,4 @@ df = (x=x, v=v, w=w, z=z, s=s) plot_pc_graph_tikz(pcalg(df, 0.01, gausscitest), [String(k) for k in keys(df)]) -

PC results for alternative DAG

We can, however, conclude unequivocally from the available (observational) data alone that w and v are causal for z and z for s. At no point did we have to resort to things like direct interventions, experiments or A/B tests to arrive at these conclusions!

+

PC results for alternative DAG

We can, however, conclude unequivocally from the available (observational) data alone that w and v are causal for z and z for s. At no point did we have to resort to things like direct interventions, experiments or A/B tests to arrive at these conclusions!

diff --git a/dev/examples/pc_cmi_examples/index.html b/dev/examples/pc_cmi_examples/index.html index fe68f224..4bd82d7e 100644 --- a/dev/examples/pc_cmi_examples/index.html +++ b/dev/examples/pc_cmi_examples/index.html @@ -14,4 +14,4 @@ z = 3 * v.^2 - w + randn(N)*0.25 s = z.^2 + randn(N)*0.25 -df = (x=x,v=v,w=w,z=z,s=s)

The data set created above exhibits the same causal structure, but now the individual variables are related to each other in a (deliberately) nonlinear fashion. As before, we can run the PC algorithm on this data set using the Gaussian conditional independence test:

plot_pc_graph_tikz(pcalg(df, 0.01, gausscitest), [String(k) for k in keys(df)])

PC algorithm on nonlinear data with gaussci

Interestingly enough, the resulting graph bears no resemblance to the true graph we would expect to see. To understand what's happening here, we need to remind ourselves that the PC algorithm uses conditional independence tests to determine the causal relationships between different parts of our data set. It seems as if the Gaussian conditional independence test we are using here can't quite handle the nonlinear relationships present in the data (note that increasing the number of data points won't change this!). To handle cases like this, we implemented additional conditional independence tests based on the concept of conditional mutual information (CMI). These tests have the benefit of being independent of the type of relationship between different parts of the analysed data. Using these tests instead of the Gaussian test we used previously is achieved by simply passing cmitest instead of gaussci as the conditional independence test to use to the PC algorithm:

plot_pc_graph_tikz(pcalg(df, 0.01, cmitest), [String(k) for k in keys(df)])

PC algorithm on nonlinear data with cmitest

The result of the PC algorithm using the CMI test again look like what we'd expect to see. It should be pointed out here that using cmitest is significantly more computationally demanding and takes a lot longer than using gausscitest.

+df = (x=x,v=v,w=w,z=z,s=s)

The data set created above exhibits the same causal structure, but now the individual variables are related to each other in a (deliberately) nonlinear fashion. As before, we can run the PC algorithm on this data set using the Gaussian conditional independence test:

plot_pc_graph_tikz(pcalg(df, 0.01, gausscitest), [String(k) for k in keys(df)])

PC algorithm on nonlinear data with gaussci

Interestingly enough, the resulting graph bears no resemblance to the true graph we would expect to see. To understand what's happening here, we need to remind ourselves that the PC algorithm uses conditional independence tests to determine the causal relationships between different parts of our data set. It seems as if the Gaussian conditional independence test we are using here can't quite handle the nonlinear relationships present in the data (note that increasing the number of data points won't change this!). To handle cases like this, we implemented additional conditional independence tests based on the concept of conditional mutual information (CMI). These tests have the benefit of being independent of the type of relationship between different parts of the analysed data. Using these tests instead of the Gaussian test we used previously is achieved by simply passing cmitest instead of gaussci as the conditional independence test to use to the PC algorithm:

plot_pc_graph_tikz(pcalg(df, 0.01, cmitest), [String(k) for k in keys(df)])

PC algorithm on nonlinear data with cmitest

The result of the PC algorithm using the CMI test again look like what we'd expect to see. It should be pointed out here that using cmitest is significantly more computationally demanding and takes a lot longer than using gausscitest.

diff --git a/dev/examples/pc_real_example/index.html b/dev/examples/pc_real_example/index.html index d911928d..e62fb699 100644 --- a/dev/examples/pc_real_example/index.html +++ b/dev/examples/pc_real_example/index.html @@ -17,4 +17,4 @@ end # make variable names a bit easier to read -variables = map(x->replace(x,"_"=>" "), names(df))

This dataframe can be used as input to the PC algorithm just like the dataframes holding synthetic data before:

plot_pc_graph_tikz(pcalg(df, 0.025, gausscitest), variables)

Plot of PC output for true data

At first glance, the output of the PC algorithm looks sensible and appears to provide some first insight into the data we are working with. In particular, the causal factors influencing faculty pay appear rather sensible. However, before drawing any further conclusions, different versions of the PC algorithm (e.g., different p-values, different independence tests, etc.) should be explored.

+variables = map(x->replace(x,"_"=>" "), names(df))

This dataframe can be used as input to the PC algorithm just like the dataframes holding synthetic data before:

plot_pc_graph_tikz(pcalg(df, 0.025, gausscitest), variables)

Plot of PC output for true data

At first glance, the output of the PC algorithm looks sensible and appears to provide some first insight into the data we are working with. In particular, the causal factors influencing faculty pay appear rather sensible. However, before drawing any further conclusions, different versions of the PC algorithm (e.g., different p-values, different independence tests, etc.) should be explored.

diff --git a/dev/index.html b/dev/index.html index 1910d620..ce0b0e60 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · CausalInference.jl

CausalInference.jl

A Julia package for causal inference, graphical models and structure learning. This package contains constraint-based and score-based algorithms for causal structure learning, as well as functionality to compute adjustment sets which can be used as control variables to compute causal effects by regression.

Introduction

The first aim of this package is to provide Julia implementations of popular algorithms for causal structure identification, the PC algorithm, the GES algorithm, an exact score-based method, and the FCI algorithm. The aim of these algorithms is to identify causal relationships in observational data alone, in circumstances where running experiments or A/B tests is impractical or even impossible. While identification of all causal relationships in observational data is not always possible, both algorithms clearly indicate which causal can and which cannot be determined from observational data. Secondly, similarly to DAGitty, covariate adjustment sets for estimating causal effects for example by regression can be computed.

Causal inference is by no means an easy subject. Readers without any prior exposure to these topics are encouraged to go over the following resources in order to get a basic idea of what's involved in causal inference:

There are also tutorials and examples linked in the navigation bar of this package.

Implementation Details

The PC algorithm was tested on random DAGs by comparing the result of the PC algorithm using the d-separation oracle with the CPDAG computed with Chickering's DAG->CPDAG conversion algorithm (implemented as dsep and cpdag in this package).

See the Library for other implemented functionality.

The algorithms use the SimpleGraph and SimpleDiGraph graph representation of the Julia package Graphs. Both types of graphs are represented by sorted adjacency lists (vectors of vectors in the Graphs implementation).

CPDAGs are just modeled as SimpleDiGraphs, where unoriented edges are represented by a forward and a backward directed edge.

The listing algorithms for adjustment sets are implemented from scratch using an memority efficient iterator protocol to handle large problems.

Performance

The speed of the PC and GES algorithm is comparable with the C++ code of the R package pcalg. The exact score-based algorithm scales to 20-25 variables on consumer hardware, which comes close to the theoretical limits of these approaches.

Plotting

Main package provides a text-based output describing all identified edges for PC and FCI algorithm (plot_pc_graph_text and plot_fci_graph_text, respectively).

In addition, additional plotting backends are supported with lazy code loading orchestrated by Requires.jl. Upon importing of TikzGraphs.jl, additional plotting methods plot_pc_graph_tikz and plot_fci_graph_tikz will be loaded (these are also aliased as plot_pc_graph and plot_fci_graph for backward compatibility). Similarly, upon importing of both GraphRecipes.jl and Plots.jl, additional plotting methods plot_pc_graph_recipes and plot_fci_graph_recipes will be loaded.

At the time of writing (December 2022), TikzGraphs.jl cannot be installed on ARM-based systems, so GraphRecipes.jl + Plots.jl is the recommended plotting backend in such cases.

Contribution

See issue #1 (Roadmap/Contribution) for questions and coordination of the development.

References

  • P. Spirtes, C. Glymour, R. Scheines, R. Tillman: Automated search for causal relations: Theory and practice. Heuristics, Probability and Causality: A Tribute to Judea Pearl 2010
  • P. Spirtes, C. Glymour, R. Scheines: Causation, Prediction, and Search. MIT Press 2000
  • J. Zhang: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 16-17 (2008), 1873-1896
  • T. Richardson, P. Spirtes: Ancestral Graph Markov Models. The Annals of Statistics 30 (2002), 962-1030
  • D. M. Chickering: Learning Equivalence Classes of Bayesian-Network Structures. Journal of Machine Learning Research 2 (2002), 445-498.
  • D. Colombo, M. H. Maathuis: Order-Independent Constraint-Based Causal Structure Learning. Journal of Machine Learning Research 15 (2014), 3921-3962.
  • B. van der Zander, M. Liśkiewicz, J. Textor: Separators and Adjustment Sets in Causal Graphs: Complete Criteria and an Algorithmic Framework. (https://doi.org/10.48550/arXiv.1803.00116)
  • M. Schauer, M. Wienöbst: Causal structure learning with momentum: Sampling distributions over Markov Equivalence Classes of DAGs. Proceedings of The 12th International Conference on Probabilistic Graphical Models, PMLR 246 (2024), 382-400. (https://proceedings.mlr.press/v246/schauer24a.html)
+Home · CausalInference.jl

CausalInference.jl

A Julia package for causal inference, graphical models and structure learning. This package contains constraint-based and score-based algorithms for causal structure learning, as well as functionality to compute adjustment sets which can be used as control variables to compute causal effects by regression.

Introduction

The first aim of this package is to provide Julia implementations of popular algorithms for causal structure identification, the PC algorithm, the GES algorithm, an exact score-based method, and the FCI algorithm. The aim of these algorithms is to identify causal relationships in observational data alone, in circumstances where running experiments or A/B tests is impractical or even impossible. While identification of all causal relationships in observational data is not always possible, both algorithms clearly indicate which causal can and which cannot be determined from observational data. Secondly, similarly to DAGitty, covariate adjustment sets for estimating causal effects for example by regression can be computed.

Causal inference is by no means an easy subject. Readers without any prior exposure to these topics are encouraged to go over the following resources in order to get a basic idea of what's involved in causal inference:

There are also tutorials and examples linked in the navigation bar of this package.

Implementation Details

The PC algorithm was tested on random DAGs by comparing the result of the PC algorithm using the d-separation oracle with the CPDAG computed with Chickering's DAG->CPDAG conversion algorithm (implemented as dsep and cpdag in this package).

See the Library for other implemented functionality.

The algorithms use the SimpleGraph and SimpleDiGraph graph representation of the Julia package Graphs. Both types of graphs are represented by sorted adjacency lists (vectors of vectors in the Graphs implementation).

CPDAGs are just modeled as SimpleDiGraphs, where unoriented edges are represented by a forward and a backward directed edge.

The listing algorithms for adjustment sets are implemented from scratch using an memority efficient iterator protocol to handle large problems.

Performance

The speed of the PC and GES algorithm is comparable with the C++ code of the R package pcalg. The exact score-based algorithm scales to 20-25 variables on consumer hardware, which comes close to the theoretical limits of these approaches.

Plotting

Main package provides a text-based output describing all identified edges for PC and FCI algorithm (plot_pc_graph_text and plot_fci_graph_text, respectively).

In addition, additional plotting backends are supported with lazy code loading orchestrated by Requires.jl. Upon importing of TikzGraphs.jl, additional plotting methods plot_pc_graph_tikz and plot_fci_graph_tikz will be loaded (these are also aliased as plot_pc_graph and plot_fci_graph for backward compatibility). Similarly, upon importing of both GraphRecipes.jl and Plots.jl, additional plotting methods plot_pc_graph_recipes and plot_fci_graph_recipes will be loaded.

At the time of writing (December 2022), TikzGraphs.jl cannot be installed on ARM-based systems, so GraphRecipes.jl + Plots.jl is the recommended plotting backend in such cases.

Contribution

See issue #1 (Roadmap/Contribution) for questions and coordination of the development.

References

  • P. Spirtes, C. Glymour, R. Scheines, R. Tillman: Automated search for causal relations: Theory and practice. Heuristics, Probability and Causality: A Tribute to Judea Pearl 2010
  • P. Spirtes, C. Glymour, R. Scheines: Causation, Prediction, and Search. MIT Press 2000
  • J. Zhang: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 16-17 (2008), 1873-1896
  • T. Richardson, P. Spirtes: Ancestral Graph Markov Models. The Annals of Statistics 30 (2002), 962-1030
  • D. M. Chickering: Learning Equivalence Classes of Bayesian-Network Structures. Journal of Machine Learning Research 2 (2002), 445-498.
  • D. Colombo, M. H. Maathuis: Order-Independent Constraint-Based Causal Structure Learning. Journal of Machine Learning Research 15 (2014), 3921-3962.
  • B. van der Zander, M. Liśkiewicz, J. Textor: Separators and Adjustment Sets in Causal Graphs: Complete Criteria and an Algorithmic Framework. (https://doi.org/10.48550/arXiv.1803.00116)
  • M. Schauer, M. Wienöbst: Causal structure learning with momentum: Sampling distributions over Markov Equivalence Classes of DAGs. Proceedings of The 12th International Conference on Probabilistic Graphical Models, PMLR 246 (2024), 382-400. (https://proceedings.mlr.press/v246/schauer24a.html)
diff --git a/dev/library/index.html b/dev/library/index.html index 6d887aeb..6aa234de 100644 --- a/dev/library/index.html +++ b/dev/library/index.html @@ -1,22 +1,22 @@ -Library · CausalInference.jl

Library

(Partially) directed acyclic graphs (PDAGs and DAGs)

CausalInference.has_a_pathFunction
has_a_path(g::AbstractGraph, U::Vector, V::VectorOrVertex, exclude_vertices::AbstractVector = T[], nbs=Graphs.outneighbors)

Find if there is a (semi-directed) path connecting U with V not passing exclude_vertices, where nbs=Graphs.outneighbors determines the direction of traversal.

source
CausalInference.iscliqueFunction
isclique(g, nodes)

Return true if all vertices in nodes are adjacent to each other in the graph g. Chickering, "Learning equivalence classes" (2002). Note that a 3-clique with already two undirected edges is also a clique in the neighbor sense.

source
CausalInference.childrenFunction
children(g, x)

Children of x in g are vertices y such that there is a directed edge y <– x. Returns sorted array.

source
CausalInference.isorientedFunction
isoriented(g, edge::Edge)
-isoriented(g, x, y)

Test if x and y are connected by a directed edge in the graph g, either x←y OR x→y. Can also perform the same test given an edge.

source
CausalInference.isadjacentFunction
isadjacent(g, x, y)

Test if x and y are connected by a any edge in the graph g (i.e. x –- y, x –> y, or x <– y .)

source

Causal graphs

CausalInference.dsepFunction
dsep(g::AbstractGraph, u, v, s; verbose = false)

Check whether u and v are d-separated given set s. Algorithm: unrolled https://arxiv.org/abs/1304.1505

source
CausalInference.cpdagFunction
cpdag(skel::DiGraph)

Computes CPDAG from a DAG using Chickering's conversion algorithm (equivalent to the PC algorithm with d-seperation as independence test, see pc_oracle.)

Reference: M. Chickering: Learning equivalence classes of Bayesian network structures. Journal of Machine Learning Research 2 (2002). M. Chickering: A Transformational Characterization of Equivalent Bayesian Network Structures. (1995).

Note that the edge order defined there is already partly encoded into the representation of a DiGraph.

source
CausalInference.alt_cpdagFunction
alt_cpdag(g::DiGraph)

Computes CPDAG from a DAG using a simpler adaption of Chickering's conversion algorithm without ordering all edges (equivalent to the PC algorithm with d-separation as independence test, see pc_oracle.)

Reference: M. Chickering: Learning equivalence classes of Bayesian network structures. Journal of Machine Learning Research 2 (2002). M. Chickering: A Transformational Characterization of Equivalent Bayesian Network Structures. (1995).

source
CausalInference.has_recanting_witnessFunction
has_recanting_witness(g::AbstractGraph, u, v,  blocked_edges::AbstractGraph) -> Bool

In a causal DAG with edges g, determine whether path-specific causal effect from vertex u to v with edges in blocked_edges blocked can be can be computed uniquely from the data available to the investigator (assuming complete observations), which is the case if there is no "recanting witness". Essentially this means that blocked_edges could equivalently be replaced by a blocking only outgoing edges of u.

See Alvin, Shpitser, Pearl (2005): "Identifiability of Path-Specific Effects", https://ftp.cs.ucla.edu/pub/stat_ser/r321-L.pdf.

source
CausalInference.backdoor_criterionFunction
backdoor_criterion(g::AbstractGraph, u::Integer, v::Integer, Z; verbose = false)

Test that given directed graph g, no node in Z is descendant of u and Z d-separates u from v in the subgraph that only has the backdoors of u left (outgoing edges of u removed).

If so, the causal effect of u on v is identifiable and is given by the formula:

∑{z∈Z} p(v | u, z)p(z)

In the linear Gaussian model, find E[Y | X = x, Z = z] = α + βx + γ'z and obtain `E[Y | do(x)] = α + βx + γ'E[Z].

source
CausalInference.meek_rules!Function
meek_rules!(g; rule4=false)

Apply Meek's rules 1-3 or 1-4 with rule4=true to orient edges in a partially directed graph without creating cycles or new v-structures. Rule 4 is needed if edges are compelled/preoriented using external knowledge.

source
CausalInference.meek_rule1Function
meek_rule1(g, v, w)

Rule 1: Orient v-w into v->w whenever there is u->v such that u and w are not adjacent (otherwise a new v-structure is created.)

source
CausalInference.meek_rule2Function
meek_rule2(g, v, w)

Rule 2: Orient v-w into v->w whenever there is a chain v->k->w (otherwise a directed cycle is created.)

source
CausalInference.meek_rule3Function
meek_rule3(g, v, w)

Rule 3 (Diagonal): Orient v-w into v->w whenever there are two chains v-k->w and v-l->w such that k and l are nonadjacent (otherwise a new v-structure or a directed cycle is created.)

source
CausalInference.do!Function

" do!(g, v)

Graphical do operator, removes all incoming edges to vertex v in a DiGraph g. Returns the modified graph.

source

PC algorithm

CausalInference.pcalgFunction
pcalg(g, I; stable=true)

Perform the PC algorithm for a set of 1:n variables using the tests

I(u, v, [s1, ..., sn])

Use IClosure(I, args) to wrap a function f with signature

f(u, v, [s1, ..., sn], par...)

Returns the CPDAG as DiGraph. By default uses a stable and threaded versions of the skeleton algorithm. (This is the most recent interface)

source
pcalg(n::V, I, par...; stable=true)

Perform the PC algorithm for a set of 1:n variables using the tests

I(u, v, [s1, ..., sn], par...)

Returns the CPDAG as DiGraph. By default uses a stable and threaded versions of the skeleton algorithm.

source
pcalg(t, p::Float64, test::typeof(gausscitest); kwargs...)

Run PC algorithm for tabular input data t using a p-value p to test for conditional independeces using Fisher's z-transformation.

source
pcalg(t::T, p::Float64; cmitest::typeof(cmitest); kwargs...) where{T}

Run PC algorithm for tabular input data t using a p-value p to detect conditional independeces using a conditional mutual information permutation test.

source
CausalInference.orient_unshieldedFunction
orient_unshielded(g, dg, S)

Orient unshielded triples using the separating sets. g is an undirected graph containing edges of unknown direction, dg is an directed graph containing edges of known direction and both v=>w and w=>vif the direction of Edge(v,w)is unknown.S` are the separating sets of edges.

Returns g, dg.

source
CausalInference.apply_pc_rulesFunction
apply_pc_rules(g, dg)

g is an undirected graph containing edges of unknown direction, dg is an directed graph containing edges of known direction and both v=>w and w=>vif the direction of Edge(v,w)` is unknown.

Returns the CPDAG as DiGraph.

source
CausalInference.skeletonFunction
skeleton(n::Integer, I) -> g, S
-skeleton(g, I) -> g, S

Perform the undirected PC skeleton algorithm for a set of 1:n variables using the test I. Start with a subgraph g or the complete undirected graph on n vertices. Returns skeleton graph and separating set.

source
CausalInference.skeleton_stableFunction
skeleton_stable(n::Integer, I) -> g, S
-skeleton_stable(g, I) -> g, S

Perform the undirected stable PC skeleton algorithm for a set of 1:n variables using the test I. Start with a subgraph g or the complete undirected graph on n vertices. Returns skeleton graph and separating set.

source
CausalInference.unshieldedFunction
unshielded(g)

Find the unshielded triples in the cyclefree skeleton. Triples are connected vertices v-w-z where z is not a neighbour of v. Uses that edges iterates in lexicographical order.

source
CausalInference.orientable_unshieldedFunction
orientable_unshielded(g, S)

Find the orientable unshielded triples in the skeleton. Triples are connected vertices v-w-z where z is not a neighbour of v. Uses that edges iterates in lexicographical order.

source
CausalInference.plot_pc_graph_tikzFunction
CausalInference.plot_pc_graph_tikz(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the PC algorithm (TikzGraphs backend).

Requires TikzGraphs to be imported

source
CausalInference.plot_pc_graph_recipesFunction
plot_pc_graph_recipes(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the PC algorithm (GraphRecipes backend).

Requires GraphRecipes and Plots to be imported

source
CausalInference.plot_pc_graph_textFunction
plot_pc_graph_text(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the PC algorithm (text-based output).

See also: plot_pc_graph and plot_pc_graph_tikz (for TikzGraphs.jl-based plotting), plot_pc_graph_recipes (for GraphRecipes.jl-based plotting)

source
CausalInference.prepare_pc_graphFunction
prepare_pc_graph(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Prepare resulting graph for plotting with various backends.

source

Statistics

CausalInference.gausscitestFunction
gausscitest(i, j, s, (C,n), c)

Test for conditional independence of variable no i and j given variables in s with Gaussian test at the critical value c. C is covariance of n observations.

source
CausalInference.partialcorFunction
partialcor(i, j, s, C)

Compute the partial correlation of nodes i and j given list of nodes s using the correlation matrix C.

source
CausalInference.cmitestFunction
cmitest(i,j,s,data,crit; kwargs...)

Test for conditional independence of variables i and j given variables in s with permutation test using nearest neighbor conditional mutual information estimates at p-value crit.

keyword arguments: kwargs...: keyword arguments passed to independence tests

source

KL Entropy Estimators

CausalInference.kl_entropyFunction
kl_entropy(data::Array{Float64, 2}; k=5)

Compute the nearest-neighbor estimate of the differential entropy of data.

data is a 2d array, with every column representing one data point. For further information, see

"A class of Rényi information estimators for multidimensional densities" Nikolai Leonenko, Luc Pronzato, and Vippal Savani The Annals of Statistics, 2008 https://projecteuclid.org/euclid.aos/1223908088

keyword arguments: k=5: number of nearest neighbors

source
CausalInference.kl_renyiFunction
kl_renyi(data::Array{Float64, 2}, q; k=5)

Compute the nearest-neighbor estimate of the Renyi-alpha entropy of data.

data is a 2d array, with every column representing one data point. For further information, see

"A class of Rényi information estimators for multidimensional densities" Nikolai Leonenko, Luc Pronzato, and Vippal Savani The Annals of Statistics, 2008 https://projecteuclid.org/euclid.aos/1223908088

keyword arguments: k=5: number of nearest neighbors

source
CausalInference.kl_mutual_informationFunction
kl_mutual_information(x, y; k=5, bias_correction=true)

compute the nearest-neighbor 'KGS' estimate of the mutual information between x and y.

x and y are 2d arrays, with every column representing one data point. For further information, see

"Estimating Mutual Information" Alexander Kraskov, Harald Stoegbauer, and Peter Grassberger Physical Review E https://arxiv.org/pdf/cond-mat/0305641.pdf

"Demystifying Fixed k-Nearest Neighbor Information Estimators" Weihao Gao, Sewoong Oh, Pramod Viswanath EEE International Symposium on Information Theory - Proceedings https://arxiv.org/pdf/1604.03006.pdf

keyword arguments: k=5: number of nearest neighbors bias_correction=true: flag to apply Gao's bias correction

source
CausalInference.kl_cond_miFunction
kl_cond_mi(x, y, z; k=5, bias_correction=true)

compute the nearest-neighbor 'KGS' estimate of the conditional mutual information between x and y given z.

x, y, and z are 2d arrays, with every column representing one data point. keyword arguments: k=5: number of nearest neighbors bias_correction=true: flag to apply Gao's bias correction

source
CausalInference.kl_perm_mi_testFunction
kl_perm_mi_test(x, y; k=5, B=100, bias_correction=true)

compute permutation test of independence of x and y.

keyword arguments: k=5: number of nearest neighbors to use for mutual information estimate B=100: number of permutations bias_correction=true: flag to apply Gao's bias correction

source
CausalInference.kl_perm_cond_mi_testFunction
kl_perm_cond_mi_test(x, y, z; k=5, B=100, kp=5, bias_correction=true)

compute permutation test of conditional independence of x and y given z.

For further information, see: "Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information" Jakob Runge Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS) 2018, Lanzarote, Spain. http://proceedings.mlr.press/v84/runge18a/runge18a.pdf

keyword arguments: k=5: number of nearest neighbors to use for mutual information estimate B=100: number of permutations bias_correction=true: flag to apply Gao's bias correction

source

FCI algorithm

CausalInference.has_marksFunction
has_marks(dg, v1, v2, t::Tuple{Symbol, Symbol}

Test if the edge between node v1 and v2 has the edge markers given by the tuple t (use the arrow macro to simplify use)

Example: has_marks(dg, 1, 2, arrow"o->")

source
CausalInference.set_marks!Function
set_marks!(dg, v1, v2, t::Tuple{Symbol, Symbol})

Set edge marks between node v1 and v2.

Example: set_marks!(dg, 1, 2, arrow"*->")

source
CausalInference.fcialgFunction
fcialg(n::V, I, par...; augmented=true, verbose=false, kwargs...)

Perform the FCI algorithm for a set of n variables using the test

I(u, v, [s1, ..., sn], par...)

Returns the PAG as a MetaDiGraph

source
CausalInference.plot_fci_graph_tikzFunction
plot_fci_graph_tikz(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the FCI algorithm (TikzGraphs backend).

Requires TikzGraphs to be imported

source
CausalInference.plot_fci_graph_recipesFunction
plot_fci_graph_recipes(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the FCI algorithm (GraphRecipes backend).

Requires GraphRecipes and Plots to be imported

source
CausalInference.plot_fci_graph_textFunction
plot_fci_graph_text(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the FCI algorithm (text-based output).

See also: plot_fci_graph and plot_fci_graph_tikz (for TikzGraphs.jl-based plotting), plot_fci_graph_recipes (for GraphRecipes.jl-based plotting)

source
CausalInference.prepare_fci_graphFunction
prepare_fci_graph(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Prepare the output of the FCI algorithm for plotting with various backends.

source

Miscellaneous

CausalInference.randdagFunction
randdag(n, alpha = 0.1)

Create Erdős–Rényi random DAG from randomly permuted random triangular matrix with edge probability alpha.

source
CausalInference.ordered_edgesFunction
ordered_edges(dag)

Iterator of edges of a dag, ordered in Chickering order:

Perform a topological sort on the NODES
+Library · CausalInference.jl

Library

(Partially) directed acyclic graphs (PDAGs and DAGs)

CausalInference.has_a_pathFunction
has_a_path(g::AbstractGraph, U::Vector, V::VectorOrVertex, exclude_vertices::AbstractVector = T[], nbs=Graphs.outneighbors)

Find if there is a (semi-directed) path connecting U with V not passing exclude_vertices, where nbs=Graphs.outneighbors determines the direction of traversal.

source
CausalInference.iscliqueFunction
isclique(g, nodes)

Return true if all vertices in nodes are adjacent to each other in the graph g. Chickering, "Learning equivalence classes" (2002). Note that a 3-clique with already two undirected edges is also a clique in the neighbor sense.

source
CausalInference.childrenFunction
children(g, x)

Children of x in g are vertices y such that there is a directed edge y <– x. Returns sorted array.

source
CausalInference.isorientedFunction
isoriented(g, edge::Edge)
+isoriented(g, x, y)

Test if x and y are connected by a directed edge in the graph g, either x←y OR x→y. Can also perform the same test given an edge.

source
CausalInference.isadjacentFunction
isadjacent(g, x, y)

Test if x and y are connected by a any edge in the graph g (i.e. x –- y, x –> y, or x <– y .)

source

Causal graphs

CausalInference.dsepFunction
dsep(g::AbstractGraph, u, v, s; verbose = false)

Check whether u and v are d-separated given set s. Algorithm: unrolled https://arxiv.org/abs/1304.1505

source
CausalInference.cpdagFunction
cpdag(skel::DiGraph)

Computes CPDAG from a DAG using Chickering's conversion algorithm (equivalent to the PC algorithm with d-seperation as independence test, see pc_oracle.)

Reference: M. Chickering: Learning equivalence classes of Bayesian network structures. Journal of Machine Learning Research 2 (2002). M. Chickering: A Transformational Characterization of Equivalent Bayesian Network Structures. (1995).

Note that the edge order defined there is already partly encoded into the representation of a DiGraph.

source
CausalInference.alt_cpdagFunction
alt_cpdag(g::DiGraph)

Computes CPDAG from a DAG using a simpler adaption of Chickering's conversion algorithm without ordering all edges (equivalent to the PC algorithm with d-separation as independence test, see pc_oracle.)

Reference: M. Chickering: Learning equivalence classes of Bayesian network structures. Journal of Machine Learning Research 2 (2002). M. Chickering: A Transformational Characterization of Equivalent Bayesian Network Structures. (1995).

source
CausalInference.has_recanting_witnessFunction
has_recanting_witness(g::AbstractGraph, u, v,  blocked_edges::AbstractGraph) -> Bool

In a causal DAG with edges g, determine whether path-specific causal effect from vertex u to v with edges in blocked_edges blocked can be can be computed uniquely from the data available to the investigator (assuming complete observations), which is the case if there is no "recanting witness". Essentially this means that blocked_edges could equivalently be replaced by a blocking only outgoing edges of u.

See Alvin, Shpitser, Pearl (2005): "Identifiability of Path-Specific Effects", https://ftp.cs.ucla.edu/pub/stat_ser/r321-L.pdf.

source
CausalInference.backdoor_criterionFunction
backdoor_criterion(g::AbstractGraph, u::Integer, v::Integer, Z; verbose = false)

Test that given directed graph g, no node in Z is descendant of u and Z d-separates u from v in the subgraph that only has the backdoors of u left (outgoing edges of u removed).

If so, the causal effect of u on v is identifiable and is given by the formula:

∑{z∈Z} p(v | u, z)p(z)

In the linear Gaussian model, find E[Y | X = x, Z = z] = α + βx + γ'z and obtain `E[Y | do(x)] = α + βx + γ'E[Z].

source
CausalInference.meek_rules!Function
meek_rules!(g; rule4=false)

Apply Meek's rules 1-3 or 1-4 with rule4=true to orient edges in a partially directed graph without creating cycles or new v-structures. Rule 4 is needed if edges are compelled/preoriented using external knowledge.

source
CausalInference.meek_rule1Function
meek_rule1(g, v, w)

Rule 1: Orient v-w into v->w whenever there is u->v such that u and w are not adjacent (otherwise a new v-structure is created.)

source
CausalInference.meek_rule2Function
meek_rule2(g, v, w)

Rule 2: Orient v-w into v->w whenever there is a chain v->k->w (otherwise a directed cycle is created.)

source
CausalInference.meek_rule3Function
meek_rule3(g, v, w)

Rule 3 (Diagonal): Orient v-w into v->w whenever there are two chains v-k->w and v-l->w such that k and l are nonadjacent (otherwise a new v-structure or a directed cycle is created.)

source
CausalInference.do!Function

" do!(g, v)

Graphical do operator, removes all incoming edges to vertex v in a DiGraph g. Returns the modified graph.

source

PC algorithm

CausalInference.pcalgFunction
pcalg(g, I; stable=true)

Perform the PC algorithm for a set of 1:n variables using the tests

I(u, v, [s1, ..., sn])

Use IClosure(I, args) to wrap a function f with signature

f(u, v, [s1, ..., sn], par...)

Returns the CPDAG as DiGraph. By default uses a stable and threaded versions of the skeleton algorithm. (This is the most recent interface)

source
pcalg(n::V, I, par...; stable=true)

Perform the PC algorithm for a set of 1:n variables using the tests

I(u, v, [s1, ..., sn], par...)

Returns the CPDAG as DiGraph. By default uses a stable and threaded versions of the skeleton algorithm.

source
pcalg(t, p::Float64, test::typeof(gausscitest); kwargs...)

Run PC algorithm for tabular input data t using a p-value p to test for conditional independeces using Fisher's z-transformation.

source
pcalg(t::T, p::Float64; cmitest::typeof(cmitest); kwargs...) where{T}

Run PC algorithm for tabular input data t using a p-value p to detect conditional independeces using a conditional mutual information permutation test.

source
CausalInference.orient_unshieldedFunction
orient_unshielded(g, dg, S)

Orient unshielded triples using the separating sets. g is an undirected graph containing edges of unknown direction, dg is an directed graph containing edges of known direction and both v=>w and w=>vif the direction of Edge(v,w)is unknown.S` are the separating sets of edges.

Returns g, dg.

source
CausalInference.apply_pc_rulesFunction
apply_pc_rules(g, dg)

g is an undirected graph containing edges of unknown direction, dg is an directed graph containing edges of known direction and both v=>w and w=>vif the direction of Edge(v,w)` is unknown.

Returns the CPDAG as DiGraph.

source
CausalInference.skeletonFunction
skeleton(n::Integer, I) -> g, S
+skeleton(g, I) -> g, S

Perform the undirected PC skeleton algorithm for a set of 1:n variables using the test I. Start with a subgraph g or the complete undirected graph on n vertices. Returns skeleton graph and separating set.

source
CausalInference.skeleton_stableFunction
skeleton_stable(n::Integer, I) -> g, S
+skeleton_stable(g, I) -> g, S

Perform the undirected stable PC skeleton algorithm for a set of 1:n variables using the test I. Start with a subgraph g or the complete undirected graph on n vertices. Returns skeleton graph and separating set.

source
CausalInference.unshieldedFunction
unshielded(g)

Find the unshielded triples in the cyclefree skeleton. Triples are connected vertices v-w-z where z is not a neighbour of v. Uses that edges iterates in lexicographical order.

source
CausalInference.orientable_unshieldedFunction
orientable_unshielded(g, S)

Find the orientable unshielded triples in the skeleton. Triples are connected vertices v-w-z where z is not a neighbour of v. Uses that edges iterates in lexicographical order.

source
CausalInference.plot_pc_graph_tikzFunction
CausalInference.plot_pc_graph_tikz(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the PC algorithm (TikzGraphs backend).

Requires TikzGraphs to be imported

source
CausalInference.plot_pc_graph_recipesFunction
plot_pc_graph_recipes(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the PC algorithm (GraphRecipes backend).

Requires GraphRecipes and Plots to be imported

source
CausalInference.plot_pc_graph_textFunction
plot_pc_graph_text(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the PC algorithm (text-based output).

See also: plot_pc_graph and plot_pc_graph_tikz (for TikzGraphs.jl-based plotting), plot_pc_graph_recipes (for GraphRecipes.jl-based plotting)

source
CausalInference.prepare_pc_graphFunction
prepare_pc_graph(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Prepare resulting graph for plotting with various backends.

source

Statistics

CausalInference.gausscitestFunction
gausscitest(i, j, s, (C,n), c)

Test for conditional independence of variable no i and j given variables in s with Gaussian test at the critical value c. C is covariance of n observations.

source
CausalInference.partialcorFunction
partialcor(i, j, s, C)

Compute the partial correlation of nodes i and j given list of nodes s using the correlation matrix C.

source
CausalInference.cmitestFunction
cmitest(i,j,s,data,crit; kwargs...)

Test for conditional independence of variables i and j given variables in s with permutation test using nearest neighbor conditional mutual information estimates at p-value crit.

keyword arguments: kwargs...: keyword arguments passed to independence tests

source

KL Entropy Estimators

CausalInference.kl_entropyFunction
kl_entropy(data::Array{Float64, 2}; k=5)

Compute the nearest-neighbor estimate of the differential entropy of data.

data is a 2d array, with every column representing one data point. For further information, see

"A class of Rényi information estimators for multidimensional densities" Nikolai Leonenko, Luc Pronzato, and Vippal Savani The Annals of Statistics, 2008 https://projecteuclid.org/euclid.aos/1223908088

keyword arguments: k=5: number of nearest neighbors

source
CausalInference.kl_renyiFunction
kl_renyi(data::Array{Float64, 2}, q; k=5)

Compute the nearest-neighbor estimate of the Renyi-alpha entropy of data.

data is a 2d array, with every column representing one data point. For further information, see

"A class of Rényi information estimators for multidimensional densities" Nikolai Leonenko, Luc Pronzato, and Vippal Savani The Annals of Statistics, 2008 https://projecteuclid.org/euclid.aos/1223908088

keyword arguments: k=5: number of nearest neighbors

source
CausalInference.kl_mutual_informationFunction
kl_mutual_information(x, y; k=5, bias_correction=true)

compute the nearest-neighbor 'KGS' estimate of the mutual information between x and y.

x and y are 2d arrays, with every column representing one data point. For further information, see

"Estimating Mutual Information" Alexander Kraskov, Harald Stoegbauer, and Peter Grassberger Physical Review E https://arxiv.org/pdf/cond-mat/0305641.pdf

"Demystifying Fixed k-Nearest Neighbor Information Estimators" Weihao Gao, Sewoong Oh, Pramod Viswanath EEE International Symposium on Information Theory - Proceedings https://arxiv.org/pdf/1604.03006.pdf

keyword arguments: k=5: number of nearest neighbors bias_correction=true: flag to apply Gao's bias correction

source
CausalInference.kl_cond_miFunction
kl_cond_mi(x, y, z; k=5, bias_correction=true)

compute the nearest-neighbor 'KGS' estimate of the conditional mutual information between x and y given z.

x, y, and z are 2d arrays, with every column representing one data point. keyword arguments: k=5: number of nearest neighbors bias_correction=true: flag to apply Gao's bias correction

source
CausalInference.kl_perm_mi_testFunction
kl_perm_mi_test(x, y; k=5, B=100, bias_correction=true)

compute permutation test of independence of x and y.

keyword arguments: k=5: number of nearest neighbors to use for mutual information estimate B=100: number of permutations bias_correction=true: flag to apply Gao's bias correction

source
CausalInference.kl_perm_cond_mi_testFunction
kl_perm_cond_mi_test(x, y, z; k=5, B=100, kp=5, bias_correction=true)

compute permutation test of conditional independence of x and y given z.

For further information, see: "Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information" Jakob Runge Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS) 2018, Lanzarote, Spain. http://proceedings.mlr.press/v84/runge18a/runge18a.pdf

keyword arguments: k=5: number of nearest neighbors to use for mutual information estimate B=100: number of permutations bias_correction=true: flag to apply Gao's bias correction

source

FCI algorithm

CausalInference.has_marksFunction
has_marks(dg, v1, v2, t::Tuple{Symbol, Symbol}

Test if the edge between node v1 and v2 has the edge markers given by the tuple t (use the arrow macro to simplify use)

Example: has_marks(dg, 1, 2, arrow"o->")

source
CausalInference.set_marks!Function
set_marks!(dg, v1, v2, t::Tuple{Symbol, Symbol})

Set edge marks between node v1 and v2.

Example: set_marks!(dg, 1, 2, arrow"*->")

source
CausalInference.fcialgFunction
fcialg(n::V, I, par...; augmented=true, verbose=false, kwargs...)

Perform the FCI algorithm for a set of n variables using the test

I(u, v, [s1, ..., sn], par...)

Returns the PAG as a MetaDiGraph

source
CausalInference.plot_fci_graph_tikzFunction
plot_fci_graph_tikz(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the FCI algorithm (TikzGraphs backend).

Requires TikzGraphs to be imported

source
CausalInference.plot_fci_graph_recipesFunction
plot_fci_graph_recipes(g, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the FCI algorithm (GraphRecipes backend).

Requires GraphRecipes and Plots to be imported

source
CausalInference.plot_fci_graph_textFunction
plot_fci_graph_text(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Plot the output of the FCI algorithm (text-based output).

See also: plot_fci_graph and plot_fci_graph_tikz (for TikzGraphs.jl-based plotting), plot_fci_graph_recipes (for GraphRecipes.jl-based plotting)

source
CausalInference.prepare_fci_graphFunction
prepare_fci_graph(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=String[])

Prepare the output of the FCI algorithm for plotting with various backends.

source

Miscellaneous

CausalInference.randdagFunction
randdag(n, alpha = 0.1)

Create Erdős–Rényi random DAG from randomly permuted random triangular matrix with edge probability alpha.

source
CausalInference.ordered_edgesFunction
ordered_edges(dag)

Iterator of edges of a dag, ordered in Chickering order:

Perform a topological sort on the NODES
 while there are unordered EDGES in g
     Let y be the lowest ordered NODE that has an unordered EDGE incident into it
     Let x be the highest ordered NODE for which x => y is not ordered
     return x => y 
-end
source
CausalInference.graph_to_textFunction
graph_to_text(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=[]; edge_styles::AbstractDict=Dict())

Print out a graph g as a table of edges labeled by node_labels.

Arguments

  • g::AbstractGraph: a graph to print
  • node_labels::AbstractVector{<:AbstractString}=[]: labels for nodes (same order as indices of nodes in g)
  • edge_styles::AbstractDict=Dict(): dictionary of edge styles (e.g. Dict((1, 2) => "->", (2, 3) => "<->"))

Example

g = DiGraph(4)
+end
source
CausalInference.graph_to_textFunction
graph_to_text(g::AbstractGraph, node_labels::AbstractVector{<:AbstractString}=[]; edge_styles::AbstractDict=Dict())

Print out a graph g as a table of edges labeled by node_labels.

Arguments

  • g::AbstractGraph: a graph to print
  • node_labels::AbstractVector{<:AbstractString}=[]: labels for nodes (same order as indices of nodes in g)
  • edge_styles::AbstractDict=Dict(): dictionary of edge styles (e.g. Dict((1, 2) => "->", (2, 3) => "<->"))

Example

g = DiGraph(4)
 for (i, j) in [(1, 2), (2, 3), (2, 4)]
     add_edge!(g, i, j)
 end
-graph_to_text(g)
source
CausalInference.kwargs_pdag_graphmakieFunction
kwargs_pdag_graphmakie(g; ilabels=1:nv(g), arrowsize=25, ilabels_fontsize=25)

Generates the keywords for GraphMakie.graphplot to plot causal graphs and (C)PDAGs represented as SimpleDiGraph as partially directed graphs.

Usage:

graphplot(g; kwargs_pdag_graphmakie(g)...)
source
CausalInference.combinations_withoutFunction
combinations_without(a, n::Integer, w)

Generate all combinations of n elements from an indexable object except those with index w. Note that the combinations are modified inplace.

source

Bayes ball

CausalInference.bayesballFunction
bayesball(g, X, S = Set{eltype(g)}())

Return the set of vertices d-connected to the set of vertices X given set of vertices S in dag g.

source
CausalInference.bayesball_graphFunction
bayesball_graph(g, X, S = Set{eltype(g)}(); back=false)

Return an mixed graph b containing edges for possible moves of the Bayes ball. Vertex x of g is vertex "x forward" at 2x-1 of b if entered forward and "x backward" at 2x if entered backward. y is d-connected to x given S if and only if there is a semi-directed path in b from "x backward" to "y forward" or "y backward"). back=true allows path through X

source

Adjustment

CausalInference.ancestorsFunction
ancestors(g, X, veto = no_veto)

Return the set of ancestors of the set of vertices X in graph g.

Every vertex is an ancestor of itself.

source
CausalInference.descendantsFunction
descendants(g, X, veto = no_veto)

Return the set of descendants of the set of vertices X in graph g.

Every vertex is a descendant of itself.

source
CausalInference.alt_test_dsepFunction
alt_test_dsep(g, X, Y, S, veto = no_veto)

Check if sets of vertices X and Y are d-separated in g given S.

An alternative to the test_dsep function, which uses gensearch under the hood. Might be (a bit) slower.

source
CausalInference.test_covariate_adjustmentFunction
test_covariate_adjustment(g, X, Y, S)

Check if S is a covariate adjustment set relative to (X, Y) in graph g.

Based on the sound and complete graphical criterion for covariate adjustment given in https://arxiv.org/abs/1203.3515 using the algorithmic approach proposed in https://arxiv.org/abs/1803.00116. Output is a boolean.

source
CausalInference.alt_test_backdoorFunction
alt_test_backdoor(g, X, Y, S)

Check if S satisfies the backdoor criterion relative to (X, Y) in graph g.

The generalization to sets X and Y differs from, e.g., Pearl (2009). See the Example section (TODO: ref).

source
CausalInference.find_dsepFunction
find_dsep(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y), veto = no_veto)

Find a d-separator Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithmic approach proposed in https://arxiv.org/abs/1803.00116.

source
CausalInference.find_min_dsepFunction
find_min_dsep(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y), veto = no_veto)

Find an inclusion minimal d-separator Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, i.e., one for which no subset is a d-separator, else return false.

Based on the algorithmic approach proposed in http://auai.org/uai2019/proceedings/papers/222.pdf.

source
CausalInference.find_covariate_adjustmentFunction
find_covariate_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find a covariate adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithmic approach proposed in https://arxiv.org/abs/1803.00116.

source
CausalInference.find_backdoor_adjustmentFunction
find_backdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find a backdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

The generalization to sets X and Y differs from, e.g., Pearl (2009). See the Example section (TODO: ref).

source
CausalInference.find_frontdoor_adjustmentFunction
find_frontdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find a frontdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithm given in https://arxiv.org/abs/2211.16468.

source
CausalInference.find_min_covariate_adjustmentFunction
find_min_covariate_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find an inclusion minimal covariate adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithmic approach proposed in https://arxiv.org/abs/1803.00116.

source
CausalInference.find_min_backdoor_adjustmentFunction
find_min_backdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find an inclusion minimal backdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

The generalization to sets X and Y differs from, e.g., Pearl (2009). See the Example section (TODO: ref).

source
CausalInference.find_min_frontdoor_adjustmentFunction
find_min_frontdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find an inclusion minimal frontdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else returns false.

Based on the algorithm given in https://arxiv.org/abs/2211.16468.

source
CausalInference.list_dsepsFunction
list_dseps(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all d-separators Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source
CausalInference.list_covariate_adjustmentFunction
list_covariate_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all covariate adjustment sets Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source
CausalInference.list_backdoor_adjustmentFunction
list_backdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all back-door adjustment sets Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source
CausalInference.list_frontdoor_adjustmentFunction
list_frontdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all front-door adjustment sets Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source

GES

CausalInference.gesFunction
ges(X; method=:gaussian_bic, penalty=0.5, parallel=false, verbose=false)

Compute a causal graph for the given observed data X (variables in columns) using GES. Returns the CPDAG, the score improvement relative to the empty graph and time measurements of first and second phase.

source
ges(n, local_score; score=0.0, parallel=false, verbose=false)

Internal method called by ges.

source
CausalInference.local_scoreFunction
local_score(os::GaussianScore, p, v)

Local Gaussian BIC score. Memoized for GaussianScore{Float64, Symmetric{Float64, Matrix{Float64}}}.

source
CausalInference.Insert!Function
Insert!(g, x, y, T)

Inserts x->y and directs previously undirected edges t->y, t ∈ T. Here x and y are not adjacent and T are undirected-neighbors of y that are not adjacent to x.

source
CausalInference.Delete!Function
Delete!(g, x, y, H)

Deletes x-y or x->y and directs previously undirected edges x->h and y->h for h in H.

source

Causal Zig-Zag

CausalInference.nupFunction
nup(g, total)

Number of directed edges that can be add to a PDAG g with total number of edges.

source
CausalInference.ndownFunction
ndown(g, total)

Number of edges that can be removed from a pdag counting undirected edges twice.

source
CausalInference.count_moves_uniformFunction
count_moves_uniform(g, κ=nv(g) - 1) = s1, s2, (x1, y1, T1), (x2, y2, H2)

Counts and samples operator in polynomial-time by avoiding full enumeration (only works for uniform score.) Count the number s1 of Insert and s2 of Delete operators for CPDAG g with degree bound κ and return a uniformly selected Insert(x1, y1, T1)and a uniform selectedDelete(x2, y2, H2)` operator.

source
CausalInference.causalzigzagFunction
causalzigzag(n, G = (DiGraph(n), 0); balance = metropolis_balance, prior = (_,_)->1.0, score=UniformScore(),
+graph_to_text(g)
source
CausalInference.kwargs_pdag_graphmakieFunction
kwargs_pdag_graphmakie(g; ilabels=1:nv(g), arrowsize=25, ilabels_fontsize=25)

Generates the keywords for GraphMakie.graphplot to plot causal graphs and (C)PDAGs represented as SimpleDiGraph as partially directed graphs.

Usage:

graphplot(g; kwargs_pdag_graphmakie(g)...)
source
CausalInference.combinations_withoutFunction
combinations_without(a, n::Integer, w)

Generate all combinations of n elements from an indexable object except those with index w. Note that the combinations are modified inplace.

source

Bayes ball

CausalInference.bayesballFunction
bayesball(g, X, S = Set{eltype(g)}())

Return the set of vertices d-connected to the set of vertices X given set of vertices S in dag g.

source
CausalInference.bayesball_graphFunction
bayesball_graph(g, X, S = Set{eltype(g)}(); back=false)

Return an mixed graph b containing edges for possible moves of the Bayes ball. Vertex x of g is vertex "x forward" at 2x-1 of b if entered forward and "x backward" at 2x if entered backward. y is d-connected to x given S if and only if there is a semi-directed path in b from "x backward" to "y forward" or "y backward"). back=true allows path through X

source

Adjustment

CausalInference.ancestorsFunction
ancestors(g, X, veto = no_veto)

Return the set of ancestors of the set of vertices X in graph g.

Every vertex is an ancestor of itself.

source
CausalInference.descendantsFunction
descendants(g, X, veto = no_veto)

Return the set of descendants of the set of vertices X in graph g.

Every vertex is a descendant of itself.

source
CausalInference.alt_test_dsepFunction
alt_test_dsep(g, X, Y, S, veto = no_veto)

Check if sets of vertices X and Y are d-separated in g given S.

An alternative to the test_dsep function, which uses gensearch under the hood. Might be (a bit) slower.

source
CausalInference.test_covariate_adjustmentFunction
test_covariate_adjustment(g, X, Y, S)

Check if S is a covariate adjustment set relative to (X, Y) in graph g.

Based on the sound and complete graphical criterion for covariate adjustment given in https://arxiv.org/abs/1203.3515 using the algorithmic approach proposed in https://arxiv.org/abs/1803.00116. Output is a boolean.

source
CausalInference.alt_test_backdoorFunction
alt_test_backdoor(g, X, Y, S)

Check if S satisfies the backdoor criterion relative to (X, Y) in graph g.

The generalization to sets X and Y differs from, e.g., Pearl (2009). See the Example section (TODO: ref).

source
CausalInference.find_dsepFunction
find_dsep(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y), veto = no_veto)

Find a d-separator Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithmic approach proposed in https://arxiv.org/abs/1803.00116.

source
CausalInference.find_min_dsepFunction
find_min_dsep(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y), veto = no_veto)

Find an inclusion minimal d-separator Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, i.e., one for which no subset is a d-separator, else return false.

Based on the algorithmic approach proposed in http://auai.org/uai2019/proceedings/papers/222.pdf.

source
CausalInference.find_covariate_adjustmentFunction
find_covariate_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find a covariate adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithmic approach proposed in https://arxiv.org/abs/1803.00116.

source
CausalInference.find_backdoor_adjustmentFunction
find_backdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find a backdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

The generalization to sets X and Y differs from, e.g., Pearl (2009). See the Example section (TODO: ref).

source
CausalInference.find_frontdoor_adjustmentFunction
find_frontdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find a frontdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithm given in https://arxiv.org/abs/2211.16468.

source
CausalInference.find_min_covariate_adjustmentFunction
find_min_covariate_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find an inclusion minimal covariate adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

Based on the algorithmic approach proposed in https://arxiv.org/abs/1803.00116.

source
CausalInference.find_min_backdoor_adjustmentFunction
find_min_backdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find an inclusion minimal backdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else return false.

The generalization to sets X and Y differs from, e.g., Pearl (2009). See the Example section (TODO: ref).

source
CausalInference.find_min_frontdoor_adjustmentFunction
find_min_frontdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

Find an inclusion minimal frontdoor adjustment set Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g, else returns false.

Based on the algorithm given in https://arxiv.org/abs/2211.16468.

source
CausalInference.list_dsepsFunction
list_dseps(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all d-separators Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source
CausalInference.list_covariate_adjustmentFunction
list_covariate_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all covariate adjustment sets Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source
CausalInference.list_backdoor_adjustmentFunction
list_backdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all back-door adjustment sets Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source
CausalInference.list_frontdoor_adjustmentFunction
list_frontdoor_adjustment(g, X, Y, I = Set{eltype(g)}(), R = setdiff(Set(vertices(g)), X, Y))

List all front-door adjustment sets Z with $I ⊆ Z ⊆ R$ for sets of vertices X and Y in g.

source

GES

CausalInference.gesFunction
ges(X; method=:gaussian_bic, penalty=0.5, parallel=false, verbose=false)

Compute a causal graph for the given observed data X (variables in columns) using GES. Returns the CPDAG, the score improvement relative to the empty graph and time measurements of first and second phase.

source
ges(n, local_score; score=0.0, parallel=false, verbose=false)

Internal method called by ges.

source
CausalInference.local_scoreFunction
local_score(os::GaussianScore, p, v)

Local Gaussian BIC score. Memoized for GaussianScore{Float64, Symmetric{Float64, Matrix{Float64}}}.

source
CausalInference.Insert!Function
Insert!(g, x, y, T)

Inserts x->y and directs previously undirected edges t->y, t ∈ T. Here x and y are not adjacent and T are undirected-neighbors of y that are not adjacent to x.

source
CausalInference.Delete!Function
Delete!(g, x, y, H)

Deletes x-y or x->y and directs previously undirected edges x->h and y->h for h in H.

source

Causal Zig-Zag

CausalInference.nupFunction
nup(g, total)

Number of directed edges that can be add to a PDAG g with total number of edges.

source
CausalInference.ndownFunction
ndown(g, total)

Number of edges that can be removed from a pdag counting undirected edges twice.

source
CausalInference.count_moves_uniformFunction
count_moves_uniform(g, κ=nv(g) - 1) = s1, s2, (x1, y1, T1), (x2, y2, H2)

Counts and samples operator in polynomial-time by avoiding full enumeration (only works for uniform score.) Count the number s1 of Insert and s2 of Delete operators for CPDAG g with degree bound κ and return a uniformly selected Insert(x1, y1, T1)and a uniform selectedDelete(x2, y2, H2)` operator.

source
CausalInference.causalzigzagFunction
causalzigzag(n, G = (DiGraph(n), 0); balance = metropolis_balance, prior = (_,_)->1.0, score=UniformScore(),
                     coldness = 1.0, σ = 0.0, ρ = 1.0, naive=false,
-                    κ = min(n - 1, 10), iterations=10, verbose=false, save=true)

Run the causal zigzag algorithm starting in a cpdag (G, t) with t oriented or unoriented edges, the balance function balance ∈ {metropolis_balance, barker_balance, sqrt}, score function (see ges algorithm) coldness parameter for iterations. σ = 1.0, ρ = 0.0 gives purely diffusive behaviour, σ = 0.0, ρ = 1.0 gives Zig-Zag behaviour.

Returns a vector of tuples with information, each containing a graph, spent time, current direction, number of edges and the score.

source
CausalInference.keyedreduceFunction
keyedreduce(op, key::AbstractVector{T}, a, init=0.0) where T

Similar to countmap returning a dictionary mapping unique key in key to the reduction the given collection itr with the given binary operator op.

julia> keyedreduce(+, [:a, :b, :a], [7, 3, 2])
+                    κ = min(n - 1, 10), iterations=10, verbose=false, save=true)

Run the causal zigzag algorithm starting in a cpdag (G, t) with t oriented or unoriented edges, the balance function balance ∈ {metropolis_balance, barker_balance, sqrt}, score function (see ges algorithm) coldness parameter for iterations. σ = 1.0, ρ = 0.0 gives purely diffusive behaviour, σ = 0.0, ρ = 1.0 gives Zig-Zag behaviour.

Returns a vector of tuples with information, each containing a graph, spent time, current direction, number of edges and the score.

source
CausalInference.keyedreduceFunction
keyedreduce(op, key::AbstractVector{T}, a, init=0.0) where T

Similar to countmap returning a dictionary mapping unique key in key to the reduction the given collection itr with the given binary operator op.

julia> keyedreduce(+, [:a, :b, :a], [7, 3, 2])
 Dict{Symbol, Float64} with 2 entries:
   :a => 9.0
-  :b => 3.0
source
CausalInference.operator_mcsFunction
operator_mcs(G, K)

Perform a Maximum Cardinality Search on graph G. The elements of clique K are of prioritized and chosen first.

source
CausalInference.precompute_semidirectedFunction
precompute_semidirected(g, y)

Computes for vertex y all vertices reachable via semidirected path from any undirected neighbor and y itself with all vertices in this same set blocked.

source

DAG Zig-Zag

CausalInference.dagzigzagFunction
dagzigzag(n, G = DiGraph(n); balance = metropolis_balance, prior = (_,_)->1.0, score=UniformScore(),
+  :b => 3.0
source
CausalInference.operator_mcsFunction
operator_mcs(G, K)

Perform a Maximum Cardinality Search on graph G. The elements of clique K are of prioritized and chosen first.

source
CausalInference.precompute_semidirectedFunction
precompute_semidirected(g, y)

Computes for vertex y all vertices reachable via semidirected path from any undirected neighbor and y itself with all vertices in this same set blocked.

source

DAG Zig-Zag

CausalInference.dagzigzagFunction
dagzigzag(n, G = DiGraph(n); balance = metropolis_balance, prior = (_,_)->1.0, score=UniformScore(),
                     coldness = 1.0, σ = 0.0, ρ = 1.0, 
-                    κ = min(n - 1, 10), iterations=10, verbose=false, save=true)

Run the causal zigzag algorithm starting in a dag G the balance function balance ∈ {metropolis_balance, barker_balance, sqrt}, score function (see ges algorithm) coldness parameter for iterations. σ = 1.0, ρ = 0.0 gives purely diffusive behaviour, σ = 0.0, ρ = 1.0 gives Zig-Zag behaviour.

Returns a vector of tuples with information, each containing a graph, spent time, current direction, number of edges and the score.

source

Exact

CausalInference.exactscorebasedFunction
exactscorebased(X; method=:gaussian_bic, penalty=0.5, parallel=false, verbose=false)

Compute a CPDAG for the given observed data X (variables in columns) using the exact algorithm proposed by Silander and Myllymäki (2006) for optimizing the BIC score (or any decomposable score). The complexity is n*2^n and the algorithm should scale up to ~20-25 variables, afterwards memory becomes a problem.

  • Silander, T., & Myllymäki, P. (2006). A simple approach for finding the globally optimal Bayesian network structure. In Uncertainty in Artificial Intelligence (pp. 445-452).
source
+ κ = min(n - 1, 10), iterations=10, verbose=false, save=true)

Run the causal zigzag algorithm starting in a dag G the balance function balance ∈ {metropolis_balance, barker_balance, sqrt}, score function (see ges algorithm) coldness parameter for iterations. σ = 1.0, ρ = 0.0 gives purely diffusive behaviour, σ = 0.0, ρ = 1.0 gives Zig-Zag behaviour.

Returns a vector of tuples with information, each containing a graph, spent time, current direction, number of edges and the score.

source

Exact

CausalInference.exactscorebasedFunction
exactscorebased(X; method=:gaussian_bic, penalty=0.5, parallel=false, verbose=false)

Compute a CPDAG for the given observed data X (variables in columns) using the exact algorithm proposed by Silander and Myllymäki (2006) for optimizing the BIC score (or any decomposable score). The complexity is n*2^n and the algorithm should scale up to ~20-25 variables, afterwards memory becomes a problem.

  • Silander, T., & Myllymäki, P. (2006). A simple approach for finding the globally optimal Bayesian network structure. In Uncertainty in Artificial Intelligence (pp. 445-452).
source
diff --git a/dev/search/index.html b/dev/search/index.html index 43a70ba8..c5c2fe17 100644 --- a/dev/search/index.html +++ b/dev/search/index.html @@ -1,2 +1,2 @@ -Search · CausalInference.jl

Loading search...

    +Search · CausalInference.jl

    Loading search...