Textor and van der Zander (2016), Barrett (2018b) 可以用于概率图分析1,可以配合 Koller (2018) 的课学习。
library(ggdag)
library(dagitty)
dagitty
和ggdag
的对比
dag <- dagitty(
"
dag {
y <- x <- z1 <- v -> z2 -> y
z1 <- w1 <-> w2 -> z2
x <- w1 -> y
x <- w2 -> y
x [exposure]
y [outcome]
}
"
)
tidy_dag <- tidy_dagitty(dag)
tidy_dag
## # A DAG with 7 nodes and 12 edges
## #
## # Exposure: x
## # Outcome: y
## #
## # A tibble: 13 x 8
## name x y direction to xend yend circular
## <chr> <dbl> <dbl> <fct> <chr> <dbl> <dbl> <lgl>
## 1 v 11.4 6.18 -> z1 10.1 5.51 FALSE
## 2 v 11.4 6.18 -> z2 12.0 4.93 FALSE
## 3 w1 10.1 4.06 -> x 10.6 4.57 FALSE
## 4 w1 10.1 4.06 -> y 11.2 3.63 FALSE
## 5 w1 10.1 4.06 -> z1 10.1 5.51 FALSE
## 6 w1 10.1 4.06 <-> w2 11.5 4.15 FALSE
## 7 w2 11.5 4.15 -> x 10.6 4.57 FALSE
## 8 w2 11.5 4.15 -> y 11.2 3.63 FALSE
## 9 w2 11.5 4.15 -> z2 12.0 4.93 FALSE
## 10 x 10.6 4.57 -> y 11.2 3.63 FALSE
## 11 z1 10.1 5.51 -> x 10.6 4.57 FALSE
## 12 z2 12.0 4.93 -> y 11.2 3.63 FALSE
## 13 y 11.2 3.63 <NA> <NA> NA NA FALSE
ggdag(tidy_dag)
- 这里的代码,类似于 Iannone (2018) 开发的包
DiagrammeR
。 <->
表示双向箭头。
tidy_ggdag <-
ggdag::dagify(
y ~ x + z2 + w2 + w1,
x ~ z1 + w1,
z1 ~ w1 + v,
z2 ~ w2 + v,
w1 ~~ w2, # bidirected path
exposure = "x",
outcome = "y"
) %>%
tidy_dagitty()
tidy_ggdag
## # A DAG with 7 nodes and 11 edges
## #
## # Exposure: x
## # Outcome: y
## #
## # A tibble: 12 x 8
## name x y direction to xend yend circular
## <chr> <dbl> <dbl> <fct> <chr> <dbl> <dbl> <lgl>
## 1 v 11.8 10.5 -> z1 12.2 9.19 FALSE
## 2 v 11.8 10.5 -> z2 13.1 10.9 FALSE
## 3 w1 13.4 9.50 -> x 13.4 8.61 FALSE
## 4 w1 13.4 9.50 -> y 14.1 9.84 FALSE
## 5 w1 13.4 9.50 -> z1 12.2 9.19 FALSE
## 6 w1 13.4 9.50 <-> w2 14.2 10.7 FALSE
## 7 w2 14.2 10.7 -> y 14.1 9.84 FALSE
## 8 w2 14.2 10.7 -> z2 13.1 10.9 FALSE
## 9 x 13.4 8.61 -> y 14.1 9.84 FALSE
## 10 z1 12.2 9.19 -> x 13.4 8.61 FALSE
## 11 z2 13.1 10.9 -> y 14.1 9.84 FALSE
## 12 y 14.1 9.84 <NA> <NA> NA NA FALSE
ggdag(tidy_ggdag)
ggdag
函数是内置ggplot2
的,因此非常方便。~~
表示双向箭头。y ~ x + z2 + w2 + w1
表示这四个变量都导致y
,类似于回归方程中,自变量和因变量的关系。- 而且这里不需要string格式的输入。
加label
confounder_triangle
是ggdag
的一个内置函数, Barrett (2018a) 给了一个加上label的例子。
confounder_triangle(x = "Coffee", y = "Lung Cancer", z = "Smoking") %>%
ggdag_dconnected(text = FALSE, use_labels = "label")
批量label
coffee_dag <- dagify(cancer ~ smoking,
smoking ~ addictive,
coffee ~ addictive,
exposure = "coffee",
outcome = "cancer",
labels = c("coffee" = "Coffee", "cancer" = "Lung Cancer",
"smoking" = "Smoking", "addictive" = "Addictive \nBehavior")) %>%
tidy_dagitty(layout = "tree")
ggdag(coffee_dag, text = FALSE, use_labels = "label")
参考文献
Barrett, Malcolm. 2018a. “Common Structures of Bias.” 2018. https://cran.r-project.org/web/packages/ggdag/vignettes/bias-structures.html.
———. 2018b. Ggdag: Analyze and Create Elegant Directed Acyclic Graphs. https://CRAN.R-project.org/package=ggdag.
Iannone, Richard. 2018. DiagrammeR: Graph/Network Visualization. https://CRAN.R-project.org/package=DiagrammeR.
Koller, Daphne. 2018. “Probabilistic Graphical Models 1: Representation.” 2018. https://www.coursera.org/learn/probabilistic-graphical-models.
Textor, Johannes, and Benito van der Zander. 2016. Dagitty: Graphical Analysis of Structural Causal Models. https://CRAN.R-project.org/package=dagitty.
Barrett (2018b) 是在 Textor and van der Zander (2016) 的基础上进行开发的。↩