node_zeroinfl {simDAG} | R Documentation |
Simulate a Node Using a Zero-Inflated Count Model
Description
Data from the parents is used to first simulate data for the regular count model, which may follow either a poisson regression or a negative binomial regression, as implemented in node_poisson
and node_negative_binomial
respectively. Then, zeros are simulated using a logistic regression model as implemented in node_binomial
. Whenever the second binomial part returned a 0, the first part is set to 0, leaving the rest untouched. Supports random effects and random slopes (if possible) in both models. See examples.
Usage
node_zeroinfl(data, parents, parents_count,
parents_zero, formula_count, formula_zero,
betas_count, betas_zero,
intercept_count, intercept_zero,
family_count="poisson", theta,
var_corr_count, var_corr_zero)
Arguments
data |
A |
parents |
A character vector specifying the names of the parents that this particular child node has. Note that this argument does not have to be specified if |
parents_count |
Same as |
parents_zero |
Same as |
formula_count |
An enhanced formula passed to the |
formula_zero |
An enhanced formula passed to the |
betas_count |
A numeric vector with length equal to |
betas_zero |
A numeric vector with length equal to |
intercept_count |
A single number specifying the intercept that should be used when generating the count model part of the node. |
intercept_zero |
A single number specifying the intercept that should be used when generating the zero-inflated part of the node. |
family_count |
Either |
theta |
A single number specifying the theta parameter ( |
var_corr_count |
If random effects or random slopes are included in |
var_corr_zero |
If random effects or random slopes are included in |
Details
It is important to note that data for both underlying models (the count model and the zero-inflation model) are simulated from completely independent of each other. When using random effects in either of the two models, they may therefore use completely different values for each process.
Value
Returns a numeric vector of length nrow(data)
.
Author(s)
Robin Denz
See Also
empty_dag
, node
, node_td
, sim_from_dag
, sim_discrete_time
Examples
library(simDAG)
set.seed(5425)
# zero-inflated poisson regression
dag <- empty_dag() +
node(c("A", "B"), type="rnorm", mean=0, sd=1) +
node("Y", type="zeroinfl",
formula_count= ~ -2 + A*0.2 + B*0.1 + A:B*0.4,
formula_zero= ~ 1 + A*1 + B*2,
family_count="poisson",
parents=c("A", "B"))
data <- sim_from_dag(dag, n_sim=100)
# above is functionally the same as:
dag <- empty_dag() +
node(c("A", "B"), type="rnorm", mean=0, sd=1) +
node("Y_count", type="poisson", formula= ~ -2 + A*0.2 + B*0.1 + A:B*0.4) +
node("Y_zero", type="binomial", formula= ~ 1 + A*1 + B*2) +
node("Y", type="identity", formula= ~ Y_zero * Y_count)
data <- sim_from_dag(dag, n_sim=100)
# same as above, but specifying each individual component instead of formulas
dag <- empty_dag() +
node(c("A", "B", "C"), type="rnorm", mean=0, sd=1) +
node("Y", type="zeroinfl",
parents_count=c("A", "B"),
betas_count=c(0.2, 0.1),
intercept_count=-2,
parents_zero=c("A", "B"),
betas_zero=c(1, 2),
intercept_zero=1,
family_count="poisson",
parents=c("A", "B"))
data <- sim_from_dag(dag, n_sim=100)
# zero-inflated negative-binomial regression
dag <- empty_dag() +
node(c("A", "B"), type="rnorm", mean=0, sd=1) +
node("Y", type="zeroinfl",
formula_count= ~ -2 + A*0.2 + B*3 + A:B*0.4,
formula_zero= ~ 3 + A*0.1 + B*0.3,
family_count="negative_binomial", theta=1,
parents=c("A", "B"))
data <- sim_from_dag(dag, n_sim=100)