generate_qualitative_data_soo {causalQual} | R Documentation |
Generate Qualitative Data (Selection-on-Observables)
Description
Generate a synthetic data set with qualitative outcomes under a selection-on-observables design. The data include a binary treatment indicator and a matrix of covariates. The treatment is either independent or conditionally (on the covariates) independent of potential outcomes, depending on users' choices.
Usage
generate_qualitative_data_soo(n, assignment, outcome_type)
Arguments
n |
Sample size. |
assignment |
String controlling treatment assignment. Must be either |
outcome_type |
String controlling the outcome type. Must be either |
Details
Outcome type
Potential outcomes are generated differently according to outcome_type
. If outcome_type == "multinomial"
, generate_qualitative_data_soo
computes linear predictors for each class using the covariates:
\eta_{mi} (d) = \beta_{m1}^d X_{i1} + \beta_{m2}^d X_{i2} + \beta_{m3}^d X_{i3}, \quad d = 0, 1,
and then transforms \eta_{mi} (d)
into valid probability distributions using the softmax function:
P(Y_i(d) = m | X_i) = \frac{\exp(\eta_{mi} (d))}{\sum_{m'} \exp(\eta_{m'i}(d))}, \quad d = 0, 1.
It then generates potential outcomes Y_i(1)
and Y_i(0)
by sampling from {1, 2, 3} using P(Y_i(d) = m | X_i), \, d = 0, 1
.
If instead outcome_type == "ordered"
, generate_qualitative_data_soo
first generates latent potential outcomes:
Y_i^* (d) = \tau d + X_{i1} + X_{i2} + X_{i3} + N (0, 1), \quad d = 0, 1,
with \tau = 2
. It then constructs Y_i (d)
by discretizing Y_i^* (d)
using threshold parameters \zeta_1 = 2
and \zeta_2 = 4
. Then,
P(Y_i(d) = m | X_i) = P(\zeta_{m-1} < Y_i^*(d) \leq \zeta_m | X_i) = \Phi (\zeta_m - \sum_j X_{ij} - \tau d) - \Phi (\zeta_{m-1} - \sum_j X_{ij} - \tau d), \quad d = 0, 1,
which allows us to analytically compute the probabilities of shift.
Treatment assignment
Treatment is always assigned as D_i \sim \text{Bernoulli}(\pi(X_i))
. If assignment == "randomized"
, then the propensity score is specified as \pi(X_i) = P ( D_i = 1 | X_i)) = 0.5
.
If instead assignment == "observational"
, then \pi(X_i) = (X_{i1} + X_{i3}) / 2
.
Other details
The function always generates three independent covariates from U(0,1)
. Observed outcomes Y_i
are always constructed using the usual observational rule.
Value
A list storing a data frame with the observed data, the true propensity score, and the true probabilities of shift.
Author(s)
Riccardo Di Francesco
See Also
generate_qualitative_data_iv
generate_qualitative_data_rd
generate_qualitative_data_did
Examples
## Generate synthetic data.
set.seed(1986)
data <- generate_qualitative_data_soo(100,
assignment = "observational",
outcome_type = "ordered")
data$pshifts