fit_IVDML {IVDML} | R Documentation |
Fitting Double Machine Learning Models with Instrumental Variables and Potentially Heterogeneous Treatment Effect
Description
This function is used to fit a Double Machine Learning (DML) model with Instrumental Variables (IV) with the goal to perform inference on potentially heterogeneous treatment effects. The model under study is Y = \beta(A)D + g(X) + \epsilon
, where the error \epsilon
is potentially correlated with the treatment D
, but there is an IV Z
satisfying \mathbb E[\epsilon|Z,X] = 0
. The object of interest is the treatment effect \beta
of the treatment D
on the response Y
. The treatment effect \beta
is either constant or can depend on the univariate quantity A
, which is typically a component of the covariates X
.
Usage
fit_IVDML(
Y,
D,
Z,
X = NULL,
A = NULL,
ml_method,
ml_par = list(),
A_deterministic_X = TRUE,
K_dml = 5,
iv_method = c("linearIV", "mlIV"),
S_split = 1
)
Arguments
Y |
Numeric vector. Response variable. |
D |
Numeric vector. Treatment variable. |
Z |
Matrix, vector, or data frame. Instrumental variables. |
X |
Matrix, vector, or data frame. Additional covariates (default: NULL). |
A |
Numeric vector. Variable with respect to which treatment effect heterogeneity is considered. Usually equal to a column of X and in this case it can also be specified later (default: NULL). |
ml_method |
Character. Machine learning method to use. Options are "gam", "xgboost", and "randomForest". |
ml_par |
List. Parameters for the machine learning method:
|
A_deterministic_X |
Logical. Whether |
K_dml |
Integer. Number of cross-fitting folds (default: 5). |
iv_method |
Character vector. Instrumental variables estimation method. Options:
"linearIV", "mlIV", "mlIV_direct" (default: c("linearIV", "mlIV")). "linearIV" corresponds to using instruments linearly and "mlIV" corresponds to using machine learning instruments. "mlIV_direct" is a variant of "mlIV" that uses the same estimate of |
S_split |
Integer. Number of sample splits for cross-fitting (default: 1). |
Value
An object of class IVDML
, containing:
-
results_splits
: A list of S_split lists of cross-fitted residuals from the different sample splits. -
A
: The argumentA
of the function. -
ml_method
: The argumentml_method
of the function. -
A_deterministic_X
: The argumentA_deterministic_X
of the function. -
iv_method
: The argumentiv_method
of the function. The treatment effect estimates, standard errors and confidence intervals can be calculated from theIVDML
object using the functionscoef.IVDML()
,se()
,standard_confint()
,robust_confint()
.
References
Cyrill Scheidegger, Zijian Guo and Peter Bühlmann. Inference for heterogeneous treatment effects with efficient instruments and machine learning. Preprint, arXiv:2503.03530, 2025.
See Also
Inference for a fitted IVDML
object is done with the functions coef.IVDML()
, se()
, standard_confint()
and robust_confint()
.
Examples
set.seed(1)
Z <- rnorm(100)
X <- Z + rnorm(100)
H <- rnorm(100)
D <- Z^2 + sin(X) + H + rnorm(100)
A <- X
Y <- tanh(A) * D + cos(X) - H + rnorm(100)
fit <- fit_IVDML(Y = Y, D = D, Z = Z, X = X, A = A, ml_method = "gam")
coef(fit, iv_method = "mlIV", a = 0, A = A, kernel_name = "boxcar", bandwidth = 0.2)