FactorHet {FactorHet} | R Documentation |
Estimate heterogeneous effects in factorial and conjoint experiments
Description
Fit a model to estimate heterogeneous effects in factorial or conjoint
experiments using a "mixture of experts" (i.e. a finite mixture of
regularized regressions with covariates affecting group assignment). Effects
are regularized using an overlapping group LASSO. FactorHet_mbo
finds
an optimal lambda via Bayesian optimization whereas FactorHet
requires
a lambda to be provided. FactorHet_mbo
typically used in practice.
Usage
FactorHet(
formula,
design,
K,
lambda,
moderator = NULL,
group = NULL,
task = NULL,
choice_order = NULL,
weights = NULL,
control = FactorHet_control(),
initialize = FactorHet_init(),
verbose = TRUE
)
FactorHet_mbo(
formula,
design,
K,
moderator = NULL,
weights = NULL,
group = NULL,
task = NULL,
choice_order = NULL,
control = FactorHet_control(),
initialize = FactorHet_init(),
mbo_control = FactorHet_mbo_control()
)
Arguments
formula |
Formula specifying model. The syntax is |
design |
A data.frame containing the data to be analyzed. |
K |
An integer specifying the number of groups; |
lambda |
A positive numeric value denoting regularization strength; this
is scaled internally by the number of observations, see
|
moderator |
A formula of variables (moderators) that affect the prior
probability of group membership. This is ignored when |
group |
A formula of a single variable, e.g. |
task |
A formula of a single variable that indicates the task number
performed by each individual. This is not used when |
choice_order |
A formula of a single variable that indicates which profile is on the "left" or "right" in a conjoint experiment. |
weights |
A formula of a single variable that indicates the weights for
each observation (e.g., survey weights). If |
control |
An object from |
initialize |
An object from |
verbose |
A logical value that prints intermediate information about
model fitting. The default is |
mbo_control |
A list of control parameters for MBO; see
|
Details
Caution: Many settings in FactorHet_control can be modified
to allow for slight variations in how the model is estimated. Some of these
are faster but may introduce numerical differences across versions of
R
and machines. The default settings aim to mitigate this. One of
the default settings (FactorHet_control(step_SQUAREM=NULL)
)
considerably increases the speed of convergence and the quality of the
optimum located at the expense of sometimes introducing numerical
differences across machines. To address this, one could not use SQUAREM
(do_SQUAREM=FALSE
) or set it to use some fixed step-size (e.g.,
step_SQUAREM=-10
). If SQUAREM produces a large step, a message to
this effect will be issued.
Factorial vs. Conjoint Experiment: A factorial experiment, i.e.
without a forced-choice between profiles, can be modeled by ignoring the
choice_order
argument and ensuring that each group
and
task
combination corresponds to exactly one observation in the
design.
Estimation: All models are estimated using an AECM algorithm
described in Goplerud et al. (2025). Calibration of the amount of
regularization (i.e. choosing \lambda
), should be done using
FactorHet_mbo
. This uses a small number (default 15) of attempts to
calibrate the amount of regularization by minimizing a user-specific
criterion (defaulting to the BIC), and then fits a final model using the
\lambda
that is predicted to minimize the criterion.
Options for the model based optimization (mbo
) can be set using
FactorHet_mbo_control
. Options for model estimation can be
set using FactorHet_control
.
Ridge Regression: While more experimental, ridge regression can be
estimated by setting lambda = 0
(in FactorHet
) and then
setting prior_var_beta
in FactorHet_control
or by
using FactorHet_mbo
and setting mbo_type = "ridge"
.
Moderators: Moderators can be provided via the moderator
argument. These are important when K > 1
for ensuring the stability
of the model. Repeated observations per individual can be specified by
group
and/or task
if relevant for a force-choice conjoint.
Value
Returns an object of class FactorHet
. Typical use will involve
examining the patterns of estimated treatment effects.
cjoint_plot
shows the raw (logistic) coefficients.
Marginal effects of treatments (e.g. average marginal effects) can be
computed using AME
, ACE
, or AMIE
.
The impact of moderators on group membership can be examined using
margeff_moderators
or posterior_by_moderators
.
The returned object is a list containing the following elements:
- parameters:
Estimated model parameters. These are usually obtained via
coef.FactorHet
.- K:
The number of groups
- posterior:
Posterior group probability for each observation. This is list of two data.frames one with posterior probabilities (
"posterior"
) and one ("posterior_predictive"
) implied solely by the moderators, i.e.\pi_{k}(X_i)
from Goplerud et al. (2025).- information_criterion:
Information on the BIC, degrees of freedom, log-likelihood, and number of iterations.
- internal_parameters:
A list of many internal parameters. This is used for debugging or by other post-estimation functions.
- vcov:
Named list containing the estimated variance-covariance matrix. This is usually extracted with
vcov
.- lp_shortEM:
If
"short EM"
is applied (only applicable ifFactorHet
, notFactorHet_mbo
, is used), it lists the log-posterior at the end of each short run.- MBO:
If
FactorHet_mbo
is used, information about the model-based optimization (MBO) is stored here.visualize_MBO
provides a quick graphical summary of the BIC at different\lambda
.
Examples
# Use a small subset of the immigration data from Hainmueller and Hopkins
data(immigration)
set.seed(1)
# Fit with two groups and tune regularization via MBO
fit_MBO <- FactorHet_mbo(
formula = Chosen_Immigrant ~ Country + Ed + Gender + Plans,
design = immigration, group = ~ CaseID,
task = ~ contest_no, choice_order = ~ choice_id,
# Only do one guess after initialization for speed
mbo_control = FactorHet_mbo_control(iters = 1),
K = 2)
# Plot the raw coefficients
cjoint_plot(fit_MBO)
# Check how MBO fared at calibrating regularization
visualize_MBO(fit_MBO)
# Visualize posterior distribution of group membership
posterior_FactorHet(fit_MBO)
# Get AMEs
AME(fit_MBO)