genpeer {QuantilePeer} | R Documentation |
Estimating Peer Effects Models
Description
qpeer
estimates the quantile peer effect models introduced by Houndetoungan (2025). In the linpeer
function, quantile peer variables are replaced with the average peer variable, and they can be replaced with other peer variables in the genpeer
function.
Usage
genpeer(
formula,
excluded.instruments,
endogenous.variables,
Glist,
data,
estimator = "IV",
structural = FALSE,
drop = NULL,
fixed.effects = FALSE,
HAC = "iid",
checkrank = FALSE,
compute.cov = TRUE,
tol = 1e-10
)
linpeer(
formula,
excluded.instruments,
Glist,
data,
estimator = "IV",
structural = FALSE,
drop = NULL,
fixed.effects = FALSE,
HAC = "iid",
checkrank = FALSE,
compute.cov = TRUE,
tol = 1e-10
)
qpeer(
formula,
excluded.instruments,
Glist,
tau,
type = 7,
data,
estimator = "IV",
structural = FALSE,
fixed.effects = FALSE,
HAC = "iid",
checkrank = FALSE,
drop = NULL,
compute.cov = TRUE,
tol = 1e-10
)
Arguments
formula |
An object of class formula: a symbolic description of the model. |
excluded.instruments |
An object of class formula to indicate excluded instruments. It should be specified as |
endogenous.variables |
An object of class formula that allows specifying endogenous variables. It is used to indicate the peer variables whose effects will be estimated. These can include average peer variables, quantile peer variables,
or a combination of multiple variables. It should be specified as |
Glist |
The adjacency matrix. For networks consisting of multiple subnets (e.g., schools), |
data |
An optional data frame, list, or environment (or an object that can be coerced by as.data.frame to a data frame) containing the variables
in the model. If not found in |
estimator |
A character string specifying the estimator to be used. The available options are:
|
structural |
A logical value indicating whether the reduced-form or structural specification should be estimated (see Details). |
drop |
A dummy vector of the same length as the sample, indicating whether an observation should be dropped. This can be used, for example, to remove false isolates or to estimate the model only on non-isolated agents. These observations cannot be directly removed from the network by the user because they may still be friends with other agents. |
fixed.effects |
A logical value or string specifying whether the model includes subnet fixed effects. The fixed effects may differ between isolated and non-isolated nodes. Accepted values are |
HAC |
A character string specifying the correlation structure among the idiosyncratic error terms for covariance computation. Options are |
checkrank |
A logical value indicating whether the instrument matrix should be checked for full rank. If the matrix is not of full rank, unimportant columns will be removed to obtain a full-rank matrix. |
compute.cov |
A logical value indicating whether the covariance matrix of the estimator should be computed. |
tol |
A tolerance value used in the QR factorization to identify columns of explanatory variable and instrument matrices that ensure a full-rank matrix (see the qr function). |
tau |
A numeric vector specifying the quantile levels. |
type |
An integer between 1 and 9 selecting one of the nine quantile algorithms used to compute peer quantiles (see the quantile function). |
Details
Let \mathcal{N}
be a set of n
agents indexed by the integer i \in [1, n]
.
Agents are connected through a network that is characterized by an adjacency matrix \mathbf{G} = [g_{ij}]
of dimension n \times n
, where g_{ij} = 1
if agent j
is a friend of agent i
, and g_{ij} = 0
otherwise.
In weighted networks, g_{ij}
can be a nonnegative variable (not necessarily binary) that measures the intensity of the outgoing link from i
to j
. The model can also accommodate such networks. Note that the network is generally constituted in many independent subnets (eg: schools).
The Glist
argument is the list of subnets. In the case of a single subnet, Glist
will be a list containing one matrix.
Let \mathcal{T}
be a set of quantile levels. The reduced-form specification of quantile peer effect models is given by:
y_i = \sum_{\tau \in \mathcal{T}} \lambda_{\tau} q_{\tau,i}(\mathbf{y}_{-i}) + \mathbf{x}_i^{\prime}\beta + \varepsilon_i,
where \mathbf{y}_{-i} = (y_1, \ldots, y_{i-1}, y_{i+1}, \ldots, y_n)^{\prime}
is the vector of outcomes for other units, and q_{\tau,i}(\mathbf{y}_{-i})
is the
sample \tau
-quantile of peer outcomes. The term \varepsilon_i
is an idiosyncratic error term, \lambda_{\tau}
captures the effect of the \tau
-quantile of peer outcomes on y_i
,
and \beta
captures the effect of \mathbf{x}_i
on y_i
. For the definition of the sample \tau
-quantile, see Hyndman and Fan (1996).
If the network matrix is weighted, the sample weighted quantile can be used, where the outcome for friend j
of i
is weighted by g_{ij}
. It can be shown that
the sample \tau
-quantile is a weighted average of two peer outcomes. For more details, see the quantile and qpeer.instruments
functions.
The quantile q_{\tau,i}(\mathbf{y}_{-i})
can be replaced with the average peer variable in linpeer
or with other measures in genpeer
through the endogenous.variables
argument.
In genpeer
, it is possible to specify multiple peer variables, such as male peer averages and female peer averages. Additionally, both quantiles and averages can be included (genpeer
is general and encompasses qpeer
and linpeer
). See examples.
One issue in linear peer effect models is that individual preferences with conformity and complementarity/substitution lead to the same reduced form.
However, it is possible to disentangle both types of preferences using isolated individuals (individuals without friends).
The structural specification of the model differs between isolated and nonisolated individuals.
For isolated i
, the specification is similar to a standard linear-in-means model without social interactions, given by:
y_i = \mathbf{x}_i^{\prime}\beta + \varepsilon_i.
If node i
is non-isolated, the specification is given by:
y_i = \sum_{\tau \in \mathcal{T}} \lambda_{\tau} q_{\tau,i}(\mathbf{y}_{-i}) + (1 - \lambda_2)(\mathbf{x}_i^{\prime}\beta + \varepsilon_i),
where \lambda_2
determines whether preferences exhibit conformity or complementarity/substitution. In general, \lambda_2 > 0
and this means that that preferences are conformist (anti-conformity may be possible in some models when \lambda_2 < 0
).
In contrast, when \lambda_2 = 0
, there is complementarity/substitution between individuals depending on the signs of the \lambda_{\tau}
parameters.
It is obvious that \beta
and \lambda_2
can be identified only if the network includes enough isolated individuals.
Value
A list containing:
model.info |
A list with information about the model, such as the number of subnets, number of observations, and other key details. |
gmm |
A list of GMM estimation results, including parameter estimates, the covariance matrix, and related statistics. |
data |
A list containing the outcome, outcome quantiles among peers, control variables, and excluded instruments used in the model. |
References
Houndetoungan, A. (2025). Quantile peer effect models. arXiv preprint arXiv:2405.17290, doi:10.48550/arXiv.2506.12920.
Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. The American Statistician, 50(4), 361-365, doi:10.1080/00031305.1996.10473566.
See Also
Examples
set.seed(123)
ngr <- 50 # Number of subnets
nvec <- rep(30, ngr) # Size of subnets
n <- sum(nvec)
### Simulating Data
## Network matrix
G <- lapply(1:ngr, function(z) {
Gz <- matrix(rbinom(nvec[z]^2, 1, 0.3), nvec[z], nvec[z])
diag(Gz) <- 0
# Adding isolated nodes (important for the structural model)
niso <- sample(0:nvec[z], 1, prob = (nvec[z] + 1):1 / sum((nvec[z] + 1):1))
if (niso > 0) {
Gz[sample(1:nvec[z], niso), ] <- 0
}
Gz
})
tau <- seq(0, 1, 1/3)
X <- cbind(rnorm(n), rpois(n, 2))
l <- c(0.2, 0.15, 0.1, 0.2)
b <- c(2, -0.5, 1)
eps <- rnorm(n, 0, 0.4)
## Generating `y`
y <- qpeer.sim(formula = ~ X, Glist = G, tau = tau, lambda = l,
beta = b, epsilon = eps)$y
### Estimation
## Computing instruments
Z <- qpeer.inst(formula = ~ X, Glist = G, tau = seq(0, 1, 0.1),
max.distance = 2, checkrank = TRUE)
Z <- Z$instruments
## Reduced-form model
rest <- qpeer(formula = y ~ X, excluded.instruments = ~ Z, Glist = G, tau = tau)
summary(rest)
summary(rest, diagnostic = TRUE) # Summary with diagnostics
## Structural model
sest <- qpeer(formula = y ~ X, excluded.instruments = ~ Z, Glist = G, tau = tau,
structural = TRUE)
summary(sest, diagnostic = TRUE)
# The lambda^* parameter is y_q (conformity) in the outputs.
# There is no conformity in the data, so the estimate will be approximately 0.
## Structural model with double fixed effects per subnet using optimal GMM
## and controlling for heteroskedasticity
sesto <- qpeer(formula = y ~ X, excluded.instruments = ~ Z, Glist = G, tau = tau,
structural = TRUE, fixed.effects = "separate", HAC = "hetero",
estimator = "gmm.optimal")
summary(sesto, diagnostic = TRUE)
## Average peer effect model
# Row-normalized network to compute instruments
Gnorm <- lapply(G, function(g) {
d <- rowSums(g)
d[d == 0] <- 1
g / d
})
# GX and GGX
Gall <- Matrix::bdiag(Gnorm)
GX <- as.matrix(Gall %*% X)
GGX <- as.matrix(Gall %*% GX)
# Standard linear model
lpeer <- linpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX, Glist = Gnorm)
summary(lpeer, diagnostic = TRUE)
# Note: The normalized network is used here by definition of the model.
# Contextual effects are also included (this is also possible for the quantile model).
# The standard model can also be structural
lpeers <- linpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX, Glist = Gnorm,
structural = TRUE, fixed.effects = "separate")
summary(lpeers, diagnostic = TRUE)
## Estimation using `genpeer`
# Average peer variable computed manually and included as an endogenous variable
Gy <- as.vector(Gall %*% y)
gpeer1 <- genpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX,
endogenous.variables = ~ Gy, Glist = Gnorm, structural = TRUE,
fixed.effects = "separate")
summary(gpeer1, diagnostic = TRUE)
# Using both average peer variables and quantile peer variables as endogenous,
# or only the quantile peer variable
# Quantile peer `y`
qy <- qpeer.inst(formula = y ~ 1, Glist = G, tau = tau)
qy <- qy$qy
# Model estimation
gpeer2 <- genpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX + Z,
endogenous.variables = ~ Gy + qy, Glist = Gnorm, structural = TRUE,
fixed.effects = "separate")
summary(gpeer2, diagnostic = TRUE)