genpeer {QuantilePeer}R Documentation

Estimating Peer Effects Models

Description

qpeer estimates the quantile peer effect models introduced by Houndetoungan (2025). In the linpeer function, quantile peer variables are replaced with the average peer variable, and they can be replaced with other peer variables in the genpeer function.

Usage

genpeer(
  formula,
  excluded.instruments,
  endogenous.variables,
  Glist,
  data,
  estimator = "IV",
  structural = FALSE,
  drop = NULL,
  fixed.effects = FALSE,
  HAC = "iid",
  checkrank = FALSE,
  compute.cov = TRUE,
  tol = 1e-10
)

linpeer(
  formula,
  excluded.instruments,
  Glist,
  data,
  estimator = "IV",
  structural = FALSE,
  drop = NULL,
  fixed.effects = FALSE,
  HAC = "iid",
  checkrank = FALSE,
  compute.cov = TRUE,
  tol = 1e-10
)

qpeer(
  formula,
  excluded.instruments,
  Glist,
  tau,
  type = 7,
  data,
  estimator = "IV",
  structural = FALSE,
  fixed.effects = FALSE,
  HAC = "iid",
  checkrank = FALSE,
  drop = NULL,
  compute.cov = TRUE,
  tol = 1e-10
)

Arguments

formula

An object of class formula: a symbolic description of the model. formula should be specified as y ~ x1 + x2, where y is the outcome and x1 and x2 are control variables, which can include contextual variables such as averages or quantiles among peers.

excluded.instruments

An object of class formula to indicate excluded instruments. It should be specified as ~ z1 + z2, where z1 and z2 are excluded instruments for the quantile peer outcomes.

endogenous.variables

An object of class formula that allows specifying endogenous variables. It is used to indicate the peer variables whose effects will be estimated. These can include average peer variables, quantile peer variables, or a combination of multiple variables. It should be specified as ~ y1 + y2, where y1 and y2 are the endogenous peer variables.

Glist

The adjacency matrix. For networks consisting of multiple subnets (e.g., schools), Glist must be a list of subnets, with the m-th element being an n_m \times n_m adjacency matrix, where n_m is the number of nodes in the m-th subnet.

data

An optional data frame, list, or environment (or an object that can be coerced by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which qpeer is called.

estimator

A character string specifying the estimator to be used. The available options are: "IV" for the standard instrumental variable estimator, "gmm.identity" for the GMM estimator with the identity matrix as the weight, "gmm.optimal" for the GMM estimator with the optimal weight matrix, "JIVE" for the Jackknife instrumental variable estimator, and "JIVE2" for the Type 2 Jackknife instrumental variable estimator.

structural

A logical value indicating whether the reduced-form or structural specification should be estimated (see Details).

drop

A dummy vector of the same length as the sample, indicating whether an observation should be dropped. This can be used, for example, to remove false isolates or to estimate the model only on non-isolated agents. These observations cannot be directly removed from the network by the user because they may still be friends with other agents.

fixed.effects

A logical value or string specifying whether the model includes subnet fixed effects. The fixed effects may differ between isolated and non-isolated nodes. Accepted values are "no" or "FALSE" (indicating no fixed effects), "join" or TRUE (indicating the same fixed effects for isolated and non-isolated nodes within each subnet), and "separate" (indicating different fixed effects for isolated and non-isolated nodes within each subnet). Note that "join" fixed effects are not applicable for structural models; "join" and TRUE are automatically converted to "separate".

HAC

A character string specifying the correlation structure among the idiosyncratic error terms for covariance computation. Options are "iid" for independent errors, "hetero" for heteroskedastic non-autocorrelated errors, and "cluster" for heteroskedastic errors with potential within-subnet correlation.

checkrank

A logical value indicating whether the instrument matrix should be checked for full rank. If the matrix is not of full rank, unimportant columns will be removed to obtain a full-rank matrix.

compute.cov

A logical value indicating whether the covariance matrix of the estimator should be computed.

tol

A tolerance value used in the QR factorization to identify columns of explanatory variable and instrument matrices that ensure a full-rank matrix (see the qr function).

tau

A numeric vector specifying the quantile levels.

type

An integer between 1 and 9 selecting one of the nine quantile algorithms used to compute peer quantiles (see the quantile function).

Details

Let \mathcal{N} be a set of n agents indexed by the integer i \in [1, n]. Agents are connected through a network that is characterized by an adjacency matrix \mathbf{G} = [g_{ij}] of dimension n \times n, where g_{ij} = 1 if agent j is a friend of agent i, and g_{ij} = 0 otherwise. In weighted networks, g_{ij} can be a nonnegative variable (not necessarily binary) that measures the intensity of the outgoing link from i to j. The model can also accommodate such networks. Note that the network is generally constituted in many independent subnets (eg: schools). The Glist argument is the list of subnets. In the case of a single subnet, Glist will be a list containing one matrix.

Let \mathcal{T} be a set of quantile levels. The reduced-form specification of quantile peer effect models is given by:

y_i = \sum_{\tau \in \mathcal{T}} \lambda_{\tau} q_{\tau,i}(\mathbf{y}_{-i}) + \mathbf{x}_i^{\prime}\beta + \varepsilon_i,

where \mathbf{y}_{-i} = (y_1, \ldots, y_{i-1}, y_{i+1}, \ldots, y_n)^{\prime} is the vector of outcomes for other units, and q_{\tau,i}(\mathbf{y}_{-i}) is the sample \tau-quantile of peer outcomes. The term \varepsilon_i is an idiosyncratic error term, \lambda_{\tau} captures the effect of the \tau-quantile of peer outcomes on y_i, and \beta captures the effect of \mathbf{x}_i on y_i. For the definition of the sample \tau-quantile, see Hyndman and Fan (1996). If the network matrix is weighted, the sample weighted quantile can be used, where the outcome for friend j of i is weighted by g_{ij}. It can be shown that the sample \tau-quantile is a weighted average of two peer outcomes. For more details, see the quantile and qpeer.instruments functions.

The quantile q_{\tau,i}(\mathbf{y}_{-i}) can be replaced with the average peer variable in linpeer or with other measures in genpeer through the endogenous.variables argument. In genpeer, it is possible to specify multiple peer variables, such as male peer averages and female peer averages. Additionally, both quantiles and averages can be included (genpeer is general and encompasses qpeer and linpeer). See examples.

One issue in linear peer effect models is that individual preferences with conformity and complementarity/substitution lead to the same reduced form. However, it is possible to disentangle both types of preferences using isolated individuals (individuals without friends). The structural specification of the model differs between isolated and nonisolated individuals. For isolated i, the specification is similar to a standard linear-in-means model without social interactions, given by:

y_i = \mathbf{x}_i^{\prime}\beta + \varepsilon_i.

If node i is non-isolated, the specification is given by:

y_i = \sum_{\tau \in \mathcal{T}} \lambda_{\tau} q_{\tau,i}(\mathbf{y}_{-i}) + (1 - \lambda_2)(\mathbf{x}_i^{\prime}\beta + \varepsilon_i),

where \lambda_2 determines whether preferences exhibit conformity or complementarity/substitution. In general, \lambda_2 > 0 and this means that that preferences are conformist (anti-conformity may be possible in some models when \lambda_2 < 0). In contrast, when \lambda_2 = 0, there is complementarity/substitution between individuals depending on the signs of the \lambda_{\tau} parameters. It is obvious that \beta and \lambda_2 can be identified only if the network includes enough isolated individuals.

Value

A list containing:

model.info

A list with information about the model, such as the number of subnets, number of observations, and other key details.

gmm

A list of GMM estimation results, including parameter estimates, the covariance matrix, and related statistics.

data

A list containing the outcome, outcome quantiles among peers, control variables, and excluded instruments used in the model.

References

Houndetoungan, A. (2025). Quantile peer effect models. arXiv preprint arXiv:2405.17290, doi:10.48550/arXiv.2506.12920.

Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. The American Statistician, 50(4), 361-365, doi:10.1080/00031305.1996.10473566.

See Also

qpeer.sim, qpeer.instruments

Examples


set.seed(123)
ngr  <- 50  # Number of subnets
nvec <- rep(30, ngr)  # Size of subnets
n    <- sum(nvec)

### Simulating Data
## Network matrix
G <- lapply(1:ngr, function(z) {
  Gz <- matrix(rbinom(nvec[z]^2, 1, 0.3), nvec[z], nvec[z])
  diag(Gz) <- 0
  # Adding isolated nodes (important for the structural model)
  niso <- sample(0:nvec[z], 1, prob = (nvec[z] + 1):1 / sum((nvec[z] + 1):1))
  if (niso > 0) {
    Gz[sample(1:nvec[z], niso), ] <- 0
  }
  Gz
})

tau <- seq(0, 1, 1/3)
X   <- cbind(rnorm(n), rpois(n, 2))
l   <- c(0.2, 0.15, 0.1, 0.2)
b   <- c(2, -0.5, 1)
eps <- rnorm(n, 0, 0.4)

## Generating `y`
y <- qpeer.sim(formula = ~ X, Glist = G, tau = tau, lambda = l, 
               beta = b, epsilon = eps)$y

### Estimation
## Computing instruments
Z <- qpeer.inst(formula = ~ X, Glist = G, tau = seq(0, 1, 0.1), 
                max.distance = 2, checkrank = TRUE)
Z <- Z$instruments

## Reduced-form model 
rest <- qpeer(formula = y ~ X, excluded.instruments = ~ Z, Glist = G, tau = tau)
summary(rest)
summary(rest, diagnostic = TRUE)  # Summary with diagnostics

## Structural model
sest <- qpeer(formula = y ~ X, excluded.instruments = ~ Z, Glist = G, tau = tau,
              structural = TRUE)
summary(sest, diagnostic = TRUE)
# The lambda^* parameter is y_q (conformity) in the outputs.
# There is no conformity in the data, so the estimate will be approximately 0.

## Structural model with double fixed effects per subnet using optimal GMM 
## and controlling for heteroskedasticity
sesto <- qpeer(formula = y ~ X, excluded.instruments = ~ Z, Glist = G, tau = tau,
               structural = TRUE, fixed.effects = "separate", HAC = "hetero", 
               estimator = "gmm.optimal")
summary(sesto, diagnostic = TRUE)

## Average peer effect model
# Row-normalized network to compute instruments
Gnorm <- lapply(G, function(g) {
  d <- rowSums(g)
  d[d == 0] <- 1
  g / d
})

# GX and GGX
Gall <- Matrix::bdiag(Gnorm)
GX   <- as.matrix(Gall %*% X)
GGX  <- as.matrix(Gall %*% GX)

# Standard linear model
lpeer <- linpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX, Glist = Gnorm)
summary(lpeer, diagnostic = TRUE)
# Note: The normalized network is used here by definition of the model.
# Contextual effects are also included (this is also possible for the quantile model).

# The standard model can also be structural
lpeers <- linpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX, Glist = Gnorm,
                  structural = TRUE, fixed.effects = "separate")
summary(lpeers, diagnostic = TRUE)

## Estimation using `genpeer`
# Average peer variable computed manually and included as an endogenous variable
Gy     <- as.vector(Gall %*% y)
gpeer1 <- genpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX, 
                  endogenous.variables = ~ Gy, Glist = Gnorm, structural = TRUE, 
                  fixed.effects = "separate")
summary(gpeer1, diagnostic = TRUE)

# Using both average peer variables and quantile peer variables as endogenous,
# or only the quantile peer variable
# Quantile peer `y`
qy <- qpeer.inst(formula = y ~ 1, Glist = G, tau = tau)
qy <- qy$qy

# Model estimation
gpeer2 <- genpeer(formula = y ~ X + GX, excluded.instruments = ~ GGX + Z, 
                  endogenous.variables = ~ Gy + qy, Glist = Gnorm, structural = TRUE, 
                  fixed.effects = "separate")
summary(gpeer2, diagnostic = TRUE)

[Package QuantilePeer version 0.0.1 Index]