uwt {AuxSurvey}R Documentation

Weighted or Unweighted Sample Mean

Description

This function estimates the sample mean of an outcome variable using either weighted or unweighted methods. It supports calculating the sample mean with finite population correction (FPC) when a population dataset is provided. The method can also compute confidence intervals (CIs) for the sample mean using the specified distribution family (Gaussian or Binomial).

Usage

uwt(
  svysmpl,
  svyVar,
  svypopu = NULL,
  subset = NULL,
  family = gaussian(),
  invlvls,
  weights = NULL
)

Arguments

svysmpl

A dataframe or tibble representing the sample data (samples). This should contain the outcome variable and any additional covariates.

svyVar

The outcome variable to estimate the sample mean for (e.g., Y1).

svypopu

A dataframe or tibble representing the population data (population). This is used to compute the finite population correction (FPC) when calculating the sample mean. Default is NULL.

subset

A character vector representing filtering conditions to select subsets of the sample and population. Default is NULL, in which case the analysis is performed on the entire dataset. If subsets are specified, estimates for both the whole data and the subsets will be calculated.

family

The distribution family of the outcome variable. Supported options are: gaussian for continuous outcomes and binomial for binary outcomes.

invlvls

A numeric vector specifying the confidence levels (CIs) for the estimators. If more than one value is provided, multiple CIs will be calculated.

weights

A numeric vector of case weights. The length should match the number of cases in svysmpl. These weights are used for calculating the weighted sample mean.

Value

A list, where each element contains the sample mean estimate and CIs for a subset or the entire data. The list includes: - est: The sample mean estimate. - se: The standard error of the sample mean estimate. - tCI: The confidence intervals for the sample mean. - sample_size: The sample size for the subset or entire dataset. - population_size: The population size, if a population dataset is provided (applicable to finite population correction). The list is returned for each subset specified.

Examples

## Simulate data with nonlinear association (setting 3).
data = simulate(N = 3000, discretize = 3, setting = 3, seed = 123)
population = data$population  # Population data (3000 cases)
samples = data$samples        # Sample data (600 cases)
ipw = 1 / samples$true_pi    # Compute inverse probability weights

## Estimate the weighted sample mean with IPW
IPW_sample_mean = uwt(svysmpl = samples, svyVar = "Y1", svypopu = population,
                      subset = c("Z1 == 1 & Z2 == 1"), family = gaussian(),
                      invlvls = c(0.95), weights = ipw)
IPW_sample_mean

## Estimate the unweighted sample mean
unweighted_sample_mean = uwt(svysmpl = samples, svyVar = "Y1", svypopu = population,
                              subset = NULL, family = gaussian(), invlvls = c(0.95), weights = NULL)
unweighted_sample_mean


[Package AuxSurvey version 1.0 Index]