other {PopulateR} | R Documentation |
Match people into new households
Description
This function creates a data frame of household inhabitants, with the specified number of inhabitants. One data frame, containing the people to match, is required. The use of an age distribution for the matching ensures that an age structure is present in the households. A less correlated age structure can be produced by entering a larger standard deviation. The output data frame of matches will only contain households of the required size. If the number of rows in the people data frame is not divisible by household size, the overcount will be output to a separate data frame.
Usage
other(
people,
pplid,
pplage,
numppl = NULL,
sdused,
HHStartNum,
HHNumVar,
userseed = NULL,
ptostop = NULL,
numiters = 1e+06,
verbose = FALSE
)
Arguments
people |
A data frame containing the people to be matched into households. |
pplid |
The variable containing the unique ID for each person. |
pplage |
The age variable. |
numppl |
The number of people in the households. |
sdused |
The standard deviation of the normal distribution for the distribution of ages in a household. |
HHStartNum |
The starting value for HHNumVar. Must be numeric. |
HHNumVar |
The name for the household variable. |
userseed |
If specified, this will set the seed to the number provided. If not, the normal set.seed() function will be used. |
ptostop |
The critical p-value stopping rule for the function. If this value is not set, the critical p-value of .01 is used. |
numiters |
The maximum number of iterations used to construct the output data frame ($Matched) containing the household inhabitants. The default value is 1000000, and is the stopping rule if the algorithm does not converge. |
verbose |
Whether the number of iterations used, the critical chi-squared value, and the final chi-squared value are printed to the console. The information will be printed for each set of pairs. For example, if there are three people in each household, the information will be printed twice. The default is FALSE, so no information will be printed to the console. |
Value
A list of two data frames $Matched contains the data frame of households containing matched people. All households will be of the specified size. $Unmatched, if populated, contains the people that were not allocated to households. If the number of rows in the people data frame is divisible by the household size required, $Unmatched will be an empty data frame.
Examples
library(dplyr)
# creating three-person households toy example with few iterations
NewHouseholds <- other(AdultsNoID, pplid = "ID", pplage = "Age", numppl = 3, sdused = 3,
HHStartNum = 1, HHNumVar = "Household", userseed=4, ptostop = .05,
numiters = 500, verbose = TRUE)
PeopleInHouseholds <- NewHouseholds$Matched
PeopleNot <- NewHouseholds$Unmatched # 2213 not divisible by 3