dist_sum {sumvar} | R Documentation |
Explore a continuous variable.
Description
Summarises the median, interquartile range, mean, standard deviation, confidence intervals of the mean and produces a density plot, stratified by a second grouping variable.
Provides frequentist hypothesis tests for comparison between groups: T test and Wilcoxon rank sum for 2 groups, Anova and Kruskall wallis test for 3 or more groups.
The function accepts an input from a dplyr pipe "%>%" and outputs the results as a tibble.
Usage
dist_sum(data, var, by = NULL)
Arguments
data |
The data frame or tibble |
var |
The variable you would like to summarise |
by |
The grouping variable |
Value
A tibble with a summary of the variable frequency (n), number of missing observations (n_miss), median, interquartile range, mean, SD, 95% confidence intervals of the mean (using the Z distribution), and density plots.
Shows the T test (p_ttest) and Wilcoxon rank sum (p_wilcox) hypothesis tests when there are two groups And an Anova test (p_anova) and Kruskal-Wallis test (p_kruskal) when there are three or more groups.
Examples
example_data <- dplyr::tibble(id = 1:100, age = rnorm(100, mean = 30, sd = 10),
group = sample(c("a", "b", "c", "d"),
size = 100, replace = TRUE))
dist_sum(example_data, age, group)
example_data <- dplyr::tibble(id = 1:100, age = rnorm(100, mean = 30, sd = 10),
sex = sample(c("male", "female"),
size = 100, replace = TRUE))
dist_sum(example_data, age, sex)
summary <- dist_sum(example_data, age, sex) # Save summary statistics as a tibble.