calculateDiversity {doblin} | R Documentation |
Calculate diversity indices for a sample
Description
This is the main function to compute barcode diversity indices for a given sample. It calculates three common diversity measures: species richness (q = 0), Shannon diversity (q = 1), and dominance-based diversity (q = infinity).
Reshapes a long-format data frame into wide format, with lineage IDs as rows and time points as columns. Replaces missing values with zeros.
Calculates the number of lineages with nonzero frequency at each time point.
Calculates the Shannon entropy at each time point and returns its exponential form. This measure considers both the number and evenness of lineages.
Calculates the reciprocal of the most abundant lineage's frequency at each time point. This measure reflects the dominance of the most frequent lineage.
Usage
calculate_diversity(input_data)
format_sample(sample)
calculate_q_0(mat)
calculate_q_1(mat)
calculate_q_inf(mat)
Arguments
input_data |
A data frame with columns ID, Time, and Reads. Represents barcode counts per lineage at each time point. |
sample |
A data frame with columns ID, Time, and Reads. |
mat |
A matrix of relative abundances, with IDs as rows and time points as columns. |
Details
Internally, the function calls:
format_sample() to reshape the data
calculate_q_0(), calculate_q_1(), and calculate_q_inf() to compute each diversity index
Value
A data frame containing three diversity indices over time: q_0 (richness), q_1 (Shannon), and q_inf (dominance).
A wide-format data frame suitable for diversity calculations.
A data frame with one column: q_0.
A data frame with one column: q_1.
A data frame with one column: q_inf.
Examples
# Load demo barcode count data (installed with the package)
demo_file <- system.file("extdata", "demo_input.csv", package = "doblin")
input_dataframe <- readr::read_csv(demo_file, show_col_types = FALSE)
# Calculate diversity indices over time
diversity_df <- calculate_diversity(input_dataframe)