calculate_tv_distance_empirical {covalchemy} | R Documentation |
Calculate Total Variation (TV) Distance Empirically
Description
This function calculates the Total Variation (TV) distance between the empirical cumulative distribution functions (ECDFs) of two datasets: original data and generated data. The TV distance is defined as half the sum of the absolute differences between the two CDFs at each point in the domain.
Usage
calculate_tv_distance_empirical(original_data, generated_data)
Arguments
original_data |
A numeric vector of the original data. |
generated_data |
A numeric vector of the generated data. |
Value
A numeric value representing the Total Variation distance between the empirical CDFs of the original and generated data.
Examples
# Test Case 1: Data from similar distributions
original_data <- rnorm(1000, mean = 0, sd = 1) # Normal distribution (mean = 0, sd = 1)
generated_data <- rnorm(1000, mean = 0, sd = 1) # Similar normal distribution
tv_distance <- calculate_tv_distance_empirical(original_data, generated_data)
print(tv_distance) # Expected to be close to 0, as both datasets are similar
# Test Case 2: Data from different distributions
original_data <- rnorm(1000, mean = 0, sd = 1) # Normal distribution (mean = 0, sd = 1)
generated_data <- rnorm(1000, mean = 5, sd = 2) # Different normal distribution
tv_distance <- calculate_tv_distance_empirical(original_data, generated_data)
print(tv_distance) # Expected to be larger, as the datasets are quite different
[Package covalchemy version 1.0.0 Index]