feature_standardization {bdsm}R Documentation

Perform feature standardization

Description

This function performs feature standardization (also known as z-score normalization) by centering the features around their mean and scaling by their standard deviation.

Usage

feature_standardization(df, excluded_cols, group_by_col, scale = TRUE)

Arguments

df

Data frame with the data.

excluded_cols

Unquoted column names to exclude from standardization. If missing, all columns are standardized.

group_by_col

Unquoted column names to group the data by before applying standardization. If missing, no grouping is performed.

scale

Logical. If TRUE (default) scales by the standard deviation.

Value

A data frame with standardized features.

Examples

df <- data.frame(
  year = c(2000, 2001, 2002, 2003, 2004),
  country = c("A", "A", "B", "B", "C"),
  gdp = c(1, 2, 3, 4, 5),
  ish = c(2, 3, 4, 5, 6),
  sed = c(3, 4, 5, 6, 7)
)

# Standardize every column
df_with_only_numeric_values <- df[, setdiff(names(df), "country")]
feature_standardization(df_with_only_numeric_values)

# Standardize all columns except 'country'
feature_standardization(df, excluded_cols = country)

# Standardize across countries (grouped by 'country')
feature_standardization(df, group_by_col = country)

# Standardize, excluding 'country' and group-wise by 'year'
feature_standardization(df, excluded_cols = country, group_by_col = year)


[Package bdsm version 0.2.1 Index]