calc_rf {qtkit}R Documentation

Internal Functions for Calculating Dispersion and Frequency Metrics

Description

A collection of internal helper functions that calculate various dispersion and frequency metrics from term-document matrices. These functions support the main calc_type_metrics function by providing specialized calculations for different statistical measures.

Computes the relative frequency (RF) for each term in a term-document matrix, representing how often each term occurs relative to the total corpus size.

Usage

calc_rf(tdm)

Arguments

tdm

A sparse term-document matrix (Matrix package format)

Details

The package implements these metrics:

Dispersion measures:

Frequency measures:

Implementation notes:

The calculation process:

  1. Sums occurrences of each term across all documents

  2. Divides by total corpus size (sum of all terms)

  3. Returns proportions between 0 and 1

Value

A numeric vector where each element represents a term's relative frequency in the corpus (range: 0-1)

References

Gries, S. T. (2008). Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403-437.


[Package qtkit version 1.1.1 Index]