train_full_model {text2emotion}R Documentation

Train a full model pipeline including text preprocessing, TF-IDF vectorization, random forest tuning, and training.

Description

Train a full model pipeline including text preprocessing, TF-IDF vectorization, random forest tuning, and training.

Arguments

custom_slang

A named list for custom slang replacements (optional).

max_features

Maximum number of features for TF-IDF vectorizer (default 10000).

min_df

Minimum document frequency for TF-IDF (default 2).

max_df

Maximum document frequency for TF-IDF (default 0.8).

mtry_grid

Grid of values for 'mtry' parameter to tune in random forest (default: c(5, 10, 20)).

ntree_grid

Grid of values for 'ntree' parameter to tune in random forest (default: c(100, 200, 300)).

stopwords_file

Path to the stopwords RDS file (default: "final_stopwords.rds").

vectorizer_file

Path to save the trained vectorizer (default: "trained_vectorizer.rds").

tfidf_model_file

Path to save the trained TF-IDF model (default: "trained_tfidf_model.rds").

rf_model_file

Path to save the trained random forest model (default: "trained_rf_ranger_model.rds").

train_df_cache_path

Path to cache the training data frame (default: "train_df_cached.rds").

Value

A list containing the trained TF-IDF model, vectorizer, random forest model, and test accuracy.


[Package text2emotion version 0.1.0 Index]