train_full_model {text2emotion} | R Documentation |
Train a full model pipeline including text preprocessing, TF-IDF vectorization, random forest tuning, and training.
Description
Train a full model pipeline including text preprocessing, TF-IDF vectorization, random forest tuning, and training.
Arguments
custom_slang |
A named list for custom slang replacements (optional). |
max_features |
Maximum number of features for TF-IDF vectorizer (default 10000). |
min_df |
Minimum document frequency for TF-IDF (default 2). |
max_df |
Maximum document frequency for TF-IDF (default 0.8). |
mtry_grid |
Grid of values for 'mtry' parameter to tune in random forest (default: c(5, 10, 20)). |
ntree_grid |
Grid of values for 'ntree' parameter to tune in random forest (default: c(100, 200, 300)). |
stopwords_file |
Path to the stopwords RDS file (default: "final_stopwords.rds"). |
vectorizer_file |
Path to save the trained vectorizer (default: "trained_vectorizer.rds"). |
tfidf_model_file |
Path to save the trained TF-IDF model (default: "trained_tfidf_model.rds"). |
rf_model_file |
Path to save the trained random forest model (default: "trained_rf_ranger_model.rds"). |
train_df_cache_path |
Path to cache the training data frame (default: "train_df_cached.rds"). |
Value
A list containing the trained TF-IDF model, vectorizer, random forest model, and test accuracy.