select_by_CHull {adproclus} | R Documentation |
Automatic Model Selection for ADPROCLUS with CHull Method
Description
For a set of full dimensional ADPROCLUS models (each with different number of clusters),
this function finds the "elbow" in the scree plot by using the
CHull procedure (Wilderjans, Ceuleman & Meers, 2013) implemented in
the multichull
package.
For a matrix of low dimensional ADPROCLUS models
(each with different number of cluster and components),
this function finds the "elbow" in the scree plot for each
number of clusters with the CHull methods.
That is, it reduces the number of model to choose from to the number of
different cluster parameter values by choosing the "elbow" number of
components for a given number of clusters. The resulting list can in turn
be visualized with plot_scree_adpc_preselected
.
For this procedure to work, the SSE or unexplained variance values must be
decreasing in the number of clusters (components). If that is not the case
increasing the number of (semi-) random starts can help.
Usage
select_by_CHull(model_fit, percentage_fit = 1e-04, ...)
Arguments
model_fit |
Matrix containing SSEs or unexplained variance of all models
as in the output of |
percentage_fit |
Required proportion of increase in fit of a more complex model. |
... |
Additional parameters to be passed on to |
Details
This procedure cannot choose the model with the largest or smallest number of clusters (components), i.e. for a set of three models it will always choose the middle one. If for a given number of clusters exactly two models were estimated, this function chooses the model with the lower SSE/unexplained variance.
The name of the model fit criterion is propagated from the input matrix based on the first column name. It is either "SSE" or "Unexplained_Variance".
Value
For full dimensional ADPROCLUS a CHull
object describing the
chosen model.
For low dimensional ADPROCLUS a matrix containing the list of chosen models
and the relevant model parameter, compatible with
plot_scree_adpc_preselected
.
References
Wilderjans, T. F., Ceulemans, E., & Meers, K. (2012). CHull: A generic convex hull based model selection method. Behavior Research Methods, 45, 1-15
See Also
mselect_adproclus
to obtain the
model_fit
input from the possible ADPROCLUS modelsmselect_adproclus_low_dim
to obtain the
model_fit
input from the possible low dimensional ADPROCLUS modelsplot_scree_adpc
for plotting the model fits
Examples
# Loading a test dataset into the global environment
x <- stackloss
# Estimating models with cluster parameter values ranging from 1 to 4
model_fits <- mselect_adproclus(data = x, min_nclusters = 1, max_nclusters = 4)
# Use and visualize CHull method
selected_model <- select_by_CHull(model_fits)
selected_model
plot(selected_model)
# Estimating low dimensional models with cluster parameter values
# ranging from 1 to 4 and component parameter values also ranging from 1 to 4
model_fits <- mselect_adproclus_low_dim(data = x, 1, 4, 1, 4, nsemirandomstart = 10, seed = 1)
# Using the CHull method
pre_selection <- select_by_CHull(model_fits)
# Visualize pre-selected models
plot_scree_adpc_preselected(pre_selection)