modelSelection_Q {ppsbm} | R Documentation |
Selects the number of groups with ICL criterion
Description
Selects the number of groups with Integrated Classification Likelihood (ICL) criterion.
Usage
modelSelection_Q(
data,
n,
Qmin = 1,
Qmax,
directed = TRUE,
sparse = FALSE,
sol.hist.sauv
)
Arguments
data |
List with 2 components:
|
n |
Total number of nodes, |
Qmin |
Minimum number of groups. |
Qmax |
Maximum number of groups. |
directed |
Boolean for directed (TRUE) or undirected (FALSE) case. |
sparse |
Boolean for sparse (TRUE) or not sparse (FALSE) case. |
sol.hist.sauv |
List of size Qmax-Qmin+1 obtained from running mainVEM on the data with method='hist'. |
Value
The function outputs a list of 7 components:
-
Qbest
Selected value of the number of groups in [Qmin, Qmax]. -
sol.Qbest
Solution of the mainVEM function for the number of groups Qbest. -
Qmin
Minimum number of groups used. -
all.J
Vector of length Qmax-Qmin+1. Each value is the estimated ELBO functionJ
for estimation withQ
groups,Qmin \le Q \le Qmax
. -
all.ICL
Vector of length Qmax-Qmin+1. Each value is the ICL value for estimation withQ
groups,Qmin \le Q \le Qmax
. -
all.compl.log.likelihood
Vector of length Qmax-Qmin+1. Each value is the estimated complete log-likelihood value for estimation withQ
groups,Qmin \le Q \le Qmax
. -
all.pen
Vector of length Qmax-Qmin+1. Each value is the penalty term in ICL for estimation withQ
groups,Qmin \le Q \le Qmax
.
References
BIERNACKI, C., CELEUX, G. & GOVAERT, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Machine Intel. 22, 719–725.
CORNELI, M., LATOUCHE, P. & ROSSI, F. (2016). Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks. Neurocomputing 192, 81 – 91.
DAUDIN, J.-J., PICARD, F. & ROBIN, S. (2008). A mixture model for random graphs. Statist. Comput. 18, 173–183.
MATIAS, C., REBAFKA, T. & VILLERS, F. (2018). A semiparametric extension of the stochastic block model for longitudinal networks. Biometrika. 105(3): 665-680.
Examples
# load data of a synthetic graph with 50 individuals and 3 clusters
n <- 50
# compute data matrix of counts per subinterval with precision d_max=3
# (ie nb of parts K=2^{d_max}=8).
K <- 2^3
data <- list(Nijk=statistics(generated_Q3$data,n,K,directed=FALSE),
Time=generated_Q3$data$Time)
# ICL-model selection with groups ranging from 1 to 4
sol.selec_Q <- modelSelection_Q(data,n,Qmin=1,Qmax=4,directed=FALSE,
sparse=FALSE,generated_sol_hist)
# best number Q of clusters:
sol.selec_Q$Qbest