MSOpt {multiDoE} | R Documentation |
Experimental setup
Description
The MSOpt
function allows the user to define the
structure of the experiment, the set of optimization criteria and the a priori
model to be considered. The output is a list containing all information about
the settings of the experiment. According to the declared criteria, the list
also contains the basic matrices for their implementation, such as information
matrix, matrix of moments and matrix of weights. The returned list
is argument of the Score
and MSSearch
functions of the multiDoE
package.
Usage
MSOpt(facts, units, levels, etas, criteria, model)
Arguments
facts |
A list of vectors representing the distribution of factors
across strata. Each item in the list represents a stratum and the first item
is the highest stratum of the multi-stratum structure of the experiment.
Within the vectors, experimental factors are indicated by progressive integer
from |
units |
A list whose |
levels |
A vector containing the number of available levels for each
experimental factor in |
etas |
A list specifying the ratios of error variance between subsequent
strata. It follows that |
criteria |
A list specifying the criteria to be optimized. It can contain any combination of:
More detailed information on the available criteria is given under Details. |
model |
A string which indicates the type of model, among “main", “interaction" and “quadratic". |
Details
A little notation is introduced to show the criteria that can be
used in the multi-objective approach of the multiDoE
package.
For an experiment with N
runs and s
strata, with stratum i
having n_i
units within each unit at stratum i-1
and
stratum 0 being defined as the entire experiment (n_0 = 1
), the
general form of the model can be written as:
y = X\beta + \sum\limits_{i = 1}^{s} Z_i\varepsilon_i
where
y
is an N
-dimensional vector of responses (N = \prod_{j = 1}^{s}n_j
),
X
is an N
by p
model matrix,
\beta
is a p
-dimensional vector containing the p
fixed model parameters,
Z_i
is an N
by b_i
indicator matrix of 0
and
1
for the units in stratum i
(i.e. the (k,l
)th element
of Z_i
is 1
if the k
th run belongs to the l
th
block in stratum i
and 0
otherwise) and
b_i = \prod_{j = 1}^{i}n_j
.
Finally, the vector \varepsilon_i \sim N(0,\sigma_i^2I_{b_i})
is a b_i
-dimensional vector
containing the random effects, which are all uncorrelated. The variance
components \sigma^{2}_{i} (i = 1, \dots, s)
have to be estimated and this is usually done using
the REML (REstricted Maximum Likelihood) method.
The best linear unbiased estimator for the parameter vector \beta
is
the generalized least square estimator:
\hat{\beta}_{GLS} = (X'V^{-1}X)^{-1}X'V^{-1}y
This estimator has variance-covariance matrix:
Var(\hat{\beta}_{GLS}) = \sigma^{2}(X'V^{-1}X)^{-1}
where
V = \sum\limits_{i = 1}^{s}\eta_i Z_i'Zi
,
\eta_i = \frac{\sigma_i^{2}}{\sigma^{2}}
and
\sigma^{2} = \sigma^{2}_{s}
.
Let M = (X' V^{-1} X)
be the information matrix about \hat{\beta}
and let \eta
be the vector of the variance components.
-
D-optimality. It is based on minimizing the generalized variance of the parameter estimates. This can be done either by minimizing the determinant of the variance-covariance matrix of the factor effects' estimates or by maximizing the determinant of
M
.
The objective function to be minimized is:f_{D}(d; \eta) = \left(\frac{1}{\det(M)}\right)^{1/p}
where
d
is the design with information matrixM
andp
is the number of model parameters. -
A-optimality. This criterion is based on minimizing the average variance of the estimates of the regression coefficients. The sum of the variances of the parameter estimates (elements of
\hat{\beta}
) is taken as a measure, which is equivalent to considering the trace ofM^{-1}
.
The objective function to be minimized is:f_{A}(d; \eta) = trace(M^{-1})/p
where
d
is the design with information matrixM
andp
is the number of model parameters. -
I-optimality. It seeks to minimize the average prediction variance.
The objective function to be minimized is:f_{I}(d; \eta) = \frac{\int_{\chi} f'(x)(M)^{-1}f(x)\,dx }{\int_{\chi} \,dx}
where
d
is the design with information matrixM
and\chi
represents the design region.
It can be proved that when there arek
treatment factors each with two levels, so that the experimental region is of the form[-1, +1]^{k}
, the objective function can also be written as:f_{I}(d; \eta) = trace \left[(M)^{-1} B\right]
where
d
is the design with information matrixM
andB = 2^{-k} \int_{\chi}f'(x)f(x) \,dx
is the moments matrix. To know the implemented expression for calculating the moments matrix for a cuboidal design region see section 2.3 of Sambo, Borrotti, Mylona, and Gilmour (2016). -
Ds-optimality. Its aim is to minimize the generalized variance of the parameter estimates by excluding the intercept from the set of parameters of interest. Let
\beta_i
be the model parameter vector of dimension (p_i - 1
) to be estimated in stratumi
. LetX_i
be the associated model matrixm_i
by(p_i-1)
, wherem_i
is the number of units in stratumi
. The partition of interest of the matrix of variances and covariances of\hat{\beta}_i
is(M_i^{-1})_{22} = [X'_i (I - \frac{1}{m_i} 11^{'})X_i]^{-1}
The objective function to be minimized is:f_{D_s}(d; \eta) = (|(M_i^{-1})_{22}|)^{1/(p_i-1)}
As-optimality. This criterion is based on minimizing the average variance of the estimates of the regression coefficients by excluding the intercept from the set of parameters of interest.
With reference to the notation introduced for the previous criterion, the objective function to be minimized is:f_{A_s}(d; \eta) = trace(W_i(M_i^{-1})_{22})
where
W_i
is a diagonal matrix of weights, with the weights scaled so that the trace ofW_i
is equal to 1. Specifically the implemented matrix assigns to each main effect and each interaction effect an absolute weight equal to 1, while to the quadratic effects it assigns an absolute weight equal to 1/4.-
Id-optimality. It seeks to minimize the average prediction variance by excluding the intercept from the set of parameters of interest.
The objective function to be minimized is the same as the I-optimality criterion where the first row and columns of theB
matrix (see the Id-optimality criterion) are deleted.
Value
MSOpt
returns a list containing the following components:
facts
: The argumentfacts
.nfacts
: An integer. The number of experimental factors (blocking factors are excluded from the count).nstrat
: An integer. The number of strata.units
: The argumentunits
.runs
: An integer. The total number of runs.etas
: The argumentetas
.avlev
: A list showing the available levels for each experimental factor. The design space for each factor is [-1, 1].levs
: A vector showing the number of available levels for each experimental factor.Vinv
: The inverse of the variance-covariance matrix of the responses.model
: The argumentmodel
.crit
: The argumentcriteria
.ncrit
: An integer. The number of criteria considered.M
: The moment matrix. Only with I-optimality criteria.M0
: The moment matrix. Only with Id-optimality criteria.W
: The diagonal matrix of weights. Only with As-optimality criteria.
References
M. Borrotti and F. Sambo and K. Mylona and S. Gilmour. A multi-objective coordinate-exchange two-phase local search algorithm for multi-stratum experiments. Statistics & Computing, 2017.
S. G. Gilmour, J. M. Pardo, L. A. Trinca, K. Niranjan, D.S. Mottram. A split-plot response surface design for improving aroma retention in freeze dried coffee. In: Proceedings of the 6th. European conference on Food-Industry Statist, 2000.
Examples
## This example uses MSOpt to setup a split-plot design with
## 1 whole-plot factor and 4 subplot factors, which in the \code{facts}
## element appear numbered from 2 to 5.
## The experiment must be structured as follows: 6 whole plots and 5 subplots
## per whole plot, for a total of 30 runs.
## Each experimental factor has 3 different levels.
## To check the number of digits to be printed.
backup_options <- options()
options(digits = 10)
facts <- list(1, 2:5)
units <- list(6, 5)
levels <- 3
etas <- list(1)
criteria <- c('I', 'D', 'A')
model <- "quadratic"
msopt <- MSOpt(facts, units, levels, etas, criteria, model)
options(backup_options)