wsdm {VGAM} | R Documentation |
The WSDM Function
Description
Computes the WSDM statistic for each regression coefficient of a fitted VGLM.
Usage
wsdm(object, hdiff = 0.005, retry = TRUE, mux.hdiff = 1,
maxderiv = 5, theta0 = 0, use.hdeff = FALSE,
doffset = NULL, subset = NULL,
derivs.out = FALSE, fixed.hdiff = TRUE,
eps.wsdm = 0.15, Mux.div = 3, warn.retry = TRUE,
with1 = TRUE, ...)
Arguments
object |
A fitted |
hdiff |
Numeric;
the difference |
retry |
Logical, compute with two other |
mux.hdiff |
Numeric, positive and of unit length;
multiplier for |
maxderiv |
Numeric, positive, integer-valued
and of unit length.
The highest order derivative to be computed.
This means the highest value that the
function will return is |
theta0 |
Numeric; the hypothesized value.
The default is appropriate for most symmetric
|
use.hdeff |
Logical, use |
doffset |
Numeric; denominator offset.
If inputted and not of sufficient
length, then the remaining values are 1s.
A value |
subset |
Specify a subset of the regression
coefficients?
May be numeric, logical or character.
Should work if |
derivs.out |
Logical; return the derivatives?
If |
fixed.hdiff |
Logical; treat |
eps.wsdm |
Numeric (positive and scalar). Tolerance for computing the WSDM value. Unless the three values are within this quantity of each other, a warning will be issued. It is usually not necessary to compute WSDM statistics very accurately because they are used as a diagnostic, hence this argument should not be too small. |
Mux.div |
Numeric ( |
warn.retry |
logical; if |
with1 |
Logical.
Include the intercepts?
(This is a |
... |
Unused just now. |
Details
This function, which is currently experimental
and very rough-and-ready,
may be used as a diagnostic
to see if any regression coefficients are
alarmingly close to the
parameter space boundary for the
regularity conditions to
be valid.
A zero value denotes the centre of the
parameter space (cops
; COPS),
which can be considered the
heart of the interior of the parameter space
where the Hauck–Donner effect (HDE) does not
occur.
A unit value occurs at w_1[0]
, the
locations where the HDE starts taking place.
As the WSDM statistic increases, the estimate
is approaching the parameter space boundary,
hence standard inference is breaking down
and becoming more fraught with various
dangers and inaccuracies.
The WSDM (pronounced wisdom)
and the WSDM function
are invariant to the sample size n
for intercept-only models under random sampling.
They are intended to be useful as a
regression diagnostic tool for most VGLMs.
In summaryvglm
,
if the max-WSDM statistic,
which is the maximum WSDM over all the
regression coefficients bar the intercepts,
is greater than 1.3, say, then the model
should definitely not be used as it stands.
One reason for the HDE is because a covariate
is heavily skewed. If so, a suitable
transformation can remedy the problem.
The HDE may also be caused by
complete separation in the covariate
space.
Incidentally,
another thing to check is the number of Fisher
scoring iterations needed for convergence,
e.g., any value greater than 10, say, should
raise concern.
Set trace = TRUE
or look at
niters(object)
or
summary(object)
.
Value
By default this function
returns the WSDM statistics
as a labelled numeric vector with
nonnegative values, i.e., with names
names(coef(object))
.
The attribute (see attr
)
"seems.okay"
will
always be attached to the answer and will
be FALSE
, TRUE
,
or NA
or NULL
if uncertain.
If FALSE
, retry by changing hdiff
or mux.hdiff
.
The following table is suggested for
all link functions except for
cauchit
:
Range | Comments |
[0.0, 0.5) | No HDE. Fine. |
[0.5, 0.7) | Faint HDE. A borderline case, approaching something to be concerned with. |
[0.7, 1.0) | Weak HDE. Should be of concern. Action is recommended but could possibly be optional. |
[1.0, 1.3) | Moderate HDE. Action needed here and beyond. |
[1.3, 2.0) | Strong HDE. Action definitely needed for this case. |
[2.0, 3.0) | Extreme I HDE. Model should not be used or remedial action urgently needed. |
[3.0, 4.0) | Extreme II HDE. Ditto. |
[4.0, 5.0) | Extreme III HDE. Ditto. |
... | ... |
This table supersedes the one given in
Yee (2022),
as this one is totally independent of n
and has several advantages.
Consequently,
hdeffsev
has been rewritten.
No more than two or three decimal places
should be used because the WSDM statistics
are approximated by finite differences and
are mainly used as a diagnostic.
Probably, for most applications, large WSDM
statistics for the intercepts should not be
a problem, hence the max-WSDM excludes these.
Being mainly used as a diagnostic, WSDM values need not be computed or stated very accurately. It is suggested that 2 (or maybe 3) decimals places is fine.
If derivs.out = TRUE
then higher-order
derivatives are returned also within a
list()
.
Warning
Use with caution. This function has been tested the most for logistic regression, and less so for other VGAM family functions. It will not work for all VGAM family functions.
Note
The results can be sensitive to
hdiff
so it is recommended that several
h
be tried,
especially for regression
coefficients that are near the parameter
space boundary.
Hence retry = TRUE
is definitely
recommended.
This function could change in the short
future because it is under active development
and requires further fine-tuning.
Author(s)
Thomas W. Yee.
References
Yee, T. W. (2022). On the Hauck-Donner effect in Wald tests: Detection, tipping points and parameter space characterization, Journal of the American Statistical Association, 117, 1763–1774. doi:10.1080/01621459.2021.1886936.
Yee, T. W. (2025). Mapping the parameter space with the WSDM function: A diagnostic for logistic regression and beyond. In preparation.
See Also
summaryvglm
,
hdeffsev
,
hdeff
,
cops
,
niters
.
Examples
# Kosmidis (2014, Table 2), JRSSB 76: 169--196
ppom.wine2 <-
vglm(cbind(bitter1, bitter2, bitter3, bitter4, bitter5) ~
temp + contact,
cumulative(reverse = TRUE,
parallel = TRUE ~ contact - 1),
wine, trace = TRUE)
coef(ppom.wine2, matrix = TRUE)
summary(ppom.wine2, wsdm = TRUE)
max(wsdm(ppom.wine2, with1 = FALSE))