arguments {BTSR}R Documentation

Shared documentation for arguments

Description

This is the common documentation for all parameter in BTSR package.

The package handles function arguments in two compatible formats

All functions accept both formats seamlessly, ensuring backward compatibility. The internal processing automatically standardizes to the new structure.

Arguments

model

character string (case-insensitive) indicating the model to be fitted to the data. Must be one of the options listed in the Section Supported Models.

n

the sample size of the output time series yt after burn-in (simulation only). Default is n = 1.

nnew

optional; the number of out-of sample predicted values required (extract and fit only). Default is nnew = 0.

burn

the length of the ‘burn-in’ period (simulation only). Default is burn = 0. The first burn values of the time series are discarded.

yt

numeric vector with the observed time series (extract and fit only). Missing values (NA's) are not allowed.

y.start

optional; an initial value for Y_t (to initialize recursions when t < 1). Default is y.start = NULL, in which case, the recursion assumes that Y_t = g_{12}^{-1}(0), for t < 1. Only relevant if p > 0.

rho

the quantile being considered in the conditional distribution of Y_t (only present in Kumaraswamy and Unit-Weibull based models). It can be any positive number between 0 and 1. Default is rho = 0.5, which corresponds to the median.

y.lower

the lower limit for the Kumaraswamy density support. Default is y.lower = 0.

y.upper

the upper limit for the Kumaraswamy density support. Default is y.upper = 1.

vt.start

optional; an initial value for \vartheta_t (to initialize recursions when t < 1). Default is vt.start = NULL, in which case, the recursion assumes that \vartheta_t = g_{22}^{-1}(0), for t < 1. Only relevant if \nu is time-varying and p_2 > 0.

e2.start

optional; an initial value for g_{23}(e_{1t}) (to initialize recursions when t < 1). Default is e2.start = NULL, in which case, the recursion assumes that e_{1t} = g_{23}^{-1}(0), for t < 1. Only relevant if \nu is time-varying and q_2 > 0 or d_2 > 0.

xreg

optional; external regressors. Can be specified as a vector, a matrix or a list. Default is xreg = NULL. For details, see the Section Regressors format.

xnew

optional; nnew new observations of the external regressors (extract and fit only). Follows the same format is the same as xreg. Default is xnew = NULL.

xreg.start

optional; initial value for the regressors (to initialize recursion). Can be specified as a vector or a list. Default is xreg.start = NULL, in which case, the average of the first p values (AR order) is used. Only relevant if xreg is provided, xregar = TRUE and p > 0. For details, see the Section Regressors format.

xregar

a length 1 or 2 logical vector indicating whether xreg should be included in the AR recursion for each part of the model. Default is xregar = TRUE. Only relevant if p > 0. If a single value is provided and \nu is time-varying, the same option is assumed for both parts of the model. See the Section ‘The BTSR structure’ in btsr-package for details.

inf

a length 1 or 2 integer vector given the truncation point for infinite sums. Default is inf = 1000. See the Section Model Order for details.

p

optional; a length 1 or 2 integer vector given the order of the AR polynomial (extract and fit only). Default is p = NULL. See the Section Model Order for details.

q

optional; a length 1 or 2 integer vector given the order of the MA polynomial (extract and fit only). Default is q = NULL. See the Section Model Order for details.

d

a length 1 or 2 logical vector indicating whether the long memory parameter d should be included in the model either as a fixed or non-fixed parameter (fit only). If d = FALSE, internally the value of the parameter d is fixed as 0. In this case, if start or fixed.values include d, the value provided by the user is ignored. If \nu is time-varying and a single value is provided it is assumed that d_1 = d_2 = d.

ignore.start

optional; logical value indicating whether the argument start should be ignored (fit only). If starting values are not provided, the function uses the default values and ignore.start is ignored. In case starting values are provided and ignore.start = TRUE, those starting values are ignored and recalculated. The default is ignore.start = FALSE. Partial starting values are not allowed.

start

optional; a list with the starting values for the non-fixed coefficients of the model (fit only). The default is start = NULL, in which case the function coefs.start is used internally to obtain starting values for the parameters. For details on the expected format and the arguments that can be passed through coefs, see the Section Model coefficients.

coefs

a list with the coefficients of the model (simulation and extraction only). The default is coefs = NULL. For details on the expected format and the arguments that can be passed through coefs, see the Section Model coefficients.

lags

optional; a list with the lags (integer values) that the entries in coefs or start correspond to (extract and fit only). The default is lags = NULL, in which the lags are computed from the fixed.lags argument (if provided). When components are missing or empty in both, lags and fixed.lags, the default behavior is to include all lags based on nreg = ncol(xreg), p, and q. For details, see the Section Model coefficients.

fixed.values

optional; a list with the values of the coefficients that are fixed (extract and fit only). The default is fixed.values = NULL. See the Section Model coefficients.

fixed.lags

optional; a list with the lags (integer values) that the fixed values in fixed.values correspond to (extract and fit only). The default is fixed.lags = NULL. For missing components, fixed values will are set based on lags.

lower

optional; list with the lower bounds for the parameters (fit only). Default is lower = NULL. The default is to assume that the parameters have no lower bound except for nu, for which de default is 0. Only the bounds for bounded parameters need to be specified. The format of lower and the arguments that can be passed through this list are the same as the ones for start.

upper

optional; list with the upper bounds for the parameters (fit only). Default is upper = NULL. The default is to assume that the parameters have no upper bound. Only the bounds for bounded parameters need to be specified. The format of lower and the arguments that can be passed through this list are the same as the ones for start.

map

a non-negative integer from 1 to 5 corresponding to the map function. Default is map = 4. See the Section The map function.

error.scale

either 0 or 1; the scale for the error term. Default is error.scale = 1 (predictive scale).

linkg

link functions. Can be specified as a character, two-character vector or a named list. The corresponding text strings for currently available links are listed in link.btsr. Default values depend on the model. For some models default values override user specifications. See the Section Link defaults for details.

linkh

a character indicating which link must be associated to the chaotic process. See the Section ‘The BTSR structure’ in btsr-package for details and link.btsr for valid links. Default is linkh = "linear".

configs.linkg

a list with two elements, ctt and power, which define the constant a and the exponent b in the link function g(x) = a x^b. Each element can be specified as a numeric value, a vector of size 2 or a named list. The default is configs.linkg = NULL. See the Section Link defaults for details.

configs.linkh

a list with extra configurations for the link h. For now, only used if linkh = "linear" or "polynomial". Default is configs.linkh = list(ctt = 1, power = 1).

m

a non-negative integer indicating the starting time for the sum of the partial log-likelihood, given by \ell = \sum_{t = m+1}^n \ell_t (extract and fit only). Default is m = 0. For details, see the Section The log-likelihood.

llk

logical; indicates whether the value of the log-likelihood function should be returned (extract and fit only). Default is llk = TRUE.

sco

logical; indicates whether the score vector should be returned (extract and fit only). Default is sco = FALSE.

info

logical; indicates whether the information matrix should be returned (extract and fit only). Default is info = FALSE. For the fitting function, info is automatically set to TRUE when report = TRUE.

extra

logical, if TRUE the matrices and vectors used to calculate the score vector and the information matrix are returned (extract and fit only). Default is extra = FALSE. Ignored by BARC models.

control

a list with configurations to be passed to the optimization subroutines (fit only). Default is control = NULL. Missing arguments will receive default values. For details, see fit.control.

report

logical; indicates whether the summary from the fitted model should be be printed (fit only). Default is report = TRUE, in which case info is automatically set to TRUE.

complete

logical; if FALSE returns only yt, else returns additional components (simulation only). Default is complete = FALSE.

debug

logical, if TRUE the output from FORTRAN is return (for debugging purposes). Default is debug = FALSE.

...

further arguments passed to the internal functions. See, for instance, summary.btsr for details.

Supported Models

Internally, all models are handled by the same function and all models can be obtained from the more general case "*ARFIMAV". When a particular model (e.g. "BREG" or "BARMA") is invoked some default values are assumed.

The following table summarizes the available distributions and the corresponding string to generate each model type. The character V at the end of the string indicates that \nu is time-varying.

+--------------+--------+------------+---------+-----------+---------+
| Distribution | i.i.d. | Regression | Short   | Long      | Chaotic |
|              | sample |            | Memory  | Memory    |         |
+--------------+--------+------------+---------+-----------+---------+
| Beta         | BETA   | BREG       | BARMA   | BARFIMA   | BARC    |
|              |        | BREGV      | BARMAV  | BARFIMAV  |         |
+--------------+--------+------------+---------+-----------+---------+
| Gamma        | GAMMA  | GREG       | GARMA   | GARFIMA   |         |
|              |        | GREGV      | GARMAV  | GARFIMAV  |         |
+--------------+--------+------------+---------+-----------+---------+
| Kumaraswamy  | KUMA   | KREG       | KARMA   | KARFIMA   |         |
|              |        | KREGV      | KARMAV  | KARFIMAV  |         |
+--------------+--------+------------+---------+-----------+---------+
| Matsuoka     | MATSU  | MREG       | MARMA   | MARFIMA   |         |
+--------------+--------+------------+---------+-----------+---------+
| Unit-Lindley | UL     | ULREG      | ULARMA  | ULARFIMA  |         |
+--------------+--------+------------+---------+-----------+---------+
| Unit-Weibull | UW     | UWREG      | UWARMA  | UWARFIMA  |         |
|              |        | UWREGV     | UWARMAV | UWARFIMAV |         |
+--------------+--------+------------+---------+-----------+---------+

Default values

All models are special cases of the general "*ARFIMAV" structure. When a specific model is selected via model = "NAME", the package automatically applies these default configurations (any parameter that does not appear in the equations below is ignored)

i.i.d samples (e.g., BETA, GAMMA,...)

\eta_{1t} = \alpha_1 = \mu, \quad \eta_{2t} = \alpha_2 = \nu.

Fixed

p <- q <- d <- 0
xreg <- NULL
linkg <- list(g11 = "linear", g2 = "linear",
              g21 = "linear", g23 = "linear")

Regression models with \nu_t constant over time (e.g., BREG, GREG,...)

\eta_{1t} = g_{11}(\mu_t) = \alpha_1 + \boldsymbol{X}_{1t}'\boldsymbol{\beta}_1, \quad \eta_{2t} = \alpha_2 = \nu.

Fixed

p <- q <- d <- 0
xreg <- list(part1 = "user's regressors", part2 = NULL)
linkg <- list(g11 = "user's choice", g12 = "linear",
              g2 = "linear", g21 = "linear", g23 = "linear")

Regression models with \nu_t varying on time (e.g. BREGV, GREGV)

\eta_{1t} = g_{11}(\mu_t) = \alpha_1 + \boldsymbol{X}_{1t}'\boldsymbol{\beta}_1, \quad \eta_{2t} = g_{21}(g_2(\nu_t)) = \alpha_2 + \boldsymbol{X}_{2t}'\boldsymbol{\beta}_2.

Fixed

p <- q <- d <- 0
linkg <- list(g11 = "user's choice", g12 = "linear",
              g2 = "user's choice", g21 = "user's choice",
              g22 = "linear", g23 = "linear")

Short-memory models with \nu constant over time (ARMA-like) (e.g. BARMA, GARMA,...)

\begin{aligned} \eta_{1t} & = g_{11}(\mu_t) = \alpha_1 + \boldsymbol{X}_{1t}'\boldsymbol{\beta}_1 + \sum_{i=1}^{p_1} \phi_{1i}\bigl(g_{12}(Y_{t-i})- I_{X_1}\boldsymbol{X}_{1(t-i)}'\boldsymbol{\beta}_1\bigr) + \sum_{k=1}^{q_1} \theta_{1k} e_{1,t-k}, \\ \eta_{2t} & = \alpha_2 = \nu. \end{aligned}

Fixed

d <- 0
xreg <- list(part1 = "user's regressors", part2 = NULL)
linkg <- list(g11 = "user's choice", g12 = "user's choice",
              g2 = "linear", g21 = "linear", g23 = "linear")

Short-memory models with \nu_t varying on time (e.g. BARMAV, GARMAV,...)

\begin{aligned} \eta_{1t} & = g_{11}(\mu_t) =\alpha_1 + \boldsymbol{X}_{1t}'\boldsymbol{\beta}_1 + \sum_{i=1}^{p_1} \phi_{1i}\big(g_{12}(Y_{t-i})- I_{X_1}\boldsymbol{X}_{1(t-i)}'\boldsymbol{\beta}_1\big) + \sum_{k=1}^{q_1} \theta_{1k} r_{t-k},\\ \vartheta_t & = g2(\nu_t)\\ \eta_{2t} & = g_{21}(\vartheta_t) =\alpha_2 + \boldsymbol{X}_{2t}' \boldsymbol{\beta}_2 + \sum_{i=1}^{p_2} \phi_{2i}\big(g_{22}(\vartheta_{t-i})- I_{X_2}\boldsymbol{X}_{2(t-i)}'\boldsymbol{\beta}_2\big) + \sum_{k=1}^{q_2} \theta_{2k} g_{23}(e_{1,t-k}). \end{aligned}

Fixed

d <- 0

Long-memory models with \nu constant over time (ARFIMA-like models) (e.g. BARFIMA, GARFIMA,...)

\begin{aligned} \eta_{1t} & = g_{11}(\mu_t) =\alpha_1 + \boldsymbol{X}_{1t}'\boldsymbol{\beta}_1 + \sum_{i=1}^{p_1} \phi_{1i}\big(g_{12}(Y_{t-i})- I_{X_1}\boldsymbol{X}_{1(t-i)}'\boldsymbol{\beta}_1\big) + \sum_{k=1}^\infty c_{1k} r_{t-k},\\ \eta_{2t} & =\alpha_2 = \nu. \end{aligned}

Fixed

p <- c("user's p", 0)
q <- c("user's q", 0)
d <- c("user's d", 0)
xreg <- list(part1 = "user's regressors", part2 = NULL)
linkg <- list(g11 = "user's choice", g12 = "user's choice",
              g2 = "linear", g21 = "linear", g23 = "linear")

Reproducing Models from the Literature

This section summarizes how to replicate well-known time series models from the literature using the BTSR package. For each model type, we provide the necessary parameter settings and references to the original publications. These configurations act as templates, helping users correctly apply the package to reproduce results or extend established models.

Key arguments (e.g., error.scale, xregar, y.lower, y.upper, rho) should be set to match the specifications in the referenced articles. While we focus on the ⁠btsr.*⁠ functions (see BTSR.functions), all models can also be implemented using the corresponding parent model functions (for details, see BTSR.parent.models).

i.i.d. samples: The arguments error.scale and xregar are ignored.

Regression models: the argument error.scale and all entries but g11 in linkg are ignored

ARMA-like models

ARFIMA-like models

Chaotic models

Regressors format

In-sample (xreg) and out-of-sample values (xnew) for regressors can be provided in two formats

xreg.start can be provided in two formats

The following rules apply to xreg, xnew and xreg.start

Model Order

The coefficients \{c_{ik}\}_{k\geq 0} are defined through the relation (see the section ‘The BTSR Structure’ in btsr-package)

c_i(z) := (1-L)^{-d_i}\theta_i(z) = \sum_{k = 0}^\infty c_{ik}z^k, \quad i \in \{1,2\}.

where \theta_i(z) = \sum_{k = 0}^{q_i} \theta_{ik}z^k is the moving average characteristic polynomial, with order q_i. For practical purposes, the following approximation is used

c_i(z) \approx \sum_{k = 0}^{K_i} c_{ik}z^k,

for some K_i sufficiently large.

inf corresponds to the truncation point for all infinite sums using the coefficients \{c_{ik}\}_{k\geq 0}, i \in \{1,2\}, including samples generation and derivatives calculation. It can be provided as either a single integer (legacy format) or a length 2 integer vector (new format) specifying the trunction points for part1/part2. If \nu is time-varying and a single value is provided the same value is used for both parts. When d = 0, Fortran automatically sets inf to q (MA order).

By default p and q are set to NULL, in which case their values are computed internally, based on the size of the argument phi and theta, respectively, in the lists of coefficients (or staring values), fixed lags, and fixed values. For fitting purposes, if p (analogously, q) and start are both NULL, an error message is issued. These parameters can be provided as either a single integer (legacy format) or a length 2 integer vector (new format) specifying orders for part1/part2. If \nu is time-varying and a single value of p (analogously, q) is provided it is assumed that p_1 = p_2 = p (analogously, q_1 = q_2 = q).

Model coefficients

start, coefs, fixed.values, lags and fixed.lags can be specified in one of two ways

The optional arguments in this lists are

The following rules apply for these lists and their arguments.

Simulation:

Extraction:

Fitting:

Extraction and fitting:

The map function

The map function T:[0,1] \to [0,1] in BARC models is a dynamical system, i.e., a function, potentially depending on a r-dimensional vector of parameters \theta. As for today, for all implemented maps, r = 1.

Available choices are

Link defaults

linkh and configs.linkh only apply to BARC models.

linkg can be specified in one of two ways

For models that do not have the \nu parameter, the links g2, g21, g22 and g23 are set to "linear" for compatibility with Fortran subroutines.

Missing entries in the linkg list follow these rules

Default linkg values are model-dependent (based on the string provided with model):

configs.linkg if provided, it must be provided as a list with optional elements, ctt and power, which define the constant a and the exponent b in the link function g(x) = a x^b. Each element in this list can be specified in one of two ways

For now, the arguments ctt and power are only used when the link function is "linear" or "polynomial". If NULL, default is to assume that ctt and power are both equal to 1 for all links.

The log-likelihood

Let \boldsymbol\gamma = (\boldsymbol \rho', \boldsymbol \lambda')' be the vector of unknown parameters in the model where

The log-likelihood function, conditioned on a set of initial conditions \mathcal{F}_m is given by

\ell(\boldsymbol\gamma) = \sum_{t = m+1}^n \ell_t = \displaystyle\sum_{t=m+1}^n\log\!\big(f(Y_t \mid \mathcal{F}_{t-1}, \boldsymbol{\gamma})\big).

For simplicity of notation assume m = 0. The score vector U(\boldsymbol\gamma) = \big(U_{\boldsymbol\rho}(\boldsymbol\gamma)', U_{\boldsymbol\lambda}(\boldsymbol\gamma)'\big)' can be written as

U_{\boldsymbol\rho}(\boldsymbol\gamma) = D_{\boldsymbol\rho}' T_1\boldsymbol h_1 + M_{\boldsymbol\rho}' T_2\boldsymbol h_2 \qquad \mbox{and} \qquad U_{\boldsymbol\lambda}(\boldsymbol\gamma) = D_{\boldsymbol\lambda}' T_2\boldsymbol h_2,

where

For the models implemented so far, \partial\eta_{1t}/\partial\lambda_j = 0 so that we don't need a matrix for these derivatives.

The conditional Fisher information matrix for \boldsymbol\gamma is given by

K_n(\boldsymbol\gamma) = \begin{pmatrix} K_{\boldsymbol\rho,\boldsymbol\rho} & K_{\boldsymbol\rho,\boldsymbol\lambda}\\ K_{\boldsymbol\lambda,\boldsymbol\rho}& K_{\boldsymbol\lambda,\boldsymbol\lambda} \end{pmatrix}

with

\begin{aligned} K_{\boldsymbol\rho,\boldsymbol\rho} &= D'_{\boldsymbol \rho}T_1E_\mu T_1 D_{\boldsymbol \rho} + M'_{\boldsymbol \rho}T_2E_{\mu\nu}T_1 D_{\boldsymbol \rho} + D'_{\boldsymbol \rho}T_1E_{\mu\nu} T_2 M_{\boldsymbol \rho} + M'_{\boldsymbol \rho}T_2 E_\nu T_2 M_{\boldsymbol \rho}\\ K_{\boldsymbol\rho,\boldsymbol\lambda} &= K_{\boldsymbol\lambda,\boldsymbol\rho}'= D_{\boldsymbol \rho}' T_1E_{\mu\nu}T_2D_{\boldsymbol \lambda} + M_{\boldsymbol \rho}' T_2 E_\nu T_2 D_{\boldsymbol \lambda},\\ K_{\boldsymbol\lambda,\boldsymbol\lambda} &= D_{\boldsymbol \lambda}' T_2E_\nu T_2D_{\boldsymbol \lambda} \end{aligned}

where E_\mu, E_{\mu\nu} and E_\nu are diagonal matrices for which the (t,t)th element is given by

[E_\mu ]_{t,t} = -\mathbb{E}\bigg(\dfrac{\partial^2 \ell_t}{\partial \mu_t^2} \bigg| \mathcal{F} _{t-1} \bigg), \quad [E_{\mu\nu}]_{t,t} = -\mathbb{E}\bigg(\dfrac{\partial^2 \ell_t}{\partial\mu_t\partial \nu_t} \bigg| \mathcal{F} _{t-1} \bigg) \quad \mbox{and} \quad [E_\nu]_{t,t} = - \mathbb{E}\bigg(\dfrac{\partial^2 \ell_t}{ \partial \nu_t^2} \bigg| \mathcal{F} _{t-1} \bigg).

References

Bayer FM, Bayer DM, Pumi G (2017). “Kumaraswamy autoregressive moving average models for double bounded environmental data.” Journal of Hydrology, 555, 385–396. doi:10.1016/j.jhydrol.2017.10.006.

Ferrari SLP, Cribari-Neto F (2004). “Beta Regression for Modelling Rates and Proportions.” Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501.

Kumaraswamy P (1980). “A generalized probability density function for double-bounded random processes.” Journal of Hydrology, 46(1-2), 79–88. doi:10.1016/0022-1694(80)90036-0.

Matsuoka DH, Pumi G, Torrent HS, Valk M (2024). “A three-step approach to production frontier estimation and the Matsuoka's distribution.” doi:10.48550/arXiv.2311.06086.

Mazucheli J, Menezes AFB, Fernandes LB, de Oliveira RP, Ghitany ME (2019). “The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on covariates.” Journal of Applied Statistics. doi:10.1080/02664763.2019.1657813.

Mazucheli J, Menezes AJB, Chakraborty S (2018). “On the one parameter unit-Lindley distribution and its associated regression model for proportion data.” Journal of Applied Statistics. doi:10.1080/02664763.2018.1511774.

Mitnik PA, Baek S (2013). “The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation.” Statistical Papers, 54, 177–192. doi:10.1007/s00362-011-0417-y.

Prass TS, Pumi G, Taufemback CG, Carlos JH (2025). “Positive time series regression models: theoretical and computational aspects.” Computational Statistics, 40, 1185–1215. doi:10.1007/s00180-024-01531-z.

Pumi G, Matsuoka DH, Prass TS (2025). “A GARMA Framework for Unit-Bounded Time Series Based on the Unit-Lindley Distribution with Application to Renewable Energy Data.” doi:10.48550/arXiv.2504.07351.

Pumi G, Matsuoka DH, Prass TS, Palm BG (2025). “A Matsuoka-Based GARMA Model for Hydrological Forecasting: Theory, Estimation, and Applications.” doi:10.48550/arXiv.2502.18645.

Pumi G, Prass TS, Souza RR (2021). “A dynamic model for double bounded time series with chaotic driven conditional averages.” Scandinavian Journal of Statistics, 48(1), 68–86. doi:10.1111/sjos.12439.

Pumi G, Valk M, Bisognin C, Bayer FM, Prass TS (2019). “Beta autoregressive fractionally integrated moving average models.” Journal of Statistical Planning and Inference, 200, 196–212. doi:10.1016/j.jspi.2018.10.001.

Rocha AV, Cribari-Neto F (2009). “Beta autoregressive moving average models.” Test, 18, 529–545. doi:10.1007/s11749-008-0112-z.

Rocha AV, Cribari-Neto F (2017). “Erratum to: Beta autoregressive moving average models.” Test, 26, 451–459. doi:10.1007/s11749-017-0528-4.

See Also

BTSR.model.defaults: function to print default settings for a specified model


[Package BTSR version 1.0.0 Index]