build_balancing_problem {gseries} | R Documentation |
Build the elements of balancing problems.
Description
(version française: https://StatCan.github.io/gensol-gseries/fr/reference/build_balancing_problem.html)
This function is used internally by tsbalancing()
to build the elements of the balancing problems.
It can also be useful to derive the indirect series associated to equality balancing constraints manually
(outside of the tsbalancing()
context).
Usage
build_balancing_problem(
in_ts,
problem_specs_df,
in_ts_name = deparse1(substitute(in_ts)),
ts_freq = stats::frequency(in_ts),
periods = gs.time2str(in_ts),
n_per = nrow(as.matrix(in_ts)),
specs_df_name = deparse1(substitute(problem_specs_df)),
temporal_grp_periodicity = 1,
alter_pos = 1,
alter_neg = 1,
alter_mix = 1,
lower_bound = -Inf,
upper_bound = Inf,
validation_only = FALSE
)
Arguments
in_ts |
(mandatory) Time series (object of class "ts" or "mts") that contains the time series data to be reconciled. They are the balancing problems' input data (initial solutions). | |||||||||||||||
problem_specs_df |
(mandatory) Balancing problem specifications data frame (object of class "data.frame"). Using a sparse format inspired from the
SAS/OR The information is provided using four mandatory variables (
Note that empty strings (
Finally, the following table lists valid aliases for the
Reviewing the Examples should help conceptualize the balancing problem specifications data frame. | |||||||||||||||
in_ts_name |
(optional) String containing the value of argument Default value is | |||||||||||||||
ts_freq |
(optional) Frequency of the time series object (argument Default value is | |||||||||||||||
periods |
(optional) Character vector describing the time series object (argument Default value is | |||||||||||||||
n_per |
(optional) Number of periods of the time series object (argument Default value is | |||||||||||||||
specs_df_name |
(optional) String containing the value of argument Default value is | |||||||||||||||
temporal_grp_periodicity |
(optional) Positive integer defining the number of periods in temporal groups for which the totals should be preserved.
E.g., specify Default value is | |||||||||||||||
alter_pos |
(optional) Nonnegative real number specifying the default alterability coefficient associated to the values of time series with positive
coefficients in all balancing constraints in which they are involved (e.g., component series in aggregation table raking problems).
Alterability coefficients provided in the problem specification data frame (argument Default value is | |||||||||||||||
alter_neg |
(optional) Nonnegative real number specifying the default alterability coefficient associated to the values of time series with negative
coefficients in all balancing constraints in which they are involved (e.g., marginal totals in aggregation table raking problems).
Alterability coefficients provided in the problem specification data frame (argument Default value is | |||||||||||||||
alter_mix |
(optional) Nonnegative real number specifying the default alterability coefficient associated to the values of time series with a mix of
positive and negative coefficients in the balancing constraints in which they are involved. Alterability coefficients provided
in the problem specification data frame (argument Default value is | |||||||||||||||
lower_bound |
(optional) Real number specifying the default lower bound for the time series values. Lower bounds provided in the problem specification
data frame (argument Default value is | |||||||||||||||
upper_bound |
(optional) Real number specifying the default upper bound for the time series values. Upper bounds provided in the problem specification
data frame (argument Default value is | |||||||||||||||
validation_only |
(optional) Logical argument specifying whether the function should only perform input data validation or not. When
Default value is |
Details
See tsbalancing()
for a detailed description of time series balancing problems.
Any missing (NA
) value found in the input time series object (argument in_ts
) would be replaced with 0 in values_ts
and trigger a warning message.
The returned elements of the balancing problems do not include the implicit temporal totals (i.e., elements A2
, op2
and b2
only contain the balancing constraints).
Multi-period balancing problem elements A2
, op2
and b2
(when temporal_grp_periodicity > 1
) are constructed
column by column (in "column-major order"), corresponding to the default behaviour of R for converting objects of class
"matrix" into vectors. I.e., the balancing constraints conceptually correspond to:
-
A1 %*% values_ts[t, ] op1 b1
for problems involving a single period (t
) -
A2 %*% as.vector(values_ts[t1:t2, ]) op2 b2
for problems involvingtemporal_grp_periodicity
periods (t1:t2
).
Notes:
Argument
alter_temporal
has not been applied yet at this point andaltertmp$coefs_ts
only contains the coefficients specified in the problem specs data frame (argumentproblem_specs_df
). I.e.,altertmp$coefs_ts
contains missing (NA
) values except for the temporal total alterability coefficients included in (specified with)problem_specs_df
. This is done in order to simplify the identification of the first non missing (nonNA
) temporal total alterability coefficient of each complete temporal group (to occur later, when applicable, insidetsbalancing()
).Argument validation is not performed here; it is (bluntly) assumed that the function is called by
tsbalancing()
where a thorough validation of the arguments is done.
Value
A list with the elements of the balancing problems (excluding the temporal totals info):
-
labels_df
: cleaned-up version of the label definition records fromproblem_specs_df
(type
is not missing (is notNA
)); extra columns:-
type.lc
:tolower(type)
-
row.lc
:tolower(row)
-
con.flag
:type.lc %in% c("eq", "le", "ge")
-
-
coefs_df
: cleaned-up version of the information specification records fromproblem_specs_df
(type
is missing (isNA
); extra columns:-
row.lc
:tolower(row)
-
con.flag
:labels_df$con.flag
allocated throughrow.lc
-
-
values_ts
: reduced version ofin_ts
with only the relevant series (see vectorser_names
) -
lb
: lower bound info (type.lc = "lowerbd"
) for the relevant series; list object with the following elements:-
coefs_ts
: lower bound values for series and period -
nondated_coefs
: vector of nondated lower bounds fromproblem_specs_df
(timeVal
isNA
) -
nondated_id_vec
: vector ofser_names
id's associated to vectornondated_coefs
-
dated_id_vec
: vector ofser_names
id's associated to dated lower bounds fromproblem_specs_df
(timeVal
is notNA
)
-
-
ub
:lb
equivalent for upper bounds (type.lc = "upperbd"
) -
alter
:lb
equivalent for period value alterability coefficients (type.lc = "alter"
) -
altertmp
:lb
equivalent for temporal total alterability coefficients (type.lc = "altertmp"
) -
ser_names
: vector of the relevant series names (set of series involved in the balancing constraints) -
pos_ser
: vector of series names that have only positive nonzero coefficients across all balancing constraints -
neg_ser
: vector of series names that have only negative nonzero coefficients across all balancing constraints -
mix_ser
: vector of series names that have both positive and negative nonzero coefficients across all balancing constraints -
A1
,op1
,b1
: balancing constraint elements for problems involving a single period (e.g., each period of an incomplete temporal group) -
A2
,op2
,b2
: balancing constraint elements for problems involvingtemporal_grp_periodicity
periods (e.g., the set of periods of a complete temporal group)
See Also
tsbalancing()
build_raking_problem()
Examples
######################################################################################
# Indirect series derivation framework with `tsbalancing()` metadata
######################################################################################
#
# Is is assumed (agreed) that...
#
# a) All balancing constraints are equality constraints (`type = EQ`).
# b) All constraints have only one nonbinding (free) series: the series to be derived
# (i.e., all series have an alter. coef of 0 except the series to be derived).
# c) Each constraint derives a different (new) series.
# d) Constraints are the same for all periods (i.e., no "dated" alter. coefs
# specified with column `timeVal`).
######################################################################################
# Derive the 5 marginal totals of a 2 x 3 two-dimensional data cube using `tsbalancing()`
# metadata (data cube aggregation constraints respect the above assumptions).
# Build the balancing problem specs through the (simpler) raking metadata.
my_specs <- rkMeta_to_blSpecs(
data.frame(series = c("A1", "A2", "A3",
"B1", "B2", "B3"),
total1 = c(rep("totA", 3),
rep("totB", 3)),
total2 = rep(c("tot1", "tot2", "tot3"), 2)),
alterSeries = 0, # binding (fixed) component series
alterTotal1 = 1, # nonbinding (free) marginal totals (to be derived)
alterTotal2 = 1) # nonbinding (free) marginal totals (to be derived)
my_specs
# 6 periods (quarters) of data with marginal totals set to zero (0): they MUST exist
# in the input data AND contain valid (non missing) data.
my_ts <- ts(data.frame(A1 = c(12, 10, 12, 9, 15, 7),
B1 = c(20, 21, 15, 17, 19, 18),
A2 = c(14, 9, 8, 9, 11, 10),
B2 = c(20, 29, 20, 24, 21, 17),
A3 = c(13, 15, 17, 14, 16, 12),
B3 = c(24, 20, 30, 23, 21, 19),
tot1 = rep(0, 6),
tot2 = rep(0, 6),
tot3 = rep(0, 6),
totA = rep(0, 6),
totB = rep(0, 6)),
start = 2019, frequency = 4)
# Get the balancing problem elements.
n_per <- nrow(my_ts)
p <- build_balancing_problem(my_ts, my_specs,
temporal_grp_periodicity = n_per)
# `A2`, `op2` and `b2` define 30 constraints (5 marginal totals X 6 periods)
# involving a total of 66 time series data points (11 series X 6 periods) of which
# 36 belong to the 6 component series and 30 belong to the 5 marginal totals.
dim(p$A2)
# Get the names of the marginal totals (series with a nonzero alter. coef), in the order
# in which the corresponding constraints appear in the specs (constraints specification
# order).
tmp <- p$coefs_df$col[p$coefs_df$con.flag]
tot_names <- tmp[tmp %in% p$ser_names[p$alter$nondated_id_vec[p$alter$nondated_coefs != 0]]]
# Define logical flags identifying the marginal total columns:
# - `tot_col_logi1`: for single-period elements (of length 11 = number of series)
# - `tot_col_logi2`: for multi-period elements (of length 66 = number of data points),
# in "column-major order" (the `A2` matrix element construction order)
tot_col_logi1 <- p$ser_names %in% tot_names
tot_col_logi2 <- rep(tot_col_logi1, each = n_per)
# Order of the marginal totals to be derived based on
# ... the input data columns ("mts" object `my_ts`)
p$ser_names[tot_col_logi1]
# ... the constraints specification (data frame `my_specs`)
tot_names
# Calculate the 5 marginal totals for all 6 periods
# Note: the following calculation allows for general linear equality constraints, i.e.,
# a) nonzero right-hand side (RHS) constraint values (`b2`) and
# b) nonzero constraint coefs other than 1 for the component series and -1 for
# the derived series.
my_ts[, tot_names] <- {
(
# Constraints RHS.
p$b2 -
# Sums of the components ("weighted" by the constraint coefficients).
p$A2[, !tot_col_logi2, drop = FALSE] %*% as.vector(p$values_ts[, !tot_col_logi1])
) /
# Derived series constraint coefficients: `t()` allows for a "row-major order" search
# in matrix `A2` (i.e., according to the constraints specification order).
# Note: `diag(p$A2[, tot_col_logi2])` would work if `p$ser_names[tot_col_logi1]` and
# `tot_names` were identical (same totals order); however, the following search
# in "row-major order" will always work (and is necessary in the current case).
t(p$A2[, tot_col_logi2])[t(p$A2[, tot_col_logi2]) != 0]
}
my_ts