SuppressLinkedTables {GaussSuppression} | R Documentation |
Consistent Suppression of Linked Tables
Description
Provides alternatives to global protection for linked tables through methods that may reduce the computational burden.
Usage
SuppressLinkedTables(
data = NULL,
fun,
...,
withinArg = NULL,
linkedGauss = "consistent",
recordAware = TRUE,
iterBackTracking = Inf,
whenEmptyUnsuppressed = NULL,
lpPackage = NULL
)
Arguments
data |
The |
fun |
A function: |
... |
Arguments to |
withinArg |
A list of named lists. Arguments to |
linkedGauss |
Specifies the strategy for protecting linked tables. Possible values are:
|
recordAware |
If |
iterBackTracking |
Maximum number of back-tracking iterations. |
whenEmptyUnsuppressed |
Parameter to |
lpPackage |
Currently ignored. If specified, a warning will be issued. |
Details
The reason for introducing the new method "consistent"
, which has not yet been extensively tested in practice,
is to provide something that works better than "back-tracking"
, while still offering equally strong protection.
Note that for singleton methods of the elimination type (see SSBtools::NumSingleton()
), "back-tracking"
may lead to
the creation of a large number of redundant secondary cells. This is because, during the method's iterations,
all secondary cells are eventually treated as primary. As a result, protection is applied to prevent a singleton
contributor from inferring a secondary cell that was only included to protect that same contributor.
Note that the frequency singleton methods "subSpace"
, "anySum0"
, and "anySumNOTprimary"
are currently not implemented
and will result in an error.
As a result, the singletonZeros
parameter in the SuppressDominantCells()
function cannot be set to TRUE
,
and the SuppressKDisclosure()
function is not available for use.
Also note that automatic forcing of "anySumNOTprimary"
is disabled.
That is, SSBtools::GaussSuppression()
is called with auto_anySumNOTprimary = FALSE
.
See the parameter documentation for an explanation of why FALSE
is required.
The combination of intervals with the various linked table strategies is not yet implemented,
so the lpPackage
parameter is currently ignored.
Value
A list of data frames, or, if withinArg
is NULL
, the ordinary output from fun
.
Note
Note on differences between SuppressLinkedTables()
and alternative approaches.
By alternatives, we refer to using the linkedGauss
parameter via GaussSuppressionFromData()
, its wrappers, or through tables_by_formulas()
, as shown in the examples below.
Alternatives can be used when only the
formula
parameter varies between the linked tables.-
SuppressLinkedTables()
creates several smaller model matrices, which may be combined into a single block-diagonal matrix. A large overall matrix is never created. With the alternatives, a large overall matrix is created first. Smaller matrices are then derived from it. If the size of the full matrix is a bottleneck,
SuppressLinkedTables()
is the better choice.The
"global"
method is available with the alternatives, but not withSuppressLinkedTables()
.Due to differences in candidate ordering, the two methods may not always produce identical results. With the alternatives, candidate order is constructed globally across all cells (as with the global method). In contrast,
SuppressLinkedTables()
uses a locally determined candidate order within each table. The ordering across tables is coordinated to ensure the method works, but it is not based on a strictly defined global order. This may lead to some differences.
Examples
### The first example can be performed in three ways
### Alternatives are possible since only the formula parameter varies between the linked tables
a <- SuppressLinkedTables(data = SSBtoolsData("magnitude1"), # With trick "sector4 - sector4" and
fun = SuppressDominantCells, # "geo - geo" to ensure same names in output
withinArg = list(list(formula = ~(geo + eu) * sector2 + sector4 - sector4),
list(formula = ~eu:sector4 - 1 + geo - geo),
list(formula = ~geo + eu + sector4 - 1)),
dominanceVar = "value",
pPercent = 10,
contributorVar = "company",
linkedGauss = "consistent")
print(a)
# Alternatively, SuppressDominantCells() can be run directly using the linkedGauss parameter
a1 <- SuppressDominantCells(SSBtoolsData("magnitude1"),
formula = list(table_1 = ~(geo + eu) * sector2,
table_2 = ~eu:sector4 - 1,
table_3 = ~(geo + eu) + sector4 - 1),
dominanceVar = "value",
pPercent = 10,
contributorVar = "company",
linkedGauss = "consistent")
print(a1)
# In fact, tables_by_formulas() is also a possibility
a2 <- tables_by_formulas(SSBtoolsData("magnitude1"),
table_fun = SuppressDominantCells,
table_formulas = list(table_1 = ~region * sector2,
table_2 = ~region1:sector4 - 1,
table_3 = ~region + sector4 - 1),
substitute_vars = list(region = c("geo", "eu"), region1 = "eu"),
collapse_vars = list(sector = c("sector2", "sector4")),
dominanceVar = "value",
pPercent = 10,
contributorVar = "company",
linkedGauss = "consistent")
print(a2)
#### The second example cannot be handled using the alternative methods.
#### This is similar to the (old) LazyLinkedTables() example.
z1 <- SSBtoolsData("z1")
z2 <- SSBtoolsData("z2")
z2b <- z2[3:5] # As in ChainedSuppression example
names(z2b)[1] <- "region"
# As 'f' and 'e' in ChainedSuppression example.
# 'A' 'annet'/'arbeid' suppressed in b[[1]], since suppressed in b[[3]].
b <- SuppressLinkedTables(fun = SuppressSmallCounts,
linkedGauss = "consistent",
recordAware = FALSE,
withinArg = list(
list(data = z1, dimVar = 1:2, freqVar = 3, maxN = 5),
list(data = z2b, dimVar = 1:2, freqVar = 3, maxN = 5),
list(data = z2, dimVar = 1:4, freqVar = 5, maxN = 1)))
print(b)