chooseGroupNames {wrMisc} | R Documentation |
Choose Column Most Likely For Sample-Names
Description
This function looks at all comumns of mat
which columns may be likely choices for sample-names and derives then group-names after stripping terminal enumerators.
Ideal sample-names should contain some replicates indicates as terminal enumerators.
Usage
chooseGroupNames(
mat,
useCoNa = NULL,
method = "median",
sep = c("_", "-", " ", ".", "=", ";"),
rmTxt = NULL,
asUnique = TRUE,
partEnumerator = FALSE,
fullReport = FALSE,
silent = FALSE,
debug = FALSE,
callFrom = NULL
)
Arguments
mat |
(matrix or data.frame) contains possible choices for sample-names |
useCoNa |
(character) optional custom choice for columns of |
method |
(character) decide how to choose number of groups as : min, low, med, high, max or mode
Note arguments |
sep |
(character) separators considered when searching and removing common words |
rmTxt |
(character, length=1) optional removing of custom text (eg variable file-extensions); no obligation that |
asUnique |
(logical) requires all (potential) samples-names to be unique (ie no repeats) to be considered for group-names; also removes all candidate columns with all different names |
partEnumerator |
(logical) when |
fullReport |
(logical) if |
silent |
(logical) suppress messages if |
debug |
(logical) additional messages for debugging |
callFrom |
(character) allows easier tracking of messages produced |
Details
The basic idea is that the column containing (good) samples-names contains all different entries and that by stripping terminal enumerators one can understand the grouping of replicates.
Note arguments asUnique
and partEnumerator
influence which columns of mat
will be evaluated/checked
Value
This function returns a character vector with grouop-names (and sample-names as names of entries) or
if fullReport=TRUE
a list with $group, $sampleNames, $col (index of column from mat
and name of column
See Also
rmSharedWords
, replicateStructure
, protectSpecChar
Examples
mat <- cbind(a=letters[1:6], b=paste(rep(c("b","B"), each=3), 1:3), c=rep(1,6),
d=gl(3,2), e=rep(c("e","E"),3), f=paste(rep(c("F","f","ff"), each=2), 1:2))
chooseGroupNames(mat, method="median") # col 2 (b/B)
chooseGroupNames(mat, method="median", fullReport=TRUE)
chooseGroupNames(mat, method="min") # col 2 (b/B)
chooseGroupNames(mat, method="max") # col 6 (F/f/ff)
chooseGroupNames(mat, method="max", asUnique=FALSE) # col 1 (a..)