detect.outliers {OutSeekR} | R Documentation |
Detect outliers
Description
Detect outliers in normalized RNA-seq data.
Usage
detect.outliers(
data,
num.null = 1000,
initial.screen.method = c("fdr", "p.value"),
p.value.threshold = 0.05,
fdr.threshold = 0.01,
kmeans.nstart = 1
)
Arguments
data |
A matrix or data frame of normalized RNA-seq data, organized with transcripts on rows and samples on columns. Transcript identifiers should be stored as |
num.null |
The number of transcripts to generate when simulating from null distributions; default is 1000. We recommend using at least 10,000 iterations for publication-level results, with 100,000 or even one million iterations providing more robust estimates. |
initial.screen.method |
The statistical criterion for initial gene selection; valid options are 'FDR' and 'p-value'. |
p.value.threshold |
The p-value threshold for the outlier test; default is 0.05. Once the p-value for a sample exceeds |
fdr.threshold |
The false discovery rate (FDR)-adjusted p-value threshold for determining the final count of outliers; default is 0.01. |
kmeans.nstart |
The number of random starts when computing k-means fraction; default is 1. See |
Value
A list consisting of the following entries:
-
p.values
: a matrix of unadjusted p-values for the outlier test run on each transcript indata
. -
fdr
: a matrix of FDR-adjusted p-values for the outlier test run on each transcript indata
. -
num.outliers
: a vector giving the number of outliers detected for each transcript based on the threshold. -
outlier.test.results.list
: a list of lengthmax(num.outliers) + 1
containing entriesroundN
, whereN
is between one andmax(num.outliers) + 1
.roundN
is the data frame of results for the outlier test after excluding the (N-1)th outlier sample, withround1
being for the original data set (i.e., before excluding any outlier samples). -
distributions
: a numeric vector indicating the optimal distribution for each transcript. Possible values are 1 (normal), 2 (log-normal), 3 (exponential), and 4 (gamma). -
initial.screen.method
: Specifies the statistical criterion for initial feature selection. Valid options are 'p-value' and 'FDR' (p-value used by default).
Examples
data(outliers);
outliers.subset <- outliers[1:10,];
results <- detect.outliers(
data = outliers.subset,
num.null = 10
);