Bloodplatelet {EBcoBART}R Documentation

Bloodplatelet

Description

Contains not standardized messenger-RNA expression measurements, derived from blood platelets, which are used to classify breast cancer versus non-small- cell lung cancer patients. For the 500 m-RNA variables, co-data is available. Co-data is defined by estimated p-values (- logit scale) of all the 500 m-RNA for three different classification tasks: 1) colorectal cancer vs. control patients, 2) pancreas cancer vs. control patients, and 3) pancreas cancer vs. colorectal cancer. Co-data is therefore informative if different cancer classification tasks have similar important m-RNA variables. See Novianti and others (2017) doi:10.1093/bioinformatics/btw837 for details on the complete data set, from which this data is derived.

Usage

data(Bloodplatelet)

Format

A list object with five objects:

Xtrain

Data frame with 101 rows (samples) and 140 columns (variables). Explanatory variables used for fitting BART. Variable names are present.

Y

Numeric of length 100. Binary training response (0: Breast cancer, 1: non-small-cell lung cancer)

CoData

Matrix with 500 rows and 4 columns. Auxiliary information on the 500 variables. Contains, for each variable, estimated p-values from three different classification tasks. P-values are -logit transformed. An intercept is included to the co-data matrix.

Author(s)

Jeroen M. Goedhart, j.m.goedhart@amsterdamumc.nl

Mark A van de Wiel

References

P. W. Novianti, B.C. Snoek, S. Wilting, and M. A. Van De Wiel, Better diagnostic signatures from RNAseq data through use of auxiliary co-data 2017 Bioinformatics, Vol. 33, No. 10, p. 1572-1574


[Package EBcoBART version 1.1.1 Index]