riboflavin {FPCdpca} | R Documentation |
Riboflavin Production Data
Description
This dataset contains measurements of riboflavin (vitamin B2) production by Bacillus subtilis, a Gram-positive bacterium commonly used in industrial fermentation processes. The dataset includes
n = 71
observations with p = 4088
predictors, representing the logarithm of the expression levels of 4088 genes. The response variable is the log-transformed riboflavin production rate.
Usage
data(riboflavin)
Format
- y
Log-transformed riboflavin production rate (original name:
q_RIBFLV
). This is a continuous variable indicating the efficiency of riboflavin production by the bacterial strain.- x
A matrix of dimension
71 \times 4088
containing the logarithm of the expression levels of 4088 genes. Each column corresponds to a gene, and each row corresponds to an observation (experimental condition or time point).
Details
The riboflavin dataset is a high-dimensional dataset commonly used in statistical research, particularly in the fields of bioinformatics and systems biology. It was originally collected to study the genetic regulation of riboflavin biosynthesis in Bacillus subtilis. The data were generated using DNA microarray technology to measure gene expression levels under various experimental conditions.
Note
The dataset is provided by DSM Nutritional Products Ltd., a leading company in the field of nutritional ingredients. The data have been preprocessed and normalized to account for technical variations in the microarray measurements.
Source
DSM Nutritional Products Ltd., Basel, Switzerland.
References
Bühlmann, P., Kalisch, M., & Meier, L. (2014). 'High-dimensional statistics with a view towards applications in biology.' Annual Review of Statistics and its Applications, 1, 255–278.
DSM Nutritional Products Ltd. (2005). 'Genome-scale analysis of Bacillus subtilis riboflavin production.' Internal Report.
Examples
# Load the riboflavin dataset
data(riboflavin)
# Display the dimensions of the dataset
print(dim(riboflavin$x))
print(length(riboflavin$y))
# Summary statistics for the response variable
summary(riboflavin$y)