DataFrame-class {S4Vectors} | R Documentation |
DataFrame objects
Description
The DataFrame
class extends the RectangularData virtual
class supports the storage of any type of object (with length
and [
methods) as columns.
Details
On the whole, the DataFrame
behaves very similarly to
data.frame
, in terms of construction, subsetting, splitting,
combining, etc. The most notable exceptions have to do with handling
of the row names:
The row names are optional. This means calling
rownames(x)
will returnNULL
if there are no row names. Of course, it could returnseq_len(nrow(x))
, but returningNULL
informs, for example, combination functions that no row names are desired (they are often a luxury when dealing with large data).The row names are not required to be unique.
Subsetting by row names does not use partial matching.
As DataFrame
derives from Vector
, it is
possible to set an annotation
string. Also, another
DataFrame
can hold metadata on the columns.
For a class to be supported as a column, it must have length
and [
methods, where [
supports subsetting only by
i
and respects drop=FALSE
. Optionally, a method may be
defined for the showAsCell
generic, which should return a
vector of the same length as the subset of the column passed to
it. This vector is then placed into a data.frame
and converted
to text with format
. Thus, each element of the vector should be
some simple, usually character, representation of the corresponding
element in the column.
Constructor
DataFrame(..., row.names = NULL, check.names = TRUE, stringsAsFactors)
:-
Constructs a
DataFrame
in similar fashion todata.frame
. Each argument in...
is coerced to aDataFrame
and combined column-wise. The row names should be given inrow.names
; otherwise, they are inherited from the arguments, as indata.frame
. Explicitly passingNULL
torow.names
ensures that there are no rownames. Ifcheck.names
isTRUE
, the column names will be checked for syntactic validity and made unique, if necessary.To store an object of a class that does not support coercion to
DataFrame
, wrap it inI()
. The class must still have methods forlength
and[
.The
stringsAsFactors
argument is ignored. The coercion of column arguments to DataFrame determines whether strings become factors. make_zero_col_DFrame(nrow)
:-
Constructs a zero-column DFrame object with
nrow
rows. Intended for developers to use in other packages and typically not needed by the end user.
Accessors
In the following code snippets, x
is a DataFrame
.
dim(x)
:-
Get the length two integer vector indicating in the first and second element the number of rows and columns, respectively.
dimnames(x)
,dimnames(x) <- value
:-
Get and set the two element list containing the row names (character vector of length
nrow(x)
orNULL
) and the column names (character vector of lengthncol(x)
).
Coercion
as(from, "DataFrame")
:-
By default, constructs a new
DataFrame
withfrom
as its only column. Iffrom
is amatrix
ordata.frame
, all of its columns become columns in the newDataFrame
. Iffrom
is a list, each element becomes a column, recycling as necessary. Note that for theDataFrame
to behave correctly, each column object must support element-wise subsetting via the[
method and return the number of elements withlength
. It is recommended to use theDataFrame
constructor, rather than this interface. as.list(x)
:Coerces
x
, aDataFrame
, to alist
.as.data.frame(x, row.names=NULL, optional=FALSE)
:-
Coerces
x
, aDataFrame
, to adata.frame
. Each column is coerced to adata.frame
and then column bound together. Ifrow.names
isNULL
, they are retrieved fromx
, if it has any. Otherwise, they are inferred by thedata.frame
constructor.NOTE: conversion of
x
to adata.frame
is not supported ifx
contains anylist
,SimpleList
, orCompressedList
columns. as(from, "data.frame")
:Coerces a
DataFrame
to adata.frame
by callingas.data.frame(from)
.as.matrix(x)
:Coerces the
DataFrame
to amatrix
, if possible.as.env(x, enclos = parent.frame())
:-
Creates an environment from
x
with a symbol for eachcolnames(x)
. The values are not actually copied into the environment. Rather, they are dynamically bound usingmakeActiveBinding
. This prevents unnecessary copying of the data from the external vectors into R vectors. The values are cached, so that the data is not copied every time the symbol is accessed.
Subsetting
In the following code snippets, x
is a DataFrame
.
x[i,j,drop]
:Behaves very similarly to the
[.data.frame
method, excepti
can be a logicalRle
object and subsetting bymatrix
indices is not supported. Indices containingNA
's are also not supported.x[i,j] <- value
:Behaves very similarly to the
[<-.data.frame
method.x[[i]]
:Behaves very similarly to the
[[.data.frame
method, except argumentsj
andexact
are not supported. Column name matching is always exact. Subsetting by matrices is not supported.x[[i]] <- value
:Behaves very similarly to the
[[<-.data.frame
method, except argumentj
is not supported.
Displaying
The show()
method for DataFrame objects obeys global options
showHeadLines
and showTailLines
for controlling the number
of head and tail rows to display.
See ?get_showHeadLines
for more information.
Author(s)
Michael Lawrence
See Also
-
DataFrame-combine for combining DataFrame objects.
-
DataFrame-utils for other common operations on DataFrame objects.
-
TransposedDataFrame objects.
-
RectangularData and SimpleList which DataFrame extends directly.
-
get_showHeadLines
for controlling the number of DataFrame rows to display.
Examples
score <- c(1L, 3L, NA)
counts <- c(10L, 2L, NA)
row.names <- c("one", "two", "three")
df <- DataFrame(score) # single column
df[["score"]]
df <- DataFrame(score, row.names = row.names) #with row names
rownames(df)
df <- DataFrame(vals = score) # explicit naming
df[["vals"]]
# arrays
ary <- array(1:4, c(2,1,2))
sw <- DataFrame(I(ary))
# a data.frame
sw <- DataFrame(swiss)
as.data.frame(sw) # swiss, without row names
# now with row names
sw <- DataFrame(swiss, row.names = rownames(swiss))
as.data.frame(sw) # swiss
# subsetting
sw[] # identity subset
sw[,] # same
sw[NULL] # no columns
sw[,NULL] # no columns
sw[NULL,] # no rows
## select columns
sw[1:3]
sw[,1:3] # same as above
sw[,"Fertility"]
sw[,c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)]
## select rows and columns
sw[4:5, 1:3]
sw[1] # one-column DataFrame
## the same
sw[, 1, drop = FALSE]
sw[, 1] # a (unnamed) vector
sw[[1]] # the same
sw[["Fertility"]]
sw[["Fert"]] # should return 'NULL'
sw[1,] # a one-row DataFrame
sw[1,, drop=TRUE] # a list
## duplicate row, unique row names are created
sw[c(1, 1:2),]
## indexing by row names
sw["Courtelary",]
subsw <- sw[1:5,1:4]
subsw["C",] # no partial match (unlike with data.frame)
## row and column names
cn <- paste("X", seq_len(ncol(swiss)), sep = ".")
colnames(sw) <- cn
colnames(sw)
rn <- seq(nrow(sw))
rownames(sw) <- rn
rownames(sw)
## column replacement
df[["counts"]] <- counts
df[["counts"]]
df[[3]] <- score
df[["X"]]
df[[3]] <- NULL # deletion