seqinfo {GenomeInfoDb} | R Documentation |
Accessing/modifying sequence information
Description
A set of generic functions for getting/setting/modifying the sequence information stored in an object.
Usage
seqinfo(x)
seqinfo(x,
new2old=NULL,
pruning.mode=c("error", "coarse", "fine", "tidy")) <- value
seqnames(x)
seqnames(x) <- value
seqlevels(x)
seqlevels(x,
pruning.mode=c("error", "coarse", "fine", "tidy")) <- value
sortSeqlevels(x, X.is.sexchrom=NA)
seqlevelsInUse(x)
seqlevels0(x)
seqlengths(x)
seqlengths(x) <- value
isCircular(x)
isCircular(x) <- value
genome(x)
genome(x) <- value
Arguments
x |
Any object containing sequence information i.e. with a |
new2old |
The
If Note that most of the times it's easier to proceed in 2 steps:
This 2-step approach will typically look like this: seqlevels(x) <- seqlevels(value) # align seqlevels seqinfo(x) <- seqinfo(value) # guaranteed to work Or, if seqlevels(x, pruning.mode="coarse") <- seqlevels(value) seqinfo(x) <- seqinfo(value) # guaranteed to work The |
pruning.mode |
When some of the seqlevels to drop from
See the "B. DROP SEQLEVELS FROM A LIST-LIKE OBJECT" section in the examples below for an extensive illustration of these pruning modes. |
value |
Typically a Seqinfo object for the Either a named or unnamed character vector for the A vector containing the sequence information to store for the other setters. |
X.is.sexchrom |
A logical indicating whether X refers to the sexual chromosome
or to chromosome with Roman Numeral X. If |
It all revolves around Seqinfo objects
The Seqinfo class plays a central role for the functions described in this man page because:
All these functions (except
seqinfo
,seqlevelsInUse
, andseqlevels0
) work on a Seqinfo object.For classes that implement it, the
seqinfo
getter should return a Seqinfo object.Default
seqlevels
,seqlengths
,isCircular
, andgenome
getters and setters are provided. By default,seqlevels(x)
doesseqlevels(seqinfo(x))
,seqlengths(x)
doesseqlengths(seqinfo(x))
,isCircular(x)
doesisCircular(seqinfo(x))
, andgenome(x)
doesgenome(seqinfo(x))
. So any class with aseqinfo
getter will have all the above getters work out-of-the-box. If, in addition, the class defines aseqinfo
setter, then all the corresponding setters will also work out-of-the-box.Examples of containers that have a
seqinfo
getter and setter:the GRanges and GRangesList classes in the GenomicRanges package;
the SummarizedExperiment class in the SummarizedExperiment package;
the GAlignments, GAlignmentPairs, and GAlignmentsList classes in the GenomicAlignments package;
the TxDb class in the GenomicFeatures package;
the BSgenome class in the BSgenome package;
and more...
Note
The full list of methods defined for a given generic function can be seen
with e.g. showMethods("seqinfo")
or showMethods("seqnames")
(for the getters), and showMethods("seqinfo<-")
or
showMethods("seqnames<-")
(for the setters a.k.a.
replacement methods). Please be aware that this shows only methods
defined in packages that are currently attached.
The GenomicRanges package defines seqinfo
and seqinfo<-
methods for these low-level data types: List and
IntegerRangesList. Those objects do not have the means
to formally store sequence information. Thus, the wrappers simply store
the Seqinfo object within metadata(x)
. Initially, the
metadata is empty, so there is some effort to generate a reasonable
default Seqinfo. The names of any List are
taken as the seqnames
, and the universe
of
IntegerRangesList is taken as the genome
.
Author(s)
H. Pagès
See Also
The seqlevelsStyle generic getter and setter for conveniently renaming the seqlevels of an object according to a given naming convention (e.g. NCBI or UCSC).
-
Seqinfo objects.
-
GRanges and GRangesList objects in the GenomicRanges package.
-
SummarizedExperiment objects in the SummarizedExperiment package.
-
GAlignments, GAlignmentPairs, and GAlignmentsList objects in the GenomicAlignments package.
-
TxDb objects in the GenomicFeatures package.
-
BSgenome objects in the BSgenome package.
-
seqlevels-wrappers for convenience wrappers to the
seqlevels
getter and setter. -
rankSeqlevels
, on whichsortSeqlevels
is based.
Examples
## ---------------------------------------------------------------------
## A. BASIC USAGE OF THE seqlevels() GETTER AND SETTER
## ---------------------------------------------------------------------
## Operations between 2 or more objects containing genomic ranges (e.g.
## finding overlaps, comparing, or matching) only make sense if the
## operands have the same seqlevels. So before performing such
## operations, it is often necessary to adjust the seqlevels in
## the operands so that they all have the same seqlevels. This is
## typically done with the seqlevels() setter. The setter can be used
## to rename, drop, add and/or reorder seqlevels of an object. The
## examples below show how to mofify the seqlevels of a GRanges object
## but the same would apply to any object containing sequence
## information (i.e. with a seqinfo() component).
library(GenomicRanges)
gr <- GRanges(rep(c("chr2", "chr3", "chrM"), 2), IRanges(1:6, 10))
## Add new seqlevels:
seqlevels(gr) <- c("chr1", seqlevels(gr), "chr4")
seqlevels(gr)
seqlevelsInUse(gr)
## Reorder existing seqlevels:
seqlevels(gr) <- rev(seqlevels(gr))
seqlevels(gr)
## Drop all unused seqlevels:
seqlevels(gr) <- seqlevelsInUse(gr)
## Drop some seqlevels in use:
seqlevels(gr, pruning.mode="coarse") <- setdiff(seqlevels(gr), "chr3")
gr
## Rename, add, and reorder the seqlevels all at once:
seqlevels(gr) <- c("chr1", chr2="chr2", chrM="Mitochondrion")
seqlevels(gr)
## ---------------------------------------------------------------------
## B. DROP SEQLEVELS FROM A LIST-LIKE OBJECT
## ---------------------------------------------------------------------
grl0 <- GRangesList(A=GRanges("chr2", IRanges(3:2, 5)),
B=GRanges(c("chr2", "chrMT"), IRanges(7:6, 15)),
C=GRanges(c("chrY", "chrMT"), IRanges(17:16, 25)),
D=GRanges())
grl0
grl1 <- grl0
seqlevels(grl1, pruning.mode="coarse") <- c("chr2", "chr5")
grl1 # grl0[[2]] was fully removed! (even if it had a range on chr2)
## If what is desired is to remove the 2nd range in grl0[[2]] only (i.e.
## the chrMT:6-15 range), or, more generally speaking, to remove the
## ranges within each list element that are located on the seqlevels to
## drop, then use pruning.mode="fine" or pruning.mode="tidy":
grl2 <- grl0
seqlevels(grl2, pruning.mode="fine") <- c("chr2", "chr5")
grl2 # grl0[[2]] not removed, but chrMT:6-15 range removed from it
## Like pruning.mode="fine" but also removes grl0[[3]].
grl3 <- grl0
seqlevels(grl3, pruning.mode="tidy") <- c("chr2", "chr5")
grl3
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
## Pruning mode "coarse" is particularly well suited on a GRangesList
## object that contains exons grouped by transcript:
ex_by_tx <- exonsBy(txdb, by="tx")
seqlevels(ex_by_tx)
seqlevels(ex_by_tx, pruning.mode="coarse") <- "chr2L"
seqlevels(ex_by_tx)
## Pruning mode "tidy" is particularly well suited on a GRangesList
## object that contains transcripts grouped by gene:
tx_by_gene <- transcriptsBy(txdb, by="gene")
seqlevels(tx_by_gene)
seqlevels(tx_by_gene, pruning.mode="tidy") <- "chr2L"
seqlevels(tx_by_gene)
## ---------------------------------------------------------------------
## C. RENAME THE SEQLEVELS OF A TxDb OBJECT
## ---------------------------------------------------------------------
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
seqlevels(txdb)
seqlevels(txdb) <- sub("chr", "", seqlevels(txdb))
seqlevels(txdb)
seqlevels(txdb) <- paste0("CH", seqlevels(txdb))
seqlevels(txdb)
seqlevels(txdb)[seqlevels(txdb) == "CHM"] <- "M"
seqlevels(txdb)
## Restore original seqlevels:
seqlevels(txdb) <- seqlevels0(txdb)
seqlevels(txdb)
## ---------------------------------------------------------------------
## D. SORT SEQLEVELS IN "NATURAL" ORDER
## ---------------------------------------------------------------------
sortSeqlevels(c("11", "Y", "1", "10", "9", "M", "2"))
seqlevels <- c("chrXI", "chrY", "chrI", "chrX", "chrIX", "chrM", "chrII")
sortSeqlevels(seqlevels)
sortSeqlevels(seqlevels, X.is.sexchrom=TRUE)
sortSeqlevels(seqlevels, X.is.sexchrom=FALSE)
seqlevels <- c("chr2RHet", "chr4", "chrUextra", "chrYHet",
"chrM", "chrXHet", "chr2LHet", "chrU",
"chr3L", "chr3R", "chr2R", "chrX")
sortSeqlevels(seqlevels)
gr <- GRanges()
seqlevels(gr) <- seqlevels
sortSeqlevels(gr)
## ---------------------------------------------------------------------
## E. SUBSET OBJECTS BY SEQLEVELS
## ---------------------------------------------------------------------
tx <- transcripts(txdb)
seqlevels(tx)
## Drop 'M', keep all others.
seqlevels(tx, pruning.mode="coarse") <- seqlevels(tx)[seqlevels(tx) != "M"]
seqlevels(tx)
## Drop all except 'ch3L' and 'ch3R'.
seqlevels(tx, pruning.mode="coarse") <- c("ch3L", "ch3R")
seqlevels(tx)
## ---------------------------------------------------------------------
## F. FINDING METHODS
## ---------------------------------------------------------------------
showMethods("seqinfo")
showMethods("seqinfo<-")
showMethods("seqnames")
showMethods("seqnames<-")
showMethods("seqlevels")
showMethods("seqlevels<-")
if (interactive()) {
library(GenomicRanges)
?`GRanges-class`
}