injectSNPs {BSgenome} | R Documentation |
SNP injection
Description
Inject SNPs from a SNPlocs data package into a genome.
Usage
injectSNPs(x, snps)
SNPlocs_pkgname(x)
## S4 method for signature 'BSgenome'
snpcount(x)
## S4 method for signature 'BSgenome'
snplocs(x, seqname, ...)
## Related utilities
available.SNPs(type=getOption("pkgType"))
installed.SNPs()
Arguments
x |
A BSgenome object. |
snps |
A SNPlocs object or the name of a SNPlocs data package.
This object or package must contain SNP information for the single
sequences contained in |
seqname |
The name of a single sequence in |
type |
Character string indicating the type of package ( |
... |
Further arguments to be passed to |
Value
injectSNPs
returns a copy of the original genome x
where some
or all of the single sequences from x
are altered by injecting the
SNPs stored in snps
.
The SNPs in the altered genome are represented by an IUPAC ambiguity code
at each SNP location.
SNPlocs_pkgname
, snpcount
and snplocs
return NULL
if no SNPs were injected in x
(i.e. if x
is not a
BSgenome object returned by a previous call to injectSNPs
).
Otherwise SNPlocs_pkgname
returns the name of the package from
which the SNPs were injected, snpcount
the number of SNPs for each
altered sequence in x
, and snplocs
their locations in the
sequence whose name is specified by seqname
.
available.SNPs
returns a character vector containing the names of the
SNPlocs and XtraSNPlocs data packages that are currently available on the
Bioconductor repositories for your version of R/Bioconductor.
A SNPlocs data package contains basic information (location and alleles)
about the known molecular variations of class snp for a given
organism.
A XtraSNPlocs data package contains information about the known molecular
variations of other classes (in-del, heterozygous,
microsatellite, named-locus, no-variation, mixed,
multinucleotide-polymorphism) for a given organism.
Only SNPlocs data packages can be used for SNP injection for now.
installed.SNPs
returns a character vector containing the names of
the SNPlocs and XtraSNPlocs data packages that are already installed.
Note
injectSNPs
, SNPlocs_pkgname
, snpcount
and snplocs
have the side effect to try to load the SNPlocs data package that was
specified thru the snps
argument if it's not already loaded.
Author(s)
H. Pagès
See Also
BSgenome-class,
IUPAC_CODE_MAP
,
injectHardMask
,
letterFrequencyInSlidingView
,
.inplaceReplaceLetterAt
Examples
## What SNPlocs data packages are already installed:
installed.SNPs()
## What SNPlocs data packages are available:
available.SNPs()
if (interactive()) {
## Make your choice and install with:
if (!require("BiocManager"))
install.packages("BiocManager")
BiocManager::install("SNPlocs.Hsapiens.dbSNP144.GRCh38")
}
## Inject SNPs from dbSNP into the Human genome:
library(BSgenome.Hsapiens.UCSC.hg38.masked)
genome <- BSgenome.Hsapiens.UCSC.hg38.masked
SNPlocs_pkgname(genome)
genome2 <- injectSNPs(genome, "SNPlocs.Hsapiens.dbSNP144.GRCh38")
genome2 # note the extra "with SNPs injected from ..." line
SNPlocs_pkgname(genome2)
snpcount(genome2)
head(snplocs(genome2, "chr1"))
alphabetFrequency(genome$chr1)
alphabetFrequency(genome2$chr1)
## Find runs of SNPs of length at least 25 in chr1. Might require
## more memory than some platforms can handle (e.g. 32-bit Windows
## and maybe some Mac OS X machines with little memory):
is_32bit_windows <- .Platform$OS.type == "windows" &&
.Platform$r_arch == "i386"
is_macosx <- substr(R.version$os, start=1, stop=6) == "darwin"
if (!is_32bit_windows && !is_macosx) {
chr1 <- injectHardMask(genome2$chr1)
ambiguous_letters <- paste(DNA_ALPHABET[5:15], collapse="")
lf <- letterFrequencyInSlidingView(chr1, 25, ambiguous_letters)
sl <- slice(as.integer(lf), lower=25)
v1 <- Views(chr1, start(sl), end(sl)+24)
v1
max(width(v1)) # length of longest SNP run
}