HetseqDoubleML {HetSeq}R Documentation

Heterogeneity-seq: Classifying cellular response by gene expression values including causal inference by DoubleML

Description

Classifying the cellular response of control cells using single gene expression (+ informative features) to identify features with the strongest predictive capabilities and applying causal inference by a DoubleML approach.

Usage

HetseqDoubleML(
  object,
  trajectories,
  score.group = NULL,
  score.name = NULL,
  quantiles = c(0.25, 0.75),
  compareGroups = c("Low", "High"),
  posClass = NULL,
  basefeatures = NULL,
  genes = NULL,
  background = NULL,
  assay = NULL,
  split = NULL,
  cross = 10,
  num_cores = 1
)

Arguments

object

Seurat object

trajectories

Matrix of cell-cell trajectories. Columns represent time points, rows represent trajectories of connected cells over time points.

score.group

A named vector of response groups. Names represent cells, the values represent the score groups. If no score.group is set, use score.name and quantiles parameters must be set to define score groups.

score.name

The name of a numeric Seurat meta data column, which will be used to calculate score groups. Only used if no score.group is given.

quantiles

Thresholds of the score.name meta data to define 3 response groups. Low, Middle, High.

compareGroups

Which score groups to test. Default: Low vs. High

posClass

Define the positive Class for classification.

basefeatures

Additional informative features to include in the classification. Must be meta data available in the Seurat object.

genes

Vector of genes to test.

background

A set of genes that will be considered as potential confounding factors in the DoubleML analysis. Must contain all genes set in the genes parameter. By default, all genes are used.

assay

The name of the Seurat assay to perform Heterogeneity-seq on. If NULL, the default assay will be used.

split

Set a training-test data split. Must be in [0,1]

cross

Number of cross-validations.

num_cores

The number of cores used in parallel processing.

Value

Table of log2FC and AUC values for each gene and an additional AUC value for the baseline features.

Examples



# Full vignette available on https://grandr.erhard-lab.de/articles/web/hetseq.html

  t <- HetseqDoubleML(data, trajectories, score.name = "score")
  
  t <- HetseqDoubleML(data, trajectories, score.group = group_vector,
        compareGroups = c("Weak", "Strong"))


[Package HetSeq version 0.1.0 Index]