meta_cells {inferCSN} | R Documentation |
Detection of metacells from single-cell gene expression matrix
Description
This function detects metacells from a single-cell gene expression matrix
using dimensionality reduction and clustering techniques.
Usage
meta_cells(
matrix,
genes_use = NULL,
genes_exclude = NULL,
var_genes_num = min(1000, nrow(matrix)),
gamma = 10,
knn_k = 5,
do_scale = TRUE,
pc_num = 25,
fast_pca = FALSE,
do_approx = FALSE,
approx_num = 20000,
directed = FALSE,
use_nn2 = TRUE,
seed = 1,
cluster_method = "walktrap",
block_size = 10000,
weights = NULL,
do_median_norm = FALSE,
...
)
Arguments
matrix |
A gene expression matrix where rows represent genes and columns represent cells.
|
genes_use |
Default is NULL .
A character vector specifying genes to be used for PCA dimensionality reduction.
|
genes_exclude |
Default is NULL . A character vector specifying genes to be excluded
from PCA computation.
|
var_genes_num |
Default is min(1000, nrow(matrix)) . Number of most variable genes
to select when genes_use is not provided.
|
gamma |
Default is 10 . Coarse-graining parameter defining the target ratio of input
cells to output metacells (e.g., gamma=10 yields approximately n/10 metacells for n input cells).
|
knn_k |
Default is 5 . Number of nearest neighbors for constructing the cell-cell
similarity network.
|
do_scale |
Default is TRUE . Whether to standardize (center and scale) gene expression
values before PCA.
|
pc_num |
Default is 25 . Number of principal components to retain for downstream analysis.
|
fast_pca |
Default is TRUE . Whether to use the faster irlba algorithm
instead of standard PCA. Recommended for large datasets.
|
do_approx |
Default is FALSE . Whether to use approximate nearest neighbor search for
datasets with >50000 cells to improve computational efficiency.
|
approx_num |
Default is 20000 . Number of cells to randomly sample for approximate
nearest neighbor computation when do_approx = TRUE .
|
directed |
Default is FALSE . Whether to construct a directed or undirected nearest
neighbor graph.
|
use_nn2 |
Default is TRUE . Whether to use the faster RANN::nn2 algorithm for nearest
neighbor search (only applicable with Euclidean distance).
|
seed |
Default is 1 . Random seed for reproducibility when subsampling cells in
approximate mode.
|
cluster_method |
Default is walktrap . Algorithm for community detection in the cell
similarity network. Options: walktrap (recommended) or louvain (gamma parameter ignored).
|
block_size |
Default is 10000 . Number of cells to process in each batch when mapping
cells to metacells in approximate mode. Adjust based on available memory.
|
weights |
Default is NULL . Numeric vector of cell-specific weights for weighted
averaging when computing metacell expression profiles. Length must match number of cells.
|
do_median_norm |
Default is FALSE . Whether to perform median-based normalization of
the final metacell expression matrix.
|
... |
Additional arguments passed to internal functions.
|
Value
A matrix where rows represent metacells and columns represent genes.
References
https://github.com/GfellerLab/SuperCell
https://github.com/kuijjerlab/SCORPION
Examples
data("example_matrix")
meta_cells_matrix <- meta_cells(
example_matrix
)
dim(meta_cells_matrix)
meta_cells_matrix[1:6, 1:6]
[Package
inferCSN version 1.1.7
Index]