SIngle-cell eMBedding Along with features

API

Import simba as:

import simba as si

Configuration for SIMBA

settings.set_figure_params([context, style, ...])

Set global parameters for figures.

settings.set_pbg_params([config])

Set PBG parameters

settings.set_workdir([workdir])

Set working directory.

Reading

read_csv(filename[, delimiter, ...])

Read .csv file.

read_h5ad(filename[, backed, as_sparse, ...])

Read .h5ad-formatted hdf5 file.

read_10x_h5(filename[, genome, gex_only])

Read 10x-Genomics-formatted hdf5 file.

read_mtx(filename[, dtype])

Read .mtx file.

read_embedding([path_emb, path_entity, ...])

Read in entity embeddings from pbg training

load_pbg_config([path])

Load PBG configuration into global setting

load_graph_stats([path])

Load graph statistics into global setting

See more at anndata

Preprocessing

pp.log_transform(adata)

Return the natural logarithm of one plus the input array, element-wise.

pp.normalize(adata[, method, scale_factor, ...])

Normalize count matrix.

pp.binarize(adata[, threshold])

Binarize an array.

pp.cal_qc(adata[, expr_cutoff])

Calculate quality control metrics.

pp.cal_qc_rna(adata[, expr_cutoff])

Calculate quality control metrics.

pp.cal_qc_atac(adata[, expr_cutoff])

Calculate quality control metrics.

pp.filter_samples(adata[, min_n_features, ...])

Filter out samples based on different metrics.

pp.filter_cells_rna(adata[, min_n_genes, ...])

Filter out cells for RNA-seq based on different metrics.

pp.filter_cells_atac(adata[, min_n_peaks, ...])

Filter out cells for ATAC-seq based on different metrics.

pp.filter_features(adata[, min_n_samples, ...])

Filter out features based on different metrics.

pp.filter_genes(adata[, min_n_cells, ...])

Filter out features based on different metrics.

pp.filter_peaks(adata[, min_n_cells, ...])

Filter out features based on different metrics.

pp.pca(adata[, n_components, algorithm, ...])

perform Principal Component Analysis (PCA)

pp.select_pcs(adata[, n_pcs, S, curve, ...])

select top PCs based on variance_ratio

pp.select_pcs_features(adata[, S, curve, ...])

select features that contribute to the top PCs

pp.select_variable_genes(adata[, layer, ...])

Select highly variable genes.

Tools

tl.discretize(adata[, layer, n_bins, max_bins])

Discretize continous values

tl.umap(adata[, n_neighbors, n_components, ...])

perform UMAP :param adata: Annotated data matrix.

tl.gene_scores(adata, genome[, gene_anno, ...])

Calculate gene scores

tl.infer_edges(adata_ref, adata_query[, ...])

Infer edges between reference and query observations

tl.trim_edges(adata_ref_query[, cutoff, n_edges])

Trim edges based on the similarity scores

tl.gen_graph([list_CP, list_PM, list_PK, ...])

Generate graph for PBG training.

tl.pbg_train([dirname, pbg_params, output, ...])

PBG training

tl.softmax(adata_ref, adata_query[, T, ...])

Softmax-based transformation

tl.embed(adata_ref, list_adata_query[, T, ...])

Embed a list of query datasets along with reference dataset into the same space

tl.compare_entities(adata_ref, adata_query)

Compare the embeddings of two entities by calculating

tl.query(adata[, obsm, layer, metric, ...])

Query the "database" of entites

tl.find_master_regulators(adata_all[, ...])

Find all the master regulators

tl.find_target_genes(adata_all, adata_PM[, ...])

For a given TF, infer its target genes

Plotting

pl.pca_variance_ratio(adata[, log, ...])

Plot the variance ratio.

pl.pcs_features(adata[, log, size, ...])

Plot features that contribute to the top PCs.

pl.variable_genes(adata[, show_texts, ...])

Plot highly variable genes.

pl.violin(adata[, list_obs, list_var, ...])

Violin plot

pl.hist(adata[, list_obs, list_var, kde, ...])

histogram plot

pl.umap(adata[, color, dict_palette, ...])

Plot coordinates in UMAP

pl.discretize(adata[, kde, fig_size, pad, ...])

Plot original data VS discretized data

pl.node_similarity(adata[, bins, log, ...])

Plot similarity scores of nodes

pl.svd_nodes(adata[, comp1, comp2, color, ...])

Plot SVD coordinates

pl.pbg_metrics([metrics, path_emb, ...])

Plot PBG training metrics

pl.entity_metrics(adata_cmp, x, y[, ...])

Plot entity metrics

pl.entity_barcode(adata_cmp, entities[, ...])

Plot query entity barcode

pl.query(adata[, comp1, comp2, obsm, layer, ...])

Plot query output

Datasets

datasets.rna_10xpmbc3k()

10X human peripheral blood mononuclear cells (PBMCs) scRNA-seq data

datasets.rna_han2018()

single-cell microwell-seq mouse cell atlas data

datasets.rna_tmc2018()

single-cell Smart-Seq2 mouse cell atlas data

datasets.rna_baron2016()

single-cell RNA-seq human pancreas data

datasets.rna_muraro2016()

single-cell RNA-seq human pancreas data

datasets.rna_segerstolpe2016()

single-cell RNA-seq human pancreas data

datasets.rna_wang2016()

single-cell RNA-seq human pancreas data

datasets.rna_xin2016()

single-cell RNA-seq human pancreas data

datasets.atac_buenrostro2018()

single cell ATAC-seq human blood data

datasets.atac_10xpbmc5k()

10X human peripheral blood mononuclear cells (PBMCs) scATAC-seq data

datasets.atac_chen2019()

simulated scATAC-seq bone marrow data with a noise level of 0.4 and a coverage of 2500 fragments

datasets.atac_cusanovich2018_subset()

downsampled sci-ATAC-seq mouse tissue data

datasets.multiome_ma2020_fig4()

single cell multiome mouse skin data (SHARE-seq)

datasets.multiome_chen2019()

single cell multiome neonatal mouse cerebral cortex data (SNARE-seq)

datasets.multiome_10xpbmc10k()

single cell 10X human peripheral blood mononuclear cells (PBMCs) multiome data