Preprocessing: `pp`#

Preprocessing tools that do not produce output, but modify the data to prepare it for downstream analysis.

Basic Preprocessing#

`drop_na`(adata[, feature_threshold, ...])	Drop features with many NAs, then drop cells with any NAs (or infinite values)
`scale`(adata[, treatment_key, control, chunked])	Scale data to unit variance per feature while maintaining a low memory footprint (operates in-place).
`scale_by_batch`(adata, batch_key[, ...])	Scale data to zero-center and unit variance per batch in-place.

Batch Effects#

Tools to remove batch effects from single-cell morphological data.

remove_batch_effects(adata[, bio_key, ...])

Remove batch effects

Feature Selection#

Tools to reduce number of features based on correlation or confounder association.

`select_features`(adata[, method, cor_cutoff, ...])	Feature selection
`corr`(X[, Y, method, M])	Compute pairwise correlations
`kruskal_test`(adata[, test_column])	Perform Kruskal-Wallis H-test for each feature across batches.
`kruskal_filter`(adata[, test_column, sigma, ...])	Filter features based on Kruskal-Wallis H-test statistics.

Aggregation#

Tools to compare aggregate profiles. Additionally, different distance metrics are available. For a simple aggregation, use aggregate. For a statistically robust distance metric, use aggregate_mahalanobis.

`aggregate`(adata[, well_key, group_keys, ...])	Aggregate single-cell measurements into well-level profiles
`aggregate_ttest`(adata[, treatment_key, ...])	Measure per-feature distance between groups using t-statistics.
`tstat_distance`(tstats)	Summarize t-statistics into per group.
`aggregate_pc`(adata[, treatment_key, ...])	Measure distance between groups using principle components weighted by variance explained
`aggregate_mahalanobis`(adata[, ...])	Measure distance between groups using mahalanobis distance

Dimensionality-reduction#

Tools to perform dimensionality-reduction.

`pca`(adata[, n_comps, whiten, copy, ...])	Principal component analysis [Pedregosa et al., 2011].
`neighbors`(adata[, n_neighbors, n_pcs, ...])	Compute a neighborhood graph of observations using the PCA representation.
`umap`(adata, **kwargs)	Embed the neighborhood graph using UMAP [McInnes et al., 2018].

Preprocessing: pp

Contents

Preprocessing: `pp`#

Basic Preprocessing#

Batch Effects#

Feature Selection#

Aggregation#

Dimensionality-reduction#

Preprocessing: pp

Contents

Preprocessing: pp#

Basic Preprocessing#

Batch Effects#

Feature Selection#

Aggregation#

Dimensionality-reduction#

Preprocessing: `pp`#