Preprocessing: pp#
Preprocessing tools that do not produce output, but modify the data to prepare it for downstream analysis.
Basic Preprocessing#
|
Drop features with many NAs, then drop cells with any NAs (or infinite values) |
|
Scale data to unit variance per feature while maintaining a low memory footprint (operates in-place). |
|
Scale data to zero-center and unit variance per batch in-place. |
Batch Effects#
Tools to remove batch effects from single-cell morphological data.
|
Remove batch effects |
Feature Selection#
Tools to reduce number of features based on correlation or confounder association.
|
Feature selection |
|
Compute pairwise correlations |
|
Perform Kruskal-Wallis H-test for each feature across batches. |
|
Filter features based on Kruskal-Wallis H-test statistics. |
Aggregation#
Tools to compare aggregate profiles.
Additionally, different distance metrics are available.
For a simple aggregation, use aggregate. For a statistically robust distance
metric, use aggregate_mahalanobis.
|
Aggregate single-cell measurements into well-level profiles |
|
Measure per-feature distance between groups using t-statistics. |
|
Summarize t-statistics into per group. |
|
Measure distance between groups using principle components weighted by variance explained |
|
Measure distance between groups using mahalanobis distance |
Dimensionality-reduction#
Tools to perform dimensionality-reduction.
|
Principal component analysis [Pedregosa et al., 2011]. |
|
Compute a neighborhood graph of observations using the PCA representation. |
|
Embed the neighborhood graph using UMAP [McInnes et al., 2018]. |