`spateo.preprocessing.normalize`#

Functions to either scale single-cell data or normalize such that the row-wise sums are identical.

Module Contents#

Functions#

`_normalize_data`(X, counts[, after, copy, rows, round])	Row-wise or column-wise normalization of sparse data array.
`normalize_total`(→ Union[anndata.AnnData, Dict[str, ...)	Normalize counts per cell.

spateo.preprocessing.normalize._normalize_data(X, counts, after=None, copy=False, rows=True, round=False)[source]#

Row-wise or column-wise normalization of sparse data array.

Parameters

X: Sparse data array to modify.
counts: Array of shape [1, n], where n is the number of buckets or number of genes, containing the total counts in each cell or for each gene, respectively.
after: Target sum total counts for each gene or each cell. Defaults to None, in which case each observation (cell) will have a total count equal to the median of total counts for observations (cells) before normalization.
copy: Whether to operate on a copy of X.
rows: Whether to perform normalization over rows (normalize each cell to have the same total count number) or over columns (normalize each gene to have the same total count number).
round: Whether to round to three decimal places to more exactly match the desired number of total counts.

spateo.preprocessing.normalize.normalize_total(adata: anndata.AnnData, target_sum: Optional[float] = 10000.0, exclude_highly_expressed: bool = False, max_fraction: float = 0.05, key_added: Optional[str] = None, layer: Optional[str] = None, inplace: bool = True, copy: bool = False) → Union[anndata.AnnData, Dict[str, numpy.ndarray]][source]#

Normalize counts per cell. Normalize each cell by total counts over all genes, so that every cell has the same total count after normalization.

If exclude_highly_expressed=True, very highly expressed genes are excluded from the computation of the normalization factor (size factor) for each cell. This is meaningful as these can strongly influence the resulting normalized values for all other genes.

Parameters

adata: The annotated data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.
target_sum: Desired sum of counts for each gene post-normalization. If None, after normalization, each observation (cell) will have a total count equal to the median of total counts for observations ( cells) before normalization.
exclude_highly_expressed: Exclude (very) highly expressed genes for the computation of the normalization factor for each cell. A gene is considered highly expressed if it has more than max_fraction of the total counts in at least one cell.
max_fraction: If exclude_highly_expressed=True, this is the cutoff threshold for excluding genes.
key_added: Name of the field in adata.obs where the normalization factor is stored.
layer: Layer to normalize instead of X. If None, X is normalized.
inplace: Whether to update adata or return dictionary with normalized copies of adata.X and adata.layers.
copy: Whether to modify copied input object. Not compatible with inplace=False.

Returns

Returns dictionary with normalized copies of adata.X and adata.layers or updates adata with normalized version of the original adata.X and adata.layers, depending on inplace.

spateo.preprocessing.normalize#

Module Contents#

Functions#

`spateo.preprocessing.normalize`#