spateo.tools.cluster.spagcn_utils#

Module Contents#

Classes#

GraphConvolution

Simple GCN layer, similar to https://arxiv.org/abs/1609.02907

simple_GC_DEC

Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer.

simple_GC_DEC_PyG

NN model like simple_GC_DEC, but employed torch_geometric.GCNConv as the GCN layer.

SpaGCN

Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8

Functions#

calculate_adj_matrix(x, y[, x_pixel, y_pixel, image, ...])

(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels.

calculate_p(adj, l)

search_l(p, adj[, start, end, tol, max_run])

Function to search proper l value for spagcn algorithm.

get_cluster_num(adata, adj, res, tol, lr, max_epochs, l)

get the initial number of clusters corresponding to given louvain resolution.

search_res(adata, adj, l, target_num[, start, step, ...])

Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm.

refine(sample_id, pred, dis[, shape])

To refine(smooth) the boundary of spatial domains(clusters).

spateo.tools.cluster.spagcn_utils.calculate_adj_matrix(x, y, x_pixel=None, y_pixel=None, image=None, beta=49, alpha=1, histology=True)[source]#

(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels.

Parameters
x : list

a list which contains corresponding x-coordinates for the spots, spatialy.

y : list

a list which contains corresponding y-coordinates for the spots, spatialy.

x_pixel : list, optional

a list which contains corresponding x-pixels for the spots, in histology image. Defaults to None.

y_pixel : list, optional

a list which contains corresponding y-pixels for the spots, in histology image. Defaults to None.

(class : image

numpy.ndarray, optional): the image(typically histology image) in numpy.ndarray format(can be obtained by cv2.imread). Defaults to None.

beta : int, optional

to control the range of neighbourhood when calculate grey value for one spot. Defaults to 49.

alpha : int, optional

to control the color scale. Defaults to 1.

histology : bool, optional

if the image is histological. Defaults to True.

Returns

numpy.ndarray: the calculated adjacent matrix.

Return type

class

spateo.tools.cluster.spagcn_utils.calculate_p(adj, l)[source]#
spateo.tools.cluster.spagcn_utils.search_l(p, adj, start=0.01, end=1000, tol=0.01, max_run=100)[source]#

Function to search proper l value for spagcn algorithm.

Parameters
p : float, optional

parameter p in spagcn algorithm. See SpaGCN for details.

(class : adj

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

start : float, optional

lower boundary of search. Defaults to 0.01.

end : int, optional

upper boundary of search. Defaults to 1000.

tol : float, optional

step length for search. Defaults to 0.01.

max_run : int, optional

maximum number of searching iteration. Defaults to 100.

Returns

the l value

Return type

float

spateo.tools.cluster.spagcn_utils.get_cluster_num(adata, adj, res, tol, lr, max_epochs, l, r_seed=100, t_seed=100, n_seed=100)[source]#

get the initial number of clusters corresponding to given louvain resolution.

Parameters
adata

further passed to SpaGCN.train(), see SpaGCN.train.

adj

further passed to SpaGCN.train(), see SpaGCN.train.

res

further passed to SpaGCN.train(), see SpaGCN.train.

tol

further passed to SpaGCN.train(), see SpaGCN.train.

lr

further passed to SpaGCN.train(), see SpaGCN.train.

max_epochs

further passed to SpaGCN.train(), see SpaGCN.train.

l : float

parameter l in spagcn algorithm, see SpaGCN for details.

r_seed : int, optional

Global seed for random, torch, numpy. Defaults to 100.

t_seed : int, optional

Global seed for random, torch, numpy. Defaults to 100.

n_seed : int, optional

Global seed for random, torch, numpy. Defaults to 100.

Returns

number of clusters

Return type

int

spateo.tools.cluster.spagcn_utils.search_res(adata, adj, l, target_num, start=0.4, step=0.1, tol=0.005, lr=0.05, max_epochs=10, r_seed=100, t_seed=100, n_seed=100, max_run=10)[source]#

Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm.

Parameters
(class : adj

~anndata.AnnData): an Annadata object.

(class

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

l : float

parameter l in spagcn algorithm, see SpaGCN for details.

target_num : int

desired number of clusters.

start : float, optional

the lower boundary of search for resolution. Defaults to 0.4.

step : float, optional

search step length. Defaults to 0.1.

tol

further passed to SpaGCN.train(), see SpaGCN.train.

lr

further passed to SpaGCN.train(), see SpaGCN.train.

max_epochs

further passed to SpaGCN.train(), see SpaGCN.train.

r_seed : int, optional

Global seed for random, torch, numpy. Defaults to 100.

t_seed : int, optional

Global seed for random, torch, numpy. Defaults to 100.

n_seed : int, optional

Global seed for random, torch, numpy. Defaults to 100.

max_run : int, optional

max number of iteration. Defaults to 10.

Returns

calculated initial louvain resolution.

Return type

float

spateo.tools.cluster.spagcn_utils.refine(sample_id, pred, dis, shape='square')[source]#

To refine(smooth) the boundary of spatial domains(clusters).

Parameters
sample_id : list

list of sample(cell, spot or bin) names.

pred : list

list of spatial domains corresponding to the sample_id list.

(class : dis

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

shape : str, optional

Smooth the spatial domains with given spatial topology, “hexagon” for Visium data, “square” for ST data. Defaults to “square”.

Returns

list of refined spatial domains corresponding to the sample_id list.

Return type

[list]

class spateo.tools.cluster.spagcn_utils.GraphConvolution(in_features, out_features, bias=True)[source]#

Bases: torch.nn.Module

Simple GCN layer, similar to https://arxiv.org/abs/1609.02907

reset_parameters()[source]#
forward(input, adj)[source]#
__repr__()[source]#

Return repr(self).

class spateo.tools.cluster.spagcn_utils.simple_GC_DEC(nfeat, nhid, alpha=0.2)[source]#

Bases: torch.nn.Module

Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer. For DEC, see https://arxiv.org/abs/1511.06335v2

forward(x, adj)[source]#
loss_function(p, q)[source]#
target_distribution(q)[source]#
fit(X, adj, lr=0.001, max_epochs=5000, update_interval=3, trajectory_interval=50, weight_decay=0.0005, opt='sgd', init='louvain', n_neighbors=10, res=0.4, n_clusters=10, init_spa=True, tol=0.001)[source]#
predict(X, adj)[source]#
class spateo.tools.cluster.spagcn_utils.simple_GC_DEC_PyG(nfeat, nhid, alpha=0.2)[source]#

Bases: simple_GC_DEC

NN model like simple_GC_DEC, but employed torch_geometric.GCNConv as the GCN layer.

forward(x, edge_index, edge_attr)[source]#
fit(X, adj, lr=0.001, max_epochs=5000, update_interval=3, trajectory_interval=50, weight_decay=0.0005, opt='sgd', init='louvain', n_neighbors=10, res=0.4, n_clusters=10, init_spa=True, tol=0.001)[source]#
predict(X, adj)[source]#
class spateo.tools.cluster.spagcn_utils.SpaGCN[source]#

Bases: object

Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8

set_l(l)[source]#
train(adata, adj, num_pcs=50, lr=0.005, max_epochs=2000, weight_decay=0, opt='adam', init_spa=True, init='louvain', n_neighbors=10, n_clusters=None, res=0.4, tol=0.001)[source]#

train model for spagcn

Parameters
(class : adj

~anndata.AnnData): an Annadata object.

(class

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

num_pcs : int, optional

number of pcs(out dimension of PCA) to use. Defaults to 50.

lr : float, optional

learning rate in neural network. Defaults to 0.005.

max_epochs : int, optional

max epochs to train in neural network. Defaults to 2000.

weight_decay : int, optional

make learning rate decay while training. Defaults to 0.

opt : str, optional

the optimizer to use. Defaults to “adam”.

init_spa : bool, optional

make initial clusters with louvain or kmeans. Defaults to True.

init : str, optional

algorithm to use in inital clustering. Supports “louvain”, “kmeans”. Defaults to “louvain”.

predict()[source]#