Note

This page was generated from 5_ap&dv_diff_analysis.ipynb. Interactive online version: Colab badge. Some tutorial content may look better in light mode.

5.AP&DV axis analysis#

Packages#

[1]:
import os
import sys

import numpy as np
import gseapy as gp
from sklearn.decomposition import PCA

import dynamo as dyn
import spateo as st
/home/yao/anaconda3/envs/BGIpy38_tf2/lib/python3.8/site-packages/requests/__init__.py:102: RequestsDependencyWarning: urllib3 (1.26.9) or chardet (5.0.0)/charset_normalizer (2.0.12) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({})/charset_normalizer ({}) doesn't match a supported "
|-----> setting visualization default mode in dynamo. Your customized matplotlib settings might be overritten.
/home/yao/.local/lib/python3.8/site-packages/nxviz/__init__.py:18: UserWarning:
nxviz has a new API! Version 0.7.3 onwards, the old class-based API is being
deprecated in favour of a new API focused on advancing a grammar of network
graphics. If your plotting code depends on the old API, please consider
pinning nxviz at version 0.7.3, as the new API will break your old code.

To check out the new API, please head over to the docs at
https://ericmjl.github.io/nxviz/ to learn more. We hope you enjoy using it!

(This deprecation message will go away in version 1.0.)

  warnings.warn(
/home/yao/anaconda3/envs/BGIpy38_tf2/lib/python3.8/site-packages/spaghetti/network.py:36: FutureWarning:

The next major release of pysal/spaghetti (2.0.0) will drop support for all ``libpysal.cg`` geometries. This change is a first step in refactoring ``spaghetti`` that is expected to result in dramatically reduced runtimes for network instantiation and operations. Users currently requiring network and point pattern input as ``libpysal.cg`` geometries should prepare for this simply by converting to ``shapely`` geometries.

Data source#

[2]:
embryo_adata = st.sample_data.drosophila(filename="E7-9h_cellbin_tdr_v2.h5ad")
embryo_adata.X = embryo_adata.layers["counts_X"]
embryo_adata.layers["log1p_X"] = np.log1p(embryo_adata.layers["counts_X"])
embryo_adata.uns["pp"] = {}
dyn.pp.normalize_cell_expr_by_size_factors(adata=embryo_adata, layers="X", X_total_layers=True,  skip_log=True)

# The embryo pc model and mesh model are reconstructed by ``2.three dims models reconstruction``
embryo_pc = st.tdr.read_model(filename="embryo_pc_model.vtk")
embryo_mesh = st.tdr.read_model(filename="embryo_mesh_model.vtk")

# cpo
cpo = [(553.2878243418567, 1098.4674808068507, 277.4399476053088),
 (1.9670869138005287, -6.902875264241757, -2.2120172004343885),
 (-0.16299443079060863, -0.16480753930466982, 0.9727647662819544)]
|-----> rounding expression data of layer: X during size factor calculation
|-----> size factor normalize following layers: ['X']
|-----? `total_szfactor` is not `None` and it is not in adata object.
|-----> skipping log transformation as input requires...
|-----> applying None to layer<X>
|-----> set adata <X> to normalized data.
|-----> <insert> pp.norm_method to uns in AnnData Object.

Calculation of AP&DV axis#

[4]:
pca = PCA(n_components=3)
pca_spatial = pca.fit_transform(np.asarray(embryo_adata.obsm["tdr_spatial"])).astype(int)
embryo_adata.obs["ap_axis"] = pca_spatial[:, [0]]
embryo_adata.obs["dv_axis"] = pca_spatial[:, [2]]
embryo_adata.obsm["pca_spatial"] = pca_spatial
[15]:
empty_array = np.zeros(shape=[2, 1])
ap_line = st.tdr.construct_axis_line(
    axis_points=np.concatenate(
        [np.asarray([pca_spatial[:, [0]].min() - 30, pca_spatial[:, [0]].max() + 30]).reshape(-1, 1), empty_array, empty_array],
        axis=1
    ),
    key_added="axis",
    label="AP axis",
    color="orangered",
)
ap_line.points = st.tl.rigid_transform_3D(
    coords=np.asarray(ap_line.points),
    coords_refA=np.asarray(embryo_adata.obsm["tdr_spatial"]),
    coords_refB=np.asarray(embryo_adata.obsm["3d_align_spatial"])
)

dv_line = st.tdr.construct_axis_line(
    axis_points=np.concatenate(
        [empty_array, empty_array,  np.asarray([pca_spatial[:, [2]].min() - 30, pca_spatial[:, [2]].max() + 30]).reshape(-1, 1),],
        axis=1
    ),
    key_added="axis",
    label="DV axis",
    color="skyblue",
)
dv_line.points = st.tl.rigid_transform_3D(
    coords=np.asarray(dv_line.points),
    coords_refA=np.asarray(embryo_adata.obsm["tdr_spatial"]),
    coords_refB=np.asarray(embryo_adata.obsm["3d_align_spatial"])
)
[17]:
st.pl.three_d_plot(
    model=st.tdr.collect_models([embryo_mesh, embryo_pc, ap_line, dv_line]),
    model_style=["surface", "points", "wireframe", "wireframe"],
    colormap=["gainsboro", "gainsboro", "orangered", "skyblue"],
    model_size=[3, 3, 8, 8],
    opacity=[0.2, 0.3, 1, 1],
    cpo=cpo,
    jupyter="static",
    show_legend=False,
    window_size=(512, 512),
)
../../../_images/tutorials_notebooks_5_3d_reconstruction_5_ap%26dv_diff_analysis_8_0.png

Differential genes expression tests along AP/DV axis using generalized linear regressions#

[18]:
axis="dv_axis"
glm_key = f"glm_degs_{axis}"
st.tl.glm_degs(adata=embryo_adata, fullModelFormulaStr=f"~cr({axis}, df=3)", key_added=glm_key, qval_threshold=0.01, llf_threshold=-2500)
embryo_glm_data = embryo_adata.uns[glm_key]["glm_result"]
embryo_glm_data
|-----? Gene expression matrix must be normalized by the size factor, please check if the input gene expression matrix is correct.If you don't have the size factor normalized gene expression matrix, please run `dynamo.pp.normalize_cell_expr_by_size_factors(skip_log = True)`.
|-----> [Detecting genes via Generalized Additive Models (GAMs)] in progress: 100.0000%
|-----> [Detecting genes via Generalized Additive Models (GAMs)] finished [2202.5779s]
[18]:
status family log-likelihood pval qval
mt:lrRNA ok NB2 -68868.195312 0.000000 0.000000
mt:ND5 ok NB2 -24257.716797 0.000000 0.000000
mt:srRNA ok NB2 -23115.775391 0.000000 0.000000
Nplp2 ok NB2 -23097.445312 0.000000 0.000000
lncRNA:CR30009 ok NB2 -22438.078125 0.000000 0.000000
... ... ... ... ... ...
LBR ok NB2 -7330.010254 0.003961 0.009829
CG14478 ok NB2 -10290.786133 0.003970 0.009844
CG13220 ok NB2 -7559.080078 0.003995 0.009900
RpL34a ok NB2 -28962.595703 0.004013 0.009940
RanGAP ok NB2 -4263.407227 0.004015 0.009941

1760 rows × 5 columns

[19]:
st.pl.glm_fit(
    adata=embryo_adata,
    gene=embryo_glm_data.index.tolist()[:10],
    ncols=5,
    feature_x=axis,
    feature_y="expression",
    feature_fit="mu",
    glm_key=glm_key,
    save_show_or_return="show",
)
../../../_images/tutorials_notebooks_5_3d_reconstruction_5_ap%26dv_diff_analysis_11_0.png

GO enrichment based on enrichr#

[20]:
gp.get_library_name(organism="fly")
[20]:
['Allele_LoF_Phenotypes_from_FlyBase_2017',
 'Allele_Phenotypes_from_FlyBase_2017',
 'Anatomy_AutoRIF',
 'Anatomy_AutoRIF_Predicted_zscore',
 'Anatomy_GeneRIF',
 'Anatomy_GeneRIF_Predicted_zscore',
 'Coexpression_Predicted_GO_Biological_Process_2018',
 'Coexpression_Predicted_GO_Cellular_Component_2018',
 'Coexpression_Predicted_GO_Molecular_Function_2018',
 'GO_Biological_Process_2018',
 'GO_Biological_Process_AutoRIF',
 'GO_Biological_Process_AutoRIF_Predicted_zscore',
 'GO_Biological_Process_GeneRIF',
 'GO_Biological_Process_GeneRIF_Predicted_zscore',
 'GO_Cellular_Component_2018',
 'GO_Cellular_Component_AutoRIF',
 'GO_Cellular_Component_AutoRIF_Predicted_zscore',
 'GO_Cellular_Component_GeneRIF',
 'GO_Cellular_Component_GeneRIF_Predicted_zscore',
 'GO_Molecular_Function_2018',
 'GO_Molecular_Function_AutoRIF',
 'GO_Molecular_Function_AutoRIF_Predicted_zscore',
 'GO_Molecular_Function_GeneRIF',
 'GO_Molecular_Function_GeneRIF_Predicted_zscore',
 'Human_Disease_from_FlyBase_2017',
 'InterPro_Domains_2019',
 'KEGG_2019',
 'PPI_Network_Hubs_from_DroID_2017',
 'Pfam_Domains_2019',
 'Phenotype_AutoRIF',
 'Phenotype_AutoRIF_Predicted_zscore',
 'Phenotype_GeneRIF',
 'Phenotype_GeneRIF_Predicted_zscore',
 'Putative_Regulatory_miRNAs_from_DroID_2015',
 'RNAi_Screens_from_GenomeRNAi_2017',
 'TF2DNA_2018',
 'Transcription_Factors_from_DroID_2015',
 'WikiPathways_2018']
[13]:
embryo_glm_go = gp.enrichr(
    gene_list=embryo_glm_data.index.tolist(),
    gene_sets="GO_Biological_Process_2018",
    organism="fly",
    outdir="go_bp_result",
    no_plot=True,
    verbose=True
)
embryo_glm_go_results = embryo_glm_go.results
embryo_glm_go_results.head(10)
2022-10-14 01:40:26,460 Connecting to Enrichr Server to get latest library names
2022-10-14 01:40:27,968 Analysis name: , Enrichr Library: GO_Biological_Process_2018
2022-10-14 01:40:41,966 Save file of enrichment results: Job Id:d458d70c1d419fc377755eeda25e566f
2022-10-14 01:40:41,980 Done.

[13]:
Gene_set Term Overlap P-value Adjusted P-value Old P-value Old Adjusted P-value Z-score Combined Score Genes
0 GO_Biological_Process_2018 regulation of transcription, DNA-templated (GO... 161/623 0.000000e+00 0.000000e+00 9.892965e-08 0.000055 -1.047207 88.525079 jigr1;fs(1)h;lmd;REPTOR-BP;pum;lola;peb;skd;Ol...
1 GO_Biological_Process_2018 negative regulation of cellular macromolecule ... 67/179 0.000000e+00 0.000000e+00 2.596111e-08 0.000021 -1.416952 82.517928 fs(1)h;bru3;msi;bru2;Kr;Fmr1;pum;Optix;Rbfox1;...
2 GO_Biological_Process_2018 regulation of transcription from RNA polymeras... 154/630 0.000000e+00 0.000000e+00 3.217498e-06 0.000560 -1.040632 76.870319 jigr1;lmd;Tet;Axud1;lola;peb;skd;asf1;gcm;exd;...
3 GO_Biological_Process_2018 negative regulation of gene expression (GO:001... 88/245 0.000000e+00 0.000000e+00 8.512170e-10 0.000002 -1.053588 76.396113 fs(1)h;bru3;msi;bru2;Ge-1;Kr;Fmr1;pum;put;Opti...
4 GO_Biological_Process_2018 positive regulation of transcription, DNA-temp... 109/379 0.000000e+00 0.000000e+00 2.056014e-07 0.000092 -1.006687 67.492905 lmd;Tet;REPTOR-BP;Axud1;lola;Rbfox1;gcm;exd;mo...
5 GO_Biological_Process_2018 axon guidance (GO:0007411) 82/242 0.000000e+00 0.000000e+00 2.779852e-08 0.000021 -1.056971 66.548159 Nrg;Kr;Fmr1;egh;ago;lola;peb;Oli;put;Ptp69D;gc...
6 GO_Biological_Process_2018 negative regulation of nucleic acid-templated ... 51/132 0.000000e+00 0.000000e+00 5.656474e-07 0.000141 -1.398374 64.769205 fs(1)h;Kr;pum;Optix;Dsp1;lolal;E(z);MBD-like;p...
7 GO_Biological_Process_2018 nervous system development (GO:0007399) 79/246 0.000000e+00 0.000000e+00 3.110317e-07 0.000103 -1.088587 61.710948 Fmr1;Rbfox1;exd;barr;Nrt;Adf1;E(z);pim;Myc;Doa...
8 GO_Biological_Process_2018 eye morphogenesis (GO:0048592) 38/114 1.765000e-13 9.207400e-12 1.981325e-04 0.009456 -2.018161 59.264058 hyx;bi;dome;put;Optix;Hrb98DE;sd;ave;p120ctn;t...
9 GO_Biological_Process_2018 cytoskeleton-dependent cytokinesis (GO:0061640) 31/69 2.100000e-15 1.540000e-13 9.151327e-06 0.000892 -1.745305 58.958696 Act42A;scra;sqh;Drp1;chic;feo;tsr;Sep1;pim;Zw1...
[14]:
gp.barplot(embryo_glm_go_results, column='Combined Score', title='GO_Biological_Process_2018', cutoff=0.05, top_term=100,
           figsize=(10, 25), ofname=f"go_bp_barplot.pdf")
[15]:
gp.dotplot(embryo_glm_go_results, column='Combined Score', title='GO_Biological_Process_2018', cmap='viridis_r', cutoff=0.05, top_term=100,
           figsize=(10, 25), ofname=f"go_bp_dotplot.pdf")