EnrichMap tutorial for one sample#
Using one gene signature#
This tutorial demonstrates how to use EnrichMap with one slide. We demonstrate how a single gene set is supplied to the framework.
import os
os.environ["PYTHONWARNINGS"] = "ignore"
import warnings
warnings.filterwarnings("ignore")
Import required packages for minimal example.
import scanpy as sc
import squidpy as sq
import enrichmap as em
sc.set_figure_params(frameon=False)
Load the dataset built-in Squidpy package:
adata = sq.datasets.visium_hne_adata()
Downloading data from 'https://exampledata.scverse.org/squidpy/figshare/visium_hne_adata.h5ad' to file '/Users/cenkcelik/Downloads/files/data/anndata/visium_hne_adata.h5ad'.
100%|████████████████████████████████████████| 329M/329M [00:00<00:00, 679GB/s]
As the data is shared as raw counts, we normalise them.
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
Now, let’s have a look at the cell type annotations.
Here, we aim to recover pyramidal layer cells using a 10-gene signature.
gene_set = [
"Lefty1",
"6330420H09Rik",
"B230110G15Rik",
"Fgf16",
"Fibcd1",
"4921539H07Rik",
"Spink8",
"Efcab6",
"Tmprss6",
"Gm2115",
]
em.tl.score(adata, gene_set=gene_set)
Scoring enrichmap: 10/10 genes found: 100%|██████████| 1/1 [00:00<00:00, 1.45it/s]
Let’s now investigate what genes influenced the EnrichMap score the most. Here, we demonstrate top three genes.
Here’s the individual gene expression for top three genes.
sq.pl.spatial_scatter(adata, color=["Fibcd1", "Lefty1", "Spink8"], size=20, shape=None, use_raw=False)
As can be seen from the spatial enrichment map, the genes in the set are enriched in specific regions of the tissue. The gene contributions plot shows the top contributing genes to the enrichment score, indicating their relative importance in the spatial context.
Evaluate spatial autocorrelation and smoothness#
Moran’s I score shows high spatial autocorrelation of the EnrichMap scores, where co-enriched and co-depleted scores cluster together with the spatial lag of the score, indicating spatially coherent scoring.
We can also explore how the EnrichMap scores change across tissue slides. A variogram can inform the user if the scores form a continuous pattern. A sharp increase in the semivariance, then a plateau is expected.
The variogram above demonstrates how smaller spatial lags are spatially coherent, whilst as the distance between two spots (or cells) increases, the semivariance also increases, indicating that spots (or cells) that are farther apart are less similar. A plateau indicates the range beyond which spatial dependence is no longer present.
Using more than one signature#
If more than one signature of interest is explored, EnrichMap can input a dictionary with multiple signatures, in which the key will be used for storing scores in adata.obs and the corresponding value will be used as the gene set.
Let’s now try to score for striatum, alongside pyramidal cells.
signature_dict = {
"pyramidal": gene_set,
"striatum": [
"Ucn3",
"Cyp19a1",
"Gm48485",
"Chrna10",
"Sstr5",
"6430628N08Rik",
"Ido1",
"Adora2a",
"Ccdc42",
"Penk",
],
}
em.tl.score(adata, gene_set=signature_dict)
Scoring striatum: 10/10 genes found: 100%|██████████| 2/2 [00:01<00:00, 1.59it/s]