enrichmap.pl.compare_morans_i

enrichmap.pl.compare_morans_i#

enrichmap.pl.compare_morans_i()#

Compare spatial autocorrelation of EnrichMap scores across patients using permutation-standardised Moran’s I.

Moran’s I measures the degree of spatial autocorrelation in a score field: values near +1 indicate strong positive autocorrelation (similar scores cluster together), values near 0 indicate spatial randomness, and values near −1 indicate a chequerboard-like pattern where neighbours tend to have dissimilar scores.

However, raw Moran’s I values depend on the spatial weights matrix (i.e. the tissue architecture), so they are not directly comparable across slides with different geometries, spot densities or tissue shapes. This function addresses that by computing a permutation z-score for each patient: the observed I is standardised against a null distribution built by randomly permuting scores on that patient’s own spatial graph. A z-score of 25 means “the score field is 25 standard deviations more spatially clustered than expected under random placement on this particular tissue”, and that statement is valid regardless of graph topology.

The function reports both the raw Moran’s I (useful for understanding the absolute autocorrelation) and the z-score (useful for cross-patient comparison). The per-patient permutation p-value indicates whether each individual patient’s scores are significantly spatially autocorrelated.

Statistical testing#

When exactly two patients are present, a spot-label permutation test is run automatically on the difference in raw Moran’s I values. All spots from both patients are pooled and randomly reassigned to two groups of the original sizes; spatial graphs are rebuilt and Moran’s I is recomputed for each pseudo-patient. This tests whether the observed difference in spatial autocorrelation is larger than expected if score values were randomly distributed across the two tissue architectures.

When multiple patients are present and group_key is provided, a permutation test is run on the difference in group-mean z-scores. Patient-level z-scores are pooled, group labels are permuted, and the difference in means is recomputed to build a null distribution. This is analogous to a two-sample t-test but without distributional assumptions and works with any number of patients per group, including n=1.

All test results are stored in df.attrs and displayed in the plot title when plot=True.

param adata:

Annotated data matrix. Must contain the EnrichMap score column in adata.obs and spatial coordinates in adata.obsm (used internally by squidpy to build the spatial neighbours graph).

type adata:

AnnData

param score_key:

Column name in adata.obs holding the EnrichMap score to analyse, e.g. "enrichmap_score" or "EMT_score".

type score_key:

str

param batch_key:

Column name in adata.obs identifying individual patients or slides, e.g. "patient_id" or "library_id". Each unique value is treated as a separate sample with its own spatial graph.

type batch_key:

str

param n_neighbors:

Number of nearest spatial neighbours used to construct the connectivity graph for each patient. For Visium data, 6 corresponds to the immediate hexagonal ring. Increasing this value smooths the autocorrelation estimate over a larger neighbourhood but may dilute local signal.

type n_neighbors:

int, default 6

param n_permutations:

Number of random permutations for constructing the per-patient null distribution and for the between-patient/group tests. Higher values give more precise p-values. For exploratory analysis 499 is sufficient; for publication-ready results use 999 or higher. The per-patient p-value resolution is bounded by 1 / (n_permutations + 1).

type n_permutations:

int, default 999

param random_state:

Seed for the random number generator. Set for reproducibility.

type random_state:

int, default 0

param group_key:

Column name in adata.obs for a higher-level clinical grouping, e.g. "subtype", "treatment_arm" or "response". When provided, the plot is coloured by group and a group-level permutation test is run on the z-scores. When None, each patient is plotted individually and no group test is performed.

type group_key:

str or None, optional

param plot:

Whether to produce a strip-over-boxplot of permutation z-scored Moran’s I values, optionally grouped by group_key.

type plot:

bool, default True

param figsize:

Figure size in inches (width, height). Ignored if ax is provided.

type figsize:

tuple of float, default (5, 4)

param palette:

Colour palette for the plot. Can be a seaborn palette name (e.g. "Set2"), a dictionary mapping group/patient names to colours, or None for the default palette.

type palette:

str, dict or None, optional

param ax:

Pre-existing axes to plot on. If provided, figsize is ignored and the plot is drawn on the given axes, which is useful for embedding in multi-panel figures.

type ax:

matplotlib.axes.Axes or None, optional

returns:

One row per patient with columns:

  • patient: patient/sample identifier (from batch_key).

  • group: clinical group label (only if group_key is set).

  • morans_i: observed Moran’s I on the patient’s spatial graph.

  • expected_i: mean Moran’s I under the permutation null.

  • std_i: standard deviation of Moran’s I under the null.

  • z_score: (morans_i - expected_i) / std_i. This is the value used for cross-patient comparison.

  • p_value: two-sided pseudo p-value from the permutation null, testing whether the patient’s score field is significantly spatially autocorrelated.

Statistical test results, when applicable, are stored as dictionaries in df.attrs:

  • df.attrs["pairwise_test"]: spot-label permutation test result (two-patient case), containing keys "morans_i_a", "morans_i_b", "observed_diff", "p_value" and "n_permutations".

  • df.attrs["group_test"]: group-level permutation test result (multi-patient with group_key), containing keys "mean_a", "mean_b", "observed_diff", "p_value" and "n_permutations".

rtype:

pd.DataFrame

Examples

Compare Moran’s I across two patients (pairwise test):

>>> result = compare_morans_i(
...     adata,
...     score_key="enrichmap_score",
...     batch_key="patient_id",
... )
>>> print(result[["patient", "morans_i", "z_score", "p_value"]])
   patient  morans_i   z_score  p_value
patient_01     0.666    26.798    0.002
patient_02    -0.017    -0.636    0.504
>>> print(result.attrs["pairwise_test"]["p_value"])
0.002

Compare Moran’s I across clinical groups:

>>> result = compare_morans_i(
...     adata,
...     score_key="enrichmap_score",
...     batch_key="patient_id",
...     group_key="subtype",
... )
>>> print(result.attrs["group_test"])
{'test': 'permutation test (difference in group means)',
 'group_a': 'luminal', 'group_b': 'basal',
 'mean_a': 26.0, 'mean_b': -0.16,
 'observed_diff': 26.16, 'p_value': 0.062, ...}

Plot on a pre-existing axes (e.g. for a multi-panel figure):

>>> fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
>>> compare_morans_i(adata, "EMT_score", "patient_id", ax=ax1)
>>> compare_morans_i(adata, "hypoxia_score", "patient_id", ax=ax2)

See also

compare_wasserstein

Pairwise optimal transport distance between spatially embedded score fields.

compare_variograms

Semivariogram-based comparison of spatial scale and structure.