STutils#
STutils is a Python package providing extra functions for analyzing STOmics (Spatial Transcriptomics) data, built on top of scanpy and anndata.
Features#
STutils provides three main modules for spatial transcriptomics analysis:
pp(Preprocessing): Data preprocessing and I/O functionsRead Stereo-seq formatted datasets
Merge cellbin data to metacells
Data format conversion
tl(Tools): Analysis tools and algorithmsFast UCell pathway scoring
Fast gene signature scoring
Neighborhood enrichment analysis
Differential expression analysis
Cell density calculation
Bias gene removal
pl(Plotting): Visualization functionsCellbin spatial plots (discrete and gradient)
Neighborhood heatmaps
Custom color palettes
Installation#
You need to have Python 3.10 or newer installed on your system. If you don’t have Python installed, we recommend installing Mambaforge.
Install from GitHub#
Install the latest development version:
pip install git+https://github.com/ChanghaoGong/STutils.git@main
Dependencies#
STutils requires:
Python >= 3.10
anndata
scanpy
session-info2
Optional dependencies for development, testing, and documentation are available. See pyproject.toml for details.
Quick Start#
Reading Stereo-seq Data#
import STutils
# Read Stereo-seq formatted dataset
sdata = STutils.pp.stereoseq(
path="path/to/stereo-seq/data", read_square_bin=True, optional_tif=False
)
Spatial Visualization#
import STutils.pl as stpl
# Plot discrete cellbin image
stpl.plot_cellbin_discrete(
adata=adata,
mask="path/to/mask.tif",
tag="cell_type",
prefix="sample1",
colors=10,
save=True,
)
# Plot gradient cellbin image
stpl.plot_cellbin_gradient(
adata=adata,
mask="path/to/mask.tif",
tag="gene_expression",
prefix="sample1",
colors="RdYlBu_r",
save=True,
)
Fast UCell Pathway Scoring#
import STutils.tl as sttl
# Step 1: Generate rank matrix
rank_matrix = sttl.fast_ucell_rank(
adata=adata, n_cores_rank=4, maxRank=1500, rank_batch_size=100000
)
# Step 2: Calculate UCell scores
pathway_input = sttl.generate_pathway_input(adata=adata, pathway_dict=pathway_dict)
ucell_scores = sttl.fast_ucell_score(
cell_index=list(adata.obs.index),
rankmatrix=rank_matrix,
n_cores_score=4,
input_dict=pathway_input,
maxRank=1500,
)
Neighborhood Enrichment Analysis#
import STutils.tl as sttl
# Calculate neighborhood enrichment
nhood_percents = sttl.nhood_enrichment(
adata=adata,
coord_type="generic",
library_key="batch",
radius=30,
cluster_key="cell_type",
)
# Plot neighborhood heatmap
import STutils.pl as stpl
stpl.nhood_heatmap(adata=adata, cluster_key="cell_type", radius=30, save=True)
Differential Expression Analysis#
import scanpy as sc
import STutils.tl as sttl
# First, run rank_genes_groups
sc.tl.rank_genes_groups(adata, "cell_type", method="wilcoxon")
# Get DEGs with volcano plot
deg_dict = sttl.getDEG(
adata=adata,
cluster="cell_type",
qval_cutoff=0.1,
mean_expr_cutoff=0.2,
top_genes=200,
save="volcano_plot.pdf",
)
Cell Density Calculation#
import STutils.tl as sttl
# Calculate cell type density
sttl.calculate_celltype_density(
adata=adata,
celltype_of_interest="T_cell",
obs_key="cell_type",
coord_keys=["x", "y"],
plot_result=True,
)
# Calculate co-culture probability
sttl.calculate_coculture_probability(
adata=adata,
celltype1="T_cell",
celltype2="B_cell",
obs_key="cell_type",
coord_keys=["x", "y"],
)
Main Modules#
Preprocessing (pp)#
stereoseq(): Read Stereo-seq formatted datasets into SpatialData objectsmerge_big_cell(): Merge STomics cellbin data to metacells by axis and celltype
Tools (tl)#
fast_ucell_rank(): Fast ranking step for UCell pathway scoringfast_ucell_score(): Fast UCell pathway scoringfast_sctl_score(): Fast multi-core version of scanpy.tl.score_genes()nhood_enrichment(): Calculate neighborhood enrichment between cell typestest_nhood(): Analyze cell type neighborhood interactions with spatial shuffling testgetDEG(): Extract differentially expressed genes with volcano plotscalculate_celltype_density(): Calculate spatial density distribution of cell typescalculate_coculture_probability(): Calculate co-occurrence probability of two cell typesgenerate_pathway_input(): Prepare pathway input dictionary for scoring functionsremoveBiasGenes(): Remove bias genes from analysis
Plotting (pl)#
plot_cellbin_discrete(): Plot discrete cellbin images with categorical colorsplot_cellbin_gradient(): Plot gradient cellbin images with continuous color scalesplot_zoom_cellbin(): Plot zoomed-in regions of cellbin imagesnhood_heatmap(): Plot neighborhood enrichment heatmapsgetDefaultColors(): Get beautiful color palettes for scientific plotting
Documentation#
Please refer to the documentation for detailed API reference and tutorials. In particular:
Citation#
If you use STutils in your research, please cite:
t.b.a
Contributing#
Contributions are welcome! Please feel free to submit a Pull Request.
Contact#
For questions and help requests, you can reach out in the scverse discourse. If you found a bug, please use the issue tracker.
License#
See LICENSE file for details.