A curated collection of gene sets relevant to neuroblastoma research, covering cell states, developmental programs, metabolic signatures, and hallmark pathways from ~16 published sources and custom curation (~68 gene sets, ~8200 entries total).
Main gene set table. Columns:
source_id: identifier for the originating study or resource (links toset_meta.tsv)gene_set: name of the gene set (e.g.MES,ADRN,Hypoxia,EMT_I)gene_name: HGNC gene symbol (e.g.ACTB)score: optional numeric score (enrichment, log2FC, etc.); can be empty
Metadata for each source. Columns:
source_id: matchessource_idingene_sets.tsvurl: link to the publication or resourcedescription: brief description of the study and what the gene sets representcitation: full citation string
Install the repo as a package (or add it to your path), then:
import neuroblastoma_gene_sets as nb_setsimport pandas as pd
df = pd.read_csv("gene_sets.tsv", sep="\t")
signatures = nb_sets.long_to_dict(df)
# -> {"MES": ["A2M", "ABRACL", ...], "ADRN": [...], ...}score_signatures_z scores each signature using z-scaled expression and writes results to ad.obs:
# Uses built-in gene sets automatically; z-scales ad.X if no "z" layer present
ad = nb_sets.score_signatures_z(ad)
# Results appear in ad.obs as e.g. "z-MES", "z-ADRN", ...Or pass custom signatures and control filtering:
ad = nb_sets.score_signatures_z(
ad,
signatures=my_dict, # {name: [gene, ...]}
min_genes=10, # skip signature if fewer genes overlap with ad.var_names
min_z=2, # z-score threshold to count a cell as "active"
min_cells=5, # skip signature if fewer than this many cells exceed min_z
)prepare_z_layer can also be called directly to z-scale before scoring:
ad = nb_sets.prepare_z_layer(ad) # adds ad.layers["z"]