Skip to content

DOC: Execute 1_loading_datasets notebook to populate cell outputs#1707

Open
romanlutz wants to merge 1 commit into
microsoft:mainfrom
romanlutz:fix/loading-datasets-notebook-output
Open

DOC: Execute 1_loading_datasets notebook to populate cell outputs#1707
romanlutz wants to merge 1 commit into
microsoft:mainfrom
romanlutz:fix/loading-datasets-notebook-output

Conversation

@romanlutz
Copy link
Copy Markdown
Contributor

Description

The published docs page at https://microsoft.github.io/PyRIT/code/datasets/loading-datasets/ renders all code cells with no output because the committed doc/code/datasets/1_loading_datasets.ipynb was checked in unexecuted (execution_count: null and empty outputs on every code cell). The site renders the .ipynb as-is — it doesn't execute notebooks at build time — so readers see only the source.

Sibling notebooks in the same folder (e.g., 2_seed_programming.ipynb) were committed with their outputs and render fine.

Change

Re-executed the notebook with jupytext --to notebook --execute and ran the standard pre-commit hooks (sanitize_notebook_paths, strip_notebook_progress_bars, nbstripout with --keep-output). The .py and .ipynb remain in sync.

Readers will now see:

  • Cell 1: full list of built-in datasets (~61 names)
  • Cell 3: sample seed values from airt_illegal and airt_malware
  • Cell 5: pyrit init log + memory.get_seeds(...) results

No source code changes.

The notebook was committed without execution, so the published docs page at https://microsoft.github.io/PyRIT/code/datasets/loading-datasets/ rendered all code cells without output. Re-execute the notebook so readers see the dataset list, sample seed values, and memory query results.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
eeee2345 added a commit to eeee2345/PyRIT that referenced this pull request May 11, 2026
@romanlutz pointed out the manual entry in 0_dataset.md is a small
hardcoded subset; the canonical list is generated by re-executing
1_loading_datasets.ipynb (which his microsoft#1707 handles). Dropping the
manual line; auto-registration via SeedDatasetProvider already
ensures agent_threat_rules appears in the regenerated notebook
output once microsoft#1707 lands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants