A collection of useful Python utilities for common tasks.
| Utility | Description |
|---|---|
| ipynb_to_py | Convert Jupyter notebooks to Python scripts (code only) |
| collect_code | Collect code from a directory into a single Markdown file |
| Notebook | Description |
|---|---|
| WhisperTranscribe | Audio transcription with OpenAI Whisper (3 presets, auto language detection) |
Each script supports an optional JSON settings file: settings_<script_name>.json in the same directory. If the file doesn't exist, defaults are used. CLI arguments always override settings.
Convert Jupyter notebooks (.ipynb) to Python scripts, extracting only the code without outputs, images, or metadata.
- Extracts code from all code cells
- Removes cell outputs, images, and execution counts
- Optionally includes markdown cells as comments
- Preserves blank lines between cells for readability
- Simple CLI interface
python ipynb_to_py.py input.ipynb
python ipynb_to_py.py input.ipynb -o output.py
python ipynb_to_py.py input.ipynb --include-markdown| Flag | Description |
|---|---|
input |
Input .ipynb file (required) |
-o, --output |
Output .py file (default: same name with .py) |
--include-markdown |
Include markdown cells as Python comments |
{
"include_markdown": false
}Collect all source code from a directory into a single Markdown file — useful for feeding code to LLMs.
- Recursively scans a directory
- Ignores configurable directories, files, and extensions
- Outputs Markdown with syntax-highlighted code blocks
- Auto-detects language from file extension
python collect_code.py # scan current dir
python collect_code.py /path/to/project # scan specific dir
python collect_code.py /path -o context.md # custom output file
python collect_code.py . --ignore-dirs venv dist
python collect_code.py . --ignore-ext .log .csv| Flag | Description |
|---|---|
root_dir |
Directory to scan (default: .) |
-o, --output |
Output .md file (default: collected_code.md) |
--ignore-dirs |
Directories to skip |
--ignore-files |
Files to skip |
--ignore-ext |
Extensions to skip (e.g. .pyc .log) |
{
"root_dir": ".",
"output_file": "collected_code.md",
"ignore_dirs": [".git", "node_modules", "venv", "__pycache__", ".vscode", "dist", "build", ".idea"],
"ignore_files": [".DS_Store", "package-lock.json", ".env"],
"ignore_extensions": [".pyc", ".log", ".svg", ".png", ".jpg", ".jpeg", ".gif", ".ico", ".lock", ".zip", ".gz"]
}Google Colab notebook for audio-to-text transcription using OpenAI's Whisper model (via faster-whisper engine).
- Three presets: small (fast), medium (balanced), high (best quality)
- 99 languages with automatic detection
- Output: plain text (.txt), markdown (.md), SRT subtitles (.srt)
- Export audio with silence removed (.mp3)
- Runs on free Colab T4 GPU (all presets)
| Preset | Model | Beam size | VRAM | Speed |
|---|---|---|---|---|
| small | small |
1 | ~1 GB | ~30x realtime |
| medium | medium |
3 | ~2.5 GB | ~15x realtime |
| high | large-v3 |
5 | ~5 GB | ~5x realtime |
- Open the notebook in Google Colab (click the badge above)
- Set runtime to GPU: Runtime → Change runtime type → GPU (T4)
- Run all setup cells (Section 1)
- Select preset and options (Section 2)
- Upload audio file (Section 3)
- Run transcription (Section 4)
- Download results (Section 5)
Location: WhisperTranscribe_v1.ipynb
python dev-main-test/run_tests.pyTests use mock data in dev-main-test/mock_data/ and require no external dependencies.
- Create a new Python script in
dev-main/ - Add optional
settings_<name>.jsonsupport (defaults must work without it) - Update this README
- Add tests in
dev-main-test/
MIT