Python utilities for reading and managing Blacklite SQLite databases with zstandard compression support.
This project uses uv for dependency management. Install uv first:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# Or via pip
pip install uvThen sync dependencies:
cd scripts/python
uv syncFor development (includes testing tools):
uv sync --devscripts/python/
├── blacklite_tools/ # Core package
│ ├── codecs.py # Codec classes (Identity, Zstandard)
│ ├── database.py # Database utilities
│ ├── cli_reader.py # Reader CLI
│ ├── cli_compress.py # Compress CLI
│ └── cli_decompress.py # Decompress CLI
├── reader.py # Reader script wrapper
├── zstd-compress.py # Compress script wrapper
├── zstd-decompress.py # Decompress script wrapper
└── tests/ # Unit and integration tests
All commands can be run with uv run:
Read and display entries from a blacklite database. Automatically detects and handles compression:
# Read from default path
uv run python reader.py
# Read from specific database
uv run python reader.py /path/to/blacklite.db
# Or use the installed command
uv run blacklite-reader /path/to/blacklite.dbThe reader automatically detects:
- Uncompressed databases
- Zstd-compressed databases
- Dictionary-compressed databases
Convert an uncompressed database to zstandard-compressed format:
# Basic compression
uv run python zstd-compress.py source.db compressed.db
# With dictionary training (better compression)
uv run python zstd-compress.py --dict source.db compressed_dict.db
# Or use the installed command
uv run blacklite-compress --dict source.db compressed.dbDictionary compression typically achieves 4x better compression ratios for repetitive log data.
Convert a compressed database back to uncompressed format:
# Decompress (auto-detects dictionary)
uv run python zstd-decompress.py compressed.db uncompressed.db
# Or use the installed command
uv run blacklite-decompress compressed.db uncompressed.dbThe decompressor automatically detects and uses dictionaries if present in the source database.
Run all tests:
cd scripts/python
uv run pytestRun specific test suites:
# Unit tests only
uv run pytest tests/unit/ -v
# Integration tests only
uv run pytest tests/integration/ -v
# With coverage
uv run pytest --cov=blacklite_tools --cov-report=htmlTest fixtures are pre-generated and committed to the repository. To regenerate:
cd scripts/python
uv run python tests/fixtures/generate_fixtures.pyThis creates three test databases:
uncompressed.db- Plain SQLite databasecompressed.db- Zstd-compressed contentcompressed_dict.db- Dictionary-compressed content
The codebase follows a layered architecture:
-
Codec Layer (
codecs.py): Handles compression/decompressionIdentityCodec: No-op codec for uncompressed dataZstandardCodec: Zstandard compression with optional dictionary
-
Database Layer (
database.py): SQLite interactions and codec selectioncreate_codec(): Auto-detect and create appropriate codeccreate_dictionary(): Train compression dictionaries from data- Helper functions for database introspection
-
CLI Layer (
cli_*.py): Command-line interfaces- Click-based argument parsing
- High-level workflow orchestration
- User-facing error handling
-
Script Wrappers: Thin wrappers for backward compatibility
- Original script names remain executable
- Import and invoke CLI modules
- Uses python-zstandard library
- Dictionary training samples up to 10,000 entries by default
- Target dictionary size: 10 MB
- Compression happens via SQLite UDFs for efficient batch processing
Entries table:
CREATE TABLE entries (
epoch_secs LONG,
nanos INTEGER,
level INTEGER,
content BLOB
)Dictionary table (when using --dict):
CREATE TABLE zstd_dicts (
dict_id LONG NOT NULL PRIMARY KEY,
dict_bytes BLOB NOT NULL
)-
UDF Approach: Compression/decompression uses SQLite User-Defined Functions
- Registered via
db.register_function() - Allows efficient batch processing in SQL queries
- Better performance than row-by-row Python loops
- Registered via
-
Single-pass Processing: Compress/decompress operations use
INSERT INTO ... SELECTwith UDFs- Leverages SQLite's query optimizer
- Minimal memory footprint
- Transactional integrity
- sqlite-utils (≥3.0): High-level SQLite library
- zstandard (≥0.22.0): Zstandard compression
- click (≥8.0): CLI framework
- pytest (≥7.0, dev): Testing framework
- pytest-cov (≥4.0, dev): Coverage reporting
Same as parent Blacklite project.