Predictive threat detection for Bluesky. Charcoal identifies accounts likely to engage with your content in a toxic or bad-faith manner — before that engagement happens.
When someone quotes or reposts your content on Bluesky, their followers are suddenly exposed to your posts. Charcoal monitors these amplification events, then scores the amplifier's followers on two axes:
- Toxicity — does this account have a pattern of hostile language?
- Topic overlap — does this account post about the same subjects you do?
Neither signal alone is a threat. An account that's hostile but posts about unrelated topics is unlikely to find you. An account that shares your topics but isn't hostile is probably an ally. The combination of high toxicity and high topic overlap is what Charcoal flags.
The output is a ranked threat list with evidence (the toxic posts that drove each score), so you can review and decide what action to take.
git clone https://github.com/musicjunkieg/charcoal.git
cd charcoal
cargo build --releaseCopy the example environment file and fill in your credentials:
cp .env.example .envYou need:
- BLUESKY_HANDLE — your Bluesky handle (e.g.
yourname.bsky.social)
No app password or authentication is needed — Charcoal uses the public AT Protocol API for all read operations.
Optional settings (see .env.example for details):
PUBLIC_API_URL— custom public API endpoint (default:https://public.api.bsky.app)CONSTELLATION_URL— Constellation backlink index URLCHARCOAL_SCORER— toxicity backend:onnx(default) orperspectiveCHARCOAL_MODEL_DIR— custom path for ONNX model filesCHARCOAL_DB_PATH— custom path for the SQLite databaseRUST_LOG— log level (default:charcoal=info)
cargo run -- initCreates the SQLite database and tables.
cargo run -- download-modelDownloads two ONNX models (~216 MB total) to your local machine:
- Toxicity model — Detoxify unbiased-toxic-roberta (~126 MB) for toxicity scoring
- Embedding model — all-MiniLM-L6-v2 (~90 MB) for semantic topic overlap
This is a one-time step. Both models run entirely locally — no API key needed,
no rate limits. Files are stored in ~/.local/share/charcoal/models/ (macOS:
~/Library/Application Support/charcoal/models/).
cargo run -- fingerprintFetches your recent posts and extracts a topic fingerprint using TF-IDF
analysis. The fingerprint shows what subjects you post about and how much.
Review the output to confirm it looks accurate. Rebuild anytime with
--refresh.
cargo run -- scan --analyzeThis is the main pipeline:
- Queries the Constellation backlink index for quote/repost events on your posts
- Fetches the follower list of each amplifier
- Scores each follower for toxicity and topic overlap
- Stores results in the database
Options:
--analyze— actually score followers (without this, only events are recorded)--max-followers N— limit followers analyzed per amplifier (default: 50)--concurrency N— parallel scoring workers (default: 8)
cargo run -- sweepScans your followers-of-followers — the accounts one hop removed from your direct audience. These are people who haven't encountered your content yet but may if an amplification event occurs.
Options:
--max-followers N— first-degree followers to scan (default: 200)--depth N— second-degree followers per first-degree (default: 50)--concurrency N— parallel scoring workers (default: 8)
This is slower than scan (potentially thousands of API calls) and is
designed for periodic use rather than continuous monitoring.
Score a single account:
cargo run -- score @someone.bsky.socialGenerate a threat report:
cargo run -- reportOutputs a ranked threat list to the terminal and saves a markdown report to
output/charcoal-report.md. Use --min-score N to filter by minimum threat score.
Check system status:
cargo run -- statusShows last scan time, database stats, fingerprint age, and scorer config.
Charcoal assigns each scored account a threat tier based on their combined toxicity + topic overlap score (0-100):
| Tier | Score | Meaning |
|---|---|---|
| Low | 0-7 | No significant threat signal |
| Watch | 8-14 | Some overlap or toxicity — worth monitoring |
| Elevated | 15-24 | Notable combination of hostility and topic proximity |
| High | 25+ | Strong threat signal — both toxic and topically close |
Charcoal uses a local ONNX model (Detoxify unbiased-toxic-roberta) by default. This model:
- Runs on CPU with no API calls or rate limits
- Returns scores across 7 toxicity categories
- Was trained to reduce bias around identity mentions (important when your topics include things like fat liberation, queer identity, or trans rights)
Google's Perspective API is available as a fallback by setting
CHARCOAL_SCORER=perspective in your .env file (requires a
PERSPECTIVE_API_KEY). Note: Perspective API is sunsetting December 2026.
Charcoal uses SQLite by default. For server deployments you can switch to PostgreSQL:
# Build with Postgres support
cargo build --release --features postgres
# Point at your database
export DATABASE_URL=postgres://user:pass@host/dbnamePrerequisite — pgvector extension: Charcoal's first migration runs
CREATE EXTENSION IF NOT EXISTS vector, which requires superuser privileges
(or the extension to be pre-installed by your database provider). On managed
Postgres (Railway, Fly.io, Supabase, etc.) the vector extension is usually
available but you may need to enable it through their dashboard or with a
superuser connection before running Charcoal for the first time. On
self-hosted Postgres, run CREATE EXTENSION vector as a superuser once:
-- Connect as a superuser, then:
CREATE EXTENSION IF NOT EXISTS vector;To transfer existing SQLite data to Postgres:
cargo run --features postgres -- migrate --database-url postgres://user:pass@host/dbnamesrc/
main.rs CLI entry point (clap)
config.rs Environment-based configuration
lib.rs Library root
bluesky/ Public AT Protocol client, post fetching, amplification types
topics/ TF-IDF topic extraction and fingerprinting
toxicity/ Scorer trait + ONNX and Perspective backends
scoring/ Profile building and threat score computation
pipeline/ Amplification detection pipeline
output/ Terminal display and markdown report generation
db/ SQLite/PostgreSQL backends, schema, queries, and data models
Charcoal includes a web-based dashboard for browsing scored accounts and triggering scans from a browser. It uses AT Protocol OAuth for authentication.
cd web && npm ci && npm run build && cd ..
cargo build --release --features webYou need three additional environment variables (see .env.example):
CHARCOAL_ALLOWED_DID— your Bluesky DID (only this account can sign in)CHARCOAL_OAUTH_CLIENT_ID— URL of your OAuth client metadata documentCHARCOAL_SESSION_SECRET— HMAC signing key for session cookies (generate withopenssl rand -hex 32)
For local development, use Tailscale Funnel to get a public HTTPS URL:
tailscale funnel 3000Then register that URL as your OAuth client ID.
cargo run --features web -- serveThe dashboard is available at http://localhost:3000 (or your Tailscale Funnel URL).
# First-time setup: install git hooks (enforces fmt + clippy + tests)
./scripts/install-hooks.sh
cargo test --features web # Run all 225 tests (unit + integration + OAuth)
cargo clippy # Lint
cargo run -- status # Quick smoke testOn macOS Tahoe with Xcode Beta installed, the linker may fail to find
clang_rt.osx. Fix by setting the library path before build/test commands:
export LIBRARY_PATH="/Library/Developer/CommandLineTools/usr/lib/clang/17/lib/darwin:$LIBRARY_PATH"Add this to your shell profile (~/.zshrc) to make it permanent.
PostgreSQL integration tests require a live database:
DATABASE_URL=postgres://charcoal:charcoal@localhost/charcoal_test \
cargo test --all-targets --features postgresAll rights reserved.