✨ feat: allow different judge models for same judge type and show stats in dashboard by marcorusso97 · Pull Request #420 · AISecurityLab/hackagent

Marco Russo (marcorusso97) · 2026-06-04T14:01:06Z

Summary

This PR introduces full support for running multiple judges of the same type with different models, and ensures their outputs are correctly tracked, aggregated, and rendered across the dashboard.

It also fixes consistency issues between summary panels and expanded detail views, so judge counts, names, metrics, and verdicts stay aligned.

Why

When two or more judges shared the same type, judge vote keys could collide and overwrite each other.
This caused missing judges, incorrect counts, incomplete strictness/ASR values, and absent verdict blocks in detail cards.

What Changed

Multi-judge key stability

Added deterministic suffixing for duplicate judge types.
Preserved distinct per-judge votes using canonical keys such as:
- eval_hb
- eval_hbv_1
- eval_hbv_2

Evaluation and metrics pipeline

Updated evaluation handling to avoid overwriting votes from repeated judge types.
Improved aggregation and sync logic to preserve per-judge outputs end to end.

Dashboard enrichment and rendering

Standardized propagation of:
- judge votes
- judge metadata (name/type)
- per-goal multi-judge metrics
Added robust fallbacks for legacy runs and sparse trace payloads.
Unified multi-judge verdict styling and behavior across attack cards.

Attack card updates

Improved verdict rendering in:
- AdvPrefix
- PAP
- Baseline
- BoN
- Generic card paths used by FlipAttack, CipherChat, and H4RM3L
Does not apply to scorer-based attacks (AutoDan-Turbo, PAIR, TAP).
Ensured verdicts also appear in mitigated scenarios when judge votes are available.

Tests

Updated unit tests for:
- evaluation step
- sync behavior
- metrics behavior
Added coverage for repeated judge-type scenarios and key-collision prevention.

Impact

Backward compatible for existing runs.
Eliminates duplicate-judge key collisions.
Improves reliability and transparency of multi-judge analytics in the dashboard.

…ts in dashboard

codecov · 2026-06-04T14:12:09Z

Codecov Report

❌ Patch coverage is 15.78947% with 576 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
hackagent/server/dashboard/_page.py	0.51%	386 Missing ⚠️
...kagent/server/dashboard/attack_cards/_advprefix.py	0.00%	40 Missing ⚠️
hackagent/server/dashboard/attack_cards/_shared.py	10.00%	36 Missing ⚠️
...ckagent/server/dashboard/attack_cards/_baseline.py	0.00%	30 Missing ⚠️
hackagent/attacks/evaluator/evaluation_step.py	78.09%	23 Missing ⚠️
...ackagent/server/dashboard/attack_cards/_generic.py	0.00%	22 Missing ⚠️
hackagent/server/dashboard/attack_cards/_bon.py	0.00%	11 Missing ⚠️
hackagent/attacks/techniques/bon/generation.py	0.00%	9 Missing ⚠️
hackagent/attacks/techniques/pap/generation.py	0.00%	9 Missing ⚠️
hackagent/server/dashboard/attack_cards/_pap.py	0.00%	7 Missing ⚠️
... and 2 more

📢 Thoughts on this report? Let us know!

✨ feat: allow different judge models for same judge type and show sta…

96bd08b

…ts in dashboard

Marco Russo (marcorusso97) requested a review from Raffaele Paolino (RPaolino) June 4, 2026 14:01

Marco Russo (marcorusso97) linked an issue Jun 4, 2026 that may be closed by this pull request

Allow usage of multiple judges of same type #414

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ feat: allow different judge models for same judge type and show stats in dashboard#420

✨ feat: allow different judge models for same judge type and show stats in dashboard#420
Marco Russo (marcorusso97) wants to merge 1 commit into
mainfrom
414-allow-usage-of-multiple-judges-of-same-type

Marco Russo (marcorusso97) commented Jun 4, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Marco Russo (marcorusso97) commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What Changed

Multi-judge key stability

Evaluation and metrics pipeline

Dashboard enrichment and rendering

Attack card updates

Tests

Impact

Uh oh!

codecov Bot commented Jun 4, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Marco Russo (marcorusso97) commented Jun 4, 2026 •

edited

Loading