Independent research by i-314 Security Research. Not an OWASP project; not endorsed by or affiliated with the OWASP Foundation or the OWASP GenAI Security Project. The OWASP FinBot CTF is referenced solely as the validation target — see Acknowledgments.
Reference implementation of Tool Output Mimicry — a primitive that bypasses multi-layer agentic AI defenses by impersonating an upstream agent's structured task summary in a user-controllable field that a downstream agent reads.
Paper: doi.org/10.5281/zenodo.19794072 (Zenodo, CC BY 4.0) · concept-DOI 10.5281/zenodo.19794071
sequenceDiagram
autonumber
actor V as Vendor (attacker)
participant T as Tool registry<br/>(create_transfer)
participant D as FinDrive<br/>(file storage)
participant I as invoice_agent
participant P as payments_agent
participant Det as Detector
V->>T: Layer 1 — poison create_transfer<br/>description (Prereq B)
V->>D: Layer 2 — upload stego HTML<br/>(visible $5K, hidden $8K) (Prereq A)
V->>I: Layer 3 — submit invoice with<br/>[FRAUD AGENT TASK_SUMMARY] block in description
I->>P: forward task with stego attachment<br/>+ impersonation block in context
Note over P: payments_agent reads the impersonated<br/>"fraud_agent decision" as authoritative
P->>P: create_transfer(amount=$8000)<br/>(NOT invoice.amount=$5000)
Det-->>Det: Gate 1 — overpayment detected ✓<br/>Gate 2 — stego markers in attachment ✓
Det-->>V: status = completed
The invariant the primitive exploits: multi-agent orchestrators that forward plain-text task summaries between agents establish a trust boundary inside a user-controllable channel. Any field the downstream agent reads — an invoice description, a vendor profile note, document content — can be crafted to look like the upstream agent's structured output, and the downstream agent will treat it as authoritative.
Live capture re-validated 2026-04-27 against a freshly-registered OWASP FinBot CTF account: see evidence/finbot_capture_20260427.log. The detector emitted a completed state after a single attempt with the exact $8,000 overpayment against a $5,000 invoice the paper describes. The discovery narrative — including two material schema changes the FinBot deployment underwent between the original capture and this re-validation — is in docs/discovery_log.md.
A second-target reproducer is planned for paper v1.1 (target TBD; candidates listed in the paper §VII).
Install directly from this Git repository:
pip install git+https://github.com/314-ia/tool-output-mimicryOr clone for hacking / contributing:
git clone https://github.com/314-ia/tool-output-mimicry
cd tool-output-mimicry
pip install -e ".[test]"The importable Python package is tom_repro (intentional short alias for Tool Output Mimicry reproducer, same pattern as import sklearn from scikit-learn).
Note on PyPI: This package is intentionally not published on PyPI. The PyPI Acceptable Use Policy excludes "dual-use content, including content that is used for research into vulnerabilities, malware, or exploits, including bug bounties." This reproducer is exactly that kind of dual-use research artefact, so we distribute via Git + Zenodo + Software Heritage instead — the canonical academic publication path. Distribution channels other than PyPI (apt, npm, Docker, etc.) may be added later if community demand materialises and they have a more permissive policy for security research.
tom-repro-finbot --dry-runPrints the rendered impersonation block, the stego HTML attachment, and confirms each Gate-2 detector regex matches. Exits 0 on PASS, 2 on structural failure. This is the canonical CI smoke test.
You need an account at https://owasp-finbot-ctf.org. Read your finbot_session cookie value from the browser; the CSRF token and a vendor will be auto-bootstrapped from your session.
export FINBOT_COOKIE='<your finbot_session value>'
tom-repro-finbotYou can also pass --csrf and --vendor-id explicitly to skip the bootstrap step. Successful capture exits 0 and prints the FinBot detector's evidence dict.
src/tom_repro/primitive.py—UpstreamAgentImpersonation,ToolOutputMimicrysrc/tom_repro/stego.py—generate_stego_html,STEGO_CSS_MARKERSsrc/tom_repro/targets/finbot.py— OWASP FinBot CTF target drivertests/— offline structural tests (24 cases, no network)evidence/— verbatim capture transcripts from live target runsdocs/— discovery log
Modern multi-agent AI orchestrators forward each agent's task summary as authoritative context to the next agent in the pipeline. A user-controllable field that the downstream agent reads — an invoice description, a vendor profile field, document content — can be crafted to impersonate the structured output of an upstream agent. The downstream agent then issues redirected tool calls without violating its prompt-level guardrails.
Tool Output Mimicry is a refinement of indirect prompt injection (Greshake et al., 2023) specialised for the inter-agent trust boundary. The technique was empirically validated against the OWASP FinBot CTF, where it was the only primitive among twenty-plus attempts in two engagement sessions to capture the fine-print challenge — causing a payment-processor agent to issue a US$8,000 transfer against a US$5,000 invoice.
If you cite the technique (the primitive itself, the threat model, the empirical claim), cite the paper:
@misc{brana2026toolmimicry,
author = {Brana, Juan Pablo},
title = {Tool Output Mimicry: Bypassing Multi-Layer Agentic AI Defenses via Upstream-Agent Impersonation in User-Controlled Fields},
year = {2026},
institution = {i-314 Research Lab},
publisher = {Zenodo},
doi = {10.5281/zenodo.19794072},
url = {https://doi.org/10.5281/zenodo.19794072},
note = {Lab homepage: \url{https://i-314.com}}
}If you cite the software (this reproducer, a specific version of the code), cite the software record:
@software{brana2026toolmimicry_software,
author = {Brana, Juan Pablo},
title = {Tool Output Mimicry — Reference Reproducer},
year = {2026},
institution = {i-314 Research Lab},
publisher = {Zenodo},
version = {v0.1.0},
doi = {10.5281/zenodo.19826743},
url = {https://doi.org/10.5281/zenodo.19826743},
note = {Concept-DOI; resolves to the latest software version. Version-DOI for v0.1.0 is 10.5281/zenodo.19826744.}
}- New target adapters are the most-wanted contribution — see CONTRIBUTING.md for the adapter contract and conventions.
- Vulnerability disclosure (in this code, or vulnerabilities discovered using the primitive) — see SECURITY.md for scope, authorised-use checklist, and the dual-track disclosure process.
This work would not have been possible without the OWASP FinBot CTF — the public agentic-AI security training platform that served as the validation target for this primitive. Particular thanks to Helen Oakley (creator; Co-lead, OWASP Agentic Security Initiative) and John Sotiropoulos (Co-lead, OWASP Agentic Security Initiative) for designing and operating the platform on which the original capture (April 2026) and the post-publication re-validation (2026-04-27) were both performed, and to the broader OWASP GenAI Security Project community for the public discussion that refined the framing of the primitive.
The primitive itself is a refinement of indirect prompt injection (Greshake et al., 2023) for the inter-agent trust boundary — see the paper's references for the full intellectual lineage.
MIT — see LICENSE.
i-314 Security Research — juan.brana@i-314.com