Skip to content

FEAT: Runtime capability discovery for prompt targets#1699

Open
hannahwestra25 wants to merge 21 commits into
microsoft:mainfrom
hannahwestra25:hawestra/query_target_capabilities
Open

FEAT: Runtime capability discovery for prompt targets#1699
hannahwestra25 wants to merge 21 commits into
microsoft:mainfrom
hannahwestra25:hawestra/query_target_capabilities

Conversation

@hannahwestra25
Copy link
Copy Markdown
Contributor

@hannahwestra25 hannahwestra25 commented May 8, 2026

Description

Adds query_target_capabilities.py, which probes a PromptTarget at runtime to determine what the underlying endpoint actually accepts. Useful for custom OpenAI-compatible endpoints, gateways that strip features, or any deployment where declared capabilities may not match real behavior.

New public API (exported from pyrit.prompt_target)

  • query_target_capabilities_async — probes boolean capability flags (SYSTEM_PROMPT, MULTI_MESSAGE_PIECES, MULTI_TURN, JSON_OUTPUT, JSON_SCHEMA).
  • verify_target_modalities_async — probes which input-modality combinations are accepted.
  • verify_target_async — runs both and returns a populated TargetCapabilities.

Each probe is bounded by per_probe_timeout_s (default 30s) and retried once on transient errors. The target's configuration is temporarily replaced with a permissive one so _validate_request doesn't short-circuit. Probe-written memory rows are tagged with prompt_metadata["capability_probe"] == "1".

Caveats (in docstrings)

  • "Supported" means the request was accepted — silent ignores aren't detected.
  • Output modality probing is intentionally not provided.
  • Not safe to call concurrently against the same target instance.

Tests & docs

  • 32 unit tests, 98% coverage of the new module.
  • Notebook: 6_1_target_capabilities.ipynb.
  • New subsection in 0_prompt_targets.md.

Comment thread doc/code/targets/0_prompt_targets.md Outdated
)

# Probe a single dimension:
verified_caps = await query_target_capabilities_async(target=target)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more intuitive IMO if it was target.get_capabilities() or (even better) target.capabilities (and similarly target.input_modalities / target.output_modalities) since these are static after instantiation (right?).

Having to import a function makes it a bit more obscure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i get your point about importing but the idea is to query the capabilities if they are not known and I think target.get_capabilities and target.capabilties doesn't convey that we're making api calls (and there's potentially a couple api calls if we're querying for multiple capabilities.

Also, this query is about inspecting a target vs performing a responsibility that a target actually does so i think putting this responsibility in the target class bloats the class and confuses what the target actually does (a target doesn't query itself, it sends prompts) and then I would be concerned that users would think the query is a getter of the already declared capabilities vs actually making the api calls.

(also if it was unclear to you that this function was making api calls to determine the capabilities, I could change the naming of the function. something like discover_target_capabilities_async might be better?)

Comment thread pyrit/prompt_target/common/query_target_capabilities.py
Comment thread doc/code/targets/6_1_target_capabilities.py
)


async def _send_and_check_async(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a backoff here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question! I don’t think backoff is needed here since these probes are trying to distinguish supported vs unsupported and most failures are deterministic rather than transient, so waiting longer usually does not change the answer and could just add more time when you retry on a capability that isn't supported. I think a single immediate retry is enough to cover brief network noise or a one-off timeout without making capability detection noticeably slower. wdyt ?

original_value='Respond with a JSON object: {"ok": true}.',
original_value_data_type="text",
conversation_id=conversation_id,
prompt_metadata=_probe_metadata({"response_format": "json"}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that the point of this probe is not to actually see if the capabilities actually take effect in the response, but it seems like for _probe_json_output_async and _probe_json_schema_async, the metadata is only ever parsed for Responses and Chat targets and silently ignored by other PromptTarget subclasses if I'm understanding correctly, which I feel like is different from being silently ignored by the endpoint at inference time (for other probes). Would we be able to check if the returned value is json formatted? or maybe more clear in the docstring at least that it only applies to specific targets.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point! I updated the docstring to try to convey this restriction, but yes that is the limitation here. For some targets the JSON hint is converted into a real provider parameter; for others it is only metadata on the PyRIT side and never becomes a structured-output request. Parsing the returned text as JSON would test output compliance, but not native JSON-mode support. lmk if that isn't clear in the comments!

Comment on lines +79 to +80
CapabilityName.MULTI_TURN: UnsupportedCapabilityBehavior.RAISE,
CapabilityName.SYSTEM_PROMPT: UnsupportedCapabilityBehavior.RAISE,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: why only RAISE on these two?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are the only two capabilities which can be adapted

behaviors: Mapping[CapabilityName, UnsupportedCapabilityBehavior] = field(

and we don't want to allow that because we just want to know what is supported and not adapted

Comment thread pyrit/prompt_target/common/query_target_capabilities.py Outdated
Comment thread doc/code/targets/6_1_target_capabilities.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants