Skip to content

[Python] Add agent-framework-azure-ai-contentunderstanding package#4829

Merged
TaoChenOSU merged 95 commits intomicrosoft:mainfrom
yungshinlintw:yslin/contentunderstanding-context-provider
Apr 28, 2026
Merged

[Python] Add agent-framework-azure-ai-contentunderstanding package#4829
TaoChenOSU merged 95 commits intomicrosoft:mainfrom
yungshinlintw:yslin/contentunderstanding-context-provider

Conversation

@yungshinlintw
Copy link
Copy Markdown
Member

@yungshinlintw yungshinlintw commented Mar 22, 2026

Reviewer's Guide

Closes #4942

This package adds a BaseContextProvider implementation that bridges Azure Content Understanding (CU) with the Agent Framework. When a user sends file attachments (PDF, images, audio, video), the provider intercepts them in before_run(), sends them to CU for analysis, and injects the structured results (markdown + extracted fields) back into the LLM context — so the agent can answer questions about the files without the developer writing any extraction code.

Quick usage:

cu = ContentUnderstandingContextProvider(
    endpoint="https://my-resource.services.ai.azure.com/",
    credential=AzureCliCredential(),
)
agent = Agent(
    client=client,
    name="DocQA",
    instructions="You are a document analyst.",
    context_providers=[cu],
)
# Files in Message.contents are auto-analyzed; results injected into LLM context
response = await agent.run(
    Message(role="user", contents=[
        Content.from_text("What's on this invoice?"),
        Content.from_uri("https://example.com/invoice.pdf", media_type="application/pdf",
                         additional_properties={"filename": "invoice.pdf"}),
    ]),
    session=session,
)

Suggested review order

1. Start with samples — they show the feature set and usage patterns end-to-end:

Sample What it demonstrates
01_document_qa.py Simplest flow — upload a PDF via URL, ask a question about it. Shows Content.from_uri(), context_providers=[cu], and how CU results appear in the agent's response.
02_multi_turn_session.py AgentSession persistence — upload a file on turn 1, ask follow-up questions on turns 2–3 without re-uploading. Shows how state["documents"] carries across turns.
03_multimodal_chat.py PDF + audio + video in a single session (5 turns). Shows auto-detection of media types, parallel analysis, and multi-segment video output with per-segment fields.
04_invoice_processing.py Per-file analyzer override — uses additional_properties={"analyzer_id": "prebuilt-invoice"} to extract structured invoice fields (vendor, total, line items) instead of generic markdown.
05_background_analysis.py Non-blocking analysis with max_wait=0.5 — file starts analyzing in the background while the agent responds immediately. Next turn resolves the pending result. Shows the analyzingready status flow.
06_large_doc_file_search.py CU extraction + OpenAI vector store for RAG — large documents are analyzed by CU, uploaded to a vector store, and retrieved via file_search tool instead of injecting full content into context.

2. Then review the core implementation:

Priority File Why
🔴 High _context_provider.py (1087 lines) Core logic — before_run() hook, file detection/stripping, CU analysis with timeout + background fallback, output formatting, tool registration. Most important file to review.
🔴 High _models.py Public API surface — DocumentEntry, DocumentStatus, AnalysisSection, FileSearchConfig TypedDicts and enums exposed to users
🟡 Medium _file_search.py FileSearchBackend protocol + OpenAI/Foundry factory methods for vector store integration
🟡 Medium __init__.py Public exports — verify the right symbols are exposed
🟡 Medium pyproject.toml Package metadata, dependencies, version constraints
🟢 Low tests/ 78 unit tests + 5 live integration tests

MAF API usage (needs team alignment)

This package uses the following internal/private MAF APIs — if any of these are changing or not intended for external use, this package may need updates:

  • BaseContextProvider and its before_run() hook
  • SessionContext.extend_instructions(), extend_messages(), extend_tools()
  • Content.from_data(), Content.from_uri(), Content.type, Content.media_type, Content.additional_properties
  • FunctionTool for registering list_documents()
  • agent_framework._sessions.AgentSession
  • agent_framework._settings.load_settings()

This PR adds agent-framework-azure-ai-contentunderstanding, an optional connector package that integrates Azure Content Understanding (CU) into the Agent Framework as a context provider.

What's Included

Core (_context_provider.py, _models.py, _file_search.py)

  • ContentUnderstandingContextProvider -- auto-analyzes file attachments (PDF, images, audio, video) via Azure CU and injects structured results (markdown, fields) into LLM context
  • Auto-detects media type and selects the right CU analyzer (prebuilt-documentSearch, prebuilt-audioSearch, prebuilt-videoSearch)
  • Multi-document session state with status tracking (analyzing/uploading/ready/failed)
  • Configurable timeout (max_wait) with async background fallback
  • Output filtering (>90% token reduction) via AnalysisSection enum
  • Auto-registered list_documents() tool for status queries
  • Document content injected into conversation history for follow-up turns
  • Multi-segment video/audio: per-segment fields with time ranges
  • MIME sniffing for misidentified files (application/octet-stream)
  • Per-file analyzer ID override via Content.additional_properties["analyzer_id"] -- mix different analyzers in the same turn (e.g., prebuilt-invoice for invoices alongside prebuilt-documentSearch for general docs)
  • Duplicate filename rejection (filenames must be unique within a session)
  • Optional FileSearchConfig for vector store integration (OpenAI/Foundry backends)

Samples (6 scripts + 3 DevUI)

  • 01_document_qa.py -- Single PDF upload + Q&A
  • 02_multi_turn_session.py -- AgentSession persistence across turns
  • 03_multimodal_chat.py -- PDF + audio + video parallel analysis (5 turns)
  • 04_invoice_processing.py -- Structured field extraction with prebuilt-invoice
  • 05_background_analysis.py -- Non-blocking analysis with max_wait + status tracking
  • 06_large_doc_file_search.py -- CU extraction + vector store RAG
  • 02-devui/01-multimodal_agent -- Interactive web UI for uploading and chatting with documents/audio/video
  • 02-devui/02-file_search_agent/azure_openai_backend -- DevUI with CU + Azure OpenAI file_search RAG
  • 02-devui/02-file_search_agent/foundry_backend -- DevUI with CU + Foundry file_search RAG

Tests

  • 66 unit tests covering all major flows
  • 5 live integration tests (CU endpoint required)
  • Test fixtures for PDF, audio, video, image, invoice modalities

@markwallace-microsoft markwallace-microsoft added documentation Improvements or additions to documentation python labels Mar 22, 2026
@github-actions github-actions Bot changed the title [WIP] [Python] Add agent-framework-azure-contentunderstanding package (DO NOT REVIEW) Python: [WIP] [Python] Add agent-framework-azure-contentunderstanding package (DO NOT REVIEW) Mar 22, 2026
@markwallace-microsoft
Copy link
Copy Markdown
Contributor

markwallace-microsoft commented Mar 23, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/azure-contentunderstanding/agent_framework_azure_contentunderstanding
   _context_provider.py2514283%177–180, 283–284, 286, 290–291, 294, 298, 381, 383, 399, 521, 525, 579, 629–630, 653, 655–658, 798, 802, 808, 828–833, 835–840, 848, 857–858
   _detection.py801186%156, 162, 167–172, 231–233
   _extraction.py1741293%40, 122, 125, 146, 224, 256, 277–281, 292
   _file_search.py23482%58, 65, 69, 72
   _models.py37197%111
TOTAL30448354988% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
6113 30 💤 0 ❌ 0 🔥 1m 40s ⏱️

@yungshinlintw yungshinlintw changed the title Python: [WIP] [Python] Add agent-framework-azure-contentunderstanding package (DO NOT REVIEW) Python: [WIP] [Python] Add agent-framework-azure-ai-contentunderstanding package (DO NOT REVIEW) Mar 26, 2026
@yungshinlintw yungshinlintw changed the title Python: [WIP] [Python] Add agent-framework-azure-ai-contentunderstanding package (DO NOT REVIEW) [Python] Add agent-framework-azure-ai-contentunderstanding package Mar 27, 2026
@yungshinlintw yungshinlintw marked this pull request as ready for review March 27, 2026 05:14
Copilot AI review requested due to automatic review settings March 27, 2026 05:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new optional Python connector package, agent-framework-azure-ai-contentunderstanding, integrating Azure Content Understanding (CU) into the Agent Framework as a BaseContextProvider for automatic attachment analysis and optional vector-store (file_search) indexing.

Changes:

  • Introduces ContentUnderstandingContextProvider plus supporting models and vector-store upload abstraction (FileSearchBackend / FileSearchConfig).
  • Adds extensive unit + integration tests and CU result fixtures, along with script + DevUI samples.
  • Wires the new workspace package into python/pyproject.toml and python/uv.lock.

Reviewed changes

Copilot reviewed 36 out of 38 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
python/uv.lock Adds the new workspace member and locks new deps (azure-ai-contentunderstanding, filetype).
python/pyproject.toml Registers the package in workspace deps and adds pyright test env config.
python/packages/azure-ai-contentunderstanding/pyproject.toml New package metadata, deps, and tooling config (pytest/ruff/mypy/pyright).
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/init.py Public exports for provider/models/backends.
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_models.py Defines DocumentStatus, AnalysisSection, DocumentEntry, FileSearchConfig.
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_file_search.py Adds backend abstraction for vector store upload/delete across OpenAI/Foundry clients.
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_context_provider.py Implements CU analysis, session tracking, background analysis, MIME sniffing, and optional vector-store upload.
python/packages/azure-ai-contentunderstanding/tests/cu/conftest.py Adds fixtures and mock CU client factory.
python/packages/azure-ai-contentunderstanding/tests/cu/test_models.py Unit tests for enums/typed models and FileSearchConfig factories.
python/packages/azure-ai-contentunderstanding/tests/cu/test_context_provider.py Comprehensive unit tests for provider flows (analysis, background, sniffing, file_search).
python/packages/azure-ai-contentunderstanding/tests/cu/test_integration.py Live CU integration tests (skipped unless env var is set).
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_pdf_result.json CU PDF fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_invoice_result.json CU invoice fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_image_result.json CU image fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_audio_result.json CU audio fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_video_result.json CU video fixture for unit tests.
python/packages/azure-ai-contentunderstanding/README.md Package README with setup guidance and usage examples.
python/packages/azure-ai-contentunderstanding/LICENSE Adds MIT license file for the new package.
python/packages/azure-ai-contentunderstanding/AGENTS.md Package-specific agent/dev notes and architecture description.
python/packages/azure-ai-contentunderstanding/.gitignore Ignores local-only artifacts under the package.
python/packages/azure-ai-contentunderstanding/samples/README.md Top-level samples index for scripts and DevUI examples.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/01_document_qa.py Script sample: single PDF upload + Q&A.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/02_multi_turn_session.py Script sample: session persistence across turns.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/03_multimodal_chat.py Script sample: PDF+audio+video parallel CU analysis.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/04_invoice_processing.py Script sample: per-file analyzer override for invoice extraction.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/05_background_analysis.py Script sample: short max_wait triggers background analysis + status.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/06_large_doc_file_search.py Script sample: CU extraction + vector-store indexing for file_search.
python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/agent.py DevUI agent: CU-powered upload + chat.
python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/init.py DevUI agent module export.
python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/README.md DevUI setup/usage doc for multimodal agent.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/azure_openai_backend/agent.py DevUI agent: CU + file_search (Azure OpenAI backend).
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/azure_openai_backend/init.py DevUI agent module export.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/azure_openai_backend/README.md DevUI setup/usage doc for Azure OpenAI file_search agent.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/foundry_backend/agent.py DevUI agent: CU + file_search (Foundry backend).
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/foundry_backend/init.py Foundry backend sample package init.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/foundry_backend/README.md DevUI setup/usage doc for Foundry file_search agent.
python/AGENTS.md Adds the new package to the Python “Azure Integrations” index.

Comment thread python/packages/azure-contentunderstanding/AGENTS.md
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 38 changed files in this pull request and generated 3 comments.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 38 changed files in this pull request and generated 3 comments.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 38 changed files in this pull request and generated 1 comment.

Comment thread python/packages/azure-contentunderstanding/README.md
@yungshinlintw yungshinlintw force-pushed the yslin/contentunderstanding-context-provider branch from 5677c0b to 9f31124 Compare March 27, 2026 21:52
Copy link
Copy Markdown
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a couple of comments, the most important one is in the 05_background_analysis because it is a core question about how this should be used, let's discuss options for that and then we need some updates most likely

Add Azure Content Understanding integration as a context provider for the
Agent Framework. The package automatically analyzes file attachments
(documents, images, audio, video) using Azure CU and injects structured
results (markdown, fields) into the LLM context.

Key features:
- Multi-document session state with status tracking (pending/ready/failed)
- Configurable timeout with async background fallback for large files
- Output filtering via AnalysisSection enum
- Auto-registered list_documents() and get_analyzed_document() tools
- Supports all CU modalities: documents, images, audio, video
- Content limits enforcement (pages, file size, duration)
- Binary stripping of supported files from input messages

Public API:
- ContentUnderstandingContextProvider (main class)
- AnalysisSection (output section selector enum)
- ContentLimits (configurable limits dataclass)

Tests: 46 unit tests, 91% coverage, all linting and type checks pass.
- Replace synthetic fixtures with real CU API responses (sanitized)
- Update test assertions to match real data (Contoso vs CONTOSO,
  TotalAmount vs InvoiceTotal, field values from real analysis)
- Add --pre install note in README (preview package)
- Document unenforced ContentLimits fields (max_pages, duration)
yungshinlin and others added 4 commits April 14, 2026 21:10
…remove wrappers

- All __init__ args now keyword-only (matches FoundryChatClient pattern)
- New 'client' param accepts pre-built ContentUnderstandingClient
- core dep bound: >=1.0.0rc5 → >=1.0.0,<2
- Self import moved after local imports
- Removed 9 static method wrappers; callsites use module functions directly
- Tests updated to import derive_doc_key and format_result directly
The client was being created twice — once inside the if/else block and
again unconditionally after it. The second instantiation overwrote the
pre-built client path and failed type checking when credential was None.
yungshinlin and others added 4 commits April 22, 2026 16:16
Package: agent-framework-azure-ai-contentunderstanding → agent-framework-azure-contentunderstanding
Module: agent_framework_azure_ai_contentunderstanding → agent_framework_azure_contentunderstanding
Directory: packages/azure-ai-contentunderstanding → packages/azure-contentunderstanding

Per agreement with PM and MAF team to drop 'AI' from the package name.
…amespace

Enables: from agent_framework.foundry import ContentUnderstandingContextProvider

Exports: ContentUnderstandingContextProvider, FileSearchConfig,
FileSearchBackend, AnalysisSection, DocumentStatus

Updates all samples and README to use the foundry namespace import.
Copy link
Copy Markdown
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some cleanup to do

Comment thread .vscode/settings.json Outdated
yungshinlin and others added 3 commits April 28, 2026 12:15
…_search sample

Address review feedback from TaoChenOSU:
- 05_large_doc_file_search.py: use client.client instead of manually
  constructing AsyncAzureOpenAI; remove openai dependency
- azure_openai_backend/agent.py: import reorder only (AIProjectClient
  kept — required for sync vector store creation in DevUI)
When a ContentUnderstandingClient is passed via client=, the caller
owns its lifecycle. Added _owns_client flag so close() only closes
the client when we created it internally.
@TaoChenOSU TaoChenOSU added this pull request to the merge queue Apr 28, 2026
Merged via the queue into microsoft:main with commit 1e1eda6 Apr 28, 2026
34 checks passed
@github-project-automation github-project-automation Bot moved this from Community PR to Done in Agent Framework Apr 28, 2026
moonbox3 added a commit that referenced this pull request Apr 29, 2026
* Python: bump package versions for 1.2.2 release

PATCH bump (1.2.1 -> 1.2.2) for the released cohort. Five PRs land in this
window:

- agent-framework-openai: fix file_search citations breaking the assistant-
  message history roundtrip (#5557) — drives the released-tier PATCH
- agent-framework-orchestrations: [BREAKING] standardize orchestration
  terminal outputs as AgentResponse (#5301)
- agent-framework-core, agent-framework-declarative: preserve Workflow.run()
  shared state across calls, accept list[Message] in declarative start
  executor, and coerce Enum values when serializing PowerFx symbols (#5531)
- agent-framework-foundry-hosting: add hosted Durable Workflow support
  (#5531)
- agent-framework-azure-contentunderstanding: new alpha package — Azure AI
  Content Understanding context provider (#4829)
- dependencies: workspace package dependency refresh (#5555)

Per lockstep convention, all 21 beta packages stamp 1.0.0b260429 and all 4
alpha packages (now including the new contentunderstanding) stamp
1.0.0a260429. Date stamp reflects 2026-04-29 Pacific. Every non-core package
floor on agent-framework-core is raised to >=1.2.2; the new
contentunderstanding package's stale >=1.0.0 floor is brought into line.

Two follow-on fixes bundled to keep validate-dependency-bounds-test green
at lowest-direct resolution:
- Bump agent-framework-azure-contentunderstanding's azure-ai-content
  understanding lower bound from >=1.0.0 to >=1.0.1 (1.0.0 ships without
  proper typing — pyright reports 65 unknown-type errors)
- Add pyright ignore comments to core/foundry/__init__.pyi for the new
  alpha package's type-stub imports, since alpha packages are not in
  core's [all] extra and therefore aren't installed at lowest-direct

* Python: add #5552 to 1.2.2 CHANGELOG

Add the streaming-span observability fix to the Fixed section. PR is on
upstream/main but not yet pulled into origin/main; the code itself will
land via the PR merge.

* Python: address PR #5561 review feedback on dependency bounds

Two packaging fixes flagged in review:

1. agent-framework-azure-contentunderstanding: add agent-framework-foundry
   as a runtime dependency. The package's README directs users to
   `pip install agent-framework-azure-contentunderstanding --pre` and the
   basic example imports `FoundryChatClient` from `agent_framework.foundry`,
   so the documented install path was failing with ImportError. Pulling
   agent-framework-foundry into deps makes the advertised entry path
   self-contained.

2. agent-framework-foundry: bump agent-framework-openai lower bound from
   >=1.1.0 to >=1.2.2,<2. Foundry imports private modules from
   agent_framework_openai (`_chat_client.py:22`, `_agent.py:34`), so
   resolvers were free to pair foundry==1.2.2 with older OpenAI versions
   that lack this release's coordinated Responses/history fix. Lockstep the
   floor with the released cohort to prevent mismatched installs.

Both changes pass `validate-dependency-bounds-test` lower + upper at
their respective packages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Python: [Feature]: Azure Content Understanding context provider for multimodal document analysis

8 participants