Skip to content

[KYUUBI #7379][2a/4] Data Agent Engine: tool system, data source, and prompt templates#7400

Open
wangzhigang1999 wants to merge 2 commits intoapache:masterfrom
wangzhigang1999:pr2a/data-agent-tools
Open

[KYUUBI #7379][2a/4] Data Agent Engine: tool system, data source, and prompt templates#7400
wangzhigang1999 wants to merge 2 commits intoapache:masterfrom
wangzhigang1999:pr2a/data-agent-tools

Conversation

@wangzhigang1999
Copy link
Copy Markdown
Contributor

Why are the changes needed?

Part 2a of 4 for the Data Agent Engine (umbrella, KPIP-7373).

This PR adds the tool system, data source abstraction, and composable prompt builder — the infrastructure that the agent runtime (PR 2b) will use to execute SQL and interact with the LLM.

Changes include:

  • AgentTool interface with ToolRiskLevel, JSON schema generation for LLM function calling
  • ToolRegistry — thread-safe tool registration, dispatch, timeout enforcement, and OpenAI-compatible tool definition export
  • RunSelectQueryTool / RunMutationQueryTool — read-only vs. mutation SQL execution with maxRows enforcement, output truncation, and SqlReadOnlyChecker
  • SqlExecutor — shared JDBC execution logic with statement timeout and result formatting
  • DataSourceFactory — HikariCP connection pool creation with optional user/password
  • JdbcDialect — auto-detection from JDBC URL with dialect-specific identifier quoting (Spark, MySQL, SQLite, Trino, generic fallback)
  • TableRef — catalog/schema/table reference with JSON deserialization support
  • SystemPromptBuilder — composable Markdown prompt assembly with date injection, per-dialect datasource sections, and free-form text sections
  • Prompt templates: base.md, datasource-{mysql,spark,sqlite,trino}.md
  • New kyuubi.engine.data.agent.tool.* and kyuubi.engine.data.agent.datasource.* configuration entries

How was this patch tested?

  • Unit tests (Java): JdbcDialectTest, TableRefTest, DataSourceFactoryAuthTest, SqlReadOnlyCheckerTest, RunSelectQueryToolTest, RunMutationQueryToolTest, ToolTest, ToolRegistryThreadSafetyTest, ToolSchemaGeneratorTest, SystemPromptBuilderTest
  • MySQL integration tests (Testcontainers): DataSourceFactoryTest, DialectTest, RunSelectQueryTest, RunMutationQueryTest, ToolExecutionTest — all run against a real MySQL container
  • Total: 164 JUnit tests + 21 ScalaTest tests, all passing

Was this patch authored or co-authored using generative AI tooling?

Partially assisted by Claude Code (Claude Opus 4.6) for test generation, code review, and PR formatting. Core design and implementation are human-authored.

wangzhigang1999 and others added 2 commits April 13, 2026 16:52
…urce, and prompt templates

Tool system with risk-based separation (RunSelectQueryTool / RunMutationQueryTool),
ToolRegistry with JSON schema generation, and SqlReadOnlyChecker keyword whitelist.

Data source abstraction with JdbcDialect auto-detection (Spark/Trino/MySQL/SQLite),
GenericDialect fallback for unknown JDBC subprotocols, TableRef value object for
structured table references with Jackson deserialization support, and HikariCP-backed
DataSourceFactory with credential isolation.

Composable SystemPromptBuilder with per-dialect prompt templates (base.md +
datasource-{name}.md), SQL workflow guidance, and query risk classification.
… remove jdbcUrl shortcut

- SystemPromptBuilder.datasource() now replaces the previous datasource section
  instead of appending, matching the single-datasource-per-session model
- Remove jdbcUrl() convenience method; callers use JdbcDialect.fromUrl() directly
- Remove redundant tests (pool config defaults, toString, tool metadata, fast timeout)
- Clean up todo comment in JdbcDialect

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@wangzhigang1999 wangzhigang1999 marked this pull request as ready for review April 13, 2026 11:01
@wangzhigang1999
Copy link
Copy Markdown
Contributor Author

Hi @pan3793, could you help review this PR when you have time? This is Part 2a of the Data Agent Engine series — it adds the tool system, data source abstraction, and prompt builder. Thanks! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant