Replies: 3 comments 1 reply
This comment was marked as off-topic.
This comment was marked as off-topic.
-
|
The approach suggested focuses on the ongoing flow of an app, leveraging git diff for real-time test generation. However, for git diff to be meaningful, the agent needs initial context or pre-existing test coverage. Without this, the agent wouldn't know what the changes are affecting or what's expected behavior versus a genuine issue. For example, if we assigned this task to a human QA, they would first need to understand what the app does, establish a baseline of test cases (with an instance of the app before the merge/commit), and then analyze what the diff changes. For AI to generate relevant tests, it needs a baseline—a snapshot of the app’s expected behavior. This means: Pre-existing tests or context Even without a fixed test suite, the system needs a way to understand:
This could come from:
Understanding changes in context
Generating targeted test cases Once the AI understands the context of the change, it can dynamically:
To summarize, a first-time setup flow is important for helping the agent succeed in ongoing flows. This step doesn't have to be manual (how it is currently); it can be automated. I've outlined this in a scoping doc last week, summarizing:
This way, every subsequent git diff is relative to something—not just an isolated patch of code. |
Beta Was this translation helpful? Give feedback.
-
|
A first MVP towards this is having tools that can detect the framework used (e.g. NextJS), analyze the code, write a test plan, write tests, and then execute the tests. High-level steps (actual command may be different)
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
From some conversations last week, I was interested in exploring more the idea of not having QA at all as part of the code base / build step. Instead, only have it as maintaining a high-quality production experience–business logic will still be tested by unit tests.
This is how a QA agent could work for production deployments.
mainFor feature deployments, bugs are left as comments on the PR instead.
Shortest generator prompt:
Shortest test runner prompt:
The resulting issues would be recorded with a single tool: note issue, which has a text note and screenshots of the UX (ideally, we would support something like instant reply somehow).
Beta Was this translation helpful? Give feedback.
All reactions