Skip to content

Prototype BigQuery CDC -> Anomaly detection python flex template.#3479

Closed
claudevdm wants to merge 17 commits intoGoogleCloudPlatform:mainfrom
claudevdm:bqmonitor
Closed

Prototype BigQuery CDC -> Anomaly detection python flex template.#3479
claudevdm wants to merge 17 commits intoGoogleCloudPlatform:mainfrom
claudevdm:bqmonitor

Conversation

@claudevdm
Copy link
Copy Markdown
Contributor

No description provided.

@claudevdm claudevdm force-pushed the bqmonitor branch 2 times, most recently from 7229c10 to 393de61 Compare March 11, 2026 17:44
@gemini-code-assist
Copy link
Copy Markdown

Warning

Gemini is experiencing higher than usual traffic and was unable to create the summary. Please try again in a few hours by commenting /gemini summary.

@claudevdm
Copy link
Copy Markdown
Contributor Author

/gemini summary

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 12.90323% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.20%. Comparing base (7264b3a) to head (379947f).
⚠️ Report is 53 commits behind head on main.

Files with missing lines Patch % Lines
...loud/teleport/plugin/maven/TemplatesStageMojo.java 5.00% 19 Missing ⚠️
...gle/cloud/teleport/plugin/DockerfileGenerator.java 27.27% 8 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##               main    #3479       +/-   ##
=============================================
+ Coverage     33.37%   52.20%   +18.82%     
- Complexity      481     6050     +5569     
=============================================
  Files           215     1040      +825     
  Lines         12816    62977    +50161     
  Branches       1249     6899     +5650     
=============================================
+ Hits           4277    32874    +28597     
- Misses         8203    27876    +19673     
- Partials        336     2227     +1891     
Components Coverage Δ
spanner-templates 72.19% <ø> (∅)
spanner-import-export 68.92% <ø> (∅)
spanner-live-forward-migration 80.45% <ø> (∅)
spanner-live-reverse-replication 77.85% <ø> (∅)
spanner-bulk-migration 89.17% <ø> (∅)
gcs-spanner-dv 85.30% <ø> (∅)
Files with missing lines Coverage Δ
...gle/cloud/teleport/plugin/DockerfileGenerator.java 84.29% <27.27%> (-5.71%) ⬇️
...loud/teleport/plugin/maven/TemplatesStageMojo.java 17.31% <5.00%> (-0.59%) ⬇️

... and 848 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gemini-code-assist
Copy link
Copy Markdown

Warning

Gemini is experiencing higher than usual traffic and was unable to create the summary. Please try again in a few hours by commenting /gemini summary.

@claudevdm
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new BigQuery Anomaly Detection template, which is a significant and valuable addition. The changes are well-structured, covering both Java and Python components, and include comprehensive unit and integration tests. The safe_eval module is particularly well-designed for secure expression evaluation, and the updates to Dockerfile generation enhance the flexibility of Python template deployment. The new README provides excellent documentation for users. Overall, this is a high-quality contribution.

@claudevdm claudevdm changed the title initial Prototype BigQuery CDC -> Anomaly detection python flex template. Mar 11, 2026
@claudevdm claudevdm marked this pull request as ready for review March 12, 2026 17:27
@claudevdm
Copy link
Copy Markdown
Contributor Author

@shunping can you please take a look at the anomaly detection parts?

@tvalentyn
Copy link
Copy Markdown
Contributor

@claudevdm would it make sense to break up this contribution into several commits that can be reviewed one-by-one?

File dockerfile = new File(dockerfilePath);
if (!dockerfile.exists()) {
List<String> filesToCopy = List.of(definition.getTemplateAnnotation().filesToCopy());
List<String> allFilesToCopy = List.of(definition.getTemplateAnnotation().filesToCopy());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of idea of splitting the PR, changes to existing file/infra can go in first while reviewing new templates is still in progress.

I see some fixes to stageFlexPythonTemplate (e.g. honor directoriesToCopy). As you may have noticed we currently do not build or release any Python templates. Existing ones have been commented out

as suggested in comment, they might never worked before. Wondering we can just get rid of these dead code (or revive them after staging being fixed, not in scope of this PR though)

@claudevdm
Copy link
Copy Markdown
Contributor Author

I will think about how to split. The CDC/IO source is already reviewed in apache/beam#37724 so reviewers can ignore that part.

I forked it here until it rolls out in a beam release

…correct pip args

- DockerfileGenerator: add setSetupFile() for FLEX_TEMPLATE_PYTHON_SETUP_FILE
  env and pip install of setup.py packages
- Dockerfile-template-python: use ARG REQUIREMENTS_FILE instead of ENV
  FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE to avoid launcher re-resolution;
  add directoriesToCopy and setupInstall placeholders; fix pip arg order
- TemplatesStageMojo: separate files from directories in filesToCopy,
  auto-detect setup.py, fix empty entryPoint default, use
  outputClassesDirectory for Dockerfile generation
- DockerfileGeneratorTest: update assertions for new pip command format
STORAGE_WRITE_API requires Java (xlang expansion service) which is not
available in the Python Flex Template container. STREAMING_INSERTS is
pure Python and sufficient for the low-volume aggregated results sink.
@claudevdm claudevdm closed this Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants