Skip to content

Add query tags: DatabricksSqlSensor, DatabricksPartitionSensor #68704

Draft
cruseakshay wants to merge 1 commit into
apache:mainfrom
cruseakshay:feature/68582-databricks-query-tags
Draft

Add query tags: DatabricksSqlSensor, DatabricksPartitionSensor #68704
cruseakshay wants to merge 1 commit into
apache:mainfrom
cruseakshay:feature/68582-databricks-query-tags

Conversation

@cruseakshay

Copy link
Copy Markdown
Contributor

Extends the session-level query-tags instrumentation added in #66895 to the remaining Databricks operators and sensors that send queries to Databricks:

  • DatabricksSqlSensor
  • DatabricksPartitionSensor
  • DatabricksSQLStatementsOperator
  • DatabricksSQLStatementsSensor

Closes: #68582

What changed

Two mechanisms are used because these components talk to Databricks in two different ways:

  • DatabricksSqlSensor and DatabricksPartitionSensor go through DatabricksSqlHook, so they reuse the existing QUERY_TAGS session-parameter plumbing from Add session-level query tags to Databricks SQL operators #66895. The merged tags are set on the hook before the query runs.
  • DatabricksSQLStatementsOperator and DatabricksSQLStatementsSensor use the REST Statement Execution API (/api/2.0/sql/statements/), which does not accept session_configuration. The API exposes a native query_tags field ([{"key": ..., "value": ...}]), so tags are injected directly into the request body. For the sensor, tags are only applied when it submits a new statement; if a statement_id is passed in, nothing is submitted and no tags are attached.

Each component gains two parameters mirroring the existing SQL operators:

  • query_tags: dict[str, str | None] | None — user-supplied tags, templated

  • include_airflow_query_tags: bool = True — merge in Airflow context tags:

    • dag_id
    • task_id
    • run_id
    • try_number
    • map_index

User-supplied tags win on key collision.

The Airflow-context tag logic that previously lived in operators/databricks_sql.py is extracted to a shared utils/query_tags.py module:

  • get_airflow_query_tags
  • build_query_tags
  • dict_to_query_tag_list

This lets all five operators and sensors share one implementation.

This is a pure relocation: DatabricksSqlOperator and DatabricksCopyIntoOperator behavior is unchanged.

Was generative AI tooling used to co-author this PR?
  • Yes, Claude Code

Generated-by: Claude Code, following the [guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)

…ricksSQLStatementsOperator, DatabricksSQLStatementsSensor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve DatabricksSqlSensor, DatabricksSqlStatementsSensor and DatabricksSqlStatementsOperators with query tags

1 participant