Skip to content

Fix partitioned backfill widening a sub-day window to the whole day#68718

Open
Lee-W wants to merge 1 commit into
apache:mainfrom
astronomer:partition-backfill-window
Open

Fix partitioned backfill widening a sub-day window to the whole day#68718
Lee-W wants to merge 1 commit into
apache:mainfrom
astronomer:partition-backfill-window

Conversation

@Lee-W

@Lee-W Lee-W commented Jun 18, 2026

Copy link
Copy Markdown
Member

Why

Fix the buggy partition-backfill-by-range path introduced in #67537

  • A partitioned timetable (e.g. CronPartitionTimetable("0 * * * *")) backfilled over a window inside a single day produced one Dag run per cron tick of the entire day. A user backfilling the hourly Dag ingest_team_a_player_stats for 08:00–09:00 got ~24 runs instead of the requested hour.
  • Root cause: SerializedDAG.iter_dagrun_infos_between truncated its bounds to calendar dates (earliest.date()), then CronPartitionTimetable.iter_partition_dagrun_infos snapped those to whole local days via resolve_day_bound — discarding the time-of-day.

What

  • Partition iteration now honours the actual datetime window instead of rounding to whole calendar days; the backfill range follows the timetable's own partition cadence.
  • iter_partition_dagrun_infos contract changed from earliest_date/latest_date: datetime.date to earliest/latest: datetime.datetime.
  • iter_dagrun_infos_between dispatch passes the full datetimes (no more .date() truncation).
  • CronPartitionTimetable.iter_partition_dagrun_infos walks _align_to_next(earliest) while current <= latest at the cron cadence, instead of expanding to whole days.
  • Bounds are honoured as instants, both ends inclusive — matching the non-partitioned iter_dagrun_infos_between path. Callers pass tz-aware datetimes (the backfill API and CLI already store from_date/to_date as datetime). The CLI clear-by-date commands using resolve_day_bound are unchanged (separate concern).

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: [Claude] following the guidelines


  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@Lee-W Lee-W added the backport-to-v3-3-test Backport to v3-3-test label Jun 18, 2026
Backfilling a partitioned Dag for a window inside a single day (e.g. an hourly
timetable for 08:00–09:00) created one run per cron tick of the entire day.
iter_partition_dagrun_infos now honours the datetime window directly — both ends
inclusive, at the timetable's own partition cadence — instead of rounding to whole
calendar days; callers pass tz-aware datetimes.
@Lee-W Lee-W force-pushed the partition-backfill-window branch from a2bef10 to 0d94a54 Compare June 18, 2026 15:35
@Lee-W Lee-W added this to the Airflow 3.3.0 milestone Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant