Fix partitioned backfill widening a sub-day window to the whole day#68718
Open
Lee-W wants to merge 1 commit into
Open
Fix partitioned backfill widening a sub-day window to the whole day#68718Lee-W wants to merge 1 commit into
Lee-W wants to merge 1 commit into
Conversation
Backfilling a partitioned Dag for a window inside a single day (e.g. an hourly timetable for 08:00–09:00) created one run per cron tick of the entire day. iter_partition_dagrun_infos now honours the datetime window directly — both ends inclusive, at the timetable's own partition cadence — instead of rounding to whole calendar days; callers pass tz-aware datetimes.
a2bef10 to
0d94a54
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Fix the buggy partition-backfill-by-range path introduced in #67537
CronPartitionTimetable("0 * * * *")) backfilled over a window inside a single day produced one Dag run per cron tick of the entire day. A user backfilling the hourly Dagingest_team_a_player_statsfor 08:00–09:00 got ~24 runs instead of the requested hour.SerializedDAG.iter_dagrun_infos_betweentruncated its bounds to calendar dates (earliest.date()), thenCronPartitionTimetable.iter_partition_dagrun_infossnapped those to whole local days viaresolve_day_bound— discarding the time-of-day.What
iter_partition_dagrun_infoscontract changed fromearliest_date/latest_date: datetime.datetoearliest/latest: datetime.datetime.iter_dagrun_infos_betweendispatch passes the full datetimes (no more.date()truncation).CronPartitionTimetable.iter_partition_dagrun_infoswalks_align_to_next(earliest)whilecurrent <= latestat the cron cadence, instead of expanding to whole days.iter_dagrun_infos_betweenpath. Callers pass tz-aware datetimes (the backfill API and CLI already storefrom_date/to_dateasdatetime). The CLI clear-by-date commands usingresolve_day_boundare unchanged (separate concern).Was generative AI tooling used to co-author this PR?
Generated-by: [Claude] following the guidelines
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.