Skip to content

dogwrap: emit start signals for cron job monitoring  #933

@pythonicrahul

Description

@pythonicrahul

Is your feature request related to a problem? Please describe.

dogwrap today only emits signals after a command finishes. This means if a cron job never starts — due to scheduler failure, OOM, bad crontab, or host going down — there is zero signal in Datadog. You can monitor for failures, but you cannot monitor for missing executions.

This is a well-known gap in cron monitoring. The standard solution is a "dead man's snitch" / Cronitor-style pattern: emit a heartbeat when the job starts, then alert if that heartbeat goes missing.

Describe the solution you'd like

Add two start signals to dogwrap, gated on existing flags (no new CLI surface):

  1. dogwrap.started metric (gauge, value=1) — fires before execute() when --send_metric is already set. Uses the same tag construction as the existing dogwrap.duration metric.

  2. Start event (alert_type=info) — fires before execute() when --submit_mode all is already set. Title: "[host] job-name started".

Together with the existing dogwrap.duration metric, this enables monitoring for cron jobs that were never triggered or failed to run:

Monitor: "alert if dogwrap.started{job:nightly-etl} has no data in last 25h"

Behavior matrix:

--submit_mode --send_metric start metric start event
errors False no no
errors True yes no
all False no yes
all True yes yes

Describe alternatives you've considered

  • External wrapper script that calls the Datadog API before invoking dogwrap. Works but adds complexity and defeats the purpose of having dogwrap as a single wrapper.
  • Datadog Agent cron integration — only works if cron runs on a host with the Agent, doesn't cover containerized or serverless cron.
  • Custom monitors on missing dogwrap.duration — fragile because duration only fires on --submit_mode all or on failure, and a missing execution produces no data point at all (indistinguishable from "no job scheduled").

Additional context

  • No new CLI flags needed — reuses --send_metric and --submit_mode all.
  • No behavior change for existing users — purely additive.
  • Requires a minor refactor: moving initialize(), host resolution, and tag construction before execute() (they don't depend on execution results).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions