Skip to content

Circuit breaker state resets on server restart #38

@fenilsonani

Description

@fenilsonani

Circuit breakers are entirely in-memory (`internal/resilience/circuitbreaker.go`). When the server restarts, every breaker resets to closed. If a domain's MX was broken (breaker open), the server immediately starts hammering it again with delivery attempts until it hits the failure threshold (5 failures) and re-opens.

This creates a burst of unnecessary connections to broken servers on every restart, and delays delivery to other domains while workers are tied up on known-bad destinations.

Current behavior

```
restart → all breakers closed → 5 failed attempts per broken domain → breaker opens again
```

With N broken domains queued, that's 5×N wasted delivery attempts before things stabilize.

Suggestion

Persist breaker state (domain, state, failure count, last failure time) to Redis alongside the queue data. On startup, restore breaker state for domains that were recently open. A simple hash per domain with a TTL matching the breaker timeout would work — no need for a full state machine in Redis.

Alternatively, a lighter approach: on startup, check `delivery_log` for domains with recent consecutive failures and pre-open their breakers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpriority: mediumMedium prioritysmtpSMTP related

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions