You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: detect service unavailability and fail fast with clear error (#118)
* fix: detect service unavailability and fail fast with clear error
When the Claude API returns persistent 500s, the SDK exhausts retries
and returns a result with subtype 'success' but is_error: true. Our
code only checked subtype, so it treated the error as success and
proceeded with validation retries — burning ~9 minutes on 30 hopeless
API calls before showing a raw JSON error.
Now:
- handleSDKMessage checks is_error on result messages
- 500/server_error/internal_error classified as SERVICE_UNAVAILABLE
- abortRetries flag skips validation retries on fatal SDK errors
- CLI adapter shows "AI service temporarily unavailable" instead of raw JSON
- Headless adapter emits service_unavailable error code
* chore: formatting
* fix: handle rate limit, network, and process exit errors with clear messages
Extend error classification to cover additional failure modes:
- 429/rate limit: "AI service is currently rate-limited"
- ECONNREFUSED/ETIMEDOUT/ENOTFOUND: "Could not connect to the AI service"
- Process exit: "AI agent process exited unexpectedly"
Rate limits also abort validation retries (same as 500s).
* fix: correct service-error regex and separate rate-limit handling
P1: The adapter regex /service.unavailable/ only matched a single char
between "service" and "unavailable", so it missed our own friendly
message "The AI service is temporarily unavailable". Fixed to
/service.*unavailable/. Also removed the "Agent SDK error:" prefix
from all framework integrations so user-friendly messages pass through
cleanly.
P2: 429 rate limits were folded into SERVICE_UNAVAILABLE_PREFIX, which
rewrote them to "temporarily unavailable" before adapters could see the
rate-limit signal. Now 429s get a separate RATE_LIMITED_PREFIX with
distinct messaging ("currently rate-limited"), while still aborting
validation retries.
0 commit comments