Skip to content

Improve fault tolerance: handle parse errors, fix logging, and minor cleanup #1931

Merged
jnioche merged 1 commit into
mainfrom
fault-tolerance-and-cleanup
Jun 5, 2026
Merged

Improve fault tolerance: handle parse errors, fix logging, and minor cleanup #1931
jnioche merged 1 commit into
mainfrom
fault-tolerance-and-cleanup

Conversation

@dpol1
Copy link
Copy Markdown
Member

@dpol1 dpol1 commented Jun 4, 2026

A collection of small but concrete correctness and robustness improvements.

  • FetcherBolt: catch NumberFormatException when parsing crawl delay and
    max thread values from metadata; log a warning and fall back to defaults
    instead of crashing the bolt
  • ConfigurableTopology, URLFilters: replace e.printStackTrace() with
    proper SLF4J LOG.error() calls
  • CloudSearchUtils: fix misleading error message ("must be score" →
    "must NOT be score"); replace manual MessageDigest boilerplate with
    DigestUtils.sha512Hex()
  • S3CacheChecker, S3Cacher: use the StandardCharsets.UTF_8 overload of
    URLEncoder.encode() to remove an unnecessary checked exception
  • JsRenderingDetector: use CharsetIdentification.getCharsetFast() to
    detect the actual document charset instead of hardcoding UTF-8

- FetcherBolt: catch NumberFormatException for crawl delay and max thread
  metadata values; log warning and fall back to defaults
- ConfigurableTopology, URLFilters: replace e.printStackTrace() with
  SLF4J LOG.error()
- CloudSearchUtils: fix error message ("must be" -> "must NOT be");
  replace manual MessageDigest with DigestUtils.sha512Hex()
- S3CacheChecker, S3Cacher: use StandardCharsets.UTF_8 overload of
  URLEncoder.encode() to remove unnecessary checked exception
- JsRenderingDetector: use CharsetIdentification.getCharsetFast() instead
  of hardcoded UTF-8
@dpol1 dpol1 added this to the 3.6.1 milestone Jun 4, 2026
@dpol1 dpol1 requested a review from jnioche June 4, 2026 14:12
@jnioche jnioche merged commit 6a3785f into main Jun 5, 2026
2 checks passed
@jnioche jnioche deleted the fault-tolerance-and-cleanup branch June 5, 2026 06:59
@jnioche
Copy link
Copy Markdown
Contributor

jnioche commented Jun 5, 2026

thanks @dpol1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants