Conversation
There was a problem hiding this comment.
Code Review
This pull request implements memory optimizations by clearing dataset and evaluator references after use, updates NLTK tokenizer loading to support punkt_tab, and introduces a ReopenFileHandler to improve log visibility on FUSE-mounted filesystems. Review feedback identifies a hardcoded local path in the test suite that should be removed, points out redundant code in the evaluation loop, and suggests refactoring duplicated logic while cautioning against potential performance degradation from the new file handler on standard filesystems.
There was a problem hiding this comment.
Pull request overview
This PR primarily aims to improve log visibility on OSS/FUSE-mounted filesystems by changing how file logging is handled, and it also includes a few runtime memory-reduction tweaks plus an NLTK tokenizer fallback update.
Changes:
- Introduce
ReopenFileHandlerand use it for non-DEBUG file logging to force close/reopen per log record (better OSS/FUSE “near real-time” log visibility). - Reduce peak memory by releasing per-benchmark evaluator objects during
evaluate_model()and clearing adapter dataset references infinalize(). - Update iFEval sentence tokenizer loading to prefer
punkt_tab(with fallback to classicpunkt).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/benchmark/test_eval.py | Updates GSM8K test invocation (now sets debug=False and adds a fixed use_cache path). |
| evalscope/utils/logger.py | Adds ReopenFileHandler and switches file handler selection logic based on log_level. |
| evalscope/run.py | Frees evaluator objects during evaluation loop to reduce memory accumulation. |
| evalscope/benchmarks/ifeval/instructions_util.py | Adds check_nltk_data('punkt_tab') + tokenizer loading fallback logic. |
| evalscope/api/benchmark/adapters/default_data_adapter.py | Clears dataset references in finalize() to release memory post-eval. |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
No description provided.