What:
Add integration with DataChef (arXiv:2602.11089, Feb 2026) — a 32B LLM trained via RL to generate complete end-to-end NeMo Curator pipeline specifications (synthesis strategy, filter chain, mixing ratios) given a target benchmark and base model. Exposes a DataChefRecipeGenerator that outputs a valid NeMo Curator config YAML.
Why:
DataChef achieves 66.7 on AIME'25 for a Qwen3-1.7B math-adapted model — surpassing the official Qwen3 post-training checkpoint for the same base model. It matches human expert curation across 6 held-out tasks. The RL-trained recipe generator eliminates the manual trial-and-error of pipeline design, which is the primary bottleneck in practice.
Definition of Done:
- DataChefRecipeGenerator under nemo_curator/recipe/
- Interface: accepts target_benchmark: str, base_model_id: str, available_data_sources: List[str], compute_budget_tokens: int
- Calls DataChef API (hosted or local) with structured prompt encoding the above
- Parses DataChef output into a valid NeMo Curator pipeline config YAML
- Config validation: runs a dry-run of the generated pipeline on 1M token sample before full execution
- Proxy reward integration: evaluates generated recipe quality on a fast proxy before committing to full run
- Fallback: if DataChef unavailable, outputs a best-practice template config for the domain
- Tutorial: generate and execute a math-specialization recipe using DataChef → NeMo Curator pipeline
- Integration test: generated YAML is parseable and passes NeMo Curator config validation
What:
Add integration with DataChef (arXiv:2602.11089, Feb 2026) — a 32B LLM trained via RL to generate complete end-to-end NeMo Curator pipeline specifications (synthesis strategy, filter chain, mixing ratios) given a target benchmark and base model. Exposes a DataChefRecipeGenerator that outputs a valid NeMo Curator config YAML.
Why:
DataChef achieves 66.7 on AIME'25 for a Qwen3-1.7B math-adapted model — surpassing the official Qwen3 post-training checkpoint for the same base model. It matches human expert curation across 6 held-out tasks. The RL-trained recipe generator eliminates the manual trial-and-error of pipeline design, which is the primary bottleneck in practice.
Definition of Done: