Using: https://github.com/NVIDIA/Model-Optimizer/blob/main/examples/puzzletron/README.md
Doing an initial puzzletron setup up to this sanity check fails with number of issues: python -m pytest tests/gpu/torch/puzzletron/test_puzzletron.py -k "Qwen3-8B"
- Why in Nemo 26_02 there is nvidia-modelopt 0.43.0rc1 installed and not 0.44?
To reproduce:
enroot import --output ./docker/nemo_26_02.sqsh docker://nvcr.io/nvidia/nemo:26.02
export EXPERIMENT_DIR=.../dkorzekwa/experiments/6_5_qwen_35_moments_lab
submit_job (srun wrapper) --partition interactive --time 4 --image $EXPERIMENT_DIR/docker/nemo_26_02.sqsh --mounts $EXPERIMENT_DIR:/workspace --interactive --gpu 8
python -m pip list |grep modelopt
after calling python -m pip install -e ".[hf,puzzletron,dev-test]":
nvidia-modelopt 0.45.0.dev164+g115cae258
-
“...Once inside the container with the repo available, install dependencies from the repo root: …” - unclear what is “repo root”, I assume it is ModelOpt source repo, can we clarify it?
-
Why is it required to install modelopt from sources given it is already installed in the nemo container? is similar approach needed for other compression algorithms in modelopt?
python -m pip install -e ".[hf,puzzletron,dev-test]"
-
python -m pytest tests/gpu/torch/puzzletron/test_puzzletron.py -k "Qwen3-8B fails, adding -o addopts="" makes it working.
-
Why are both needed?
python -m pip install -e ".[hf,puzzletron,dev-test]"
python -m pip install -r examples/puzzletron/requirements.txt
can we simplify it?
- python -m pip install -e ".[hf,puzzletron,dev-test]" shows an error:
Uninstalling nvidia-modelopt-0.43.0rc1:
Successfully uninstalled nvidia-modelopt-0.43.0rc1
Attempting uninstall: peft
Found existing installation: peft 0.13.2
Uninstalling peft-0.13.2:
Successfully uninstalled peft-0.13.2
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
nemo-export-deploy 0.4.0rc0 requires peft<0.14.0, but you have peft 0.19.1 which is incompatible.
tensorrt-llm 1.1.0 requires fastapi<=0.121.3,>=0.120.1, but you have fastapi 0.135.1 which is incompatible.
tensorrt-llm 1.1.0 requires nvidia-cutlass-dsl==4.2.1; python_version >= "3.10", but you have nvidia-cutlass-dsl 4.4.2 which is incompatible.
tensorrt-llm 1.1.0 requires setuptools<80, but you have setuptools 81.0.0 which is incompatible.
tensorrt-llm 1.1.0 requires transformers==4.56.0, but you have transformers 4.57.6 which is incompatible.
tensorrt-llm 1.1.0 requires wheel<=0.45.1, but you have wheel 0.46.3 which is incompatible.
Successfully installed deepspeed-0.19.1 dependency-groups-1.3.1 fire-0.7.1 hjson-3.1.0 humanize-4.15.0 lru-dict-1.4.1 nox-2026.4.10 nvidia-modelopt-0.45.0.dev164+g115cae258 peft-0.19.1 pytest-cov-7.1.0 pytest-instafail-0.5.0 termcolor-3.3.0 torch-geometric-2.7.0 wonderwords-3.0.1
- python -m pytest tests/gpu/torch/puzzletron/test_puzzletron.py -o addopts="" -k "Qwen3-8B" fails with
File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/workspace/Model-Optimizer/tests/_test_utils/torch/distributed/utils.py", line 53, in init_process
job(rank, size)
File "/workspace/Model-Optimizer/tests/gpu/torch/puzzletron/test_puzzletron.py", line 202, in _test_puzzletron_multiprocess_job
pytest.fail(
File "/opt/venv/lib/python3.12/site-packages/_pytest/outcomes.py", line 163, in __call__
raise Failed(msg=reason, pytrace=pytrace)
Failed: 2 assertion(s) failed for Qwen/Qwen3-8B:
- Teacher memory mismatch for Qwen/Qwen3-8B: expected 395.63, got 1582.13720703125
- Teacher num_params mismatch for Qwen/Qwen3-8B: expected 6096640, got 24189184
- To use puzzletron on a slurm based cluster, I had to figure out what enroot command to use to download the image and then learn how to use slurm. Is there some wiki in modelopt that shows how to use modelopt using different types of infrastructures, e.g. in my case slurm-based on-prem.
Using: https://github.com/NVIDIA/Model-Optimizer/blob/main/examples/puzzletron/README.md
Doing an initial puzzletron setup up to this sanity check fails with number of issues:
python -m pytest tests/gpu/torch/puzzletron/test_puzzletron.py -k "Qwen3-8B"To reproduce:
after calling python -m pip install -e ".[hf,puzzletron,dev-test]":
“...Once inside the container with the repo available, install dependencies from the repo root: …” - unclear what is “repo root”, I assume it is ModelOpt source repo, can we clarify it?
Why is it required to install modelopt from sources given it is already installed in the nemo container? is similar approach needed for other compression algorithms in modelopt?
python -m pytest tests/gpu/torch/puzzletron/test_puzzletron.py -k "Qwen3-8Bfails, adding-o addopts=""makes it working.Why are both needed?
can we simplify it?