You are working on a Python project for analyzing image annotations and bounding box data.
- Short and concise code is preferred
- The project uses "uv" for running scripts
- NO EMOJIS - emojis are unprofessional and should be avoided completely
- Provide a concise action plan before implementing any ideas - outline the approach first
- Be direct, professional and if necessary, skeptical.
- DO NOT INVENT HARDCODED INFORMATION - Never make up specific paths, filenames, directory structures, or other concrete details unless they are directly found in the codebase
- Code comments should be brief and used sparingly
- Environment variables are stored in .env (not accessible to you)
- Environment is set in scripts using dotenv, but is not set in bash typically
- Dataset is stored on an external volume with root directory specified by TIF_ROOT_DIR environment variable, assume this is fine.
pretraining_annotations.csvis a symlinked 2.3GB file with image paths and bounding boxes- SLURM scripts for GPU run on the
hpg-turinpartition with access to L4 GPUs. Normal CPU-only scripts do not need a partition specified. - Use
test_minimal.csvandtest_minimal_outfor real data testing.
- Follow PEP 8 Python style guidelines
- When writing code, consider the runtime and cyclomatic complexity of the implmentation
- Run
pre-commitwhen you're finished. - DO NOT include complexity information in docstrings - keep docstrings focused on functionality
- Use
uv run pytestto run tests with appropriate flags. - Instructions in the tests folder.
- ALWAYS use
uvfor ALL dependency management - never edit pyproject.toml directly - Use
uv add <package>to add dependencies - Use
uv run <command>to execute scripts - Use
uv syncto install dependencies
Follow Conventional Commits specification for all commit messages.
feat: new featurefix: bug fixdocs: documentation changesstyle: formatting, missing semicolons, etc (no code change)refactor: code change that neither fixes a bug nor adds a featuretest: adding missing tests or correcting existing testschore: changes to build process or auxiliary tools
- NEVER use
git add .orgit add -Aor other commands that blindly include files. - Always add files individually using specific file paths:
git add file1.py file2.py - Show the user what will be staged before adding files
- Ask for confirmation on which files should be included in the commit
- Use lowercase for type and description
- Keep description under 50 characters
- Use imperative mood (e.g., "add" not "added" or "adds")
- No period at the end of the description
- Stay objective and factual - avoid subjective words like "improve", "enhance", "better", "clean up", "simplify"
- Be specific about what changed - describe the actual modification, not its perceived benefit
- Use concrete technical terms rather than qualitative assessments
- Avoid unnecessary justifications - state what changed, not why it's good
✅ Good: feat: add new validation for image paths
✅ Good: fix: handle null pointer in image validation
✅ Good: refactor: extract tile size calculation to separate function
❌ Bad: fix: improve error handling
❌ Bad: refactor: clean up tile algorithm
❌ Bad: feat: better image loading performance