Skip to content

fix: support full-image masks in instance segmentation postprocessing#588

Merged
kprokofi merged 4 commits into
masterfrom
fix/instance-segmentation-full-image-masks
May 20, 2026
Merged

fix: support full-image masks in instance segmentation postprocessing#588
kprokofi merged 4 commits into
masterfrom
fix/instance-segmentation-full-image-masks

Conversation

@kprokofi
Copy link
Copy Markdown
Contributor

Summary

  • Adds support for full-image masks (e.g., RF-DETR-Seg) in MaskRCNNModel postprocessing
  • Auto-detects mask type by comparing mask spatial dimensions to model input dimensions
  • Adds full_image_masks parameter (auto/yes/no) for explicit control

Problem

Models like RF-DETR-Seg output full-image masks at reduced resolution (e.g. input_size / 4 = 96x96) rather than per-box crop masks (e.g. Mask R-CNN's 28x28). The existing _segm_postprocess incorrectly treats these as box-region crops, causing masks to be shifted and wrongly scaled during inference.

Root cause: _segm_postprocess pads the mask, resizes it to the bounding box dimensions, and places it at the box position — correct for 28x28 per-box crops, but wrong for 96x96 full-image masks that already cover the entire input.

Resolves: open-edge-platform/training_extensions#6488

Fix

  • Auto-detection heuristic: if mask_dim / model_input_dim > 0.15, the mask is full-image (e.g., 96/384 = 0.25 for RF-DETR-Seg vs 28/800 = 0.035 for Mask R-CNN)
  • Full-image postprocess: simple cv2.resize to original image dimensions + threshold at 0.5
  • Backward compatible: per-box crop masks (Mask R-CNN, RTMDet-Inst, etc.) continue using the existing _segm_postprocess logic unchanged

Verification

Tested end-to-end with RF-DETR-Seg-Small exported to OpenVINO:

  • Before fix: masks shifted by ~20px and scaled 0.56x (misaligned with bounding boxes)
  • After fix: mask-to-box overlap = 0.997-1.000 (perfectly aligned)

Changes

File Change
models/instance_segmentation.py Add _should_use_full_image_masks() method, _full_image_mask_postprocess() function, conditional routing in postprocess()
models/parameters.py Add full_image_masks parameter to INSTANCE_SEGMENTATION registry

@kprokofi kprokofi requested a review from a team as a code owner May 19, 2026 12:33
@github-actions github-actions Bot added the python python related changes label May 19, 2026
@kprokofi kprokofi marked this pull request as draft May 19, 2026 12:35
DETR-family instance segmentation models (e.g. RF-DETR-Seg) output
full-image masks at reduced resolution (input_size/4) rather than
per-box crop masks (28x28) like Mask R-CNN.

Add DETRInstanceSegmentation class (__model__ = "DETRInstSeg") that
inherits from MaskRCNNModel and overrides postprocess() to resize
masks to original image dimensions directly, instead of the box-crop
placement logic used by MaskRCNNModel.

This follows the same pattern as SSD vs YOLO for detection -- different
architectures get different model wrappers, selected via model_type in
the exported model's rt_info.

MaskRCNNModel remains unchanged for backward compatibility.

Resolves: open-edge-platform/training_extensions#6488
@kprokofi kprokofi force-pushed the fix/instance-segmentation-full-image-masks branch from 54ed03f to 4134322 Compare May 19, 2026 12:49
Comment thread model_api/src/model_api/models/instance_segmentation.py Outdated
Tests cover:
- _full_image_mask_postprocess: resize, threshold, dtype, spatial pattern preservation
- Comparison between full-image and per-box-crop postprocessing approaches
- DETRInstanceSegmentation.postprocess: basic flow, batch dim squeezing,
  confidence filtering, empty results, label increment, label names,
  mask positioning (verifies masks are NOT shifted to box position),
  multiple detections, class attributes, and inheritance
@github-actions github-actions Bot added the tests Related to tests label May 19, 2026
kprokofi added 2 commits May 19, 2026 22:59
Introduce InstanceSegmentationModel as the common base for both
MaskRCNNModel and DETRInstanceSegmentation. The base class contains
all shared logic: initialization, output detection, preprocessing,
box rescaling, confidence/area filtering, and NMS.

Subclasses only need to implement _postprocess_single_mask():
- MaskRCNNModel: per-box-crop postprocess (_segm_postprocess)
- DETRInstanceSegmentation: full-image resize (_full_image_mask_postprocess)

This eliminates the duplicated postprocess code and makes the hierarchy
cleanly express the architectural difference between the two approaches.

Also updates the tiler to use InstanceSegmentationModel for isinstance
checks, and adds tests verifying the new hierarchy.
Copy link
Copy Markdown
Contributor

@tybulewicz tybulewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement, thanks a lot!

@kprokofi kprokofi marked this pull request as ready for review May 20, 2026 13:09
@kprokofi kprokofi added this pull request to the merge queue May 20, 2026
Merged via the queue into master with commit 1951d64 May 20, 2026
33 checks passed
@kprokofi kprokofi deleted the fix/instance-segmentation-full-image-masks branch May 20, 2026 13:16
kprokofi added a commit to open-edge-platform/training_extensions that referenced this pull request May 21, 2026
Point openvino-model-api to the model_api fix/instance-segmentation-full-image-masks
branch commit that introduces DETRInstanceSegmentation class with proper
full-image mask postprocessing for RF-DETR-Seg.

Also add pyarrow override to resolve datumaro>=24 vs mlflow<24 conflict
that blocks uv lock resolution for Python 3.13+.

Ref: open-edge-platform/model_api#588
kprokofi added a commit to open-edge-platform/training_extensions that referenced this pull request May 21, 2026
Point openvino-model-api to the model_api fix/instance-segmentation-full-image-masks
branch commit that introduces DETRInstanceSegmentation class with proper
full-image mask postprocessing for RF-DETR-Seg.

Also add pyarrow override to resolve datumaro>=24 vs mlflow<24 conflict
that blocks uv lock resolution for Python 3.13+.

Ref: open-edge-platform/model_api#588
kprokofi added a commit to open-edge-platform/training_extensions that referenced this pull request May 21, 2026
Point openvino-model-api to the model_api fix/instance-segmentation-full-image-masks
branch commit that introduces DETRInstanceSegmentation class with proper
full-image mask postprocessing for RF-DETR-Seg.

Also add pyarrow override to resolve datumaro>=24 vs mlflow<24 conflict
that blocks uv lock resolution for Python 3.13+.

Ref: open-edge-platform/model_api#588
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python related changes tests Related to tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RF-DETR-Seg predictions are unexpectedly shifted

3 participants