fix: support full-image masks in instance segmentation postprocessing#588
Merged
Merged
Conversation
DETR-family instance segmentation models (e.g. RF-DETR-Seg) output full-image masks at reduced resolution (input_size/4) rather than per-box crop masks (28x28) like Mask R-CNN. Add DETRInstanceSegmentation class (__model__ = "DETRInstSeg") that inherits from MaskRCNNModel and overrides postprocess() to resize masks to original image dimensions directly, instead of the box-crop placement logic used by MaskRCNNModel. This follows the same pattern as SSD vs YOLO for detection -- different architectures get different model wrappers, selected via model_type in the exported model's rt_info. MaskRCNNModel remains unchanged for backward compatibility. Resolves: open-edge-platform/training_extensions#6488
54ed03f to
4134322
Compare
leoll2
reviewed
May 19, 2026
Tests cover: - _full_image_mask_postprocess: resize, threshold, dtype, spatial pattern preservation - Comparison between full-image and per-box-crop postprocessing approaches - DETRInstanceSegmentation.postprocess: basic flow, batch dim squeezing, confidence filtering, empty results, label increment, label names, mask positioning (verifies masks are NOT shifted to box position), multiple detections, class attributes, and inheritance
Introduce InstanceSegmentationModel as the common base for both MaskRCNNModel and DETRInstanceSegmentation. The base class contains all shared logic: initialization, output detection, preprocessing, box rescaling, confidence/area filtering, and NMS. Subclasses only need to implement _postprocess_single_mask(): - MaskRCNNModel: per-box-crop postprocess (_segm_postprocess) - DETRInstanceSegmentation: full-image resize (_full_image_mask_postprocess) This eliminates the duplicated postprocess code and makes the hierarchy cleanly express the architectural difference between the two approaches. Also updates the tiler to use InstanceSegmentationModel for isinstance checks, and adds tests verifying the new hierarchy.
tybulewicz
approved these changes
May 19, 2026
Contributor
tybulewicz
left a comment
There was a problem hiding this comment.
Great improvement, thanks a lot!
leoll2
approved these changes
May 19, 2026
kprokofi
added a commit
to open-edge-platform/training_extensions
that referenced
this pull request
May 21, 2026
Point openvino-model-api to the model_api fix/instance-segmentation-full-image-masks branch commit that introduces DETRInstanceSegmentation class with proper full-image mask postprocessing for RF-DETR-Seg. Also add pyarrow override to resolve datumaro>=24 vs mlflow<24 conflict that blocks uv lock resolution for Python 3.13+. Ref: open-edge-platform/model_api#588
kprokofi
added a commit
to open-edge-platform/training_extensions
that referenced
this pull request
May 21, 2026
Point openvino-model-api to the model_api fix/instance-segmentation-full-image-masks branch commit that introduces DETRInstanceSegmentation class with proper full-image mask postprocessing for RF-DETR-Seg. Also add pyarrow override to resolve datumaro>=24 vs mlflow<24 conflict that blocks uv lock resolution for Python 3.13+. Ref: open-edge-platform/model_api#588
kprokofi
added a commit
to open-edge-platform/training_extensions
that referenced
this pull request
May 21, 2026
Point openvino-model-api to the model_api fix/instance-segmentation-full-image-masks branch commit that introduces DETRInstanceSegmentation class with proper full-image mask postprocessing for RF-DETR-Seg. Also add pyarrow override to resolve datumaro>=24 vs mlflow<24 conflict that blocks uv lock resolution for Python 3.13+. Ref: open-edge-platform/model_api#588
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MaskRCNNModelpostprocessingfull_image_masksparameter (auto/yes/no) for explicit controlProblem
Models like RF-DETR-Seg output full-image masks at reduced resolution (e.g.
input_size / 4 = 96x96) rather than per-box crop masks (e.g. Mask R-CNN's28x28). The existing_segm_postprocessincorrectly treats these as box-region crops, causing masks to be shifted and wrongly scaled during inference.Root cause:
_segm_postprocesspads the mask, resizes it to the bounding box dimensions, and places it at the box position — correct for 28x28 per-box crops, but wrong for 96x96 full-image masks that already cover the entire input.Resolves: open-edge-platform/training_extensions#6488
Fix
mask_dim / model_input_dim > 0.15, the mask is full-image (e.g.,96/384 = 0.25for RF-DETR-Seg vs28/800 = 0.035for Mask R-CNN)cv2.resizeto original image dimensions + threshold at 0.5_segm_postprocesslogic unchangedVerification
Tested end-to-end with RF-DETR-Seg-Small exported to OpenVINO:
Changes
models/instance_segmentation.py_should_use_full_image_masks()method,_full_image_mask_postprocess()function, conditional routing inpostprocess()models/parameters.pyfull_image_masksparameter toINSTANCE_SEGMENTATIONregistry