Skip to content

fix (nonmaxsuppression): fix output when use_dyn_output is false (make it onnx compliant) - #4679#4773

Open
adityas-amd wants to merge 9 commits intoROCm:developfrom
adityas-amd:adityas-amd/fix/nonmaxsuppression-output-padding
Open

fix (nonmaxsuppression): fix output when use_dyn_output is false (make it onnx compliant) - #4679#4773
adityas-amd wants to merge 9 commits intoROCm:developfrom
adityas-amd:adityas-amd/fix/nonmaxsuppression-output-padding

Conversation

@adityas-amd
Copy link
Copy Markdown
Member

Motivation

Fixes #4679

The nonmaxsuppression operator was producing incorrect outputs when use_dyn_output=False. The output was padded with [0, 0, 0] zero triplets instead of being trimmed to only the actually selected boxes.

In reality, [0, 0, 0] is a valid box index (batch 0, class 0, box 0), making it appear as if box 0 was selected multiple times. This silently corrupted downstream operations like Top-K filtering, breaking ONNX spec compliance.

Additionally:

  • use_dyn_output was never exposed in Python bindings, meaning all Python users were always hitting the code path with no way to opt out
  • use_dyn_output=True failed with static input shapes with error: parse: PARSE_EXPAND: dynamic input tensor with fixed dims input not supported

I tested the changes on a small test script

Before fix:

Building NMS ONNX model...
ONNX model saved!

Loading model with MIGraphX...
 Model compiled!

Running inference...

MIGraphX NMS Output:
[[0 0 3]
 [0 0 0]
 [0 0 5]
 [0 0 0]
 [0 0 0]
 [0 0 0]]
Output shape: (6, 3)

RESULT ANALYSIS:
   Number of output boxes: 6
   Expected (ONNX spec):   3 boxes

BUG CONFIRMED!
   Output is padded with [0,0,0] zeros
   0 appears multiple times!

Expected output per ONNX spec:
[[0, 0, 3],   <- batch 0, class 0, box 3 (score 0.95)
 [0, 0, 0],   <- batch 0, class 0, box 0 (score 0.90)
 [0, 0, 5]]   <- batch 0, class 0, box 5 (score 0.30)

After fix:

Building NMS ONNX model... 
ONNX model saved!

Loading model with MIGraphX... 
Model compiled!

Running inference...
MIGraphX NMS Output:
[[0 0 3] 
[0 0 0] 
[0 0 5]] 
Output shape: (3, 3)

RESULT ANALYSIS: 
 Number of output boxes: 3
 Expected (ONNX spec): 3 boxes

Output is CORRECT! Only selected boxes returned — ONNX spec compliant!

Expected output per ONNX spec: 
[[0, 0, 3], <- batch 0, class 0, box 3 (score 0.95)
 [0, 0, 0], <- batch 0, class 0, box 0 (score 0.90)
 [0, 0, 5]] <- batch 0, class 0, box 5 (score 0.30)

Technical Details

Root Cause

In compute(), the output was only trimmed to num_selected boxes when use_dyn_output=True. The False path returned the full pre-allocated buffer zero-filled to max_num_boxes rows.

Changes

src/include/migraphx/op/nonmaxsuppression.hpp

  • Removed std::fill zero padding in compute_nms() - no longer needed since output is always trimmed
  • Changed compute() to always trim output to num_selected boxes regardless of use_dyn_output flag value — both True and False now produce correct ONNX spec compliant output
  • Fixed compute_shape() else branch to return dynamic output shape for static inputs so use_dyn_output=True works correctly with static input shapes without PARSE_EXPAND error

src/py/migraphx_py.cpp

  • Added use_dyn_output parameter to parse_onnx() Python binding
    so Python users can now explicitly control this option
    (defaults to false for backward compatibility)

test/ref/nonmaxsuppression.cpp (Test suite passes)

  • Fixed gold values in 4 existing tests to match ONNX spec —
    removed 0 padding from expected outputs:
    • nms_test
    • nms_not_center_test
    • nms_transpose1_test
    • nms_transpose2_test
  • Added 4 new tests covering all combinations reported in issue: test coverage for reported scenarios | Added 4 targeted tests
    • nms_static_input_dyn_out_false_test — main bug fix validation
    • nms_static_input_dyn_out_true_test — static shapes + True works
    • nms_dyn_out_false_and_true_same_result_test — both flags produce identical correct output
    • nms_dynamic_input_dyn_out_false_throws_test — dynamic inputs with use_dyn_output=False correctly throws error

Files Changed

File Change
src/include/migraphx/op/nonmaxsuppression.hpp Main bug fix
src/py/migraphx_py.cpp Expose use_dyn_output in Python
test/ref/nonmaxsuppression.cpp Fix gold values + new tests

Changelog Category

  • Resolved Issues: Known issues from a previous version that
    have been resolved.

@adityas-amd adityas-amd requested a review from causten as a code owner April 10, 2026 03:59
Copilot AI review requested due to automatic review settings April 10, 2026 03:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes ONNX NonMaxSuppression output correctness in MIGraphX when use_dyn_output is disabled (previously returned a fixed-size buffer padded with valid [0,0,0] triplets), exposes use_dyn_output in Python parsing, and updates/adds reference tests to match ONNX spec behavior.

Changes:

  • Make nonmaxsuppression always return only the selected indices (trim to num_selected) and adjust shape inference accordingly.
  • Expose use_dyn_output in the Python parse_onnx() binding.
  • Update existing NMS reference tests to remove padding and add targeted tests for static/dynamic input + use_dyn_output combinations.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
src/include/migraphx/op/nonmaxsuppression.hpp Updates shape inference and runtime output shaping to avoid padded outputs and improve spec compliance.
src/py/migraphx_py.cpp Adds use_dyn_output argument to Python parse_onnx() and wires it to onnx_options.
test/ref/nonmaxsuppression.cpp Adjusts gold outputs to match trimmed ONNX behavior and adds new coverage for reported scenarios.
Comments suppressed due to low confidence (1)

src/py/migraphx_py.cpp:674

  • use_dyn_output is now exposed on parse_onnx(), but parse_onnx_buffer() still can’t set onnx_options.use_dyn_output, so Python users parsing from an in-memory buffer remain unable to control this behavior. Consider adding the same use_dyn_output argument (defaulting to false) to parse_onnx_buffer() and wiring it into the options for API parity.
        py::arg("max_loop_iterations")    = 10,
        py::arg("limit_max_iterations")   = std::numeric_limits<uint16_t>::max(),
        py::arg("use_dyn_output") = false); //https://github.com/ROCm/AMDMIGraphX/issues/4679

    m.def(
        "parse_onnx_buffer",
        [](const std::string& onnx_buffer,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 187 to 191
// actual selected boxes. Previously returned fixed max_num_boxes which caused zero padding bug. https://github.com/ROCm/AMDMIGraphX/issues/4679
std::vector<shape::dynamic_dimension> out_lens = {};
out_lens.push_back({0, max_num_boxes});
out_lens.push_back({3, 3});
return {shape::int64_type, out_lens};
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, compute_shape() returns the same dynamic output shape in both the use_dyn_output and !use_dyn_output branches, and compute() also reshapes unconditionally. That makes the use_dyn_output flag behavior misleading (it now mainly controls whether dynamic inputs are allowed). Consider simplifying the branching / documenting this explicitly (or deprecating the flag) so callers don’t assume it controls whether the output shape is dynamic.

Copilot uses AI. Check for mistakes.
@adityas-amd adityas-amd requested a review from Copilot April 10, 2026 20:47
@adityas-amd adityas-amd changed the title fix (nonmaxsuppression): fix incorrect output when use_dyn_output is false - #4679 fix (nonmaxsuppression): fix output when use_dyn_output is false (make it onnx compliant) - #4679 Apr 10, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +187 to 192
// actual selected boxes. Previously returned fixed max_num_boxes which caused zero
// padding bug. https://github.com/ROCm/AMDMIGraphX/issues/4679
std::vector<shape::dynamic_dimension> out_lens = {};
out_lens.push_back({0, max_num_boxes});
out_lens.push_back({3, 3});
return {shape::int64_type, out_lens};
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output max length used for shape inference/allocation is derived from max_num_boxes = num_batches * num_boxes, but compute_nms() iterates over num_classes and can select up to num_batches * num_classes * num_boxes indices (or * min(num_boxes, max_output_boxes_per_class)), so the current {0, max_num_boxes} upper bound can under-allocate result and make std::copy(selected_indices..., output.begin()) write past the end when num_classes > 1. Consider including scores' max num_classes in the bound (e.g., boxes_max_lens[0] * inputs.at(1).max_lens()[1] * boxes_max_lens[1]) so output_shape.max_lens() is always a safe upper bound.

Copilot uses AI. Check for mistakes.
Comment on lines 382 to 392
// make buffer of maximum size
shape max_output_shape = {output_shape.type(), output_shape.max_lens()};
argument result{max_output_shape};
std::size_t num_selected = 0;

std::size_t max_output_boxes_per_class =
(args.size() > 2) ? (args.at(2).at<std::size_t>()) : 0;
if(max_output_boxes_per_class == 0)
{
return result;
return result.reshape({output_shape.type(), {num_selected, 3}});
}
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute() always allocates result with output_shape.max_lens() before reading max_output_boxes_per_class; when max_output_boxes_per_class == 0 this returns an empty {0,3} view but still pays the cost of allocating the max-size buffer. Consider checking max_output_boxes_per_class first and returning an empty argument directly (or moving the max-buffer allocation after the early-return) to avoid unnecessary allocations.

Copilot uses AI. Check for mistakes.
Comment on lines 700 to +704
std::unordered_map<std::string, std::vector<migraphx::shape::dynamic_dimension>>(),
py::arg("skip_unknown_operators") = false,
py::arg("print_program_on_error") = false,
py::arg("external_data_path") = "");
py::arg("external_data_path") = "",
py::arg("use_dyn_output") = false);
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the parse_onnx_buffer binding, the first argument is the ONNX buffer (onnx_buffer in the lambda), but the exposed keyword name is still py::arg("filename") (see line 695). Since this function signature is being updated, consider renaming the argument keyword to something like onnx_buffer (or otherwise documenting why it differs) so Python call sites using keywords aren’t misled.

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/include/migraphx/op/nonmaxsuppression.hpp 83.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4773      +/-   ##
===========================================
+ Coverage    92.32%   92.45%   +0.12%     
===========================================
  Files          583      583              
  Lines        29334    29297      -37     
===========================================
+ Hits         27082    27084       +2     
+ Misses        2252     2213      -39     
Files with missing lines Coverage Δ
src/include/migraphx/op/nonmaxsuppression.hpp 95.89% <83.33%> (-0.03%) ⬇️

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Issue]: nonmaxsuppression node produces incorrect outputs when nonmaxsupression::use_dyn_output == False

2 participants