Skip to content

Update rocMLIR integration API to version 5 (perfConfig, cluster size)#4761

Open
dhernandez0 wants to merge 6 commits intoadd_rock_prefixfrom
rocmlir-integration-cluster-and-triton
Open

Update rocMLIR integration API to version 5 (perfConfig, cluster size)#4761
dhernandez0 wants to merge 6 commits intoadd_rock_prefixfrom
rocmlir-integration-cluster-and-triton

Conversation

@dhernandez0
Copy link
Copy Markdown

@dhernandez0 dhernandez0 commented Apr 9, 2026

MERGE AFTER #4748

Motivation

Update MIGraphX integration to support rocMLIR API version 5, which passes perfConfig through the backend pipeline (for triton backend compatibility) and returns cluster size from kernel attributes (for gfx1250+).

rocMLIR PR: ROCm/rocMLIR#2341 (I'll update the commit hash when merged).

Technical Details

  • Bump MLIR_MIGRAPHX_DIALECT_API_VERSION check from 4 to 5
  • mlirMIGraphXAddBackendPipeline now takes a MlirMIGraphXBackendOptions struct (arch, perfConfig, optLevel) instead of a separate arch string; the tuning solution is extracted and passed as perfConfig
  • Update get_launch_params to return cluster size from mlirGetKernelAttrs (currently ignored with a TODO)

Changelog Category

Add a CHANGELOG.md entry for any option other than Not Applicable

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@dhernandez0 dhernandez0 self-assigned this Apr 9, 2026
@dhernandez0 dhernandez0 requested a review from causten as a code owner April 9, 2026 09:48
}
// 2nd pipeline to call
run_backend_pipeline();
run_backend_pipeline(solution_str);
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I undertsand solution is never null when we call run_backend_pipeline, is that correct?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that should be the case. Probably should throw an exception if its null. Then you can just do: run_backend_pipeline(solution.to<std::string>())

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@dhernandez0 dhernandez0 changed the title Update API with rocMLIR to support cluster launch and triton backend Update rocMLIR integration API to version 5 (perfConfig, cluster size) Apr 9, 2026
MlirMIGraphXBackendOptions opts{};
opts.arch = target_arch.c_str();
opts.perfConfig = solution.c_str();
opts.optLevel = 3;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this because it could be useful for debugging on your side as well. It's often the case that LLVM bugs go away if you pass optLevel=0.

@dhernandez0
Copy link
Copy Markdown
Author

dhernandez0 commented Apr 9, 2026

I currently have this split into two PRs: #4748 (rock. prefix for attributes) and this one (API v5: perfConfig + cluster size). Happy to squash them into a single PR if you'd prefer, please let me know. @pfultz2

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates MIGraphX’s rocMLIR integration to match rocMLIR MIGraphX dialect API v5 by switching the backend pipeline hook to the new options-based API and by reading the newly returned kernel cluster size attribute.

Changes:

  • Bump MLIR_MIGRAPHX_DIALECT_API_VERSION compatibility check from 4 → 5.
  • Update backend pipeline invocation to use MlirMIGraphXBackendOptions (arch, perfConfig, optLevel).
  • Extend get_launch_params() to return cluster size (currently ignored by code_object_op).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mlirMIGraphXAddBackendPipeline(pm_back.get(), target_arch.c_str());
MlirMIGraphXBackendOptions opts{};
opts.arch = target_arch.c_str();
opts.perfConfig = solution.c_str();
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opts.perfConfig is always set to solution.c_str(). When solution is empty (common path when no explicit tuning solution is provided), passing an empty string may be interpreted as an explicit perfConfig by rocMLIR and could cause parse/validation failures. Prefer setting perfConfig to nullptr when solution.empty() (or only setting it when a solution exists).

Suggested change
opts.perfConfig = solution.c_str();
opts.perfConfig = solution.empty() ? nullptr : solution.c_str();

Copilot uses AI. Check for mistakes.
Comment on lines +919 to +923
MlirMIGraphXBackendOptions opts{};
opts.arch = target_arch.c_str();
opts.perfConfig = solution.c_str();
opts.optLevel = 3;
mlirMIGraphXAddBackendPipeline(pm_back.get(), &opts);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opts.optLevel is hard-coded to 3 without explanation or configuration. If MIGraphX supports varying optimization levels elsewhere, consider threading that setting through (env var / compile option) or at least defining a named constant to avoid an unexplained magic number here.

Copilot uses AI. Check for mistakes.
Comment on lines +955 to +956
set_tuning(solution);
solution_str = *solution.if_string();
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After set_tuning(solution) validates that solution is a string, the code re-fetches it via solution.if_string() and dereferences again to populate solution_str. Consider reusing the validated pointer (or have set_tuning return the string) to avoid redundant lookups and make the control flow clearer.

Suggested change
set_tuning(solution);
solution_str = *solution.if_string();
const auto solution_ptr = solution.if_string();
set_tuning(solution);
solution_str = *solution_ptr;

Copilot uses AI. Check for mistakes.
Comment on lines +52 to 53
#if !defined(MLIR_MIGRAPHX_DIALECT_API_VERSION) || MLIR_MIGRAPHX_DIALECT_API_VERSION != 5
#warning "Incompatible version of rocMLIR library used, disabling"
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR metadata indicates this change should be included in CHANGELOG.md (category marked "Added"), but this PR does not include a changelog entry. Please add an appropriate CHANGELOG.md entry (or update the PR category to "Not Applicable" if that was intended).

Copilot uses AI. Check for mistakes.
@dhernandez0 dhernandez0 requested a review from a team as a code owner April 11, 2026 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants