[AIROCMLIR-551] Completely support as_underlying_shape and as_logical_shape by Mr-Anyone · Pull Request #2265 · ROCm/rocMLIR

Mr-Anyone · 2026-03-02T21:15:08Z

Motivation

Being able to support transposed memory layout, long strides in both as_underlying_shape and as_logical_shape.

Furthermore, in as_underlying_shape, broadcasting is supported.

Technical Details

This one PR essentially implements these two commits: #2198 , this one.

as_logical_shape is implemented as the following:

Reshape from flat memory layout to memory layout
Check the stride permutation, and invert it to match the logical shape
If the memory layout tensor is "greater in dimension" than the logical dimension, it means that we have a long stride layout so we emit tensor.extract_from_slice to get the logical shape
Broadcast if needed

as_underlying_shape is implemented as the following:

Reshape to memory based on stride permutation
If there are long strides, we emit a tensor.insert_slice with a tensor.empty_op
If there is broadcasting, we error out.

Test Plan

Added e2e and lit test.

Test Result

Passed both e2e and lit test.

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Mr-Anyone · 2026-03-02T21:15:42Z

Also, note that the code structure is really similar to the one found in the tosa pipeline if that makes review easier.

Mr-Anyone · 2026-03-02T21:24:48Z

mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir

-// CHECK-DAG: %[[expanded:.*]] = tensor.expand_shape %[[arg1]]
-// CHECK-DAG: %[[expanded_0:.*]] = tensor.expand_shape %[[arg0]]
-// CHECK-DAG: linalg.sub ins(%[[expanded_0]], %[[expanded]] {{.*}})
+// CHECK-DAG: linalg.sub ins(%[[arg0]], %[[arg1]] {{.*}})


This actually improves the previous IR. Notice that the input shape !migraphx.shaped<16xf32, 1> get converted into tensor<16xf32> which is the memory layout shape. In that case, we don't emit tensor.expand_shape.

Before:

func.func @func_sub(%arg0: tensor<16xf32>, %arg1: tensor<16xf32>) -> tensor<16xf32> { %expanded = tensor.expand_shape %arg1 [[0]] output_shape [16] : tensor<16xf32> into tensor<16xf32> %expanded_0 = tensor.expand_shape %arg0 [[0]] output_shape [16] : tensor<16xf32> into tensor<16xf32> %0 = tensor.empty() : tensor<16xf32> %1 = linalg.sub ins(%expanded_0, %expanded : tensor<16xf32>, tensor<16xf32>) outs(%0 : tensor<16xf32>) -> tensor<16xf32> %collapsed = tensor.collapse_shape %1 [[0]] : tensor<16xf32> into tensor<16xf32> return %collapsed : tensor<16xf32> }

After:

func.func @func_sub(%arg0: tensor<16xf32>, %arg1: tensor<16xf32>) -> tensor<16xf32> { %0 = tensor.empty() : tensor<16xf32> %1 = linalg.sub ins(%arg0, %arg1 : tensor<16xf32>, tensor<16xf32>) outs(%0 : tensor<16xf32>) -> tensor<16xf32> return %1 : tensor<16xf32> }

mlir/lib/Conversion/LinalgToRock/LinalgToRock.cpp

justinrosner · 2026-03-03T21:25:54Z

mlir/lib/Conversion/LinalgToRock/LinalgToRock.cpp

+  /// %empty = tensor.empty() : ....
+  /// %inserted_slice = tensor.insert_slice %actual_data into %empty ...


Is the expand strides case the only case that will produce an empty tensor + insert_slice? This seems quite broad, and potentially susceptible to pattern matching unwanted scenarios?

This is true indeed true, but my understanding is that empty tensor + insert_slice is the same as a rock.expand_strides. rock.expand_strides is the same as extract_slice (of the result) + memcpy which I think has the same semantics as above.

I think the we have to emit attribute during migraphx to linalg if we only want some case of the tensor.insert_slice to be lowered into rock. Semantically,

Also, I think copilot gave a pretty good comment on this one as well as I forgot to check for stride, offset, and sizes attribute for tensor.insert_slices.

I think in the TosaToRock version we used a tosa.custom_op so that we weren't just matching empty tensor + insert_slice

I think it is hard to emit one single op for this operations in linalg, because my understanding is that we only want to write to half of the memory. We don't want to load/write to the other half of the memory?

This is a similar problem in llvm discourse.

I don't have a problem with using multiple ops to represent this, my main concern is if there are ever going to be other situations that emit empty tensor + insert_slice that we don't want to lower to rock.expand_strides.

Here are a few examples:

%empty = tensor.empty() : tensor<8x8xf16> %result = tensor.insert_slice %src into %empty[2, 0][4, 8][1, 1] : tensor<4x8xf16> into tensor<8x8xf16>

In this case the data should go at row offset 2 but rock.expand_strides would place it at row 0.

%empty = tensor.empty() : tensor<8x8xf16> %result = tensor.insert_slice %src into %empty[0, 0][4, 8][2, 1] : tensor<4x8xf16> into tensor<8x8xf16>

This interleaves source rows with gaps (stride-2 insertion). I believe that rock.expand_strides would do a contiguous copy instead?

I think you have some handling right now for these cases, but this still seems fragile to me. Does it make sense to add some kind of metadata to the insert_slice (or tensor.empty) upon creation?

I have added attribute/meta data in the tensort.insert_slice

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-non-contiguous-stride.mlir

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

mlir/lib/Conversion/LinalgToRock/LinalgToRock.cpp

pabloantoniom · 2026-03-05T06:28:39Z

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

+    return op.emitOpError("unsupported conversion to underlying shape");
+
+  // Trivial case, the input tensor is the target memory layout tensor
+  if (inTensorType == resultTensorType){


We are missing clang-format?

pabloantoniom · 2026-03-05T09:23:22Z

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

+    // Verify that memoryLayoutType is >= transposedType in all dimensions.
+    RankedTensorType transposedType =
+        cast<RankedTensorType>(transposed.getType());
+    if (llvm::any_of(llvm::enumerate(memoryLayoutType.getShape(),


This will print an error for every failed dimension right? I think it would be better to print a message just on the first error

In that case, I am not sure, if llvm::any_of is the play here. There is side effect dependent on hasErroredOut which is not that great. Do you think a for loop will be better?

llvm::any_of should do early exit if i think. So it shouldn't print error message multiple times i think

Copilot

Pull request overview

This PR extends MIGraphX shape materialization to fully support transposed layouts, long/expanded strides, and (for reads) broadcasted strides via improved as_logical_shape / as_underlying_shape lowering, and adds a Rock-side lowering path for expanded-stride materializations.

Changes:

Update migraphx.mlir.as.logical.shape lowering to reshape into memory layout, invert stride permutation via transpose, slice for long strides, and broadcast as needed.
Update migraphx.mlir.as.underlying.shape lowering to transpose into memory layout and materialize long strides via tensor.insert_slice into a larger tensor.empty (broadcast writes remain unsupported).
Add Linalg→Rock conversion to rewrite eligible tensor.insert_slice-into-tensor.empty patterns into rock.expand_strides, and add/adjust lit + e2e tests for non-contiguous / long-stride cases.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp	Implements the new `as_logical_shape` / `as_underlying_shape` lowering logic for transpose, long strides, and broadcasting (read-side).
mlir/lib/Conversion/LinalgToRock/LinalgToRockPass.cpp	Marks certain `tensor.insert_slice` patterns dynamically illegal to force conversion into Rock ops.
mlir/lib/Conversion/LinalgToRock/LinalgToRock.cpp	Adds a conversion pattern rewriting `tensor.insert_slice` (into a single-use `tensor.empty`) into `rock.expand_strides`.
mlir/test/fusion/pr-e2e/mixr-non-contiguous-strides-sub.mlir	Adds an end-to-end regression covering non-contiguous output strides.
mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir	Updates existing checks and adds new coverage for transposed, broadcasted, sliced, and combined stride scenarios.
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-not-implemented.mlir	Removes a no-longer-relevant “broadcast not implemented” negative test.
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-non-contiguous-stride.mlir	Adds new lit tests for long-stride legalization + error cases.
mlir/test/Conversion/LinalgToRock/linalg-to-rock-expand-strides.mlir	Adds lit tests validating `tensor.insert_slice` → `rock.expand_strides` lowering.

Comments suppressed due to low confidence (1)

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp:83

hasTranspose |= (from != static_cast<int32_t>(to)) mixes int64_t/size_t with an int32_t cast. This should compare using the same width (e.g., cast to to int64_t) to avoid truncation and keep the intent clear.

  for (auto [to, from] : llvm::enumerate(inversePermutation)) {
    permutation[from] = to;
    transposedShape[from] = memoryType.getShape()[to];
    hasTranspose |= (from != static_cast<int32_t>(to));
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

mlir/lib/Conversion/LinalgToRock/LinalgToRockPass.cpp

mlir/lib/Conversion/LinalgToRock/LinalgToRock.cpp

mlir/test/fusion/pr-e2e/mixr-non-contiguous-strides-sub.mlir

mlir/lib/Conversion/LinalgToRock/LinalgToRockPass.cpp

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

umangyadav · 2026-03-09T18:16:57Z

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp

+    // Verify that memoryLayoutType is >= transposedType in all dimensions.
+    RankedTensorType transposedType =
+        cast<RankedTensorType>(transposed.getType());
+    if (llvm::any_of(llvm::enumerate(memoryLayoutType.getShape(),


llvm::any_of should do early exit if i think. So it shouldn't print error message multiple times i think

mlir/test/fusion/pr-e2e/mixr-non-contiguous-strides-sub.mlir

Mr-Anyone · 2026-04-06T14:39:31Z

mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir

+func.func @mlir_dot_sigmoid(%arg0: !migraphx.shaped<4x5x16xf16, 80x16x1>, %arg1: !migraphx.shaped<4x16x24xf16, 384x24x1>) -> !migraphx.shaped<4x5x24xf16, 288x24x1> attributes {arch = "gfx1201", kernel = "mixr", num_cu = 32 : i64} {
+  %0 = migraphx.dot %arg0, %arg1 : <4x5x16xf16, 80x16x1>, <4x16x24xf16, 384x24x1> -> <4x5x24xf16, 120x24x1>
+  %1 = migraphx.sigmoid %0 : <4x5x24xf16, 120x24x1> -> <4x5x24xf16, 288x24x1>
+  return %1 : !migraphx.shaped<4x5x24xf16, 288x24x1>
+}


This testcase comes from #2225

umangyadav · 2026-04-09T18:54:03Z

mlir/lib/Conversion/LinalgToRock/LinalgToRock.cpp

+  auto expandOp = rock::ExpandStridesOp::create(rewriter, loc, op.getType(),
+                                                adaptor.getSource(), alloc);
+  rewriter.replaceOp(op, expandOp);
+  rewriter.eraseOp(tensorEmpty);


Better to assert that tensor::EmptyOp doesn't have any other use before erasing

done. Using similar approach in linalg.generic convolution lowering, we can let dead code elimination clean this us?

justinrosner

Overall the changes look good to me now. Just need some resolution on #2265 (comment)

Mr-Anyone · 2026-04-10T19:20:23Z

Overall the changes look good to me now. Just need some resolution on #2265 (comment)

In this case, I made a change in AsUnderlyingShapeConverter to emit rock.is_expand_strides to tell linalg -> rock if it is an expand_stride.

Co-authored-by: Justin Rosner <justin.rosner@amd.com>

Mr-Anyone requested a review from causten as a code owner March 2, 2026 21:15

Mr-Anyone requested review from dhernandez0, justinrosner, pabloantoniom and umangyadav March 2, 2026 21:16

Mr-Anyone commented Mar 2, 2026

View reviewed changes

justinrosner reviewed Mar 3, 2026

View reviewed changes

Mr-Anyone requested a review from justinrosner March 4, 2026 20:39

pabloantoniom reviewed Mar 5, 2026

View reviewed changes

Mr-Anyone requested a review from pabloantoniom March 5, 2026 14:29

umangyadav requested a review from Copilot March 5, 2026 14:39

Copilot started reviewing on behalf of umangyadav March 5, 2026 14:42 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

umangyadav reviewed Mar 9, 2026

View reviewed changes

mlir/lib/Conversion/LinalgToRock/LinalgToRockPass.cpp Outdated Show resolved Hide resolved

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 8b142a9 to b951489 Compare March 9, 2026 16:32

Mr-Anyone requested a review from umangyadav March 9, 2026 16:33

umangyadav reviewed Mar 9, 2026

View reviewed changes

mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp Outdated Show resolved Hide resolved

umangyadav reviewed Mar 9, 2026

View reviewed changes

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from b951489 to 8640f0d Compare March 9, 2026 20:01

Mr-Anyone mentioned this pull request Mar 9, 2026

[AIROCMLIR-600] Lower migraphx.transpose into linalg.transpose #2275

Merged

1 task

Mr-Anyone requested a review from umangyadav March 13, 2026 20:37

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 8640f0d to eb548f6 Compare March 18, 2026 13:29

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from eb548f6 to 66fe4ed Compare March 26, 2026 16:02

Mr-Anyone mentioned this pull request Mar 27, 2026

[AIROCMLIR-658] Lower migraphx.quant_dot from Linalg to Rock #2317

Open

1 task

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 66fe4ed to 8a3c02c Compare March 30, 2026 14:53

Mr-Anyone mentioned this pull request Mar 30, 2026

Lower migraphx.quantizelinear #2321

Draft

1 task

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 8a3c02c to 9729376 Compare April 6, 2026 13:40

Mr-Anyone commented Apr 6, 2026

View reviewed changes

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 379799b to 1dadc1e Compare April 7, 2026 13:48

umangyadav reviewed Apr 9, 2026

View reviewed changes

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch 2 times, most recently from dad2540 to f831e77 Compare April 10, 2026 13:32

justinrosner reviewed Apr 10, 2026

View reviewed changes

Mr-Anyone requested a review from umangyadav April 10, 2026 19:23

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 9c08c9e to 3004622 Compare April 10, 2026 19:23

Mr-Anyone requested a review from justinrosner April 10, 2026 19:23

justinrosner approved these changes Apr 10, 2026

View reviewed changes

Mr-Anyone and others added 16 commits April 11, 2026 11:29

Completely support as_logical_shape/as_underlying_shape

696f127

Added testcase for migraphx-to-linalg

984e0bb

Added Linalg to rock testcase

c5dbfad

Early exit in as_underlying_shape

178204f

Address comments round one

f48d65f

Address comments

294b9b6

clang-format

7510914

Adddress some comments

959584b

Move into function and refactor

fccef74

clang-format and other small changes

b07d141

Added testcase from before

66b71b7

Update mlir/lib/Conversion/LinalgToRock/LinalgToRockPass.cpp

50727de

Co-authored-by: Justin Rosner <justin.rosner@amd.com>

Address comments

bfdbf3a

Address pr-review skill

8d1f7e8

clang-format

95416d9

Address cursor comments

b63c436

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 3004622 to b63c436 Compare April 11, 2026 15:29

Use rock attribute instead

a1374db

Mr-Anyone force-pushed the pr-migraphx-as-underlying-logical branch from 1108dbc to a1374db Compare April 11, 2026 22:05

Mr-Anyone merged commit 321803c into develop Apr 12, 2026
14 of 15 checks passed

Mr-Anyone deleted the pr-migraphx-as-underlying-logical branch April 12, 2026 03:09

		/// %empty = tensor.empty() : ....
		/// %inserted_slice = tensor.insert_slice %actual_data into %empty ...

Conversation

Mr-Anyone commented Mar 2, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Mr-Anyone commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mr-Anyone Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justinrosner left a comment

Choose a reason for hiding this comment

Uh oh!

Mr-Anyone commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Mr-Anyone commented Mar 2, 2026 •

edited

Loading

Mr-Anyone Mar 2, 2026 •

edited

Loading