Skip to content

fix: allow deleting cross-subscription snapshots#2515

Merged
andyzhangx merged 2 commits intokubernetes-sigs:masterfrom
RomanBednar:OCPBUGS-54897
Apr 29, 2025
Merged

fix: allow deleting cross-subscription snapshots#2515
andyzhangx merged 2 commits intokubernetes-sigs:masterfrom
RomanBednar:OCPBUGS-54897

Conversation

@RomanBednar
Copy link
Copy Markdown
Contributor

@RomanBednar RomanBednar commented Apr 15, 2025

DeleteSnapshot should try to get subscription ID from snapshot ID before proceeding with deletion to ensure we can delete a snapshot that could have been created under different subscription ID.

If a cross-subscription snapshot deletion is attempted without it, the driver will use the default subscription ID and fail with ResourceGroupNotFound when Delete is executed with file share client.

This is relevant only for ARM/Control Plane client authentication.

What type of PR is this?
/kind bug

What this PR does / why we need it:

See #2514 for details.

This requires source volume id contains subscription id, which means you need to set subscriptionID: xxx in your storage class to create pvc/pv, by that way, the snapshot id would also contain subscription id:

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: azurefile-csi-test-16-sub
parameters:
  resourceGroup: rg-cross-subs
  subscriptionID: a6665e84-xxxxxx
provisioner: file.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

Which issue(s) this PR fixes:

Fixes #2514

Requirements:

Special notes for your reviewer:

With this patch driver can delete the snapshot from different subscription correctly:

I0415 11:25:01.035952       1 utils.go:105] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0415 11:25:01.035983       1 utils.go:106] GRPC request: {"snapshot_id":"rg-cross-subs#f5742e4fc484944eda71ed0#pvc-3a3cb10f-8d92-4363-9c30-aa7c28144502###default#a6665e84-xxxxxx#2025-04-15T11:06:21.0000000Z"}
I0415 11:25:01.384978       1 controllerserver.go:1073] delete snapshot(2025-04-15T11:06:21.0000000Z) successfully
I0415 11:25:01.385022       1 utils.go:112] GRPC response: {}

Tested on OpenShift only.

Release note:

none

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 15, 2025
@k8s-ci-robot k8s-ci-robot requested review from ZeroMagic and cvvz April 15, 2025 11:22
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 15, 2025
DeleteSnapshot should try to get subscription ID from snapshot ID
before proceeding with deletion to ensure we can delete a snapshot
that could have been created under different subscription ID.

If a cross-subscription snapshot deletion is attempted without it, the
driver will use the default subscription ID and fail with
`ResourceGroupNotFound` when `Delete` is executed with file share
client.

This is relevant only for ARM/Control Plane client authentication.
Copy link
Copy Markdown
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/retest

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 16, 2025
Copy link
Copy Markdown
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
/approve cancel
/lgtm cancel

return nil, status.Error(codes.InvalidArgument, "Snapshot ID must be provided")
}
rgName, accountName, fileShareName, _, _, _, err := GetFileShareInfo(req.SnapshotId) //nolint:dogsled
rgName, accountName, fileShareName, _, _, subsID, err := GetFileShareInfo(req.SnapshotId) //nolint:dogsled
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately the snapshot id does not container subs id, e.g.

azurefile-csi-driver-test-201921e8-8897-4f15-9199-d480a5c899fb#f84252b3b42624bd1a4f423#deletesnapshot-volume-1-4da0a823-4cb36f05###default#2025-04-16T13:08:25.0000000Z

I think supporting cross subscription for snapshot is a new feature, need to append subs id in creating snapshot

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now...the subs id is there only when the origin pv had it. But that means just adding the subs id to every new snapshot is not sufficient, because there can be existing snapshots without the subs id.

Tried a few ideas today, but not sure how to fix this safely yet.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worry, I will fix it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyzhangx Do you have some estimate on when you'll be able to merge the fix? We're nearing OpenShift 4.19 release and need to decide whether we document a workaround (manual removal) or cherry-pick the fix.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing volumeID format is risky, I will only make such change in csi driver v1.33.0, so I think you need to document a workaround (manual removal)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, if sourceVolumeID already contains the subs id, I think my new commit in this PR would fix the issue, it just adds isValidSubscriptionID func to validate whether it's subs id or not, that's quite safe change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if test pass, I will cherrypick this change to v1.32 release

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 16, 2025
@andyzhangx
Copy link
Copy Markdown
Member

I0416 10:20:12.248342       1 utils.go:105] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0416 10:20:12.248359       1 utils.go:106] GRPC request: {"snapshot_id":"capz-u95c6w#fcd3f2df074414678add6cb#pvc-f24ba0f1-a2f9-49e9-a200-6d0febf4ef87###azurefile-9494#2025-04-16T10:15:50.0000000Z"}
I0416 10:20:12.329341       1 azure_metrics.go:105] "Observed Request Latency" latency_seconds=0.080748305 request="azurefile_csi_driver_controller_delete_snapshot" resource_group="capz-u95c6w" subscription_id="2025-04-16T10:15:50.0000000Z" source="file.csi.azure.com" snapshot_id="capz-u95c6w#fcd3f2df074414678add6cb#pvc-f24ba0f1-a2f9-49e9-a200-6d0febf4ef87###azurefile-9494#2025-04-16T10:15:50.0000000Z" result_code="failed_csi_driver_controller_delete_snapshot"
E0416 10:20:12.329367       1 utils.go:110] GRPC error: rpc error: code = Internal desc = failed to delete snapshot(2025-04-16T10:15:50.0000000Z): DELETE https://management.azure.com/subscriptions/2025-04-16T10:15:50.0000000Z/resourceGroups/capz-u95c6w/providers/Microsoft.Storage/storageAccounts/fcd3f2df074414678add6cb/fileServices/default/shares/pvc-f24ba0f1-a2f9-49e9-a200-6d0febf4ef87
--------------------------------------------------------------------------------
RESPONSE 400: 400 Bad Request
ERROR CODE: InvalidSubscriptionId
--------------------------------------------------------------------------------
{

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 29, 2025
@andyzhangx andyzhangx changed the title allow deleting cross-subscription snapshots fix: allow deleting cross-subscription snapshots Apr 29, 2025
Copy link
Copy Markdown
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 29, 2025
Copy link
Copy Markdown
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 29, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, RomanBednar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 29, 2025
@andyzhangx andyzhangx merged commit fb0637b into kubernetes-sigs:master Apr 29, 2025
28 checks passed
@andyzhangx
Copy link
Copy Markdown
Member

/cherrypick release-1.32

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@andyzhangx: new pull request created: #2551

Details

In response to this:

/cherrypick release-1.32

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@andyzhangx
Copy link
Copy Markdown
Member

/cherrypick release-1.31

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@andyzhangx: #2515 failed to apply on top of branch "release-1.31":

Applying: allow deleting cross-subscription snapshots
Applying: fix: use isValidSubscriptionID to check
Using index info to reconstruct a base tree...
M	pkg/azurefile/controllerserver.go
M	pkg/azurefile/utils.go
M	pkg/azurefile/utils_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/azurefile/utils_test.go
CONFLICT (content): Merge conflict in pkg/azurefile/utils_test.go
Auto-merging pkg/azurefile/utils.go
Auto-merging pkg/azurefile/controllerserver.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0002 fix: use isValidSubscriptionID to check

Details

In response to this:

/cherrypick release-1.31

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

snapshot fails to delete when using cross subscriptions

4 participants