Skip to content

chore: add CSI-specific Prometheus metrics#3464

Merged
k8s-ci-robot merged 2 commits intokubernetes-sigs:masterfrom
hccheng72:add-csi-metrics
Jan 21, 2026
Merged

chore: add CSI-specific Prometheus metrics#3464
k8s-ci-robot merged 2 commits intokubernetes-sigs:masterfrom
hccheng72:add-csi-metrics

Conversation

@hccheng72
Copy link
Copy Markdown
Contributor

@hccheng72 hccheng72 commented Jan 12, 2026

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR introduces dedicated CSI operation metrics for the AzureDisk CSI driver, separate from the existing Azure cloud provider API metrics, enabling better observability of CSI operations.

Which issue(s) this PR fixes:

Fixes #

Requirements:

Special notes for your reviewer:

Release note:

none

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 12, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @hccheng72. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 12, 2026
@landreasyan landreasyan requested a review from Copilot January 12, 2026 21:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds CSI-specific Prometheus metrics to the Azure Disk CSI Driver, replacing the previous cloud provider metrics implementation. The changes aim to provide better observability for CSI controller and node operations with dedicated metric types and labels.

Changes:

  • Introduces a new pkg/metrics package with controller and node operation metrics (counters, histograms, and gauges)
  • Replaces cloud provider metrics with CSI-specific metrics across controller and node server operations
  • Updates documentation to describe the new metrics and provide usage examples

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
pkg/metrics/metrics.go New metrics package defining controller and node operation metrics with registration and helper functions
pkg/azurediskplugin/main.go Adds metrics registration at driver startup
pkg/azuredisk/nodeserver.go Replaces cloud provider metrics with CSI metrics for node operations (stage, unstage, publish, unpublish, expand)
pkg/azuredisk/controllerserver.go Replaces cloud provider metrics with CSI metrics for controller operations (create, delete, modify, publish, unpublish, expand, snapshot)
deploy/example/metrics/README.md Comprehensive update with new metric descriptions and usage examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 13, 2026
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 13, 2026
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 13, 2026
@landreasyan landreasyan requested a review from Copilot January 13, 2026 20:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@landreasyan
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 13, 2026
@hccheng72 hccheng72 marked this pull request as ready for review January 14, 2026 00:14
@hccheng72 hccheng72 changed the title [WIP]chore: add CSI-specific Prometheus chore: add CSI-specific Prometheus Jan 14, 2026
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 14, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@andyzhangx
Copy link
Copy Markdown
Member

/hold
see comments here: kubernetes-sigs/azurefile-csi-driver#2943 (comment)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 16, 2026
@landreasyan landreasyan requested a review from Copilot January 17, 2026 00:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the latency logs are still missing some key columns, could you add back? there are existing kusto query relying on these columns:

  • origin
    I0116 17:11:51.778446 1 azure_metrics.go:106] "Observed Request Latency" logger="logLatency" latency_seconds=11.006767806 request="azuredisk_csi_driver_controller_create_volume" resource_group="capz-iequs8" subscription_id="46678f10-4bbb-447e-98e8-d2829589f2d8" source="disk.csi.azure.com" volumeid="/subscriptions/46678f10-4bbb-447e-98e8-d2829589f2d8/resourceGroups/capz-iequs8/providers/Microsoft.Compute/disks/pvc-91448383-b866-4255-9ea5-e54f9681e12a" result_code="succeeded"

  • with this PR:
    I0116 23:59:07.521827 1 metrics.go:130] "Observed Request Latency" logger="logLatency" request="azuredisk_csi_driver_controller_create_volume" latency_seconds=5.980431424 result_code="succeeded"

Copy link
Copy Markdown
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 21, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, hccheng72

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 21, 2026
@andyzhangx
Copy link
Copy Markdown
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 21, 2026
@k8s-ci-robot k8s-ci-robot merged commit 24bb225 into kubernetes-sigs:master Jan 21, 2026
23 checks passed
@andyzhangx
Copy link
Copy Markdown
Member

/cherrypick release-1.34

@andyzhangx
Copy link
Copy Markdown
Member

/cherrypick release-1.33

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@andyzhangx: cannot checkout release-1.34: error checking out "release-1.34": exit status 1 error: pathspec 'release-1.34' did not match any file(s) known to git

Details

In response to this:

/cherrypick release-1.34

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@andyzhangx: #3464 failed to apply on top of branch "release-1.33":

Applying: chore: add csi specific metrics
Using index info to reconstruct a base tree...
M	pkg/azuredisk/controllerserver.go
M	pkg/azuredisk/nodeserver.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/azuredisk/nodeserver.go
Auto-merging pkg/azuredisk/controllerserver.go
CONFLICT (content): Merge conflict in pkg/azuredisk/controllerserver.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0001 chore: add csi specific metrics

Details

In response to this:

/cherrypick release-1.33

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@andyzhangx
Copy link
Copy Markdown
Member

/cherrypick release-1.34

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@andyzhangx: cannot checkout release-1.34: error checking out "release-1.34": exit status 1 error: pathspec 'release-1.34' did not match any file(s) known to git

Details

In response to this:

/cherrypick release-1.34

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@andyzhangx
Copy link
Copy Markdown
Member

@hccheng72 can you also add the metrics log logic back in blob csi driver? thx

@andyzhangx
Copy link
Copy Markdown
Member

/cherrypick release-1.34

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@andyzhangx: new pull request created: #3475

Details

In response to this:

/cherrypick release-1.34

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants