chore: add CSI-specific Prometheus metrics#3464
chore: add CSI-specific Prometheus metrics#3464k8s-ci-robot merged 2 commits intokubernetes-sigs:masterfrom
Conversation
|
Hi @hccheng72. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Pull request overview
This pull request adds CSI-specific Prometheus metrics to the Azure Disk CSI Driver, replacing the previous cloud provider metrics implementation. The changes aim to provide better observability for CSI controller and node operations with dedicated metric types and labels.
Changes:
- Introduces a new
pkg/metricspackage with controller and node operation metrics (counters, histograms, and gauges) - Replaces cloud provider metrics with CSI-specific metrics across controller and node server operations
- Updates documentation to describe the new metrics and provide usage examples
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/metrics/metrics.go | New metrics package defining controller and node operation metrics with registration and helper functions |
| pkg/azurediskplugin/main.go | Adds metrics registration at driver startup |
| pkg/azuredisk/nodeserver.go | Replaces cloud provider metrics with CSI metrics for node operations (stage, unstage, publish, unpublish, expand) |
| pkg/azuredisk/controllerserver.go | Replaces cloud provider metrics with CSI metrics for controller operations (create, delete, modify, publish, unpublish, expand, snapshot) |
| deploy/example/metrics/README.md | Comprehensive update with new metric descriptions and usage examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4b9338e to
58ac070
Compare
58ac070 to
6de0dc3
Compare
|
/ok-to-test |
6de0dc3 to
2941d32
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/hold |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
the latency logs are still missing some key columns, could you add back? there are existing kusto query relying on these columns:
-
origin
I0116 17:11:51.778446 1 azure_metrics.go:106] "Observed Request Latency" logger="logLatency" latency_seconds=11.006767806 request="azuredisk_csi_driver_controller_create_volume" resource_group="capz-iequs8" subscription_id="46678f10-4bbb-447e-98e8-d2829589f2d8" source="disk.csi.azure.com" volumeid="/subscriptions/46678f10-4bbb-447e-98e8-d2829589f2d8/resourceGroups/capz-iequs8/providers/Microsoft.Compute/disks/pvc-91448383-b866-4255-9ea5-e54f9681e12a" result_code="succeeded" -
with this PR:
I0116 23:59:07.521827 1 metrics.go:130] "Observed Request Latency" logger="logLatency" request="azuredisk_csi_driver_controller_create_volume" latency_seconds=5.980431424 result_code="succeeded"
84055e9 to
619aaec
Compare
619aaec to
db9e4fe
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx, hccheng72 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold cancel |
|
/cherrypick release-1.34 |
|
/cherrypick release-1.33 |
|
@andyzhangx: cannot checkout DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@andyzhangx: #3464 failed to apply on top of branch "release-1.33": DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/cherrypick release-1.34 |
|
@andyzhangx: cannot checkout DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@hccheng72 can you also add the metrics log logic back in blob csi driver? thx |
|
/cherrypick release-1.34 |
|
@andyzhangx: new pull request created: #3475 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR introduces dedicated CSI operation metrics for the AzureDisk CSI driver, separate from the existing Azure cloud provider API metrics, enabling better observability of CSI operations.
Which issue(s) this PR fixes:
Fixes #
Requirements:
Special notes for your reviewer:
Release note: