Feature Request
If this is a feature request, please fill out the following form in full:
Describe the problem the feature is intended to solve
For now, tensorflow serving exports metrics by model like below.
...
:tensorflow:serving:request_count{model_name="test_model",status="OK"} 6
...
:tensorflow:serving:request_latency_bucket{model_name="test_model",API="predict",entrypoint="REST",le="10"} 0
:tensorflow:serving:request_latency_bucket{model_name="test_model",API="predict",entrypoint="REST",le="18"} 0
...
:tensorflow:serving:runtime_latency_bucket{model_name="test_model",API="Predict",runtime="TF1",le="10"} 0
:tensorflow:serving:runtime_latency_bucket{model_name="test_model",API="Predict",runtime="TF1",le="18"} 0
:tensorflow:serving:runtime_latency_bucket{model_name="test_model",API="Predict",runtime="TF1",le="32.4"} 0
...
We cannot collect metrics by signatures, even if the latencies of each signature are very different.
Related codes:
|
void RecordRuntimeLatency(const string& model_name, const string& api, |
|
const string& runtime, int64_t latency_usec); |
|
void RecordRequestLatency(const string& model_name, const string& api, |
|
const string& entrypoint, int64_t latency_usec); |
Describe the solution
It must be better if runtime latency and request latency are recorded with signature names.
Describe alternatives you've considered
Additional context
Feature Request
If this is a feature request, please fill out the following form in full:
Describe the problem the feature is intended to solve
For now, tensorflow serving exports metrics by model like below.
We cannot collect metrics by signatures, even if the latencies of each signature are very different.
Related codes:
serving/tensorflow_serving/servables/tensorflow/util.h
Lines 118 to 119 in 21360c7
serving/tensorflow_serving/servables/tensorflow/util.h
Lines 122 to 123 in 21360c7
Describe the solution
It must be better if runtime latency and request latency are recorded with signature names.
Describe alternatives you've considered
Additional context