-
Notifications
You must be signed in to change notification settings - Fork 573
stats: support custom histogram bucket #2539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
message ProxyStatsHistogramBucketSetting { | ||
// Specifies a matcher for stats and the buckets that matching stats should use. | ||
// The match is applied to the original stat name before tag-extraction, | ||
// for example `cluster.exampleclustername.upstream_cx_length_ms`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this apply for istio_request_total type metrics? Or only envoy ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
istio/istio#41441 change the implement of istio.stats (with Telemetry API), after that this work for all stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Telemetry already has a way of selecting metrics. I am a bit worried about using a different mechanism..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be applied to counter metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it only works for Histogram
metric
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to re-iterate my concern that we are creating a new mechanism for selecting metrics when we already have one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unlike Telemetry API
, The match is applied to the original stat name before tag-extraction
@hzxuzhonghu @howardjohn should support this on |
ProxyConfig should only include xds client metadata, but later it is extended to include some other data plane config, which is not covered by API. For this case, telemetry api can apply to a specific proxy, so I feel proxy-related metric setting should be there. |
this's actually part of statsconfig in bootstrap, similar to |
option go_package="istio.io/api/type/v1alpha1"; | ||
|
||
// Describes how to match a given string. Match is case-sensitive. | ||
message StringMatch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is already such a API, which is used by VS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is two message StringMatch
in this repo, they may share same API just like WorkloadSelector
?
IC, |
47b37e1
to
c3637a1
Compare
LGTM. I think it's useful for customizing standard Envoy metrics, too, so it can't be in just the stats filter. Looks like it's mostly verbatim copied into bootstrap, so it should be noted that it requires a proxy restart. |
update notes |
This is a bit confusing - We were trying to get away from ProxyConfig options for Telemetry in lieu of Telemetry API. Why is the new API not used? Secondly, it's unclear what all metrics does it effect. What happens if you update a histogram metrics buckets? Can you make sense of old and new data? Does it use the same name? |
this's actually part of bootstrap, Temeletry API cannot achieve
it will not change metrics name, only change the default buckets of histogram metric. |
886f4b0
to
dbb67de
Compare
kindly ping @nrjpoddar |
Telemetry API cannot change configuration in bootstrap, BDS(bootstrap discovery request) also will not help in this scenario. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM but would need Telemetry WG experts to give their approval.
// Specifies a matcher for stats and the buckets that matching stats should use. | ||
// The match is applied to the original stat name before tag-extraction, | ||
// for example `cluster.exampleclustername.upstream_cx_length_ms`. | ||
istio.type.v1alpha1.StringMatch match = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a new type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is two protoo StringMatch
in this repo, they may share same API just like WorkloadSelector
?
I can reuse the one with VS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few concerns here:
- ProxyConfig vs Telemetry - not sure there is a clear answer on why it's kept in ProxyConfig
- the regex matching - I don't think it's a good idea in general, but for metrics in particular.
- the shape of the API - I think it's reasonable to customize metrics, but I would expect an
API where for each metric (by name - we don't have that many) we would have some settings associated to that metric - buckets, etc.
But most important concern: we are in the middle of few major changes, with
OpenTelemetry maturing, Ambient, K8S Gateway happening. We also know that
Istio telemetry 'cardinality' and miss-match with other telemetry standards is a problem.
One of the options to improve and better integrate with OTel - for example generate the of the high-cardinality metrics from otel access logs ( historically istio metrics were designed with Mixer - which was access log based, so less of a problem ).
I would really like to see a doc or have a discussion about the future of telemetry and
how it fits with the evolving world around us - and which APIs belong in Istio and which
in open telemetry.
Regarding the bootstrap - I think this is an implementation detail of Envoy that leaks into API without any good reason. We can use any of the Istio APIs in bootstrap - just need to inject them into the pod like we do for ProxyConfig, or get them over XDS before starting envoy ( agent already gets the root CAs from Istiod ). We can get all the config in agent Ambient on the other hand is likely to not have any injection at all - and a lot of the changes that depend on injecting |
AFAIK,
I agree that regex it's not a good idea, but it's widely used in envoy statistic system.
that's another topic, which is too large for this PR (IMO, this's minor improvement) |
On Fri, Dec 9, 2022 at 11:31 PM zirain ***@***.***> wrote:
ProxyConfig vs Telemetry - not sure there is a clear answer on why it's
kept in ProxyConfig
AFAIK, Telemetry used to generate configuration for stats filter, and as
a part of bootstrap config, that's why I put this on ProxyConfig.
We keep confusing implementation details in envoy with Istio API.
The fact Envoy happens to configure something in bootstrap should not have
any impact on the API surface.
The agent can get telemetry configs and generate whatever bootstrap it
needs - and when we move to
ztunnel in ambient it may generate whatever rust needs.
the regex matching - I don't think it's a good idea in general, but for
metrics in particular.
I agree that regex it's not a good idea, but it's widely used in envoy
statistic system.
Envoy configs are huge and usually just a reflection of its internal
implementation details.
There are a lot of features in Envoy we choose not to expose in Istio API -
and even more that
won't be exposed in K8S Gateway or OTel.
Istio API should be implementable with ztunnel or proxyless gRPC.
If users absolutely need an envoy config - we have EnvoyFilter, and I
believe it is now possible
to also have the bootstrap generated dynamically by Istiod, for users who
want to take strong
dependencies on envoy internal impl.
I would really like to see a doc or have a discussion about the future of
telemetry and
how it fits with the evolving world around us - and which APIs belong in
Istio and which
in open telemetry.
that's another topic, which is too large for this PR (IMO, this's minor
improvement)
Before v1.17, this won't work on a wasm filter, it's not a bad good idea
to provide an api from the beginning of switching to native filter.
Any API we add is going to make the transition to ambient and OTel a bit
harder.
I have nothing against the feature - it seems quite useful - but we must
take into account what
happens outside of Istio and where we want to go long term.
For example, if this historgram setting will be used for TCP metrics - how
would they be implemented
by ztunnel (where there is no ProxyConfig - since it's shared per node -
and no envoy ) ?
And if we switch to OTel - does the collector or instrumentation have a
similar config ?
… —
Reply to this email directly, view it on GitHub
<#2539 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2SYJDPLK7RTO5YVVMLWMQWU3ANCNFSM6AAAAAARUZK47M>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
To resolve some ambiguities:
|
LGTM. |
@zirain: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cannot make an agreement, close this first. |
after istio/istio#41441, add first class api to support custom histogram bucket
cc @kyessenov