Skip to content

Commit 9000b84

Browse files
bmiguel-teixeiracparkins
authored andcommitted
[exporter/prometheusremotewrite] fix/sanitize exponential retry defaults (open-telemetry#30286)
**Description:** This PR sanitizes the default retry settings by using default constructor and setting the single `InitialInterval`, instead of user initializing entire structure manually. The current default settings do not become exponential due to being upper bound at 200ms by `MaxInterval: 200 * time.Millisecond,`. This effectively caps the wait intervals between requests at a very low interval, thus becomes more of a linear retry than exponential. And due to the distributed nature of the Otel Collector, coupled with each metric export being handled by go-routine, this caused a massive churn of requests when remote endpoint is unavailable. *Before/Current* ``` first backoff 673.174543ms elapsed 83ns next backoff 25.476545ms elapsed 44.95825ms next backoff 134.374813ms elapsed 217.756375ms next backoff 120.396777ms elapsed 475.136125ms next backoff 130.842218ms elapsed 607.402958ms next backoff 170.769874ms elapsed 836.674125ms next backoff 201.503411ms elapsed 938.644667ms next backoff 166.184906ms elapsed 1.186788375s next backoff 143.509999ms elapsed 1.296595125s next backoff 190.37078ms . . . elapsed 59.34433225s next backoff 298.715693ms elapsed 59.533727458s next backoff 136.844597ms total requests made: 299 ``` *After this change* ``` first backoff 343.109367ms elapsed 166ns next backoff 55.833932ms elapsed 74.545375ms next backoff 151.276125ms elapsed 192.138833ms next backoff 364.656003ms elapsed 744.091666ms next backoff 564.927763ms elapsed 1.75722475s next backoff 1.248209052s elapsed 4.077209583s next backoff 3.374196626s elapsed 8.769952708s next backoff 8.449184533s elapsed 23.303548708s next backoff 7.341400254s elapsed 43.388838083s next backoff 22.327511322s elapsed 1m14.147837833s next backoff 43.127363668s elapsed 1m55.730420666s next backoff 34.674349715s elapsed 2m16.703792375s next backoff 40.140627107s elapsed 2m31.856713458s next backoff 27.45308936s elapsed 2m49.376459416s next backoff 29.853555518s elapsed 3m14.625332666s next backoff 17.209739491s elapsed 3m37.914321833s next backoff 15.614405319s elapsed 4m14.614882458s next backoff 35.458886397s elapsed 4m33.424847s next backoff -1ns total requests made: 18 ``` **Link to tracking Issue:** No issues opened from my side. **Testing:** Unit testes already in place. **Documentation:** N/A
1 parent 6fbc8c2 commit 9000b84

File tree

2 files changed

+31
-9
lines changed

2 files changed

+31
-9
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Use this changelog template to create an entry for release notes.
2+
3+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
4+
change_type: bug_fix
5+
6+
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
7+
component: prometheusremotewriteexporter
8+
9+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
10+
note: sanitize retry default settings
11+
12+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
13+
issues: [30286]
14+
15+
# (Optional) One or more lines of additional information to render under the primary note.
16+
# These lines will be padded with 2 spaces and then inserted directly into the document.
17+
# Use pipe (|) for multiline entries.
18+
subtext:
19+
20+
# If your change doesn't affect end users or the exported elements of any package,
21+
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
22+
# Optional: The change log or logs in which this entry should be included.
23+
# e.g. '[user]' or '[user, api]'
24+
# Include 'user' if the change is relevant to end users.
25+
# Include 'api' if there is a change to a library API.
26+
# Default: '[user]'
27+
change_logs: [user]

exporter/prometheusremotewriteexporter/factory.go

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@ import (
88
"errors"
99
"time"
1010

11-
"github.com/cenkalti/backoff/v4"
1211
"go.opentelemetry.io/collector/component"
1312
"go.opentelemetry.io/collector/config/confighttp"
1413
"go.opentelemetry.io/collector/config/configopaque"
@@ -67,19 +66,15 @@ func createMetricsExporter(ctx context.Context, set exporter.CreateSettings,
6766
}
6867

6968
func createDefaultConfig() component.Config {
69+
retrySettings := exporterhelper.NewDefaultRetrySettings()
70+
retrySettings.InitialInterval = 50 * time.Millisecond
71+
7072
return &Config{
7173
Namespace: "",
7274
ExternalLabels: map[string]string{},
7375
MaxBatchSizeBytes: 3000000,
7476
TimeoutSettings: exporterhelper.NewDefaultTimeoutSettings(),
75-
RetrySettings: exporterhelper.RetrySettings{
76-
Enabled: true,
77-
InitialInterval: 50 * time.Millisecond,
78-
MaxInterval: 200 * time.Millisecond,
79-
MaxElapsedTime: 1 * time.Minute,
80-
RandomizationFactor: backoff.DefaultRandomizationFactor,
81-
Multiplier: backoff.DefaultMultiplier,
82-
},
77+
RetrySettings: retrySettings,
8378
AddMetricSuffixes: true,
8479
SendMetadata: false,
8580
HTTPClientSettings: confighttp.HTTPClientSettings{

0 commit comments

Comments
 (0)