Skip to content

Conversation

anderseknert
Copy link
Member

Since we don't use OPA's HTTP API in Regal, I hadn't looked at this from a performance POV before, but it was a fun side quest :)

A pretty decent reduction of the baseline cost for the most common request type, v1/data POST. Changes:

  • Add NoOp implementation of Metrics for when metrics aren't needed
  • Cheaper metrics.New instantiation avoiding unnecessary lock
  • Avoid cost of decision logging if decision logging isn't enabled
  • Only call request.URL.Query() if URL.RawQuery isn't empty, avoiding allocating an empty map with each request

While the target was v1/data POST handling, some of the changes above have a positive impact on all or most handlers. All changes have been run through the existing tests, and a new benchmark to measure the impact of the fixes have been added, showing:

13063 ns/op	   15162 B/op	     195 allocs/op - main
12796 ns/op	   14856 B/op	     189 allocs/op - avoid r.URL.Query() when no query provided
12541 ns/op	   14483 B/op	     187 allocs/op - decisionLogger.Log early exit if not enabled
12133 ns/op	   14235 B/op	     180 allocs/op - get revisions and init logger only if needed
11098 ns/op	   14207 B/op	     180 allocs/op - more efficient metrics.New() (without locking)
10683 ns/op	   13171 B/op	     169 allocs/op - use no-op metrics implementation when metrics aren't requested

Even if the impact is pretty good, it's worth noting that most of the improvements above are only seen when decision logging is turned off. While this is the common case for development, it's not in production. Getting the baseline cost down is important still as there should be no cost paid for features unused.

@anderseknert anderseknert force-pushed the v1-data-post-perf branch 2 times, most recently from f00473a to 5d5aba0 Compare June 9, 2025 06:02
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good! I'm afraid API-level refactors like this are a tad risky, because our test coverage might be less than great. So let's keep an eye out on reports when this is released, but I think we should be OK here. 😃

)

func (*noOpMetrics) Info() Info { return Info{Name: "<built-in no-op>"} }
func (*noOpMetrics) Timer(name string) Timer { return noOpTimerInstance }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, we could also do this

Suggested change
func (*noOpMetrics) Timer(name string) Timer { return noOpTimerInstance }
func (x *noOpMetrics) Timer(name string) Timer { return x }
func (*noOpMetrics) Start() {}
func (*noOpMetrics) Stop() int64 { return 0 }
func (*noOpMetrics) Value() any { return 0 }
func (*noOpMetrics) Int64() int64 { return 0 }

etc...but it would make little difference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But x is not a timer, is it? 🤔

@anderseknert
Copy link
Member Author

@srenatus — agreed... although most of the changes feel pretty safe, there's always some risk. The no-op metrics is what I feel most uncertain about, as the metrics are in so many places and it's hard to know how they're used everywhere. I was thinking that perhaps this would break Prometheus metrics, but looking at that code it seems like it collects its own metrics. And if no decisions are being logged and the user doesn't ask for metrics, I think we should be able to skip collecting them 😅

Since we don't use the HTTP API in Regal, I hadn't looked at this from
a performance POV before, and it was a fun side quest :)

A pretty decent reduction of the baseline cost for the most common
request type, v1/data POST. Changes:

- Add NoOp implementation of `Metrics` for when metrics aren't needed
- Cheaper `metrics.New` instantiation avoiding unnecessary lock
- Avoid cost of decision logging if decision logging isn't enabled
- Only call request.URL.Query() if URL.RawQuery isn't empty, avoiding
  allocating an empty map with each request

While the target was v1/data POST handling, some of the changes above
have a positive impact on all or most handlers. All changes have been
run through the existing tests, and a new benchmark to measure the impact
of the fixes have been added, showing:

```
13063 ns/op	   15162 B/op	     195 allocs/op - main
12796 ns/op	   14856 B/op	     189 allocs/op - avoid r.URL.Query() when no query provided
12541 ns/op	   14483 B/op	     187 allocs/op - decisionLogger.Log early exit if not enabled
12133 ns/op	   14235 B/op	     180 allocs/op - get revisions and init logger only if needed
11098 ns/op	   14207 B/op	     180 allocs/op - more efficient metrics.New() (without locking)
10683 ns/op	   13171 B/op	     169 allocs/op - use no-op metrics implementation when metrics aren't requested
```

Even if the impact is pretty good, it's worth noting that most of the improvements
above are only seen when decision logging is turned off. While this is the common case
for development, it's not in production. Getting the baseline cost down is important still
as there should be no cost paid for features unused.

Signed-off-by: Anders Eknert <[email protected]>
@anderseknert anderseknert merged commit b1ce92c into open-policy-agent:main Jun 9, 2025
28 checks passed
@anderseknert anderseknert deleted the v1-data-post-perf branch June 9, 2025 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants