Fix running benchmarks (Issue 1386) #1393

itingliu · 2025-06-30T21:53:37Z

Fix the issues I encountered when trying to run benchmarks from the package.

With these changes I can now run benchmarks inside Benchmarks folder with the following command

USE_PACKAGE=1 BENCHMARK_DISABLE_JEMALLOC=true swift package --disable-sandbox -c release benchmark run --filter "URL-Template-expansion"

Specifically

Enable MemberImportVisibility so system Foundation doesn't get automatically pulled in when any of the dependencies import Foundation in their source code
Create a shared benchmark InternationalizationBenchmark.swift for all benchmarks files inside /Internationalization/ folder. Previously I was seeing invalid redeclaration of 'benchmarks' because BenchmarkCalendar.swift and BenchmarkLocale.swift are placed inside the same folder, and both of them declare let benchmarks, and that isn't supported by Benchmark package.
Replace NSTemporaryDirectory with the one from swift-foundation package.

itingliu · 2025-06-30T21:54:36Z

@swift-ci please test

itingliu · 2025-06-30T21:56:00Z

Benchmarks/Package.swift

@@ -54,7 +54,7 @@ print("swift-foundation benchmarks: \(usePackage.description)")
 var packageDependency : [Package.Dependency] = [.package(url: "https://github.com/ordo-one/package-benchmark.git", from: "1.11.1")]
 var targetDependency : [Target.Dependency] = [.product(name: "Benchmark", package: "package-benchmark")]
 var i18nTargetDependencies : [Target.Dependency] = []
-var swiftSettings : [SwiftSetting] = []
+var swiftSettings : [SwiftSetting] = [.unsafeFlags(["-Rmodule-loading"]), .enableUpcomingFeature("MemberImportVisibility")]


This .unsafeFlags(["-Rmodule-loading"] doesn't contribute to this fix, but I don't think it hurts to leave it here. Can definitely remove it if it's too loud.

Seems fine to me, since this patch is trying to keep that from happening again.

vanvoorden · 2025-06-30T22:08:45Z

@itingliu What happens if we try:

$ BENCHMARK_DISABLE_JEMALLOC=true swift package --disable-sandbox -c release benchmark run --filter "URL-Template-expansion"

I think this diff would still have the same availability error from URL.Template. Would we ship another diff that can fix that?

itingliu · 2025-06-30T22:19:44Z

@itingliu What happens if we try:
$ BENCHMARK_DISABLE_JEMALLOC=true swift package --disable-sandbox -c release benchmark run --filter "URL-Template-expansion"
I think this diff would still have the same availability error from URL.Template. Would we ship another diff that can fix that?

Without USE_PACKAGE=1 we'd still be using the system foundation. To allow that to be the default, we'd need to modify Benchmark's package file. I'm not familiar with all of the various settings there, so I'm hesitant to change that. Regardless of what we do there, this PR is needed to solve these obvious issues, and it at least unblocks local workflow for all of us.

vanvoorden · 2025-07-01T02:35:21Z

Benchmarks/Benchmarks/Internationalization/InternationalizationBenchmark.swift

+let benchmarks = {
+    calendarBenchmarks()
+    localeBenchmarks()
+}


@itingliu FWIW… it's possible this might lead to some unexpected behaviors. Our calendarBenchmarks function sets some global state:

Benchmark.defaultConfiguration.metrics = [.cpuTotal, .mallocCountTotal, .throughput]

And our localeBenchmarks also sets some global state:

Benchmark.defaultConfiguration.metrics = [.cpuTotal, .wallClock, .throughput, .peakMemoryResident, .peakMemoryResidentDelta]

At this point… I'm not sure we know exactly which function "wins".

I believe our expectation is that the benchmarks defined in localeBenchmarks run with peakMemoryResident and peakMemoryResidentDelta as default metrics. Is that correct? And our expectation is that the benchmarks defined in calendarBenchmarks do not run with peakMemoryResident and peakMemoryResidentDelta as default metrics. Is that correct?

They're executed sequentially, so I think mutating global state is fine. The configurations will just be updated right before the tests run, or am I missing anything?

The configurations will just be updated right before the tests run, or am I missing anything?

This is where my question was going… to what extent is registering a set of benchmarks decoupled from running a subset of those benchmarks?

Suppose we run our international benchmarks package target… but we focus specifically on a benchmark defined from calendarBenchmarks:

$ swift package --disable-sandbox -c release benchmark run --target "InternationalizationBenchmarks" --filter "nextThousandThursdaysInTheFourthWeekOfNovember"

When we launch our benchmarks… I assume we have to execute both calendarBenchmarks and localeBenchmarks. The next step then is to respect the filter that was specified and only run nextThousandThursdaysInTheFourthWeekOfNovember. At that point… I believe we can see how our benchmark tests are potentially ~~running~~ out of sync from the global state we set when those tests were defined.

In general I believe the pattern I typically see is splitting benchmark targets apart and this question about the global Benchmark.defaultConfiguration state is not very important. It is global state… but we tear it down and rebuild it for every target so it is fresh and clean.

I do not believe this is a major issue… and this is currently a very impactful diff that unblocks engineers from running benchmarks and that is good! But I do believe there might be some unexpected issues currently from running the benchmarks and what specific metrics might be reported. I could maybe think of three possible ideas to work around that:

Move BenchmarkCalendar and BenchmarkLocale to separate and independent benchmark targets.

Keep BenchmarkCalendar and BenchmarkLocale as one benchmark target but update the way that configurations are passed to benchmarks so we do not depend on global state. Similar to what we do in other places.¹

Keep BenchmarkCalendar and BenchmarkLocale as one benchmark target and update some header documentation comments where the benchmarks are defined to notify engineers that we might see unexpected behavior when configurations might not be respected.

But I don't think this discussion would have to block this specific diff from landing. I think you can use your best judgement and make the best decision if you want to ship more code on this right now or ship a future diff with a potential workaround in the future.

Footnotes

https://github.com/swiftlang/swift-foundation/blob/swift-6.1.2-RELEASE/Benchmarks/Benchmarks/Internationalization/BenchmarkCalendar.swift#L135-L148 ↩

I assume we have to execute both calendarBenchmarks and localeBenchmarks. The next step then is to respect the filter that was specified and only run nextThousandThursdaysInTheFourthWeekOfNovember. At that point… I believe we can see how our benchmark tests are running out of sync from the global state we set when those tests were defined.

Since nextThousandThursdaysInTheFourthWeekOfNovember is defined inside localeBenchmarks, localeBenchmarks will need to be run first, so I would expect the configuration set in that benchmark to be used.

Ahh… I think you are correct! It looks like the default Benchmark constructor does use shared global state:

https://github.com/ordo-one/package-benchmark/blob/1.29.3/Sources/Benchmark/Benchmark.swift#L229

But the configuration itself is a value type:

https://github.com/ordo-one/package-benchmark/blob/1.29.3/Sources/Benchmark/Benchmark.swift#L432

Which is then copied by value in the new benchmark:

https://github.com/ordo-one/package-benchmark/blob/1.29.3/Sources/Benchmark/Benchmark.swift#L238

So it looks like defining benchmarks does capture shared mutable state… but it captures that shared mutable state by value and mutating that shared state in the future does not affect the benchmark after it was defined. My mistake! Sorry for any confusion about that.

vanvoorden · 2025-07-02T18:18:54Z

itingliu added 3 commits June 30, 2025 14:23

Attempted fix

954ca9a

Fix URL temporary directory

6b166c1

Fix invalid redeclaration of 'benchmarks'

92d2258

itingliu mentioned this pull request Jun 30, 2025

benchmarks failing from main #1386

Closed

itingliu commented Jun 30, 2025

View reviewed changes

itingliu linked an issue Jun 30, 2025 that may be closed by this pull request

benchmarks failing from main #1386

Closed

parkera approved these changes Jun 30, 2025

View reviewed changes

vanvoorden reviewed Jul 1, 2025

View reviewed changes

itingliu merged commit 6d87788 into swiftlang:main Jul 2, 2025
16 checks passed

itingliu deleted the fix-benchmark branch July 2, 2025 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix running benchmarks (Issue 1386) #1393

Fix running benchmarks (Issue 1386) #1393

Uh oh!

itingliu commented Jun 30, 2025

Uh oh!

itingliu commented Jun 30, 2025

Uh oh!

itingliu Jun 30, 2025

Uh oh!

parkera Jun 30, 2025

Uh oh!

vanvoorden commented Jun 30, 2025

Uh oh!

itingliu commented Jun 30, 2025

Uh oh!

vanvoorden Jul 1, 2025

Uh oh!

itingliu Jul 1, 2025 •

edited

Loading

Uh oh!

vanvoorden Jul 1, 2025 •

edited

Loading

Uh oh!

itingliu Jul 1, 2025

Uh oh!

vanvoorden Jul 1, 2025

Uh oh!

Uh oh!

vanvoorden commented Jul 2, 2025

Uh oh!

Uh oh!

Fix running benchmarks (Issue 1386) #1393

Fix running benchmarks (Issue 1386) #1393

Uh oh!

Conversation

itingliu commented Jun 30, 2025

Uh oh!

itingliu commented Jun 30, 2025

Uh oh!

itingliu Jun 30, 2025

Choose a reason for hiding this comment

Uh oh!

parkera Jun 30, 2025

Choose a reason for hiding this comment

Uh oh!

vanvoorden commented Jun 30, 2025

Uh oh!

itingliu commented Jun 30, 2025

Uh oh!

vanvoorden Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

itingliu Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanvoorden Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Footnotes

Uh oh!

itingliu Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

vanvoorden Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vanvoorden commented Jul 2, 2025

Uh oh!

Uh oh!

itingliu Jul 1, 2025 •

edited

Loading

vanvoorden Jul 1, 2025 •

edited

Loading