Skip to content

Fix running benchmarks (Issue 1386) #1393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 2, 2025
Merged

Conversation

itingliu
Copy link
Contributor

Fix the issues I encountered when trying to run benchmarks from the package.

With these changes I can now run benchmarks inside Benchmarks folder with the following command

USE_PACKAGE=1 BENCHMARK_DISABLE_JEMALLOC=true swift package --disable-sandbox -c release benchmark run --filter "URL-Template-expansion"

Specifically

  1. Enable MemberImportVisibility so system Foundation doesn't get automatically pulled in when any of the dependencies import Foundation in their source code
  2. Create a shared benchmark InternationalizationBenchmark.swift for all benchmarks files inside /Internationalization/ folder. Previously I was seeing invalid redeclaration of 'benchmarks' because BenchmarkCalendar.swift and BenchmarkLocale.swift are placed inside the same folder, and both of them declare let benchmarks, and that isn't supported by Benchmark package.
  3. Replace NSTemporaryDirectory with the one from swift-foundation package.

@itingliu
Copy link
Contributor Author

@swift-ci please test

@@ -54,7 +54,7 @@ print("swift-foundation benchmarks: \(usePackage.description)")
var packageDependency : [Package.Dependency] = [.package(url: "https://github.com/ordo-one/package-benchmark.git", from: "1.11.1")]
var targetDependency : [Target.Dependency] = [.product(name: "Benchmark", package: "package-benchmark")]
var i18nTargetDependencies : [Target.Dependency] = []
var swiftSettings : [SwiftSetting] = []
var swiftSettings : [SwiftSetting] = [.unsafeFlags(["-Rmodule-loading"]), .enableUpcomingFeature("MemberImportVisibility")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .unsafeFlags(["-Rmodule-loading"] doesn't contribute to this fix, but I don't think it hurts to leave it here. Can definitely remove it if it's too loud.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me, since this patch is trying to keep that from happening again.

@itingliu itingliu linked an issue Jun 30, 2025 that may be closed by this pull request
@vanvoorden
Copy link
Contributor

@itingliu What happens if we try:

$ BENCHMARK_DISABLE_JEMALLOC=true swift package --disable-sandbox -c release benchmark run --filter "URL-Template-expansion"

I think this diff would still have the same availability error from URL.Template. Would we ship another diff that can fix that?

@itingliu
Copy link
Contributor Author

@itingliu What happens if we try:

$ BENCHMARK_DISABLE_JEMALLOC=true swift package --disable-sandbox -c release benchmark run --filter "URL-Template-expansion"

I think this diff would still have the same availability error from URL.Template. Would we ship another diff that can fix that?

Without USE_PACKAGE=1 we'd still be using the system foundation. To allow that to be the default, we'd need to modify Benchmark's package file. I'm not familiar with all of the various settings there, so I'm hesitant to change that. Regardless of what we do there, this PR is needed to solve these obvious issues, and it at least unblocks local workflow for all of us.

Comment on lines +3 to +6
let benchmarks = {
calendarBenchmarks()
localeBenchmarks()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@itingliu FWIW… it's possible this might lead to some unexpected behaviors. Our calendarBenchmarks function sets some global state:

Benchmark.defaultConfiguration.metrics = [.cpuTotal, .mallocCountTotal, .throughput]

And our localeBenchmarks also sets some global state:

Benchmark.defaultConfiguration.metrics = [.cpuTotal, .wallClock, .throughput, .peakMemoryResident, .peakMemoryResidentDelta]

At this point… I'm not sure we know exactly which function "wins".

I believe our expectation is that the benchmarks defined in localeBenchmarks run with peakMemoryResident and peakMemoryResidentDelta as default metrics. Is that correct? And our expectation is that the benchmarks defined in calendarBenchmarks do not run with peakMemoryResident and peakMemoryResidentDelta as default metrics. Is that correct?

Copy link
Contributor Author

@itingliu itingliu Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're executed sequentially, so I think mutating global state is fine. The configurations will just be updated right before the tests run, or am I missing anything?

Copy link
Contributor

@vanvoorden vanvoorden Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configurations will just be updated right before the tests run, or am I missing anything?

This is where my question was going… to what extent is registering a set of benchmarks decoupled from running a subset of those benchmarks?

Suppose we run our international benchmarks package target… but we focus specifically on a benchmark defined from calendarBenchmarks:

$ swift package --disable-sandbox -c release benchmark run --target "InternationalizationBenchmarks" --filter "nextThousandThursdaysInTheFourthWeekOfNovember"

When we launch our benchmarks… I assume we have to execute both calendarBenchmarks and localeBenchmarks. The next step then is to respect the filter that was specified and only run nextThousandThursdaysInTheFourthWeekOfNovember. At that point… I believe we can see how our benchmark tests are potentially running out of sync from the global state we set when those tests were defined.

In general I believe the pattern I typically see is splitting benchmark targets apart and this question about the global Benchmark.defaultConfiguration state is not very important. It is global state… but we tear it down and rebuild it for every target so it is fresh and clean.

I do not believe this is a major issue… and this is currently a very impactful diff that unblocks engineers from running benchmarks and that is good! But I do believe there might be some unexpected issues currently from running the benchmarks and what specific metrics might be reported. I could maybe think of three possible ideas to work around that:

  • Move BenchmarkCalendar and BenchmarkLocale to separate and independent benchmark targets.
  • Keep BenchmarkCalendar and BenchmarkLocale as one benchmark target but update the way that configurations are passed to benchmarks so we do not depend on global state. Similar to what we do in other places.1
  • Keep BenchmarkCalendar and BenchmarkLocale as one benchmark target and update some header documentation comments where the benchmarks are defined to notify engineers that we might see unexpected behavior when configurations might not be respected.

But I don't think this discussion would have to block this specific diff from landing. I think you can use your best judgement and make the best decision if you want to ship more code on this right now or ship a future diff with a potential workaround in the future.

Footnotes

  1. https://github.com/swiftlang/swift-foundation/blob/swift-6.1.2-RELEASE/Benchmarks/Benchmarks/Internationalization/BenchmarkCalendar.swift#L135-L148

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we have to execute both calendarBenchmarks and localeBenchmarks. The next step then is to respect the filter that was specified and only run nextThousandThursdaysInTheFourthWeekOfNovember. At that point… I believe we can see how our benchmark tests are running out of sync from the global state we set when those tests were defined.

Since nextThousandThursdaysInTheFourthWeekOfNovember is defined inside localeBenchmarks, localeBenchmarks will need to be run first, so I would expect the configuration set in that benchmark to be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh… I think you are correct! It looks like the default Benchmark constructor does use shared global state:

https://github.com/ordo-one/package-benchmark/blob/1.29.3/Sources/Benchmark/Benchmark.swift#L229

But the configuration itself is a value type:

https://github.com/ordo-one/package-benchmark/blob/1.29.3/Sources/Benchmark/Benchmark.swift#L432

Which is then copied by value in the new benchmark:

https://github.com/ordo-one/package-benchmark/blob/1.29.3/Sources/Benchmark/Benchmark.swift#L238

So it looks like defining benchmarks does capture shared mutable state… but it captures that shared mutable state by value and mutating that shared state in the future does not affect the benchmark after it was defined. My mistake! Sorry for any confusion about that.

@itingliu itingliu merged commit 6d87788 into swiftlang:main Jul 2, 2025
16 checks passed
@itingliu itingliu deleted the fix-benchmark branch July 2, 2025 15:44
@vanvoorden
Copy link
Contributor

pikahappy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

benchmarks failing from main
3 participants