Skip to content

Conversation

@PureWhiteWu
Copy link
Contributor

On Linux environment, calling num_cpus::get_physical() costs about 1-2ms per time, and in our scenario, we need to create hundreds to thousands of Pools dynamically(our services normally have hundreds to thousands of instances).

Thus, it will cost a really long time to create the pools, which lead to service hang or request timeout.

This PR caches the result of num_cpus::get_physical to avoid calling it every time, which can greatly improve the new performance of Pool.

Also, in container environment, we should call num_cpus::get() instead of num_cpus::get_physical(), but since it may be a breaking change, I only added an TODO for it.

PTAL, thanks!

Copy link
Collaborator

@bikeshedder bikeshedder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add an entry to the CHANGELOG.md under the ## [Unreleased] describing the change.

@PureWhiteWu PureWhiteWu force-pushed the feat/cache_num_cpus branch 2 times, most recently from b8857cc to 044303b Compare August 19, 2025 14:04
@PureWhiteWu
Copy link
Contributor Author

CHANGELOG updated.

@PureWhiteWu
Copy link
Contributor Author

Oh, LazyLock requires Rust 1.80, but the MSRV is 1.75 now.

Should I switch it to lazy_static or bump the MSRV?

@PureWhiteWu
Copy link
Contributor Author

I've switched to lazy_static for MSRV compatibility.

@bikeshedder
Copy link
Collaborator

Sticking with the current MSRV of 1.75 is probably the better choice.

btw. regarding the TODO refering to num_cpus::get() instead of num_cpus::get_physical() I feel like this is a non-breaking change but rather fixes wrong behavior, no?

@bikeshedder
Copy link
Collaborator

When running a container with 4 CPUs on an Ampere system with 192 cores I would not want to end up with a default pool size of 768 (192*4), would I? 🤔

@bikeshedder
Copy link
Collaborator

Ah, I remember reading the documentation and went with get_physical as logical cores aren't very useful in a lot of workloads. I ended up using physical CPUs for that very reason. tbh. I didn't think of cgroups and containers back then. So yes, it should probably be changed to something like num_cpus::get() * 2 so the default stays the same for most of the users and does the right thing in containerized environments, too.

@bikeshedder
Copy link
Collaborator

I'm happy to merge this. I just would like to make a decision on that TODO now. It has the potential of staying in the source code for the next years to come.

My vote would be to go with num_cpus::get() * 2 and call it the day.

@bikeshedder bikeshedder merged commit 207e428 into deadpool-rs:main Aug 19, 2025
11 checks passed
@bikeshedder
Copy link
Collaborator

@PureWhiteWu I just merged this and created a separate PR for the default max pool size:

@PureWhiteWu PureWhiteWu deleted the feat/cache_num_cpus branch August 19, 2025 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants