Closed
Description
Hi,
I saw the news of that regex
got refactored and optimized and decided to check my old benchmark. I was very surprised it now runs twice as long!
How to reproduce (using multirust
for versions as older regex
doesn't compile with newer nightly Rust):
git clone https://github.com/mkpankov/parse-rust.git
cd parse-rust
multirust override nightly-2015-06-24
git checkout 4076c404caf1560a466e9f0799817035089fe841
cargo build --release
time zcat mp3-logs-with-fake-ips.log.gz | ./target/release/parse-rust
// outputs around 4s on my machine
multirust override nightly-2015-05-25
git checkout e33d410291fa7f134eef628b5591d605cd68b218
cargo clean
cargo build --release
time zcat mp3-logs-with-fake-ips.log.gz | ./target/release/parse-rust
// outputs around 2s on my machine
I'm sorry I can't pinpoint it more accurately (maybe it's Rust changes, not regex
), but recent major changes of regex
might be it. Two times degradation is severe in my opinion, and needs action.
regex
versions:
- new, degraded:
"regex 0.1.38 (registry+https://github.com/rust-lang/crates.io-index)",
"regex_macros 0.1.20 (registry+https://github.com/rust-lang/crates.io-index)",
- old, fast:
"regex 0.1.30 (registry+https://github.com/rust-lang/crates.io-index)",
"regex_macros 0.1.18 (registry+https://github.com/rust-lang/crates.io-index)",
Some background: back when I did this I compared Rust version to C++ version (doing almost stupid translation) and Rust beat C++ by about 40% w/o using compile-time regex. This kind of degradation puts it back behind C++ 😞