Skip to content

Extremely slow optimizer performance when including large array of strings #39352

Open
@joshtriplett

Description

@joshtriplett

Consider the following test program:

fn main() {
    let words = include!("num_array.rs");
    let mut l = 0;
    for word in words.into_iter() {
        l += word.len();
    }
    println!("{}", l);
}

num_array.rs contains an array of 250k strings:

$ (echo '['; seq 1 250000 | sed 's/.*/"&",/' ; echo ']') > num_array.rs
$ head -n 5 num_array.rs 
[
"1",
"2",
"3",
"4",
$ tail -n 5 num_array.rs 
"249997",
"249998",
"249999",
"250000",
]

Compiling this in debug mode took about 45 seconds; long, and potentially a good test case for compiler profiling, but not completely ridiculous.

Compiling this in release mode showed no signs of finishing after 45 minutes. stracing rustc showed two threads, one blocked in a futex and the other repeatedly allocating and freeing a memory buffer:

mmap(0x7fb665a00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb637400000
munmap(0x7fb637400000, 2097152)         = 0
mmap(0x7fb665a00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb637400000
munmap(0x7fb637400000, 2097152)         = 0
madvise(0x7fb66531a000, 659456, MADV_DONTNEED) = 0
madvise(0x7fb6653da000, 139264, MADV_DONTNEED) = 0
madvise(0x7fb66540c000, 1576960, MADV_DONTNEED) = 0
madvise(0x7fb665a0c000, 2043904, MADV_DONTNEED) = 0
madvise(0x7fb665600000, 4194304, MADV_DONTNEED) = 0
madvise(0x7fb63e600000, 12582912, MADV_DONTNEED) = 0
madvise(0x7fb661000000, 25165824, MADV_DONTNEED) = 0
madvise(0x7fb66531a000, 659456, MADV_DONTNEED) = 0
madvise(0x7fb6653da000, 139264, MADV_DONTNEED) = 0
madvise(0x7fb66540c000, 1576960, MADV_DONTNEED) = 0
madvise(0x7fb665a0c000, 2043904, MADV_DONTNEED) = 0
madvise(0x7fb667a00000, 10485760, MADV_DONTNEED) = 0
madvise(0x7fb646000000, 100663296, MADV_DONTNEED) = 0
mmap(0x7fb665a00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb637400000
munmap(0x7fb637400000, 2097152)         = 0
mmap(0x7fb665a00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb637400000
munmap(0x7fb637400000, 2097152)         = 0
madvise(0x7fb66531a000, 659456, MADV_DONTNEED) = 0
madvise(0x7fb6653da000, 139264, MADV_DONTNEED) = 0
madvise(0x7fb66540c000, 1576960, MADV_DONTNEED) = 0
madvise(0x7fb665a0c000, 2043904, MADV_DONTNEED) = 0
madvise(0x7fb665600000, 4194304, MADV_DONTNEED) = 0
madvise(0x7fb63e600000, 12582912, MADV_DONTNEED) = 0
madvise(0x7fb661000000, 25165824, MADV_DONTNEED) = 0
madvise(0x7fb66531a000, 659456, MADV_DONTNEED) = 0
madvise(0x7fb6653da000, 139264, MADV_DONTNEED) = 0
madvise(0x7fb66540c000, 1576960, MADV_DONTNEED) = 0
madvise(0x7fb665a0c000, 2043904, MADV_DONTNEED) = 0
madvise(0x7fb667a00000, 10485760, MADV_DONTNEED) = 0
madvise(0x7fb646000000, 100663296, MADV_DONTNEED) = 0

By way of comparison, an analogous C program compiled with GCC takes 4.6s to compile without optimization, or 5.6s with optimization. Python parses and runs an analogous program in 1.2s. So, 45s seems excessive for an unoptimized compile, and 45m+ seems wildly excessive for an optimized compile.

Complete test case (ready to cargo run or cargo run --release): testcase.tar.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-compiletimeIssue: Problems and improvements with respect to compile times.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions