Skip to content

regex is less efficient than it could be #14029

Closed
@comex

Description

@comex

Consider this code:

#![feature(phase)]
extern crate regex;
#[phase(syntax)]
extern crate regex_macros;

pub fn is_all_a(s: &str) -> bool {
    return regex!("^a+$").is_match(s);
}

Ideally this would optimize away to a small function that just iterates over the string and checks for characters other than 'a'.

Instead, it:

  • calls malloc several times to start out;
  • goes through an indirect call unless LTO is enabled - might not usually be a big deal, but I would like to eventually be able to efficiently match a regex on a single character in lieu of writing out all the possibilities manually
  • to the 'exec' function, which itself, even with LTO (and -O) enabled, makes many non-inlined calls, including to malloc, char_range_at, char_range_at_reverse, etc.

Without LTO, it generates about 7kb of code for one regex, or 34kb if I put 8 regexes in that function. Not the end of the world, but it adds up.

I recognize the regex implementation is new, but I thought this was worth filing anyway as room for improvement.

rustc 0.11-pre-nightly (2dcbad5 2014-05-06 22:01:43 -0700)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions