Skip to content

Surprising optimization differences between variants of the same code #48200

Closed
@glandium

Description

@glandium

I was looking at the assembly output of the following code, to see if rust would do a short-circuit or not:

pub fn foo(a: Option<usize>, b: Option<usize>) -> usize {
    if let (Some(a), Some(b)) = (a, b) {
        a + b
    } else {
        0
    }
}

And got the following result:

example::foo:
  push rbp
  mov rbp, rsp
  cmp qword ptr [rdi], 1
  jne .LBB0_1
  mov rcx, qword ptr [rdi + 8]
  add rcx, qword ptr [rsi + 8]
  xor eax, eax
  cmp qword ptr [rsi], 1
  cmove rax, rcx
  pop rbp
  ret
.LBB0_1:
  xor eax, eax
  pop rbp
  ret

(thanks godbolt)

Which, come to think of it, might make sense, although I'm not sure reading the data and making the addition before checking the second Option tag is better than avoiding the branch this all allows to avoid. So, let's assume this is actually better than doing two compare/branch at the beginning, at the very least, the compiled code should be the same as for either of the following:

pub fn bar(a: Option<usize>, b: Option<usize>) -> usize {
    if a.is_some() && b.is_some() {
        a.unwrap() + b.unwrap()
    } else {
        0
    }
}

pub fn baz(a: Option<usize>, b: Option<usize>) -> usize {
    if let Some(a) = a {
        if let Some(b) = b {
            return a + b;
        }
    }
    0
}

Both compile to the code below, which is different than the original one:

  push rbp
  mov rbp, rsp
  cmp qword ptr [rdi], 1
  jne .LBB1_1
  cmp qword ptr [rsi], 1
  jne .LBB1_3
  mov rax, qword ptr [rsi + 8]
  add rax, qword ptr [rdi + 8]
  pop rbp
  ret
.LBB1_1:
  xor eax, eax
  pop rbp
  ret
.LBB1_3:
  xor eax, eax
  pop rbp
  ret

BTW, note how it generates two branch targets with the same return code instead of reusing the first one for the second branch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions