Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions src/libraries/System.Private.CoreLib/src/System/Convert.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2901,15 +2901,16 @@ private static unsafe int FromBase64_ComputeResultLength(char* inputPtr, int inp

// Perf: reuse the variable that stored the number of '=' to store the number of bytes encoded by the
// last group that contains the '=':
if (padding != 0)
{
if (padding == 1)
padding = 2;
else if (padding == 2)
padding = 1;
else
throw new FormatException(SR.Format_BadBase64Char);
}
if ((uint)padding >= 3)
throw new FormatException(SR.Format_BadBase64Char);

// padding map
// -----------
// 0 => 0
// 1 => 2
// 2 => 1
ReadOnlySpan<byte> map = [0, 2, 1];
padding = map[padding];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trades branches vs. memory access -- I don't know which approach will be faster and in benchmarks it will be hard to measure, because branch predictor will do a good job, and for the memory access the data will be in the cpu caches, so fast retrieval.

Or put differently: it trades possible branch mispredicts vs. potential cache eviction.

Maybe it's better to write the code as idiomatic as possible and let the compilers optimize it out. Actually with a switch it looks quite good and produces code similar to what native compilers produce.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, writing the code in a more idiomatic style is preferable. However, at the moment, the JIT doesn't seem to generate a value-to-result mapping (like C/C++ does) for simple cases like this. Even for a switch, it just falls back to a jump table, which still involves a memory access.

I agree - it's better to either leave it as-is or rewrite it using a switch, and help the JIT recognize and optimize such patterns. I think I even saw an issue about this a while ago, but couldn't find it right away.

Just for reference, here's what C/C++ produces:

mov     edi, edi
mov     eax, DWORD PTR CSWTCH.1[0+rdi*4]

CSWTCH.1:
        .long   0
        .long   2
        .long   1

As we can see the C/C++ compiler prefers a memory access here, rather than relying on the branch predictor.

And here's what the JIT produces in my case - basically the same thing as the C++ version:

mov      rdi, 0x72BEDC82CB80      ; static handle
movzx    rax, byte  ptr [rax+rdi]

And here's what a typical switch ends up looking like:

lea      rdi, [reloc @RWD00]
mov      edi, dword ptr [rdi+4*rax]
lea      rcx, G_M26941_IG02
add      rdi, rcx
jmp      rdi
mov      eax, 1
jmp      SHORT G_M26941_IG07
mov      eax, 2
jmp      SHORT G_M26941_IG07
xor      eax, eax
G_M26941_IG07:  ;; offset=0x0035

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve found the issue related to this: #114041

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think it would be still good to change to the switch-approach here.
Once the JIT got improved, then the benefit here comes for free.


// Done:
return (usefulInputLength / 4) * 3 + padding;
Expand Down
Loading