-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Don't inline >8 bytes of IL calls in BBJ_THROW #78386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsFixes: #78300 public static void Test(int i)
{
if (i == 0) throw Create();
}
static FormatException Create() => new FormatException(SR.Test);
class SR
{
public static string Test => "";
}Codegen for ; Method P:Test(int)
G_M000_IG01:
56 push rsi
4883EC20 sub rsp, 32
G_M000_IG02:
85C9 test ecx, ecx
7406 je SHORT G_M000_IG04
G_M000_IG03:
4883C420 add rsp, 32
5E pop rsi
C3 ret
G_M000_IG04:
48B9F8A9FF0DFA7F0000 mov rcx, 0x7FFA0DFFA9F8
E8A244AE5F call CORINFO_HELP_NEWSFAST
488BF0 mov rsi, rax
B901000000 mov ecx, 1
48BA18B2EF0DFA7F0000 mov rdx, 0x7FFA0DEFB218
E84B74AB5F call CORINFO_HELP_STRCNS
488BD0 mov rdx, rax
488BCE mov rcx, rsi
FF15070B2300 call [System.FormatException:.ctor(System.String):this]
488BCE mov rcx, rsi
E88750A15F call CORINFO_HELP_THROW
CC int3
; Total bytes of code: 74Now: ; Method P:Test(int)
G_M11252_IG01:
4883EC28 sub rsp, 40
G_M11252_IG02:
85C9 test ecx, ecx
7405 je SHORT G_M11252_IG04
G_M11252_IG03: ;; offset=0008H
4883C428 add rsp, 40
C3 ret
G_M11252_IG04:
FF153D1C4500 call [P:Create():System.FormatException]
488BC8 mov rcx, rax
E835E84E5F call CORINFO_HELP_THROW
CC int3
; Total bytes of code: 28Currently, we almost never inline anything in throw blocks above 16 bytes of IL, but it seems that it's a good idea to not inline small methods too: Pros:
Cons:
Alternative fix is to reduce ALWAYS_INLINE il threshold from 16 to 8 for THROW sections, but we might regress jit's throughput. Let's see the diffs.
|
|
Just for reference, I noticed it while looking at the codegen for the not-yet-merged https://github.com/dotnet/runtime/pull/71590/files#diff-f7c1029305b8ff70a95e36e675c08e17c8f816c5a1be8a204dc42d0595129e18R1419. |
|
(And I suspect the diffs may be influenced by such NoInlining in use elsewhere to work around this, e.g. https://github.com/dotnet/runtime/blob/main/src/libraries/System.IO.Pipelines/src/System/IO/Pipelines/ThrowHelper.cs) |
|
@dotnet/jit-contrib @jakobbotsch PTAL, Diffs: https://dev.azure.com/dnceng-public/public/_build/results?buildId=85004&view=ms.vss-build-web.run-extensions-tab I made the PR less aggressive so now the number of size regressions is minimal. The current logic is to inline everything below 8 bytes of IL (e.g. properties) and immediately give up on everything larger where previously we had IL prescan. Number of regressions seem reasonable to me now. |
jakobbotsch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in principle.
Any examples of the regressions? I can imagine not exposing some facts, even in the throwing BBs, can hurt codegen in the rest of the function. For example if we no longer inline a function with an argument in S someStruct then that could result in new address exposure.
|
Maybe one common case would be instance functions on structs? e.g.: where |
|
@jakobbotsch I was not able to find any case like that in the diffs. What I did find is the cases like these: I tried to adjust the heuristic to not do that for methods returning structs by value but it hid many previous improvements. Overall the number of regressions is "big" only in PMI collections: From the same methods with different generic types (pmi) |
|
How come there's no diffs in the crossgen2 collections? |
crossgen2 uses more conservative inliner while I did my changes in ExtendedDefaultPolicy. |
|
Seems like the JIT shouldn't inline any function that is straight-line code leading to a [Edit] I slightly misread the issue, as the |
The initial change in this PR did exactly that but that produced a lot of size regressions and many examples actually affected non-cold path too (e.g. more registers were saved) so I changed it to "Ok, inline only under 8 bytes of IL", mostly for various properties (getters/setters), I tried to play with that threshold and 8 (actually even 9) was the best value. |
|
I guess the size regression was because the call overhead was larger than the generated code for very small functions. |



Fixes: #78300
Codegen for
Test, was:Now:
Currently, we almost never inline anything in throw blocks above 16 bytes of IL, but it seems that it's a good idea to not inline small methods too:
Pros:
throwCons:
calloverhead, likely not noticeably. We can handle this via Dynamic PGO (if a throw blocks is not cold according to profile)