Skip to content

Conversation

@LukeSerne
Copy link
Contributor

Fixes #3587

This PR fixes a correctness issue in the decompiler ouput. Previously, CastStrategy::markExplicitLongSize only applied to the first input of shift operations. Using "L" suffixes is more widely useful, and sometimes even required for correct decompilation output: "-0x80000000" and "-0x80000000L" compile to very different constants. I'm not sure why CastStrategy::markExplicitLongSize was restricted to the first input of a shift operation, so I just removed the constraint. This might result in regressions somewhere else. If that is the case, please let me know.

The example below (copied from the linked issue) demonstrates a case where the lack of a L suffix resulted in different code when compiling the decompiler output.

Before:

int main(void)

{
  long lVar1;
  undefined4 local_14;
  
  lVar1 = fake_function();
  if (lVar1 < -0x80000000)
  {
    local_14 = 0x7b;
  }
  else
  {
    local_14 = 0x457;
  }
  return local_14;
}

After:

int main(void)

{
  long lVar1;
  undefined4 local_14;
  
  lVar1 = fake_function();
  if (lVar1 < -0x80000000L)
  {
    local_14 = 0x7b;
  }
  else
  {
    local_14 = 0x457;
  }
  return local_14;
}

Previously, `CastStrategy::markExplicitLongSize` only applied to the
first input of shift operations. Using "L" suffixes is more widely
useful, and sometimes even required for correct decompilation output:
"-0x80000000" and "-0x80000000L" compile to very different constants.
@Wall-AF
Copy link

Wall-AF commented Dec 26, 2025

@LukeSerne this kind of issue happens elsewhere, e.g.in 16-bit x86 code, where the size of datatypes seem to be ignored! In this case. I'm assuming the long is 64-bit, but Ghidra is internally using 32!

Maybe there's some way the correct sizes could be passed to the decompiler for it to act accordingly??? Here, I'm also thinking of sized pointers too, i.e near/far (16-/32-bit)!

@LukeSerne
Copy link
Contributor Author

LukeSerne commented Dec 28, 2025

@Wall-AF Do you have an example of a function where that happens? Ghidra does support sized pointers (for example char *32 for a 32-bit pointer to a char), but maybe this is a 16-bit specific issue?

Does this PR improve the situation in that example, or is it unaffected?

@Wall-AF
Copy link

Wall-AF commented Dec 28, 2025

@LukeSerne This example (produced from the decompiler menu Debug Functiion Decompiler) isn't using longs but pointer-to-pointer(-to-pointer maybe, it's hard to work out currently) where the 32-bitness is spread over 2 16-bit memory assignments (MOV ES:[BX+2],0; MOV ES:[BX],0).

This one is showing that a pointer to an array of objects (sized 12) isn't indexing properly either! (NB, I've seen situations where if a member (never the 1st one) of the structure is referenced, then the index is nicely applied!1)

Footnotes

  1. 'struct tagA {int n; Obj* pObj) A; A* aA = new(sizeof(A)*20); int x = 5; Obj pObj = aAa[x].pObj`

@LukeSerne
Copy link
Contributor Author

@Wall-AF thanks for the examples. Those seem like a different issue (maybe multiple?) though, so I think it's better if you create a new issue for those (or, if an issue already exists, post them there). This PR only affects whether L is printed after an integer constant in the decompiler, so it won't help with those examples...

@Wall-AF
Copy link

Wall-AF commented Dec 29, 2025

Soz, thought the principle was the same, making it one that might get resolved! Re new/alternate issue, there's so many that aren't going anywhere!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature: Decompiler Status: Triage Information is being gathered

Projects

None yet

Development

Successfully merging this pull request may close these issues.

INT_MIN(-2147483648) is decompiled to hex form?

4 participants