Skip to content

Conversation

@tpwrules
Copy link
Contributor

@tpwrules tpwrules commented Jul 28, 2025

Alternative to #30754 to close #30722 . I have been severely nerd-sniped but here goes... Please see the commit messages for further detail.

This applies to anything that uses %f with hal.util->snprintf or the things we use linker wraps (augh) to point to that.

Resolves two problems. Out of the ~4.2 billion floats:

  • Subnormal numbers are printed as half their true value. e.g. 0.<sufficient zeros>2 was printed as 0.<same zeros>1. This affects the ~16.7 million non-zero values fabsf(v) < 1.175494e-38f (FLT_MIN).
  • The 26 smallest non-zero floats (fabsf(v) <= 2e-44f) will get stuck in an infinite loop if formatted, causing a watchdog reboot even in flight if executed on the main thread!

Notices and does not resolve two additional problems:

  • The requested number of digits is not always produced, the tests request 99 and get fewer depending on the magnitude of the number.
  • The precision isn't enough to distinguish every float.

This study of %f problems is not exhaustive. I did dig out an OG Genuino Arduino Uno R3 and avr-gcc 7.3.0 to test the (nominally) original assembly this function was ported from and it has none of these problems. Fixing the latter two concerns me slightly for overflows, and none of these problems have been obvious for a decade or more.

We should perhaps consider modernizing our implementation (or digging deeper into the corresponding assembly to fix the other port problems) but this is a surprisingly rich and subtle field. We are also unfortunately not fully consistent in using this implementation.

Subnormal values were formatted as 1/2 of their true value because it
was assumed the mantissa's significance corresponded to an exponent of
0, not the correct exponent of 1 (which represents 2x the value). Fix by
setting the correct exponent in this case.

This affects the ~16.7 million floats closest to zero.
@tpwrules
Copy link
Contributor Author

tpwrules commented Jul 28, 2025

Empirically %.4f is completely safe, it's %.5f (and larger and %f) that hangs when formatting 1e-45f (smallest positive number). So fortunately I don't think this is actually hittable in flight code (although Lua easily can), I can't find such an instance.

tpwrules and others added 2 commits July 28, 2025 18:47
Subnormal floats with mantissas of 7 or less (14 before the previous
fix) have less digits to format than the conversion loop expects,
causing it to never meet the condition to exit and looping forever. This
cannot occur with normal floats because the implicit leading 1 always
produces a sufficiently large value to format.

This causes a hang with the 12 floats closest to zero (26 before the
previous fix) if a suitably large precision is passed (which it is by
default using e.g. `%f`). Fix by exiting the conversion loop if we get
in the case where we can't add any more digits.

Co-authored-by: JasonMDavey <[email protected]>
Ensures subnormals aren't formatted grossly wrong nor cause hangs.

Reveals two additional bugs:

* The requested number of digits is not always produced.

* The precision isn't enough to distinguish every float.

Outputs from the (nominally) original assembly implementation running on
a real AVR (avr-gcc 7.3.0) are included. That implementation exhibits
none of these flaws so it's likely they have been introduced during the
C port.

Co-authored-by: Peter Barker <[email protected]>
@tpwrules
Copy link
Contributor Author

tpwrules commented Jul 28, 2025

Changed to check for the problematic case in the loop and bail no matter how it happens, instead of checking the numbers which cause it going into the loop. Should be more robust but might be slightly slower.

@rmackay9 rmackay9 merged commit 529bf32 into ArduPilot:master Jul 29, 2025
109 of 111 checks passed
@tpwrules tpwrules deleted the pr/ftoa-fix branch July 29, 2025 13:44
@robertlong13 robertlong13 moved this from Pending to 4.6.3-beta1 in 4.6 Backports Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: 4.6.3 / 4.6.3-beta1

Development

Successfully merging this pull request may close these issues.

Infinite loop in AP_HAL/utility/ftoa_engine when formatting floats with very small exponents

3 participants