Potential use-after-free around QSTRs? #12667
-
Hello! I'm trying to debug a peculiar issue where I'm occasionally seeing a cryptic exception being thrown at boot, coming from
The issue is quite hard to debug - adding a bunch of printfs or doing some random changed in listed Python files can make it disappear. I did however manage to actually gather some insight into what's going on. The exception gets thrown from
Turns out that the QSTR that gets passed into When placing a breakpoint on mp_raise_ValueError, I've managed to find out that
At that point the pool size is 320. As you can see,
After looking a bit closer it appears that the last pool allocation (the one that creates the pool of size 320) triggers GC. What caught my attention is this comment in
Haven't caught it red-handed yet (as mentioned, the problem is pretty hard to trigger at will), but my current suspicion is that:
I'm not familiar with micropython's code, so I wanted to ask: does that make any sense? Am I missing something? There's a lot of stuff running besides micropython in the project I'm working on, so it could very well be caused by something else. However, my theory with GC freeing the last qstr chunk and leaving it as a dangling pointer appears plausible to me based on what I've read in the code so far, so I've figured it won't hurt to ask for a second look. It feels like someone already familiar with this code should be able to validate this theory much quicker than me:) In case it's relevant, this is all happening on a ESP32-S3 with PSRAM and auto heap split enabled. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Thanks for the extremely detailed report @dos1 I will have to investigate in more detail, but your theory that this is a use-after-free of sorts does at least sound plausible given what you're seeing.
FWIW the GC also treats the entire C stack, as well as all registers, as potential root pointers. So even if We have seen this mechanism go wrong, where e.g. the only thing in a register/stack is an offset from the pointer (and it's quite hard to guard against that), but I'm not sure that's likely to be the issue here. |
Beta Was this translation helpful? Give feedback.
Turns out it wasn't detailed enough 😆 I forgot to mention that I'm seeing this on is a bit older version of mpy (from somewhere around 1.20) with backported auto heap split support. Which leads to...
Thanks for this hint! Turns out the patch from #12229 was missing. While it's hard to be 100% sure that applying it fixed the issue, so far I was unable to reproduce it with it applied and judging from the code it does seem very plausible that thi…