Skip to content

Conversation

silentbicycle
Copy link
Collaborator

@silentbicycle silentbicycle commented Aug 26, 2025

PR #509 introduced a bug: It didn't distinguish between an unexpected end of input and an end of input in a zone that matches but ignores its input. This caused several lxpos tests to fail due to getting a TOK_UNKNOWN rather than a TOK_EOF when the input has trailing whitespace, but I didn't notice until after merging because the normal build doesn't regenerate the code for src/lx/lexer.lx or src/libfsm/lexer.lx. (I had ensured all the libre dialect lexers and parsers were regenerated, but missed those.)

Only src/lx/print/c.c has code changes, the other files are all generated code updates.

Instead of always generating TOK_UNKNOWN, this this inspects the zone mappings to determine whether the current end ID represents a dead end for the zone. If not, it should instead generate TOK_EOF.

PR #509 introduced a bug: It didn't distinguish between an unexpected
end of input and an end of input in a zone that matches but ignores its
input. This caused several lxpos tests to fail due to getting a
TOK_UNKNOWN rather than a TOK_EOF when the input has trailing
whitespace, but I didn't notice until after merging because the normal
build doesn't regenerate the code for src/lx/lexer.lx or
src/libfsm/lexer.lx. (I had ensured all the libre dialect lexers and
parsers were regenerated, but missed those.)

Instead of always printing TOK_UNKNOWN, this this inspects the zone
mappings to determine whether the current end ID represents a dead end
for the zone. If not, it should instead print TOK_EOF.
@silentbicycle silentbicycle requested a review from katef August 26, 2025 19:08
* should stay small enough that linear search is fine. If this becomes
* prohibitively expensive, then build a bitset of dead-end IDs upfront
* in one pass. */
for (struct ast_zone *z = ast->zl; z != NULL; z = z->next) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set up a variety of scenarios and they all behaved consistently this being how lx identifies dead ends internally, but please let me know if I'm misunderstanding something or there's a more direct way to check this. I didn't see a way to tell what zone the accept_c callback is running inside of, but the linear scan across all zones should be small in practice.

@katef katef merged commit 6c66234 into main Aug 29, 2025
346 checks passed
@katef katef deleted the sv/fix-lx-handling-for-EOF-broken-by-509 branch August 29, 2025 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants