-
Notifications
You must be signed in to change notification settings - Fork 99
Description
Reproduction
Code:
require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'parslet', '2.0.0'
gem 'minitest', '5.25.4'
end
require 'minitest'
require "parslet"
require "parslet/convenience"
module Problem
class PegParser < Parslet::Parser
rule(:code_fence) {
match(/\A`/) >> str("``")
}
rule(:anything_but_code_fence) {
(code_fence.absent? >> any).repeat(1, nil)
}
end
end
class ProblemDemo < Minitest::Test
def test_parses_code_fence_as_expected
parsed = Problem::PegParser.new.code_fence.parse_with_debug("```")
assert_equal("```", parsed)
end
def test_slash_a_works_as_expected
assert_raises(Parslet::ParseFailed) {
_ = Problem::PegParser.new.code_fence.parse("NOT A CODE FENCE ```")
}
end
def test_slash_a_any_repeat_does_not_work_as_expected
input = "NOT A CODE FENCE ```"
parsed = Problem::PegParser.new.anything_but_code_fence.parse_with_debug(input)
assert_equal(input, parsed)
end
endRun:
$ gem install m
$ m parselet_problem_test.rb
Run options: -n "/^(test_parses_code_fence_as_expected|test_slash_a_works_as_expected|test_slash_a_any_repeat_does_not_work_as_expected)$/" --seed 16544
# Running:
Extra input after last repetition at line 1 char 18.
`- Failed to match sequence (!CODE_FENCE .) at line 1 char 18.
`- Input should not start with CODE_FENCE at line 1 char 18.
F..
Finished in 0.000818s, 3667.4819 runs/s, 3667.4819 assertions/s.
1) Failure:
ProblemDemo#test_slash_a_any_repeat_does_not_work_as_expected [parselet_problem_test.rb:40]:
Expected: "NOT A CODE FENCE ```"
Actual: nil
3 runs, 3 assertions, 1 failures, 0 errors, 0 skips
Expected
That I can write a parser to capture a pattern that requires it begin at the start of a line, and I can re-use that same parser to capture anything EXCEPT for that exact match via absent? and repeat. I expect the tests to pass
Actual
The last test above fails:
Extra input after last repetition at line 1 char 18.
`- Failed to match sequence (!CODE_FENCE .) at line 1 char 18.
`- Input should not start with CODE_FENCE at line 1 char 18.
F
Finished in 0.040330s, 570.2951 runs/s, 1785.2715 assertions/s.
1) Failure:
ProblemDemo#test_slash_a_any_does_not_work_as_expected [test/rundoc/peg_parser_test.rb:33]:
Expected: "NOT A CODE FENCE ```"
Actual: nil
This happens because each iteration of the repeat consumes one character via any so "NOT A CODE FENCE ```" gets paired down to "OT A CODE FENCE ```" and the parser continues. However when it reaches "```" it incorrectly thinks that the backticks start at the beginning of a line, when they do not in the original document.
Considerations
I'm unsure if this is actually a bug or not (i.e. it's unexpected to me, but perhaps it's by design). I'm wondering if this is expected, if there's a workaround or pattern I can use to capture all values up to a parser that starts on a new line.