Question
TLDR: I want to match anything but /.+?/ doesnt' seem to work, why?
I have the following super simple grammar and code:
from lark import Lark, Tree
parser: Lark = Lark(r"""
rterm: "(___hole 0" anything ")"
anything: /.+?/
%import common.ESCAPED_STRING
%import common.SIGNED_NUMBER
%import common.WS
%ignore WS
""", start='rterm')
test_strings: list[str] = ["(___hole 0 (fun n : nat => ___hole 1 (___hole 2 eq_refl : 0 + n = n)))"]
for test_string in test_strings:
print(f'{test_string=}')
tree: Tree = parser.parse(test_string)
print(tree.pretty())
when I try to parse the only test string I have it gives me an error:
Traceback (most recent call last):
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-18-352bf581b4ee>", line 19, in <cell line: 17>
tree: Tree = parser.parse(test_string)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/lark.py", line 581, in parse
return self.parser.parse(text, start=start, on_error=on_error)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parser_frontends.py", line 106, in parse
return self.parser.parse(stream, chosen_start, **kw)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parsers/earley.py", line 297, in parse
to_scan = self._parse(lexer, columns, to_scan, start_symbol)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parsers/xearley.py", line 144, in _parse
to_scan = scan(i, to_scan)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parsers/xearley.py", line 118, in scan
raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},
lark.exceptions.UnexpectedCharacters: No terminal matches 'f' in the current parser context, at line 1 col 13
(___hole 0 (fun n : nat => ___hole 1 (___hole 2 eq_r
^
Expected one of:
* RPAR
focus on the last line:
lark.exceptions.UnexpectedCharacters: No terminal matches 'f' in the current parser context, at line 1 col 13
(___hole 0 (fun n : nat => ___hole 1 (___hole 2 eq_r
^
Expected one of:
* RPAR
which surprises me because I would have expected .+? to match any characgter but it claims that it can't match the f. Does anyone know why?
Research
I've search and saw these two relevant questions but their contents didn't help:
- https://github.com/lark-parser/lark/issues/257
- Lark parser can't parse characters, even though they are defined in regex of rule this one seems helpful due to the type of error. It's not matching the char
ffor some reason but the dot.should have captured that, no?
(nearly) cross posted here: https://github.com/lark-parser/lark/discussions/1163