I'm writing a parser which has some tokens that are concatenated from multiple smaller rules, using yymore().
If it reaches EOF before the end of this composite token, I need it to return a special error-token to the parser. This is the same problem as in this question.
The answer there suggests to convert the parser to a "push parser" to solve this.
The Bison manual makes it pretty clear how to make a push parser part but I cannot find a similar instruction on how the lexer should look.
Let's take the following lexer:
%option noyywrap
%{
#include <string.h>
// Stub of the parser header file:
#define GOOD_STRING 1000
#define BAD_STRING 1001
char *yylval;
%}
%x STRING
%%
\" { BEGIN(STRING); yymore(); }
<STRING>{
\" { BEGIN(INITIAL); yylval = strdup(yytext); return GOOD_STRING; }
.|\n { yymore(); }
<<EOF>> { BEGIN(INITIAL); yylval = strdup(yytext); return BAD_STRING; }
}
.|\n { return yytext[0]; }
%%
void parser_stub()
{
int token;
while ((token = yylex()) > 0) {
if (token < 1000) {
printf("%i '%c'\n", token, (char)token);
} else {
printf("%i \"%s\"\n", token, yylval);
free(yylval);
}
}
}
int main(void)
{
parser_stub();
}
It doesn't work as a pull-parser because it continues parsing after encountering EOF, which ends in an error: fatal flex scanner internal error--end of buffer missed.
(It works if yymore() is not used but it still technically is an undefined behavior.)
In the rule <<EOF>> it needs to emit 2 tokens: BAD_STRING and 0.
How do you convert a lexer into one suitable for a push-parser?
I'm guessing it involves replacing returns with something that pushes a token to the parser without ending yylex() but I haven't found a mention of such function / macro.
Is this just a case of having to implement it manually, without any support built-in into Flex?