I am writing grammar for multi line strings (Text Block) in java. The delimiter for the start and end of a text block is triple quotes. I can successfully parse and build the AST for the text blocks and its content, except for one issue: the TEXT_BLOCK_START token is being returned after the tokens from the second lexer. I am using this as a guide: flow diagram. According to the ANTLR2 documentation, the way that I have implemented this should produce the desired token stream:
TEXT_BLOCK_START -> content from second lexer, etc... -> TEXT_BLOCK_END
I have tried changing the order of the action and the delimiter, the order of the rules, and using select() instead of selector.push().
Here are the important parts of the main class:
final Lexer lexer = new Lexer(reader);
lexer.setCommentListener(contents);
final Lexer secondLexer =
new Lexer(lexer.getInputState());
lexer.setTokenObjectClass("antlr.CommonHiddenStreamToken");
secondLexer.setTokenObjectClass("antlr.CommonHiddenStreamToken");
final TokenStreamHiddenTokenFilter filter = new
TokenStreamHiddenTokenFilter(lexer);
final TokenStreamSelector selector = new TokenStreamSelector();
lexer.selector = selector;
secondLexer.selector = selector;
selector.addInputStream(filter, "filter");
selector.addInputStream(secondLexer, "secondLexer");
selector.select(filter);
The lexer (main lexer) rule:
TEXT_BLOCK_START
: "\"\"\"" {selector.push("secondLexer");}
;
The secondary lexer rule:
TEXT_BLOCK_END
: "\"\"\"" {selector.pop();}
;
As stated above, everything parses as expected, except that the token stream looks like this:
content from second lexer, etc... -> TEXT_BLOCK_END -> TEXT_BLOCK_START
What am I missing here?