I'm concentrating on checking for error conditions in an parser design using Spirit X3. One of which is the character category checks like isalpha or ispunct. According to the X3 documentation Character Parsers they should match what C++ provides as std::isalpha and std::ispunct. However with a code demonstration shown below I do get different results.
#include <cstddef>
#include <cstdio>
#include <cstdint>
#include <cctype>
#include <iostream>
#include <boost/spirit/home/x3/version.hpp>
#include <boost/spirit/home/x3.hpp>
namespace client::parser
{
namespace x3 = boost::spirit::x3;
namespace ascii = boost::spirit::x3::ascii;
using ascii::char_;
using ascii::space;
using x3::skip;
x3::rule<class main_rule_id, char> const main_rule_ = "main_rule";
const auto main_rule__def = ascii::cntrl;
BOOST_SPIRIT_DEFINE( main_rule_ )
const auto entry_point = skip(space) [ main_rule_ ];
}
int main()
{
printf( "Spirit X3 version: %4.4x\n", SPIRIT_X3_VERSION );
char output;
bool r = false;
bool r2 = false; // answer according to default "C" locale
char input[2];
input[1] = 0;
printf( "ascii::cntrl\n" );
uint8_t i = 0;
next_char:
input[0] = (char)i;
r = parse( (char*)input, input+1, client::parser::entry_point, output );
r2 = (bool)std::iscntrl( (unsigned char)i );
printf( "%2.2x:%d%d", i, r, r2 );
if ( i == 0x7f ) { goto exit_loop; }
++i;
if ( i % 8 ) { putchar( ' ' ); } else { putchar( '\n' ); }
goto next_char;
exit_loop:
return 0;
}
The output is:
Spirit X3 version: 3004
ascii::cntrl
00:11 01:11 02:11 03:11 04:11 05:11 06:11 07:11
08:11 09:01 0a:01 0b:01 0c:01 0d:01 0e:11 0f:11
10:11 11:11 12:11 13:11 14:11 15:11 16:11 17:11
18:11 19:11 1a:11 1b:11 1c:11 1d:11 1e:11 1f:11
20:00 21:00 22:00 23:00 24:00 25:00 26:00 27:00
28:00 29:00 2a:00 2b:00 2c:00 2d:00 2e:00 2f:00
30:00 31:00 32:00 33:00 34:00 35:00 36:00 37:00
38:00 39:00 3a:00 3b:00 3c:00 3d:00 3e:00 3f:00
40:00 41:00 42:00 43:00 44:00 45:00 46:00 47:00
48:00 49:00 4a:00 4b:00 4c:00 4d:00 4e:00 4f:00
50:00 51:00 52:00 53:00 54:00 55:00 56:00 57:00
58:00 59:00 5a:00 5b:00 5c:00 5d:00 5e:00 5f:00
60:00 61:00 62:00 63:00 64:00 65:00 66:00 67:00
68:00 69:00 6a:00 6b:00 6c:00 6d:00 6e:00 6f:00
70:00 71:00 72:00 73:00 74:00 75:00 76:00 77:00
78:00 79:00 7a:00 7b:00 7c:00 7d:00 7e:00 7f:11
So the first bit after the colon is the answer according to X3 and the second bit is the answer according to C++. The mismatch happens on the characters that also fall into the category isspace. Recently I'm more looking into the library headers, but I still haven't found a part that explains this behavior.
Why the disparity? Do I have missed something?
Oh yeah, I love my goto statements. And my retro C style. I hope you do too! Even for an X3 parser.