I haven't been able to find documentation of which characters compound the punctuation set "%p" in Lua.
- 44,692
- 7
- 66
- 118
- 187
- 2
- 12
3 Answers
The answer is locale dependent, it is a direct interface to the C function.
Actually, if there is a C standard function which does something similar to the Lua function, it is near-certain that the Lua function just wraps the C function, warts and all, even without looking at the specific case.
(This is part of the reason file:read() still has trouble reading text with embedded zeroes in 5.2, maybe even will have in 5.3)
While Amaden gave a good answer for the "C" locale, and ColonelThirtyTwo gave the right way to check for the current locale, the C standard only says:
ispunct(): Theispunctfunction tests for any printing character that is one of a locale-specific set of punctuation characters for which neitherisspacenorisalnumis true. In the "C" locale,ispunctreturns true for every printing character for which neitherisspacenorisalnumis true.
- 44,692
- 7
- 66
- 118
A small script to find them:
for i=0,255 do
if string.match(string.char(i), "%p") then
io.write(string.char(i))
end
end
io.write("\n")
-- $ luajit test.lua
-- !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
- 23,953
- 8
- 45
- 85
%p is matched by the C function ispunct (C source v 5.2), which matches the following:
041 ‘‘!’’ 042 ‘‘ ’’ 043 ‘‘#’’ 044 ‘‘$’’ 045 ‘‘%’’
046 ‘‘&’’ 047 ‘‘’’’ 050 ‘‘(’’ 051 ‘‘)’’ 052 ‘‘*’’
053 ‘‘+’’ 054 ‘‘,’’ 055 ‘‘-’’ 056 ‘‘.’’ 057 ‘‘/’’
072 ‘‘:’’ 073 ‘‘;’’ 074 ‘‘<’’ 075 ‘‘=’’ 076 ‘‘>’’
077 ‘‘?’’ 100 ‘‘@’’ 133 ‘‘[’’ 134 ‘‘\’’ 135 ‘‘]’’
136 ‘‘^’’ 137 ‘‘_’’ 140 ‘‘‘’’ 173 ‘‘{’’ 174 ‘‘|’’
175 ‘‘}’’ 176 ‘‘~’’
(From man ispunct)
- 77,877
- 8
- 106
- 148
- 191,408
- 23
- 240
- 301