What regular expression to perform search for header that starts with a number such as 1. Humility?
Here's the sample data screen shot, http://www.knowledgenotebook.com/issue/sampleData.html
Thanks.
What regular expression to perform search for header that starts with a number such as 1. Humility?
Here's the sample data screen shot, http://www.knowledgenotebook.com/issue/sampleData.html
Thanks.
 
    
    Don't know what regex your using so I asume its Perl compatible.
You should always post some example data incase your perceptions of regex are unclear.
Breaking down what your 'Stop signs' are:
## left out of regex, this could be anything up here
##
(?:              # Start of non-capture group         START sign
     \d+\.           # 1 or more digits followed by '.'
   |              # or
     \(\d+\)         # '(' folowed by 1 or more digits followed by ')'
                     # note that \( could be start of capture group1 in bizzaro world
)                # End group
\s?              # 0 or 1 whitespace (includes \n)
[^\n<]+          # 1 or more of not \n AND not '<'    STOP sign's
It seems you want all chars after the group up to but not to include the
very next \n  OR the very next '<'. In that case you should get rid of the \s?
because \s includes newline, if it matches a newline here, it will continue to match
until [^\n<]+ is satisfied.
(?:\d+\.|\(\d+\))[^\n<]+
Edit - After viewing your sample, it appears that you are searching unrendered html
pasted in html content. In that case the header appears to be:
'1. Self-Knowledge<br>'  which when the entities are converted, would be
1. Self-Knowledge<br>
You can add the entity to the mix so that all your bases are covered (ie: entity, \n, <):
((?:\d+\.|\(\d+\)))[^\S\n]+((?:(?!<|[\n<]).)+)
Where;
Capture group1 = '1.'
Capture group2 = 'Self-Knowledge'
Other than that, I don't know what it could be.