I have a plugin tag [crayon ...] that may or may not be rendered in a <p></p> block like so:
<p>This is a <b>sentence</b> [crayon ...] The Crayon [/crayon] of words. </p>
Since my tag is replaced by a <div> tag, the <p> is left disjoint from </p> and the browser closes it for me, leaving a blank paragraph above my plugin. In any case, the markup is invalid and has weird outcomes. My problem is that I need to detect if [crayon lies between a <p></p> block. I have found two ways so far:
- Use
<p(?:\s+[^>]*)?>(.*?)</p(?:\s+[^>]*)?>and search for[crayonin the capture. - Use
<p[^>]*>(?:[^<]*<(?!/?p(\s+[^>]*)?>)[^>]+(\s+[^>]*)?>)*[^<]*\[crayonfor the case of<p>...[crayonwhere ... doesn't contain a</p>or<p>and a similar method for a</p>after the[crayon]tag.
The second method is harder to read but will fail if a </p> is captured before my tag. It doesn't require any further processing to find my tag within the <p></p> like the first. However, the first regex is much simpler and will execute quicker. Which should I use, and is there a better way?
EDIT:
For method 2, this beast works:
<p[^<]*>(?:[^<]*<(?!/?p(\s+[^>]*)?>)[^>]+(\s+[^>]*)?>)*[^<]*((?:\[crayon[^\]]*\].*?\[/crayon\])|(?:\[crayon[^\]]*/\]))(?:[^<]*<(?!/?p(\s+[^>]*)?>)[^>]+(\s+[^>]*)?>)*[^<]*</p[^<]*>
, why are you using a
, you'll need a proper HTML parser.
with the wpautop() function.
– Aram Kocharyan Jan 21 '12 at 04:59or
` rather than go the regex route.