How do I write the Regular Expression in PHP to match HTML <p> that are AFTER the first <H1> tag?
For example the following states if not equal to the expression
if(!preg_match_all('#<p(.*?)<\/p>#', $page_content, $matches)
How do I write the Regular Expression in PHP to match HTML <p> that are AFTER the first <H1> tag?
For example the following states if not equal to the expression
if(!preg_match_all('#<p(.*?)<\/p>#', $page_content, $matches)
In properly written HTML (i.e HTML that isn't designed to break all sorts of parsers by abusing the loopholes in SGML specification), all <h1>s will have corresponding closing tags. That means you can simply look for a <p> preceded by a </h1>.
<\/h1>[\s\S]*?<p>([\s\S]*?)<\/p>
Here's how the above regex works, and a proof of concept:
<\/h1> matches </h1> literally[\s\S]*? matches all characters until the next <p><p> matches <p> literally([\s\S]*?) matches all characters until the next </p> (note the capturing group - this group contains what you want)<\/p> matches </p> literally