Working with HTML, I want to match all tags containing a string. For example, I want to match all hyperlinks (separate matches; one match per complete ... tag) within each of which appears the string "click here".
Example source - I want to match each of these as separate matches:
<a href="/somepage">click here</a>
<a href="/somepage">please <b>click here</b> now</a>
<a href="/somepage"><img src="/someimage" alt="click here"/></a>
So I need to start with the opening tag (eg. <a\s+[^>]+>) then match "click here" but on condition it appears before the next closest </a> closing tag. For example, the following are not suitable:
<a\s+[^>]+>.*?click here.*?</a> matches any link (then all HTML) up to the first "click here".
<a\s+[^>]+>[^<]*click here.*?</a> only matches if no other tags exist inside the <a>.
Only idea so far:
<a\s+[^>]+>(?:.*?(?=</a>)) will match everything within a specific <a> tag, but I don't know how to then "back-check" for text within the (?:) group. Is that possible?