I have some website source stream I am trying to parse. My current Regex is this:
Regex pattern = new Regex (
@"<a\b             # Begin start tag
    [^>]+?             # Lazily consume up to id attribute
    id\s*=\s*['""]?thread_title_([^>\s'""]+)['""]?  # $1: id
    [^>]+?             # Lazily consume up to href attribute
    href\s*=\s*['""]?([^>\s'""]+)['""]?             # $2: href
    [^>]*              # Consume up to end of open tag
    >                  # End start tag
    (.*?)                                           # $3: name
    </a\s*>            # Closing tag",
RegexOptions.Singleline | RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace );
But it doesn't match the links anymore. I included a sample string here.
Basically I am trying to match these:
<a href="http://visitingspain.com/forum/f89/how-to-get-a-travel-visa-3046631/" id="thread_title_3046631">How to Get a Travel Visa</a>
"http://visitingspain.com/forum/f89/how-to-get-a-travel-visa-3046631/" is the **Link**
304663` is the **TopicId**
"How to Get a Travel Visa" is the **Title**
In the sample I posted, there are at least 3, I didn't count the other ones.
Also I use RegexHero (online and free) to see my matching interactively before adding it to code.
 
     
     
     
    