This could be a duplicate question, but I have no idea what search terms to look up, so don't be hard on me if it has been asked before (and I'm pretty sure it was).
So I am getting a web page's source code using the WebClient class and saving the entire string in the source variable:
var client = new WebClient();
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
var data = client.OpenRead(urlAddress);
var reader = new StreamReader(data);
var source = reader.ReadToEnd();
data.Close();
reader.Close();
Now I want to process certain text ranges from the source variable, especially user posted messages. Now the problem is that in the web pages source "&" is actually &, "'" is ’ and quotes (") are either –, “, ” and who knows what else.
Well, I could replace those codes with the actual symbols using the Replace string method, but I would like to know if there is a way to convert all those codes to the actual (expected) symbols. Is there a method that can do that, or maybe a library or some utility class on the Internet?