$string = file_get_contents('http://example.com');
if ('UTF-8' === mb_detect_encoding($string)) {
    $dom = new DOMDocument();
    // hack to preserve UTF-8 characters
    $dom->loadHTML('<?xml encoding="UTF-8">' . $string);
    $dom->preserveWhiteSpace = false;
    $dom->encoding = 'UTF-8';
    $body = $dom->getElementsByTagName('body');
    echo htmlspecialchars($body->item(0)->nodeValue);
}
This changes all UTF-8 characters to Å, ¾, ¤ and other rubbish. Is there any other way how to preserve UTF-8 characters?
Don't post answers telling me to make sure I am outputting it as UTF-8, I made sure I am.
Thanks in advance :)
 
     
     
     
     
    