I am getting a webpage from the web like this
import requests
html = requests.get("http://www.google.com/")
this returns a whole lot of junk in the html variable what I want from this is that I want only the data that is displayed in the web browser and no other useless data like html tag head , link , meta , script and other useless tags and its content . I tried doing this with the HTMLParser module but it just strips the tags out of it . Any Idea how should i achieve this?
