I am working on a project in which I need to crawl several websites and gather different kinds of information from them. Information like text, links, images, etc.
I am using Python for this. I have tried BeautifulSoup for this purpose on the  HTML pages and it works, but I am stuck when parsing sites which contains a lot of JavaScript, as most of the information on these files is stored in the <script> tag.
Any ideas how to do this?
 
     
     
     
     
     
    
