Questions tagged [screen-scraping]

Screen scraping, also known as web scraping or data scraping, is a software technique used to collect and parse information from user interfaces. Questions about using programming languages to do screen scraping are off topic and should be asked at Stack Overflow Q&A.

Screen scraping, also known as web scraping or data scraping, is a software technique used to collect and parse information from user interfaces.

Because information on webpages is almost certainly organized in well-formatted HTML, basic screen scraping can be a simple task. In most cases, the reason for screen scraping is to not only parse the data on the webpage, but then to collect it either by reproducing it on a different webpage or by storing it in a file or database.

41 questions
32
votes
7 answers

How do I copy text from a dialog box?

In Windows sometimes I get an error dialog with long text. Manually typing out the exact details of an error message can be an annoying and lengthy process. Is there way to copy text from a Windows dialog box?
Tony_Henrich
  • 12,156
12
votes
4 answers

How "legal" is site-scraping using cURL?

Recently I was experimenting with the cURL, and I found lot is possible with it. I built a small script that crawls a musical site, which plays online songs. On the way of my experiment, I found that it is possible to crawl the song source also..…
9
votes
2 answers

Get Current HTML Of Page Built With AJAX Requests

So, I'm using the Chrome/Chromium browser (put could use Firefox, if need be). I'm viewing webpages which are constructed "on the fly" with (presumably) AJAX (think the how you scroll down on Facebook and things just keep appearing and…
Richard
  • 3,501
  • 2
  • 23
  • 26
4
votes
4 answers

Save report from Windows checking removable disk?

Is there any way to save a report of the errors that Windows found and fixed in a USB key? Windows presented me this dialog, btw that is not resizable: . Some problems were found and fixed. When I open the details there is a long list of files with…
JohnC
  • 669
4
votes
1 answer

Extract data from an online atlas

There is an online atlas that I would like to extract values from. The atlas provides a tool ('Query') to extract values when you click a location or enclose a region on the map, or you can specify the latitude/longitude of a point where you want…
KAE
  • 1,919
4
votes
2 answers

How to automatically copy text from different websites

I want to know how to automatically copy text from different websites. I am building a database of companies which belong to certain associations. The website has a list of companies with the description of each of them which I am manually copying.…
BDstat
  • 41
3
votes
1 answer

What is the name for google's information summary box (pic inside)? Is there an api to access it from a google search?

Example of what I'd like to access: Just wondering if I could write something that'll query google with a search string and if the summary box returns, fetch only that, otherwise fetch the first few links. I think this is called "screen…
Daz C
  • 201
3
votes
0 answers

wget put all prerequisites in flat subdirectory, but not root page?

I'm trying to get wget to save a page + prerequisites in a format resembling that of a web browser: article.html article_files/img.jpg article_files/script.js I am able to get almost this behavior, but article.html is inside article_files. Is this…
3
votes
4 answers

I am seeing animated PNG files on some sites in lieu of GIFs. How can I save/download them?

I know how to save a GIF, it's super easy. Just right click and save. Voila! But with these new animated PNG's, I haven't the foggiest. Without using some sort of screen scrape where I'm grabbing the whole desktop, I am not even sure it's possible.…
2
votes
2 answers

Web scraping / crawling a particular Google book

For my work, I need to scrape the text from a large book on Google Books. The book in question is a very old book and is out of copyright. The book is a Gazetteer of the World. We will be putting the text into a database, so we need the raw text…
2
votes
0 answers

How to paste HTML headings into Excel

In a previous answer (vba - html table to excel worksheet) about parsing/pasting HTML table contents into an Excel sheet, wbeard2 shared this very helpful, illustrative piece of code. He/she notes that it implants the table data into Excel but not…
1
vote
0 answers

Recommendation on Web scraping and data flow

I have a solar panel unit, and the company that set it up (Fronius) has a website where I can live stream the data collected from the solar panel (Current power, Energy today, monthly, and yearly data). I'd like to display the data from the…
1
vote
3 answers

Save parts of a website as pure text

I hope I may ask this here. I need to extract the contents of an existing website (in charge of the website owner) to Word (or text) documents. For this, I only need the content from one DIV with a given ID. Is there any tool for Windows that can do…
Martin
  • 4,012
1
vote
1 answer

How do I use AutoHotKey to read the text at the mouse location?

I can read the entire window text using WinGetText() but I am trying to get the text at the current mouse location. I've found several examples on the AutoHotKey forums but they are all very old (from 2007-2009) and the samples no longer work and…
1
vote
1 answer

Is it legal to screen scrape your own bank statements in the US?

I want to automatically download my bank statements. My bank charges monthly for OFX access so have considered gathering the data points by other means (scripting, screen scraping), but I want to know if it's legal in the US. Does anyone have any…
Matt
  • 121
1
2 3