2
1
I'm trying to scrape data from a website for research.
The urls are nicely organized in an example.com/x format, with x as an ascending number and all of the pages are structured in the same way. I just need to grab certain headings and a few numbers which are always in the same locations. I'll then need to get this data into structured form for analysis in Excel.
I have used wget before to download pages, but I can't figure out how to grab specific lines of text.
Excel has a feature to grab data from the web (Data->From Web) but from what I can see it only allows me to download tables. Unfortunately, the data I need is not in tables.
In the end, I created a .txt file with a list of all the urls I needed and had wget download all the pages in that file. I then used iMacros as suggested by @Lamb to extract data locally. I found the trial of the fulll version easiest to start out with. Some useful beginner features are not available in the firefox plugin gui, even though the same code will work.
– Stoney – 2012-11-15T11:02:07.167