Extracting information from web page in given interval

1

I have a problem. I need to extract something like 800 records and it would take days to do that manually.

The information may be taken by inserting few fields into form (always the same) and then a variable in given interval. The page doesn't pass parameters in plain text so I cannot just use a very simple tool that increments parameter variable.

Is there any good tool that could help me with this?

Jarek

Posted 2011-03-21T15:31:30.667

Reputation: 1 361

An example would be very helpful. Where do you enter the variable? – LaLeX – 2011-03-25T12:16:20.697

I would use curl in a bash script, from the sounds of it. How do you mean the data is not passed plain text? Is it passed with a post request? What is your OS, as well? – 0x90 – 2011-03-25T12:16:22.180

Link http://nahlizenidokn.cuzk.cz/VyberParcelu.aspx - sorry for the language - you fill in for example first field Brno second Zidenice third is that interval so for example I want to get data from 6200 to 6500

– Jarek – 2011-03-25T12:26:23.457

Your form doesn't work with these values. Better show it as a picture. – harrymc – 2011-03-26T10:40:47.270

if you can post the html that the site generates by using the view source function of your browser. Other wise contact the site administrator and ask if they can export the database data to you or give you read privileges and access to the database. – nelaaro – 2011-03-31T07:00:29.053

Answers

3

If you are not afraid of programming, there is a brilliant Sellenium framework. In short, you can automate and mimic a browser, programming its behaviour on a page. http://code.google.com/p/selenium/?redir=1

Darek

Posted 2011-03-21T15:31:30.667

Reputation: 886

1

Dobrý den,

Hey, it seems they have some SOAP services, probably you could use the xml rpc library from any computer language to access the data. I found this wsdl https://katastr.cuzk.cz/static/wsdl/sestavy.wsdl but they might have other...

celebdor

Posted 2011-03-21T15:31:30.667

Reputation: 826

0

Look into writing a script using perl mechanize or ruby mechanize. Maybe even AutoHotKey, program for creating macros, would work for you.

pbsmind

Posted 2011-03-21T15:31:30.667

Reputation: 131