How to list and copy all images that a webpage uses automatically?

2

I have used the Photoshop slicing tool to create some HTML pages, but it has generated dozens of images, many of which are not used by the final pages. To delete all useless files, I need a way to determine and copy all images that my webpages do use (with IMG tags and CSS styles).

I think that Teleport VLX, Firefox and Chrome can't do this. How can I do this?

Dims

Posted 2011-12-27T17:53:43.257

Reputation: 8 464

Answers

2

So, are you trying to go through different websites and download all of the images from that website? Basically a targeted web crawler?

http://www.webreaper.net/

WebReaper is a program I used back in the late 90s that would download assorted data from websites. You can target it to just download images.

enter image description here

kobaltz

Posted 2011-12-27T17:53:43.257

Reputation: 14 361

No, I try to extract valuable files from generated by myself – Dims – 2011-12-27T20:12:59.773

From generated who? I'm not sure if you're translation is coming across well. – kobaltz – 2011-12-27T20:15:20.987

So it does not working also as teleport. No image copied. I use images in styles, not in <IMG> tags. Image paths are eclosed to quotes, so WebReaper request files with that quotes and unable to copy any image – Dims – 2011-12-27T20:25:06.267

I created multiple HTML files, these are mine files, not from web. – Dims – 2011-12-27T20:25:57.250

What is your end goal? If they are your files, can't you SSH/FTP/Remote into your server and grab the files? – kobaltz – 2011-12-27T20:28:50.683

Also, you may look into disabling certain slices in your photoshop if you're worried about unused images being created when exporting. – kobaltz – 2011-12-27T20:29:46.737

I have generated multiple pages, but finally have compiled some of them into one. So, I have hundreds of images in my "images" folder, while my web page use only dozen. I want to extract only those files which are actually used in webpage. – Dims – 2011-12-27T20:32:11.870

Are you using a certain language? IE, in RoR, I would solve something like this by removing all of the images. I would get a list of errors returned in my console and would be able to see which files are still being requested. – kobaltz – 2011-12-27T20:34:38.460

Thanks I have already grepped for them, was just wondering if an utility exists and also was lazy to write regexps – Dims – 2011-12-27T20:40:01.200

0

You can use the Firefox Page Info window:

The Page Info window gives you technical details about the page you're on. To open it: Click the Site Identity Button (the website’s icon to the left of its address) and click the More information... button in the prompt.

[...]

The Media panel displays the URL and type of all the backgrounds, images, and embedded content (including audio and video) that loads with the page.

user33758

Posted 2011-12-27T17:53:43.257

Reputation: