Bulk Download Images with Organization

1

I have a list of URLs (pretty much all image urls though some pdf) that I need downloaded. I have found a variety of options for bulk downloading and these would work but I need them to be organized by the directory they are listed as in the URL. For Example:

samplesite.com/sample1/image1.jpg
samplesite.com/sample1/image2.jpg
samplesite.com/sample1/image3.jpg
samplesite.com/sample2/image1.jpg
samplesite.com/sample2/image2.jpg
samplesite.com/sample2/image3.jpg

I would need to be organized like this:

Folder Sample1
image1.jpg
image2.jpg
image3.jpg
Folder Sample2
image1.jpg
image2.jpg
image3.jpg

I do have SFTP access but each directory is terribly organized and has image files mixed with other irrelevant files. Additionally, most of the batch scripts I have tried to create have had issues. When I did xCopy there was no way to figure out which files failed, and when I did robocopy speed was compromised. Any suggestions on how I should go about moving forward? Existing software is preferred, but I am fine with advice on how I should script this. I prefer not to have to install anything to access SFTP via command line, but if that's the only option, it is what it is.

Nick

Posted 2015-03-02T12:19:17.063

Reputation: 45

Answers

0

I think wget can do that with some options. Please try wget -input-file=urls.txt --force-directories

From the wget manual:

--input-file=file
Read URLs from a local or external file. If ‘-’ is specified as file, URLs are read from the standard input. (Use ‘./-’ to read from a file literally named ‘-’.)

and

--force-directories
The opposite of ‘-nd’—create a hierarchy of directories, even if one would not have been created otherwise. E.g. ‘wget -x http://fly.srk.fer.hr/robots.txt’ will save the downloaded file to fly.srk.fer.hr/robots.txt.

agtoever

Posted 2015-03-02T12:19:17.063

Reputation: 5 490

Worked like a charm for URLs. Do you happen to know if it supports SFTP? – Nick – 2015-03-02T15:41:34.223

I think in that case you need to switch to curl instead of wget. – agtoever – 2015-03-02T15:53:58.240