How to use wget with an input file and filenames

2

1

i have a text file that contains 10,000 url's with a unique number i want to save the file as. Each line has a 10 character code, then the URL of the image to retrieve. How can I make the input file use the first 10 characters as the wget filename?

this is an example of the input file: input.txt

x100083590http://image.allmusic.com/13/adg/cov200/drt200/t291/t29123q8m19.jpg
b200149548http://ecx.images-amazon.com/images/I/41DoH%2BAWKEL.jpg
z100151855http://image.allmusic.com/13/amg/cov200/dri400/i450/i45035hxdrb.jpg
p400171646http://ecx.images-amazon.com/images/I/61cH4n34IhL.jpg

wget -i input.txt would get the file but not with the preceding unique number.

I want t29123q8m19.jpg (the first line) to be saved as x100083590.jpg

If there is a better way to write out the input file, say with the URL first, then I can do that too, but I will never know the length of the first field. Right now the first 10 characters will always be what I want to save the wget image as.

Edit This is being done in a windows environment.

Matt

Posted 2012-11-20T22:16:18.187

Reputation: 23

Answers

3

Use the following batch file:

@echo off
setlocal enabledelayedexpansion
for /f %%l in (Input.txt) do (
    set line=%%l
    wget -O !line:~0,10!.jpg !line:~10!
)

Karan

Posted 2012-11-20T22:16:18.187

Reputation: 51 857

6

In linux.

 while read p; do
   newname=${p:0:10} # first 10 chars
   url=${p:10} # remaining chars after the 10th
   wget $url -O $newname.jpg  #get url and output to new filename
 done < input.txt

Under windows, we could do:

 SETLOCAL ENABLEDELAYEDEXPANSION
 for /f %%p in (input.txt) do (
    SET p1=$$p
    SET newname=!p1:~0,10!
    SET url=!p1:~10!
    wget %url% -O %newname%.jpg
 )

Paul

Posted 2012-11-20T22:16:18.187

Reputation: 52 173

Thank you for the Prompt reply Paul. How would one do it in Windows? – Matt – 2012-11-20T22:37:32.507

@Matt Added windows, let me know if you find any bugs – Paul – 2012-11-20T23:03:21.143

Here is the error i get upon running this in a batch file:

--2012-11-20 15:19:44-- http://~10/ Resolving ~10... failed: No data record of requested type. wget: unable to resolve host address `~10'

– Matt – 2012-11-20T23:24:20.423

@Matt Ok try this. Stupid windows :) – Paul – 2012-11-20T23:56:16.273

Same error. Karan's solution above is almost the same, and is working. I still thank you for your assistance and contribution. – Matt – 2012-11-21T00:08:44.663

@Matt Great. Did you include the first line - setlocal? Karen's solution is identical, other than doing the substrings on the same line. I don't want to leave this answer as is if there is a bug, but it works when I use it. – Paul – 2012-11-21T00:19:58.577

@Matt Actually, I think I found it - I had the filename quoted. – Paul – 2012-11-21T00:21:27.567

Sorry this still isnt working.

C:\Wget\bin>( SET p1=$$p SET newname=!p1:~0,10! SET url=!p1:~10! wget -O .jpg ) SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc syswgetrc = C:\Wget/etc/wgetrc wgetrc_file_name = C:\Wget\etc\wgetrc wget: missing URL Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.

Edit, Sorry I do not know how to format these comments – Matt – 2012-11-21T16:07:15.200

0

Using and the shell (cygwin or git-bash):

file=/PATH/TO/INPUT_FILE.txt
awk '{print "wget \047" substr($0, 11) "\047 -o " substr($0, 0, 10) ".jpg"}' "$file | sh

same, but multi lines version:

file=/PATH/TO/INPUT_FILE.txt
awk '
    {
        print "wget \047" substr($0, 11) "\047 -o " substr($0, 0, 10) ".jpg"
    }
' "$file | sh

Gilles Quenot

Posted 2012-11-20T22:16:18.187

Reputation: 3 111