wget - Many URL's in .txt file - download and save as

1

0

I have 2000 URLs in excel file. The URLs are in the first column and in the second column there are names for the files downloaded from URL in the first column. I can copy that and paste to .txt file if it's needed, no problem. File names contain spaces. I need to do this on Windows 7. Could you help me?

@Edit: Well, sorry If my problem is unclear. I'm not english native speaker. I have URL in first column and and I want to save the file downloaded from this URL with name from the second column. I want that spaces to be there. I want to download all the files with one command or batch file using "wget" tool.

user194380

Posted 2013-01-31T19:31:58.227

Reputation: 21

wget -i will read a list of URLs from a file, I'm not sure how you'd get it to rename the files as it download them, though. – Rob – 2013-01-31T20:04:53.987

Answers

0

Steps

  1. Open your worksheet in Excel and click File → Save As.

  2. Close Excel to unlock the file.

  3. Choose CSV (comma separated values) as type and same your file as urls.csv.

  4. Open a command prompt, execute

    type urls.csv
    

    and identify the value separator (character placed between URL and file name.

    If it's, e.g., a semicolon, execute the following command:

    for /f "delims=; tokens=1,2" %a in (urls.csv) do @wget -O "%b" "%a"
    

How it works

  • Excel saves the URLs and corresponding names as comma (or semicolon) separated values.

    Example:

    http://foo;bar
    http://foo bar;foobar
    
  • for /f ... %a (urls.csv) goes through all lines and saves the first value in %a and the second in %b.

    Here, delims=; specifies the semicolon as value separator and token=1,2 specifies that there will be two tokens.

  • wget -O "%b" "%a" saves %a in %b. Since the URL is quoted, Wget will automatically take care of spaces and other special characters.

  • The @ in front of @wget prevents the commands from being printed.

See also: For /f - Loop through text | SS64.com

Dennis

Posted 2013-01-31T19:31:58.227

Reputation: 42 934

could you give an example for a csv with urls only please? – Mike – 2014-12-02T19:14:42.897

@mugur: Would the URLs be separated by commas or newlines? – Dennis – 2014-12-02T19:25:58.770

newlines ... I got stuck in this answer and I did search for other options as wget has an option for this in the form of wget -i file.csv. Sorry for bothering. – Mike – 2014-12-02T20:22:13.433

Excel 2010 saves CSVs with commas for me. – Karan – 2013-01-31T20:19:10.523

Excel 2003 seems to use semicolons. – Dennis – 2013-01-31T20:21:27.340

Why call it "C"SV then and not SSV?! – Karan – 2013-01-31T20:22:09.567

Slightly misleading, yes. Other applications use tabs. Semicolons are usually a better choice, since it's less likely that they will naturally occur in the cells. Now that I think of it, it's probably because my Office is in Spanish. We use the comma as a decimal separator, so actual CSVs would be a poor choice... – Dennis – 2013-01-31T20:25:15.513

I don't think it's because of your Office language. Excel simply uses whatever character you've set as your OS' preferred List separator under Control Panel's Region and Language / Additional settings / Customize Format. – Karan – 2013-01-31T20:27:52.830

I checked and it's set to ,, but I found the responsible setting: You can adjust the decimal separator in Excel (in 2003, it's in Tools, Options, International). If it's set to ., Excel uses actual comma separated values. If it's ,, it uses semicolons instead. – Dennis – 2013-01-31T20:39:16.650

Thank you very much! It works. And thanks for the explanation! (LibreOffice Calc saves CVS with commas) – user194380 – 2013-01-31T21:17:01.153

0

Can we help you? Possibly, if you actually said what it is that you need to do. What do you mean 'file names"?

Here is a general answer. 1) In a spreadsheet program copy the column that contains the data from which you want to remove spaces. 2) Save that to a .txt file. 3) Open that .txt file in any program with working search and replace. 4) Search for spaces and replace with _ 5) Save that .txt file 6) Open it in your spreadsheet program. 7) You should have a column with data_data_data. 8) Copy that column into your original file.

Would that solve the puzzle?

NoMonkey No

Posted 2013-01-31T19:31:58.227

Reputation: 1

could you give an example for a csv with urls only please? – Mike – 2014-12-02T19:14:57.987

Well, sorry If my problem is unclear. I'm not english native speaker. I have URL in first column and and I want to save the file downloaded from this URL with name from the second column. I want that spaces to be there. I want to download all the files with one command or batch file using "wget" tool. – user194380 – 2013-01-31T19:53:04.683

0

Say Input.txt looks like this:

http://cdn.sstatic.net/superuser/img/sprites.png sp ri te.png
http://www.google.com/images/srpr/logo3w.png go og le.png

A single command like the following:

for /f "tokens=1*" %i in (Input.txt) do wget -O "%j" "%i"

will save the files as sp ri te.png and go og le.png respectively.

To use in a batch file, just double the % signs.

Note: Obviously, the URLs must not contain spaces. Ensure they are encoded to use %20 etc.

Karan

Posted 2013-01-31T19:31:58.227

Reputation: 51 857

could you give an example for a csv with urls only please? – Mike – 2014-12-02T19:14:25.730

Here is the example of my URL, there is a redirection but wget has no problems with downloading. There are no spaces in the URLs. Will it work? http://www.dominiopublico.gov.br/pesquisa/DetalheObraDownload.do?select_action=&co_obra=1877&co_midia=2

– user194380 – 2013-01-31T20:11:41.287

I can (and did) try it, but would you care to try it yourself on some sample URLs and confirm? – Karan – 2013-01-31T20:16:36.177