Commenting in a wget list?

5

2

I need to download about 100 packages so I'm using wget-list to make it easier. My question however, is once I've made the list (I assume it's in a .txt format), is there a way I can insert comments into it that wget will ignore? Something like this:

#This is a comment
http://someurl.com
http://anotherurl.com

n0pe

Posted 2011-04-02T12:45:51.810

Reputation: 14 506

Answers

1

It doesn't look like it:

If --force-html is not specified, then file should consist of a series of URLs, one per line.

You could try HTML style comments: <!-- Comment --> - maybe those get interpreted as comments, although I wouldn't count on it.

You could also use the --force-html parameter and feed it HTML - a format in which you'd have all freedom to comment as much as you like. The downside is that it adds a lot of clutter:

<!-- This is a comment -->
<a href="http://someurl.com"></a>
<a href="http://anotherurl.com"></a>

Pekka

Posted 2011-04-02T12:45:51.810

Reputation: 2 239

Yeah HTML makes it way too messy. Thanks for the clarification. – n0pe – 2011-04-02T13:01:27.590

4

You can pipe through grep or sed to remove comments:

grep -v '^#' ~/list.wget | wget -i- -c -B http://base.url.if_needed

bsquared

Posted 2011-04-02T12:45:51.810

Reputation: 41

4

Just put comments in your without notation, wget will simply pick them up as invalid URLs.

Anthony

Posted 2011-04-02T12:45:51.810

Reputation: 41

0

I tested wget using the list available at https://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28syntax%29#Inline_comments

I discovered that wget does not support a comment character. However, the following generate quick "Invalid URL" errors:

:  Test comment 1   list: Invalid URL :  Test comment 1: Scheme missing
:: Test comment 2   list: Invalid URL :: Test comment 2: Scheme missing
#  Test comment 3   list: Invalid URL http://#  Test comment 3: Invalid host name
// Test comment 4   list: Invalid URL // Test comment 4: Scheme missing

These were not listed on the Wikipedia article but also cause quick "Invalid URL" errors

/ Test comment 1    list: Invalid URL / Test comment 1: Scheme missing
[ Test comment 1    list: Invalid URL http://[ Test comment 1: Unterminated IPv6 numeric address
@ Test comment 1    list: Invalid URL http://@ Test comment 1: Invalid user name
? Test comment 1    list: Invalid URL http://? Test comment 1: Invalid host name

The remaining comment lead-in character strings all caused wget to attempt to resolve a domain name using DNS resulting in at least eight lines of error output.

I also discovered that wget scans the entire list file and builds a list of URLs to fetch before it starts fetching. For example if you have a list file containing:

# test comment 1
# test comment 2
http://superuser.com/questions/265711/commenting-in-a-wget-list

# test comment 3
# test comment 4
# test comment 5
# test comment 6

The wget output is:

list: Invalid URL http://# test comment 1: Invalid host name
list: Invalid URL http://# test comment 2: Invalid host name
list: Invalid URL http://# test comment 3: Invalid host name
list: Invalid URL http://# test comment 4: Invalid host name
list: Invalid URL http://# test comment 5: Invalid host name
list: Invalid URL http://# test comment 6: Invalid host name
--2015-08-19 14:03:55--  http://superuser.com/questions/265711/commenting-in-a-wget-list
Resolving superuser.com (superuser.com)... 190.93.247.58, 190.93.244.58, 141.101.114.59, ...
Connecting to superuser.com (superuser.com)|190.93.247.58|:80... connected.
HTTP request sent, awaiting response... 200 OK
<snip>

Thus, while : :: # / // [ @ ? can all safely be used as comment lead-in characters the resulting errors will be output first and will not be in-line with the wget attempts to fetch pages.

user3347790

Posted 2011-04-02T12:45:51.810

Reputation: 353