Checking for dead links in a html file?

4

2

I have a html file with several hundred links, a research database of mine, of a sort, collected over the years. What would be the easiest way to check which one of them are still alive ?

(importing in firefox is out of the question)

Rook

Posted 2010-05-10T03:25:55.050

Reputation: 21 622

1

Dup of http://superuser.com/questions/38428/application-to-check-broken-links

– Charles Stewart – 2010-05-11T12:39:12.107

Answers

5

Use wget. Simple, scriptable, command-line, and available on your favorite platform, whether it's Unix-ish, Win*, Cygwin, etc (see Wikipedia for links to various versions). From the manpage:

--spider
When invoked with this option, Wget will behave as a Web spider, which means that it will not download the pages, just check that they are there. For example, you can use Wget to check your bookmarks:

wget --spider --force-html -i bookmarks.html

This feature needs much more work for Wget to get close to the functionality of real web spiders.

You might want the --no-verbose and/or --output-document=file options too.

quack quixote

Posted 2010-05-10T03:25:55.050

Reputation: 37 382

probably doable with curl too. – quack quixote – 2010-05-10T03:41:02.453

@quack why is the answer marked as CW ? – Sathyajith Bhat – 2010-05-10T04:19:52.157

@sathya: why not? :) – quack quixote – 2010-05-10T04:29:13.430

That will do, yes. Btw, I don't think this should be CW. – Rook – 2010-05-10T05:12:44.027

@idigas: honestly, it's CW because i believe this question's a duplicate. haven't managed to search up a predecessor yet tho. no biggie; if someone wants to add a curl example they can. – quack quixote – 2010-05-10T05:46:55.130

@quack - interesting. I also thought that I saw this question before, but curiously, haven't been able to dig it up ... – Rook – 2010-05-10T12:37:16.893

@quack Hmm I see. Still, this answer warrants some rep love :-) – Sathyajith Bhat – 2010-05-10T14:01:53.490