Force wget to timeout

5

1

How can I force wget to stop after X seconds?

I have a script that downloads images and from time to time it gets stuck and refuses to "timeout".

What I've tried:

--tries=3 --connect-timeout=30

From ps aux:

root     26543  0.0  0.0  38636  1656 ?        S    20:40   0:00 wget -nc --tries=3 --connect-timeout=30 --restrict-file-names=nocontrol -O 18112012/image.jpg http://site/image.jpg

teslasimus

Posted 2012-12-05T18:49:36.980

Reputation: 443

Have you tried the --timeout (or -T) option? – gniourf_gniourf – 2012-12-05T19:04:46.797

yes... everything from wget man – None – 2012-12-05T19:12:36.980

Are you sure you tried wget -nc --tries=3 -T30 --restrict-file-names=nocontrol -O 18112012/image.jpg http://site/image.jpg? I've never had problems like you're describing. – gniourf_gniourf – 2012-12-05T19:17:58.223

Answers

2

You can run the wget command as a background process and send a SIGKILL to forcibly kill it after sleeping for a certain amount of time.

wget ... &
wget_pid=$!
counter=0
timeout=60
while [[ -n $(ps -e) | grep "$wget_pid") && "$counter" -lt "$timeout" ]]
do
    sleep 1
    counter=$(($counter+1))
done
if [[ -n $(ps -e) | grep "$wget_pid") ]]; then
    kill -s SIGKILL "$wget_pid"
fi

Explanation:

  • wget ... & - the & notation at the end runs the command in the background as opposed to the foreground
  • wget_pid=$! - $! is a special shell variable that contains the process id of the most recently executed command. Here we save it to a variable called wget_pid.
  • while [[ -n $(ps -e) | grep "$wget_pid") && "$counter" -lt "$timeout" ]] - Look for the process every one second, if it's still there, keep waiting until a timeout limit.
  • kill -s SIGKILL "$wget_pid" - We use kill to forcibly kill the wget process running in the background by sending it a SIGKILL signal.

sampson-chen

Posted 2012-12-05T18:49:36.980

Reputation: 359

1What if wget succeeds? You'll throw randomly a SIGKILL? – gniourf_gniourf – 2012-12-05T19:08:14.850

how can i skip "sleep" if file has been already downloaded? – None – 2012-12-05T19:09:25.173

1Rewrote the script to make it a bit more robust and address the skipping of sleep times issue – sampson-chen – 2012-12-05T19:17:53.313

@gniourf_gniourf is there evidence that it's harmful to send a SIGKILL to nothing? – kojiro – 2012-12-05T19:25:17.023

@kojiro depending on implementation, technically pids can get reused; so it's theoretically possible for it to hit another process. – sampson-chen – 2012-12-05T19:34:24.497

@kojiro Why to nothing? – gniourf_gniourf – 2012-12-05T19:34:34.673

@sampson-chen the probability of that happening is near zero. Exactly zero if you use the job number instead of the pid. – kojiro – 2012-12-05T20:32:01.037

12

Easiest way is to use the timeout(1) command, part of GNU coreutils, so available pretty much anywhere bash is installed:

timeout 60 wget ..various wget args..

or if you want to hard-kill wget if its running too long:

timeout -s KILL 60 wget ..various wget args..

Chris Dodd

Posted 2012-12-05T18:49:36.980

Reputation: 291

You do know bash is installed on every Mac, which do not have a GNU userland, right? – kojiro – 2012-12-05T19:22:08.480

1@kojiro: The question is tagged ubuntu which includes GNU coreutils, and it can readily be installed on a mac if you want it. – None – 2012-12-05T19:24:55.207

Hmm, maybe it should've been migrated to askubuntu.com. >;) – kojiro – 2012-12-05T21:24:00.760

1This is definitely the easiest and safest way. man timeout will show other options. I would personally not use the -s KILL option directly, but maybe use the -k option, something like timeout -k 5 60 wget ..various wget args.. (to give a 5 extra seconds chance to wget before being KILLed). This answer deserves +1. – gniourf_gniourf – 2012-12-06T07:49:12.050

0

I recently noticed that wget 1.14 was silently ignoring the --timeout option, it worked ok when I updated it to 1.19

adrianTNT

Posted 2012-12-05T18:49:36.980

Reputation: 223

0

Community wiki because this is mostly a copy of sampson-chen's answer, but I wanted to point out a couple of things:

wget ... &
# Strictly speaking you can just use the job number,
# which is probably %1, but saving the pid is also fine.
wget_pid=$! 
counter=0
timeout=60
# use kill -0 to check if a pid is still running
while kill -0 "$wget_pid" && (( counter < timeout )); do
    sleep 1
    (( counter++ ))
done
# if killing nothing is distasteful, use kill -0 one more time.
# also think a SIGKILL is overkill since the question doesn't imply wget needs it.
kill -0 "$wget_pid" && kill "$wget_pid"

kojiro

Posted 2012-12-05T18:49:36.980

Reputation: 251