Why wget wall clock time much higher than download time?

0

wget --page-requisites --span-hosts --convert-links --adjust-extension --execute robots=off --user-agent Mozilla --random-wait https://www.invisionapp.com/inside-design/essential-steps-designing-empathy/

The command above provides the following time stats:

Total wall clock time: 35s
Downloaded: 248 files, 39M in 4.2s (9.36 MB/s)

This website takes about 5 seconds to download and display all files on a hard refresh in the browser.

Why is the wall clock time significantly longer than the download time and is there a way to make it faster?

JustCodin

Posted 2019-06-17T22:17:34.983

Reputation: 1

Welcome to SuperUser. Please use the search function. You can find similar questions at https://stackoverflow.com/questions/3430810/wget-download-with-multiple-simultaneous-connections and several other places in the Stack Exchange network.

– Christopher Hostage – 2019-06-17T22:38:17.147

Thanks. I did search, but I believe the question is different. If total_download_time represents the total time spent downloading the files, then this matches the speed of the browser, which would be ideal, however, wall_clock - total_download_time = 35s - 4.2s = 30.8s seems like a rather excessive time to write 248 files with cumulative size of 39M to disk.

I've also tried removing --convert-links and adding --no-clobber to the wget command and running wget_command & wget_command in the terminal to spawn multiple processes to parallelize the download, but no success. – JustCodin – 2019-06-17T23:09:40.433

A better question might be "What tool should I use instead of wget". Wget is single-threaded. – Christopher Hostage – 2019-06-17T23:11:09.853

@ChristopherHostage Thanks. I may ask a new question. But even with wget as a single-threaded program, shouldn't launching multiple instances wget_command & wget_command (which is from the article you linked that I found before and tried) with --no-clobber effectively achieve parallel downloads? Because --no-clobber checks if any file (or even part of a file) has been downloaded and should skip it and move on to the next file that actually needs to be downloaded. But my results don't achieve this, which is why I asked this question. And 30.8s seems like a long time to write 39M. – JustCodin – 2019-06-17T23:52:20.550

No answers