I just copied 200GB from USB HDD to my main drive.
There were about 130000 files
After the first 4-5 minutes I observed that:
- For the smallest files, the rate was about 100 files per second at
about 600KB/s
- And for big files it was like 70MB/s
At beginning windows changed the estimation from like 1 hour to 5+ hours then back to 1 hour and so on.
At the end like in 95% it was still changing the estimation from 10 minutes to 10+ hours.
So it instead of become more accurate it was going less and less precise.
Simple math shows:
130,000 files at 100 files per sec = 22 minutes
200,000 MB at 70 MB per sec = 47 minutes
22 minutes - loosed in seek time copying files of few kilobytes in size.
47 minutes - the time it will need to transfer the actual data if there is no seek time.
Sum of the 22min + 47min is the absolute maximum time that it could possibly take.
So obviously the estimate should be somewhere between 47 and 69 minutes.
What the dialog shows at about 90%:
"I am copying some small files at 1MB/s, there is 20GB more data, it will take 5:30 hours to complete.
Few seconds later:
"I am copying a big file here, at 70mb/s it will take 4 minutes to complete.
What human actually sees from the same dialog:
120,000 files and 180GB are already copied for 40 minutes. The rest 10000 files and 20GB should take about 5min
The dialog gives enough information to make calculation that gets more and more accurate each second. It knows rate at which small files are copied. It knows at what speed big files are copied. It also knows how many files and how many bytes there are left.
It is so simple to make so accurate assumption only by setting the upper and lower limit.
The dialog shows a bit more correct data only in case when the big files are before the small files. If this is the case it starts at 40 minutes, and after 30 minutes it starts copying small files and says "well I need 20 minutes more".
But when the small files at the beginning and big files are at the end. The dialog does not actually care at what "files per second" it transfers the small files. It make its calculation like the small files count is infinity, and that like they will forever be small.
And the canonical answer: http://blogs.msdn.com/b/oldnewthing/archive/2004/01/06/47937.aspx
– Cody Gray – 2011-04-20T12:01:18.76753http://imgs.xkcd.com/comics/estimation.png – Smudge – 2012-01-04T15:12:54.857
3Also, this should apply to any OS, not just Windows, as I believe the constraints are universal. – Clockwork-Muse – 2012-01-04T18:21:02.897
Jeff Atwood also wrote a very interesting article about this once: http://www.codinghorror.com/blog/2008/03/actual-performance-perceived-performance.html It also contains links for further reading on this topic.
– BennyInc – 2012-01-05T09:33:20.703http://superuser.com/questions/21980/user-interface-annoyances/22006#22006 – Troggy – 2009-09-18T22:06:10.563
The progress bar shows the # of files completed, not the % time completed, fyi. – Factor Mystic – 2009-09-19T00:34:09.647
1
Also to note is Mark Russinovich's blog post:http://blogs.technet.com/b/markrussinovich/archive/2008/02/04/2826167.aspx
– surfasb – 2012-09-13T20:45:42.313Elaboration of xkcd 612. – Peter Mortensen – 2014-02-01T16:19:58.700