32

I have a free dropbox account (2GB), and I was wondering how the versioning of large files works.

I have a full backup of all my webfiles that sites @ just over 1GB. After the initial upload of 1GB, everytime it syncs will dropbox figure out the delta of the file, or will it have to upload the entire thing again to version it?

It would be cool to always have an up to date version of a large file, but I dont want to kill my bandwidth uploading 1GB everytime.

Is this possible?

Thanks,

barfoon
  • 750
  • 3
  • 14
  • 29

3 Answers3

38

Dropbox uses a binary diff algorithm to break down all files into blocks, and only upload blocks that it doesn't already have in the cloud. All of this is done locally on your computer.

Dropbox doesn't just use your files that you have already uploaded, it aggregates everyone's files into one database of blocks, and checks each local block hash against that database.

This means that if someone else has uploaded the same file as yourself (say for example, the latest Ubuntu ISO), then the upload will seem instant as there is nothing to upload, but if you are updating a file that changes regularly, like your backup file, then only the changes are uploaded. If you upload a totally unique file, then you have to wait for it all to upload.

Moo
  • 2,225
  • 19
  • 23
  • 5
    Any references to this? It's pretty interesting – STW Aug 11 '09 at 17:46
  • Yeah, sounds fantastic. Just wondering how you know that though? Thanks, – barfoon Aug 11 '09 at 17:48
  • 1
    The Dropbox team talk about it every now and then in the forums (Arash F especially, although they are very busy these days). – Moo Aug 11 '09 at 17:48
  • Very cool - thanks for the information. Going to revamp the backup strategy here now! – barfoon Aug 11 '09 at 17:50
  • 3
    Does this mean it would only upload changed blocks of an encrypted file (e.g. a TrueCrypt volume) as well? – Will M Aug 11 '09 at 19:49
  • 1
    Will - yes, I believe quite a few people use TrueCrypt within their Dropbox folders with great success. – Moo Aug 11 '09 at 19:50
  • 2
    The last part of you answer is no-longer true. After the 'Dropship' debacle, changes were made. It is likely that they still de-dupe internally, but if you put the "windows8.iso" (which, odds are, at least someone has already done) in your folder now, you will have to upload every byte. – DanO Dec 10 '12 at 20:10
  • 1
    This is not current the right answer. Dropbox disabled Global Deduplication (not upload blocks uploaded for other people) for privacy reasons. – Ricardo Polo Jaramillo Oct 04 '14 at 18:28
12

For what it's worth, Dropbox claims to create hashes on every 4MB of each file. That way, if you change a contiguous 2MB of a 100MB file, it will likely only need to upload 4MB (or 8MB if you cross into a second 4MB block) to re-sync the file.

The hashes we use are only for the 4MB file chunks

Source: https://blogs.dropbox.com/tech/2016/05/inside-the-magic-pocket/

mightytightywty
  • 221
  • 2
  • 3
4

It's also important to highlight that it doesn't upload your whole file at once when you change it. For example, if you have an unique file weighting 2GB, let's say for an encrypted disk drive you hold (like when you use truecrypt or pgpdisk), and you change just a couple of files inside the encrypted disk, dropbox will only upload the blocks that effectively changed. So, for instance, if you upload your pgpdisk file with 2GB to dropbox, and then you change just let's say 100MB of this 2GB, dropbox will be intelligent enough to detect and update only what have changed. So you don't waste your upload bandwidth uploading stuff that is already there.

Another feature that I saw the dropbox team is working on is to make dropbox to detect another instances of dropbox running on your local network, and sync the information in between them. For example, you have a laptop and a desktop, and both have the same dropbox account, and you update your files on your desktop - and the desktop instantly syncs with the "cloud" - when you plug your laptop in, instead of going to the cloud, dropbox will instead download the diff directly from your desktop computer, and won't waste your download bandwidth. This is still to come - but will be a sweet feature!

Macaubas
  • 141
  • 3