26

Are there any version control systems out there that are particularly good (or bad) at dealing with large files? Nothing too crazy, but from several hundred megabytes to a gigabyte, let's say.

We currently have subversion in place, but there are some mutterings about it not being ideal for this purpose. I'm not a developer myself, and I don't know how objective they are being, so I thought I'd do a quick community survey for extra info.

I'm interested in the behaviour or suitability of these VCS solutions from a systems point of view as well as the user point of view.

TIA.

voretaq7
  • 79,345
  • 17
  • 128
  • 213
DictatorBob
  • 1,614
  • 11
  • 15

12 Answers12

7

You'll find that they are very much of a muchness when it comes to binary files.

The mutterings you have heard are most likely originating from the notion that version controlling binaries is a little bit at odds with the power of version control. Binary files can't be diffed or merged, so they are treated as dumb copies. The whole file is replaced on every small change.

This isn't to say that you can't version control binary files or indeed that it isn't useful for you to do so. If you have the requirement to roll-back a file to the version that you committed yesterday. In which case it has served a use to you.

Although you might find a storage solution with snapshots serves you better and more efficiently.

Dan Carley
  • 25,189
  • 5
  • 52
  • 70
  • 2
    A storage system with snapshots is what I would recommend as well. I use ZFS to version my virtual machine hard drives an it works well. The snapshots are almost instant and they only take up the space necessary to store the changed blocks. – Amok Oct 06 '09 at 17:03
  • Snapshots might be a good option. I guess it depends on whether or not I can set it up so they can be (mostly) independent. – DictatorBob Oct 06 '09 at 18:54
  • @Dan, Isn't this no better than simply copy-paste? – Pacerier Oct 22 '14 at 07:11
6

It sounds like Boar would satisfy your requirements. It is version control for large binary files such as videos or pictures.

Mats Ekberg
  • 161
  • 1
  • 2
  • This project appeals to me, because it doesn't have a working directory copy of the current repository state. Meaning only the backed up data (and revisions) and the original exist, not a 'backup' of the last updated state of the repo. Or at least that's how it seems to me. – MrSnowflake Mar 10 '11 at 14:24
4

Another option made for multimedia and creative workflows is AlienBrain which is now owned by Avid. It's used by a lot of game studio's to version control the game assets and code.

http://www.alienbrain.com/

It may not be the best solution though if you're not dealing with media assets.

3dinfluence
  • 12,409
  • 2
  • 27
  • 41
  • That's actually the one they used to use at a certain large game studio I worked at. Couldn't remember the name. Thanks. :) – DictatorBob Oct 06 '09 at 18:53
4

The vast majority of those having to deal with loads of binary files (e.g. games) tend to use perforce, sometimes with a layer over it.

3

git-annex "allows managing files with git, without checking the file contents into git. While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space."

sciurus
  • 12,493
  • 2
  • 30
  • 49
1

The Wikipedia page for Subversion also states that it supports binary files although I don't have personal experience of this so can't comment on how well it works.

Native support for binary files, with space-efficient binary-diff storage.

1

If, and it's a big if, you're using Autodesk software (Maya, Autocad, Inventor, etc... ) Then there's Autodesk Vault.

I recently discovered this at work. It's not free, but it is seemingly the only VCS that works on Autodesk media asset files.

However, it's only really suitable for tracking changes in files it can inspect, so drawing files it's fine with, but not possible for 'rendered assets'.

I'd probably go with git.

Tom O'Connor
  • 27,440
  • 10
  • 72
  • 148
1

An entirely lateral method is to use the union filesystem AUFS which is used by Docker to allow users to create diffs against entire filesystem nodes and publish them. They talk about it on their blog.

This isn't version control with all of the tools of git but it does allow one to add and modify files in a large tree with no real size limit.

This would be a very robust solution just for the media files but I don't think it gives granular control so it would be best for projects where the need is similar to Docker's.

Adam Nelson
  • 1,557
  • 3
  • 13
  • 12
1

Adobe offers Version Cue CS4, which was made for multimedia projects. You might want to check that out.

Chris
  • 1,381
  • 1
  • 12
  • 22
1

I believe Bazaar handles binary files quite well this seems to be documented here (4.1). I suppose this depends on if you want to spend money or not though as the documentation does state there are better tools out there (it doesn't name them however).

PixelSmack
  • 530
  • 4
  • 8
  • 3
    Quoting from that site: *That said, bzr is primarily a source code control system, not a media archive system. So it is not a priority to support enormous (hundred-megabyte) binaries or multi-gigabyte trees. There are other tools better suited to that.* – Cristian Ciupitu Oct 06 '09 at 20:13
0

Git will be able to deal with "several hundred megabytes to a gigabyte" binary files. It's very fast.

Aleksandr Levchuk
  • 2,415
  • 3
  • 21
  • 41
  • 1
    It also is not server based. Now, while I really really like the idea of a distributed VCS where your local machine has a copy of the whole repository, this may be a LITTLE slow and cumbersome when your repository blows 1000gb - which is kind of trivial when you deal with files in the gigabyte size. Then a central repository + local workfolder make sure the local machines can stay sane. – TomTom May 23 '12 at 13:00
0

Search on Digital Asset Management seems to be called "DAM"- its a segment aimed at game developers, studios, scientists who have large files. There are quite a few commercial products, and possibly the GUI will be easy since the segment is aimed at artists and non-engineer types. I am looking at resourcespace.org right now because its open-source and seems simple and flexible.

John P. Fisher
  • 470
  • 4
  • 9
  • Git-lfs has been unveiled. YMMV – Deer Hunter Nov 11 '15 at 19:45
  • Yes there is git-lfs, and also github for windows... I have not given up on them for my use, but git-lfs + gitforwindows has a worse GUI but allows me to use local storage; github requires ( I think) you to store on githuib, which is a non-starter for this topic. It does have a nice GUI. Both give you Git Bash which is great! – John P. Fisher Nov 12 '15 at 23:14