37

We are a small company that does video editing, among other things, and need a place to keep backup copies of large media files and make it easy to share them.

I've got a box set up with Ubuntu Server and 4 x 500 GB drives. They're currently set up with Samba as four shared folders that Mac/Windows workstations can see fine, but I want a better solution. There are two major reasons for this:

  1. 500 GB is not really big enough (some projects are larger)
  2. It is cumbersome to manage the current setup, because individual hard drives have different amounts of free space and duplicated data (for backup). It is confusing now and that will only get worse once there are multiple servers. ("the project is on sever2 in share4" etc)

So, I need a way to combine hard drives in such a way as to avoid complete data loss with the failure of a single drive, and so users see only a single share on each server. I've done linux software RAID5 and had a bad experience with it, but would try it again. LVM looks ok but it seems like no one uses it. ZFS seems interesting but it is relatively "new".

What is the most efficient and least risky way to to combine the hdd's that is convenient for my users?


Edit: The Goal here is basically to create servers that contain an arbitrary number of hard drives but limit complexity from an end-user perspective. (i.e. they see one "folder" per server) Backing up data is not an issue here, but how each solution responds to hardware failure is a serious concern. That is why I lump RAID, LVM, ZFS, and who-knows-what together.

My prior experience with RAID5 was also on an Ubuntu Server box and there was a tricky and unlikely set of circumstances that led to complete data loss. I could avoid that again but was left with a feeling that I was adding an unnecessary additional point of failure to the system.

I haven't used RAID10 but we are on commodity hardware and the most data drives per box is pretty much fixed at 6. We've got a lot of 500 GB drives and 1.5 TB is pretty small. (Still an option for at least one server, however)

I have no experience with LVM and have read conflicting reports on how it handles drive failure. If a (non-striped) LVM setup could handle a single drive failing and only loose whichever files had a portion stored on that drive (and stored most files on a single drive only) we could even live with that.

But as long as I have to learn something totally new, I may as well go all the way to ZFS. Unlike LVM, though, I would also have to change my operating system (?) so that increases the distance between where I am and where I want to be. I used a version of solaris at uni and wouldn't mind it terribly, though.

On the other end on the IT spectrum, I think I may also explore FreeNAS and/or Openfiler, but that doesn't really solve the how-to-combine-drives issue.

privatehuff
  • 1,049
  • 2
  • 10
  • 13
  • 4
    ZFS is really only considered a stable production ready option on Solaris/OpenSolaris (although a few people would argue even with it's readiness there). – Christopher Cashell May 27 '09 at 20:51
  • 6
    Another note regarding LVM. . . stop thinking about it in terms of redundancy and disk failure. LVM should never know about disk failures, because it should be handled at a lower level (RAID). LVM gives you the ability to manage and partition your disks in a cleaner and more flexible way, but it doesn't add redundancy and it handles a disk failure the same way a non LVM partition does (it blows up). – Christopher Cashell May 27 '09 at 20:55

14 Answers14

30

LVM is actually quite heavily used. Basically, LVM sits above the hardware (driver) layer. It doesn't add any redundancy or increased reliability (it relies on the underlying storage system to handle reliability). Instead, it provides a lot of added flexibility and additional features. LVM should never see a disk disappear or fail, because the disk failure should be handled by RAID (be it software or hardware). If you lose a disk and can't continue operating (rebuild the RAID, etc), then you should be going to backups. Trying to recover data from an incomplete array should never be needed (if it is, you need to reevaluate your entire design).

Among the things you get with LVM are the ability to easily grow and shrink partitions/filesystems, the ability to dynamically allocate new partitions, the ability to snapshot existing partitions, and mount the snapshots as read only or writable partitions. Snapshots can be incredibly useful, particularly for things like backups.

Personally, I use LVM for every partition (except /boot) on every box I build, and I've been doing so for the past 4 years. Dealing with non-LVM'ed boxes is a huge pain when you want to add or modify your disk layout. If you're using Linux, you definitely want use LVM. [Note: This above stuff on LVM has been updated to better explain what it is and how it fits into the storage equation.]

As for RAID, I don't do servers without raid. With disk prices as cheap as they are, I'd go with RAID1 or RAID10. Faster, simpler, and much more robust.

Honestly though, unless you're wedded to Ubuntu (which I would normally recommend), or if the box is performing other tasks, you might want to look into OpenFiler. It turns your box into a storage appliance with a web interface and will handle all of the RAID/LVM/etc for you, and allow you to export the storage as SMB, NFS, iSCSI, etc. Slick little setup.

Christopher Cashell
  • 8,999
  • 2
  • 31
  • 43
  • 2
    I second the OpenFiler suggestion. It seems one can never keep up with the amount of needed disk space, and separating the Server from the Data can make the end-user experience and the management so much easier and better. There is a reason why Netapp has been so successfull. I would suggest following that model. – pcapademic May 29 '09 at 09:51
  • 1
    I had looked briefly into freeNAS.. any reason to choose OpenFiler instead? – privatehuff May 29 '09 at 14:35
  • When I first started playing with them a couple of years ago, OpenFiler looked more stable, more featureful, had better driver support, and was under more active development. I decided to give it a shot, and it worked out really well for me. I honestly haven't looked at FreeNAS since. It may be that FreeNAS has caught up to OpenFiler, but I don't know. – Christopher Cashell May 29 '09 at 16:51
  • I've been using FreeNAS for SMB and iSCSI target and it's been brilliant, never missed a beat. On the other hand, I haven't evaluated OpenFiler, so I've no idea how it compares – Mark Henderson Jun 25 '09 at 00:06
14

ZFS is really reliable and it sure does make your storage management hell of a lot easier. As a bonus: smb is integrated with ZFS in OpenSolaris and it handles Raid very well. Wait a few days, download the by then released 2009.6 version and give it a go on a test machine. I am sure you will love ZFS.

And about your comment ZFS being new: not very new anymore!

Wijnand
  • 141
  • 2
  • "New" meant mostly to me, and just that I haven't been hearing about it and playing with it for years etc. But so, I need to be running OpenSolaris to use ZFS? – privatehuff May 27 '09 at 18:56
  • I have heard that there is some support on other unixes, for ZFS. – Brad Gilbert May 27 '09 at 19:23
  • 1
    If you need ZFS, you need it on OpenSolaris. Implementations on linux are done via userspace and as a result are subject to performance overhead and conflicting cache policies. – jldugger May 28 '09 at 02:10
  • 3
    ZFS support on other operating systems is severely lacking functionality. And really, OpenSolaris is a really nice operating system, but please use the real OpenSolaris. zfs set sharesmb=on backup/share1 really nice integration. – Wijnand May 28 '09 at 08:35
9

The central question is: "How important is this data?"

If the answer is "I can recreate it easily" you want RAID5, possibly with LVM on top of it for simplicity of management.

If the answer is "I can recreate it but it would take a while and people would complain" you want RAID 6 or more likely RAID 1/10.

If the answer is "Nobody does any work while I recreate it and make sure it's bit-perfect" you want ZFS/Raid-Z

Note that you're always able to recreate it. RAID isn't a backup.

user2108
  • 271
  • 1
  • 2
  • I was going for the importance of the data more than the relative performance characteristics. The degradation during rebuild is also important of course but not what I was addressing. – user2108 May 27 '09 at 19:38
  • How does RAID-Z or RAIDZ2 provide better redundancy than RAID6? Both handle any 2 failing disks in an array at most. Also RAID10 handles a second failing disk not as good as RAID6 because you need the right disk to fail, not any of the remaining. Performance is way better with RAID10 than with RAID6. Side Note: Linux 2.6.30 is out and allows you to migrate from RAID1 -> RAID5 <-> RAID6 == awesomeness! – Martin M. Jun 10 '09 at 06:40
  • I've heard RaidZ2 eliminates the write-hole. Basically what mdadm + write journaling do. However, I don't have personal experience with RaidZ, so I'm just repeating what I've heard. – Brain2000 Jun 21 '20 at 14:37
5

To connect a lot of drives in the same chassis, a hardware RAID controller is the best tool. It will provide lots of SATA connectors for your drives, redondancy via RAID-5 or preferably RAID-6, and may provide better performance too.

Software RAID performance is often better than hardware RAID in benchmarks, however file serving and software RAID are both CPU intensive and compete for your processors while working. My experience shows that unless you use dual quad-core systems, properly configured hardware RAID will beat software RAID hands down.

Good hardware controllers with good linux support :

  • Areca
  • 3Ware
  • the new Adaptec series (the old are slooooow)
  • LSI MegaRAID
wazoox
  • 6,782
  • 4
  • 30
  • 62
  • 2
    i am gonna chime in on the 3ware cards, they are wicked especially teh 9650se and the 9690se, used both types of cards with up to 16 drives on the 9650se, no real issue and good stable product. –  Feb 01 '10 at 02:20
4

RAID is NOT like LVM. You can use RAID to make fault-tolerance partitions, but LVM is used to easy disk partitioning and filesystem edition. You can use RAID over LVM or ZFS (ZFS is can work both RAID and LVM). In my opinion, ZFS works better than LVM, but:

  • on Solaris 10/11/OpenSOlaris only, you can't use it from linux
  • ZFS is disk management and filesystem, LVM allow to use any filesystem you need

On Ubuntu i prefer to use RAID5 MD with LVM.

Paul Rudnitskiy
  • 399
  • 2
  • 4
  • 1
    ZFS is available on FreeBSD 7.x as well as Nexenta. – jharley May 27 '09 at 18:41
  • Can you use ZFS as a disk management/raid, while skipping the filesystem portion? So it would be a direct replacement for mdadm and LVM? If so, having a combination of ZFS and SCST for iSCSI might be an interesting solution. However, if you are forced to use the ZFS file system, then that goes out the window. – Brain2000 Jun 21 '20 at 14:00
4

Take a look at what Nexenta and OpenSolaris are offering and I think you'll be very pleased with that you can get for nothing. It's rumoured that the next releases of OpenFiler will use the FreeBSD ZFS port as well (though, they're quite behind from a feature perspective).

That being said, I try to avoid doing RAID5, RAID6 or RAID50 in software and prefer to use hardware controllers to offload all the XOR work. RAID1 and RAID10 in Linux software work pretty well, and from there I put LVM on top of them to allow more flexibility in what's done with the blocks I have after the redundancy in place. RAID+LVM+XFS is my favorite Linux config., but I'd take ZFS over it anyday.

jharley
  • 813
  • 6
  • 10
  • Traditionally hardware RAID has not performed as well software RAID -- typically it has an advantage only when there's multiple writes that the dedicated controller can duplicate (i.e. don't have to pass over the PCI bus), namely RAID 1. There are other concerns with hardware RAID, for example the firmware's quality and inability to be (easily) updated. Finally, it adds another point of failure. For these reasons, I tend to avoid hardware RAID. That being said, for these same reasons, I haven't played with the new hardware RAID options! :) – Brian M. Hunt Nov 09 '10 at 17:27
2

RAID vs LVM isn't really a good comparison they perform separate roles, and are frequently used together. RAID is used for drive redundancy, LVM can be used to break up your RAID device into logical volumes, it is used for easy resizing, and for taking snapshots.

Zoredache
  • 128,755
  • 40
  • 271
  • 413
2

I ran the file server for a very similar company/situation. Basically a 3 person graphics department with 30TB of storage and the shoestring budget of a small company. Our typical projects ran from a 0.5TB to 6TB. And the file server was serving a sizable rendering farm which could really pound on it.

In my setup I ran a 3U server running Linux with external hardware RAID6 arrays attached to it. I managed the physical and logical volumes with LVM and ran the XFS file system. What I would do is create a logical volume for each project and then expand it as the project grew. When the project was done I could archive the job to tape and blow away the logical volume. This would return that space back to the volume group where it would get reallocated to the next project(s).

This was a very clean way to utilize our storage but there are two drawbacks to this approach. You end up having to micromanage the sizes of the logical volumes trying to balance the amount of space allocated to a logical volume so that you had enough space to do your job but not over allocate it and end up wasting space. Our rendering farm was capable of generating many TB's of data a day and if you didn't pay close attention to this you would run out of space in a hurry. I eventually setup some scripts that monitored the trends in the available space on the logical volumes and would auto grow them. Even with that in place with 80 or so Logical Volumes there was a lot of unused space tied up in all the logical volumes. I've already hinted at the 2nd problem....LVM doesn't really do thin provisioning and XFS only allows you to grow a file system. So over allocation of space to a logical volume can add up to a lot of unusable space.

This was all setup about 5 years ago and if I were setting it up today I would use OpenSolaris and ZFS. The main reason is ZFS pooled storage approach means less volume management. You can still separate each project into its own file system but without having to micromanage the sizes of the individual volumes. ZFS has some other very nice features that make it a better choice but there are other questions on serverfault that go into that.

In my opinion ZFS simply is the best free solution available today.

3dinfluence
  • 12,409
  • 2
  • 27
  • 41
2

Some things to consider if you stay with Linux:

  • Think about the filesystem, too. Your 4x 500GB example is about the largest capacity I would recommend with good conscience for ext3. I do not recommend to create much larger ext3 filesystems because e.g. fsck time can be enormous. Create several smaller instead of one large filesystem.
  • As you mentioned video data: ext3 handles large files inefficiently because it'll have to create indirect, double-indirect and triple-indirect metadata blocks to store the data of large files and you'll pay the price. Nowadays, ext4 supports extents and handles this much better. But then, it is rather new and e.g. Red Hat Enterprise Linux 5 does not support it yet. (Some Enterprise distros will support alternatives like XFS).
  • If there is data corruption on any data block you'll have a hard time to even notice this with Linux filesystems. ZFS on the other hand checksums all metadata and data and also verifies the checksum every time data is read from disk. (There is also background scrubbing)
  • RAID rebuild time on Linux is proportional to disk size because the RAID layer does not know the contents of the filesystem (layer). ZFS's RAID-Z rebuild time depends on the amount of actual data on the failed disk because only used blocks will be copied/rebuilt on a replacement disk.
  • Do you want to snapshot your filesystems? LVM snapshots don't even compare with ZFS's instantaneous snapshots. The latter can also easily exposed to the end-users e.g. for easy restores.
  • Use RAID6 (RAID-Z2) and not only RAID5 (RAID-Z) with large disks (>500GB) because chances are that another disk will fail during rebuilt.
knweiss
  • 3,955
  • 23
  • 20
  • Excellent points I had not considered about ext3. Ext4 is still a bit new for my blood, but XFS should be supported by the Ubuntu Server I am using, do you recommend it for this? – privatehuff Jun 25 '09 at 15:24
1

Use the "mdadm" utility to create a RAID-5 array out of your drives.

This provides the redundancy you need, so if a drive goes dead you can replace it with no data loss, and also gives you usage of 3 out of the 4 drives.

I'd also recommend you create an LVM volume on top of the RAID, so you can partition the space out as needed.

ph.
  • 151
  • 3
1

You may want to give a look at AFS. This will give you some measure of both availability (you can access these files both on and off your network) and redundancy (files can be replicated). For a workflow where you open a file, work on it for some time and then save it, this would be better (from a network standpoint) than NFS, at least older NFS.

Chris
  • 141
  • 3
1

As the other reply says, LVM is widely used, and can combine several disks into a single "volume group" that looks like one huge block device. It's worth remembering, though, that doing this increases your probability of failure -- it only takes one disk to fail in a volume group to take out the entire filesystem, and if you have four disks joined together, this is four times more likely. Best to use LVM on top of a RAID1 or similar to mitigate this. Linux software RAID is quite adequate for this, although hardware RAID may be more convenient.

Dan
  • 737
  • 5
  • 11
  • Your statement "it only takes one disk to fail in a volume group to take out the entire filesystem" is highly dependant of how you create and use the logical volumes within the volume group. If you only create a single logical volume filling the entire volume group them what you say is true. It is entirely possible, and depending on filesystem requirements, even recommended, to create multiple logical volumes in a single volume group. – pgs Jun 14 '09 at 16:22
1

Sorry this would be a comment but I don't have the rep...

How does RAID-Z or RAIDZ2 provide better redundancy than RAID6?

ZFS has checksumming everywhere

About the original question. Whatever the data I would use 2 active parity per 10 discs. I would use good quality RAID card the 3ware ones are excellent. Personally I use hardware RAID with battery backup. LVM just so you can migrate the data easily at the end of the hardware's life. XFS as the file system.

alanc
  • 1,500
  • 9
  • 12
James
  • 2,212
  • 1
  • 13
  • 19
0

Why you don't use a disk or SSD card to have the system server to bootup and the 500 GB as only storage. Use one 500GB disk and when it becomes full exchange it manually. You can do the backup later in another computer with calm. Because while the harddisk is turning (spinning) it can be damaged. Anyway if you connect all disks at the same time they all are turning and can take damage if you use it or not. The probability of failure raise when you have more disks turned on. Use one per time and exchange it when are full or in a period of time anticipating a failure (use too the SMART capability to get rid of that). Use a carry disk or use some external SCSI/SATA disk adapter so you don't need to dissasemble the computer server every time you exchange a disk. That is the most secure and reliable form. Using RAID is more expensive and you just waste some disks (because you let it turned on in risk of become damaged for only letting it turned on? stupid, or not?). If you want more datatroughtput then using a RAID configuration for that I guess is a good choice but never trust in a computer configuration. The backup must be done always manually with a person (the network or server administrator). That is one of the works of an administrator. You can use tapes, DVD, BlueRay, or other disks to do the backup. But you always will need a reliable storage medium and a running disk is not one. A turned off disk and well saved (in a fresh and free of humidity place) is a reliable storage medium.