-2

I'm looking for some pointers on the best way to manage a Linux data server, with 20 hard drives and with new files always being added (around 30GB/day). Performance is not important, reliability is crucial. I don't like RAIDs (many issues with RAID5 and broken disks!) . For now, all drives are accessed individually, but this is a problem because I've to keep moving data around ...

I'm trying to understand if LVM/Hadoop/some-other-magic is the best for me.

I'm specially concerned with a situation it hardware failures and with recovery plans to get back the data and/or not to loose the data on other drives (when some sort of middleware) is involved.

I'm fluent in Linux, not so much in (large) data management.

  • Multiple failure of 2 RAID5 (2 disks in one and 1 disk in another) almost simultaneously left a bad taste in my mouth ... – user1770719 Jan 24 '13 at 12:55
  • The disk enclosure I use does not support RAID 6. – user1770719 Jan 24 '13 at 16:08
  • Use the software RAID then instead of the functionality on your enclosure. You could also skip the RAID and do it all with LVM. Just present all the individual disk to the OS. Put all the disks into a LVM. Then create your logical volumes as needed. Individual Logical volumes can be stripped across PVs, or mirrored. They can be easily moved between various PVs. – Zoredache Jan 24 '13 at 17:37

2 Answers2

1

Erm, Steve Ballmer RAID, RAID, RAID.

RAID has been designed to give you reliability. I would not be scared of it, its kind of the industry standard. Serverfault is not really the place to start recommending products. I would say have a look at scale-out NAS file system. Something like gluster.

Sc0rian
  • 1,011
  • 7
  • 16
0

But you are right that RAID 5 is often a headache: you should be using RAID6 instead, especially since you have so many drives.

What would be even better, considering the large number of drives you're using, is RAID60. This divides your disks in to 2 RAID6 arrays, then stripes the data across the two arrays. This gives you a speed advantage, better resiliency against disk failure, and faster rebuilds when replacing drives.

Another good alternative to look at is ZFS, which is available on FreeBSD (which shouldn't be much of a stretch for a linux admin) and in purpose-built distros such as Nexenta. There are a bunch of people around here that swear by it, including me.

longneck
  • 22,793
  • 4
  • 50
  • 84
  • The disk enclosure I use does not support RAID 6. ZFS would be a neat solution, but I would prefer not to have to reinstall my base system (a now ancient CentOS 5.x) – user1770719 Jan 24 '13 at 13:18