It seems there are two general approaches (at least in the MooseFS and XtreemFS worlds):
The drive-at-a-time
For MooseFS the best way is to use one HDD as one XFS partition connected to chunkserver.
We don't recommending to use any RAID, LVM configuration.
Why?
First thing are HDD errors. If your hard drive starts to slowing down, is hard to find with one it is on LVM. On MFS you can find it very quickly even from MFS master web site.
Second thing: adding or removing hard drive to MooseFS is easier than adding or removing to LVM group. Just add HDD to chunkserver, format to XFS, reload chunkserver and you have extra space on your instance.
Third thing is that MoooseFS have many better sorting algorithms for placing chunks to many hard disks, So all hard drives have balanced traffic - LVM doesn't
The volume-at-a-time approach
The XtreemFS OSD (and also the other services) rely on a local file
system for data and metadata storage. Thus, on a machine with multiple
disks, you have two possibilities. First, you can combine multiple disks
on one machine to a single file system, e.g. by using RAID, LVM, or a
ZFS pool. Second, each disk (including SSDs, etc.) hold its own local
file system and is exported by an own XtreemFS OSD service.
Both of the possibilities have their advantages and disadvantages and I
cannot make a general recommendation. The first option brings
flexibility in terms of the used RAID level or possibly attached SSD
caches. Furthermore it might be easier to maintain and monitor one OSD
process per machine than one process per disk.
Using one OSD server per local disk might result in a better
performance. While running a RAID of fast SSDs, the XtreemFS OSD might
become a bottleneck. You could also share the load of multiple OSD on
one machine over multiple network interfaces. For replicated files, you
have to care about replica placement and avoid placing multiple replicas
of one file on OSDs running on the same hardware. You possibly have to
write a custom OSD selection policy. XtreemFS offers an interface for this.
Which seems better
Based on the response from XtreemFS, it would seem that MooseFS could benefit from the volume-at-a-time approach, but only if you mitigate potential drive failures very well.
Drive-at-a-time has the benefit that in the event of a single drive failure (which seems to be the most concerning physical error that can happen), MooseFS' sorting algorithms and recovery systems can replicate now-unreplicated data and "ignore" the failed drive.
Volume-at-a-time has the benefit of forcing replicated data to be on different servers - but doesn't guarantee even/level individual drive usage.
These answers come from the respective mailing lists for MooseFS and XtreemFS - only grammar and readability have been improved; links to original threads provided