We've set up a mirroring pair of GlusterFS servers. No special tuning, whatever came "out of the box" with GlusterFS-3.5.1 in the official RHEL6 RPM, that's what we have.
The cluster works, but the performance is pretty awful. For example, extracting a large tarball (firefox-31.0.source.tar.bz2
) via GlusterFS on localhost takes a whopping 44 minutes here. Extracting the same file directly -- on the same disk -- takes less than 2. There is a similar disparity in removing the created trees (takes 10 minutes via gluster)...
Of course, it is to be expected, that the mirroring needs to take place, etcaetera, a network-using filesystem will be slower -- but 30 times slower? Simply copying the large file over is fast -- so it is not the bandwidth we are lacking. While the untar-ing is running, I see both the glusterfs
(client) and the glusterfsd
(server) processes consuming a lot of CPU (about 10% each), but the system remains about 70% idle -- both gluster-processes are a lot busier than the extracting bzip2 and tar are... What are they doing?
Is there some tuning I can do to dramatically improve performance here? Or should I try ceph (or gfarm?) instead of gluster? Or are they all terrible with a large number of small files? Thank you!