How to lower Gluster FS down peer timeout / reduce down peer impact?

Question

The setting: Two fresh CentOS 6.5 server with latest updates. Both have a fresh install of Gluster 3.5.2.

What I did ( from the perspective of server 2, shared1 and shared2 are logical volumes ) :

wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo
yum -y install glusterfs glusterfs-fuse glusterfs-server -y
/etc/init.d/glusterd start
chkconfig --level 345 glusterd on

echo "1.2.3.4 server1" >> /etc/hosts
echo "4.3.2.1 server2" >> /etc/hosts

gluster peer probe server1
gluster volume create shared replica 2 transport tcp server2:/shared2 server1:/shared1 force
gluster volume start shared

mount.glusterfs server2:/shared /mnt/shared

gluster peer status

This worked perfectly, and I have a nice shared filesystem on /mnt/shared on both servers. The command set was executed on each server respectively, and modified to match that server's perspective.

The testing:

If I press the reset button on server1, I have a horrible ~45 second delay in using or accessing files on /mnt/shared

I did search for a solution on google, glusterfs admin guide, and on serverfault, but no one seems to have this issue.

Any advice on how to lower the timeouts, or ignore a down peer temporarily? A read-only state during failover is fine as long as there's no delays. Or, just tell me what I did wrong, or did not do.

Thanks,

geedoubleya · Accepted Answer · 2014-08-13T09:59:33.403

14

You may be suffering from the client ping timeout setting as its default is 42 seconds. Run the following to check:

gluster volume info shared

The parameter you are looking for is "network.ping-timeout". You can change this by running

gluster volume set shared network.ping-timeout "new timeout value"

See if that reduces the recovery period.

edited Aug 13 '14 at 09:59

answered Aug 12 '14 at 13:17

geedoubleya

672
4
10

2

This was exactly my issue. Thanks for pointing out this not-so-easy to find gem. – Robert Mar 22 '15 at 02:05
1

The 42 seconds default can be found at https://gluster.readthedocs.org/en/latest/Administrator%20Guide/Managing%20Volumes/ – cherouvim Apr 22 '16 at 14:52

How to lower Gluster FS down peer timeout / reduce down peer impact?

1 Answers1

Linked