3

I've previously set up a couple of GlusterFS volumes (called cyclorana0 and cyclorana1) across two nodes (with IPs 10.0.2.4 (hostname alboguttata) and 10.0.2.5 (hostname verrucosa)) on my LAN. The probe commands, etc. were successful and the two nodes can see each other and the volumes correctly. The two nodes are running Armbian Linux 4.14.144-odroidxu4 with glusterfs-server version 3.13.2-1ubuntu1.

I've also set up a RHEL client (with glusterfs packages version 3.12.2) to mount the volumes via /etc/fstab, and it's been working fine for several months, even after several reboots. Here's the contents:

10.0.2.5:/cyclorana0 /home/[username]/cyclorana0 glusterfs defaults,_netdev 0 0
10.0.2.5:/cyclorana1 /home/[username]/cyclorana1 glusterfs defaults,_netdev 0 0

However, after a reboot of the RHEL client yesterday, the mounts fail. After I tried manually running mount -a, I only get this:

Mount failed. Please check the log file for more details.
Mount failed. Please check the log file for more details.

The only logs I can think of are those in /var/log/glusterfs/, and indeed I found log files corresponding to my attempts. The output for the two volumes are the same, so here is one of them for example:

[2019-09-21 02:21:59.834507] I [MSGID: 100030] [glusterfsd.c:2646:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.2 (args: /usr/sbin/glusterfs --volfile-server=10.0.2.5 --volfile-id=/cyclorana1 /home/[username]/cyclorana1)
[2019-09-21 02:21:59.844494] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2019-09-21 02:21:59.863736] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0
[2019-09-21 02:21:59.863931] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2019-09-21 02:21:59.873312] I [MSGID: 114020] [client.c:2361:notify] 0-cyclorana1-client-0: parent translators are ready, attempting connect on transport
[2019-09-21 02:21:59.883477] E [MSGID: 101075] [common-utils.c:482:gf_resolve_ip6] 0-resolver: getaddrinfo failed (family:2) (Name or service not known)
[2019-09-21 02:21:59.883561] E [name.c:267:af_inet_client_get_remote_sockaddr] 0-cyclorana1-client-0: DNS resolution failed on host verrucosa
Final graph:
+------------------------------------------------------------------------------+
  1: volume cyclorana1-client-0
  2:     type protocol/client
  3:     option ping-timeout 42
  4:     option remote-host verrucosa
  5:     option remote-subvolume /bricks/brick1
  6:     option transport-type socket
  7:     option transport.address-family inet
  8:     option transport.tcp-user-timeout 0
  9:     option transport.socket.keepalive-time 20
 10:     option transport.socket.keepalive-interval 2
 11:     option transport.socket.keepalive-count 9
 12:     option send-gids true
 13: end-volume
 14:  
 15: volume cyclorana1-dht
 16:     type cluster/distribute
 17:     option lock-migration off
 18:     subvolumes cyclorana1-client-0
 19: end-volume
 20:  
 21: volume cyclorana1-write-behind
 22:     type performance/write-behind
 23:     subvolumes cyclorana1-dht
 24: end-volume
 25:  
 26: volume cyclorana1-read-ahead
 27:     type performance/read-ahead
 28:     subvolumes cyclorana1-write-behind
 29: end-volume
 30:  
 31: volume cyclorana1-readdir-ahead
 32:     type performance/readdir-ahead
 33:     option parallel-readdir off
 34:     option rda-request-size 131072
 35:     option rda-cache-limit 10MB
 36:     subvolumes cyclorana1-read-ahead
 37: end-volume
 38:  
 39: volume cyclorana1-io-cache
 40:     type performance/io-cache
 41:     subvolumes cyclorana1-readdir-ahead
 42: end-volume
 43:  
 44: volume cyclorana1-quick-read
 45:     type performance/quick-read
 46:     subvolumes cyclorana1-io-cache
 47: end-volume
 48:  
 49: volume cyclorana1-open-behind
 50:     type performance/open-behind
 51:     subvolumes cyclorana1-quick-read
 52: end-volume
 53:  
 54: volume cyclorana1-md-cache
 55:     type performance/md-cache
 56:     subvolumes cyclorana1-open-behind
 57: end-volume
 58:  
 59: volume cyclorana1-io-threads
 60:     type performance/io-threads
 61:     subvolumes cyclorana1-md-cache
 62: end-volume
 63:  
 64: volume cyclorana1
 65:     type debug/io-stats
 66:     option log-level INFO
 67:     option latency-measurement off
 68:     option count-fop-hits off
 69:     subvolumes cyclorana1-io-threads
 70: end-volume
 71:  
 72: volume meta-autoload
 73:     type meta
 74:     subvolumes cyclorana1
 75: end-volume
 76:  
+------------------------------------------------------------------------------+
[2019-09-21 02:21:59.887159] I [fuse-bridge.c:4915:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23
[2019-09-21 02:21:59.887257] I [fuse-bridge.c:5548:fuse_graph_sync] 0-fuse: switched to graph 0
[2019-09-21 02:21:59.887755] E [fuse-bridge.c:4983:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is not connected)
[2019-09-21 02:21:59.891025] W [fuse-bridge.c:1242:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not connected)
[2019-09-21 02:21:59.896995] W [fuse-bridge.c:1242:fuse_attr_cbk] 0-glusterfs-fuse: 3: LOOKUP() / => -1 (Transport endpoint is not connected)
[2019-09-21 02:21:59.905103] I [fuse-bridge.c:5822:fuse_thread_proc] 0-fuse: initating unmount of /home/[username]/cyclorana1
[2019-09-21 02:21:59.905610] W [glusterfsd.c:1462:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7ea5) [0x7f2075213ea5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x557ca6064d05] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x557ca6064b6b] ) 0-: received signum (15), shutting down
[2019-09-21 02:21:59.905659] I [fuse-bridge.c:6611:fini] 0-fuse: Unmounting '/home/[username]/cyclorana1'.
[2019-09-21 02:21:59.905685] I [fuse-bridge.c:6616:fini] 0-fuse: Closing fuse connection to '/home/[username]/cyclorana1'.

I can barely understand these logs, but from what I can tell, my RHEL client can "see" the volume but somehow couldn't mount it. The main problematic lines in the log seem to be (tell me if I'm wrong):

[2019-09-21 02:21:59.883477] E [MSGID: 101075] [common-utils.c:482:gf_resolve_ip6] 0-resolver: getaddrinfo failed (family:2) (Name or service not known)
[2019-09-21 02:21:59.883561] E [name.c:267:af_inet_client_get_remote_sockaddr] 0-cyclorana1-client-0: DNS resolution failed on host verrucosa

I'm baffled by the DNS resolution error, since I'm connecting directly via it's LAN IP and the RHEL client is on the same LAN.

I also tried to mount from a separate Manjaro Linux client on the same LAN which produced the same error messages.

How do I troubleshoot and fix this so that I can mount these volumes? Thank you.

P.S. Here is a log from when the mount was successful:

[2019-08-27 06:07:08.806469] I [MSGID: 100030] [glusterfsd.c:2646:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.2 (args: /usr/sbin/glusterfs --volfile-server=10.0.2.5 --volfile-id=/cyclorana1 /home/[username]/cyclorana1)
[2019-08-27 06:07:09.225314] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2019-08-27 06:07:09.377519] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0
[2019-08-27 06:07:09.379969] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2019-08-27 06:07:10.809517] I [MSGID: 114020] [client.c:2361:notify] 0-cyclorana1-client-0: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
  1: volume cyclorana1-client-0
  2:     type protocol/client
  3:     option ping-timeout 42
  4:     option remote-host verrucosa
  5:     option remote-subvolume /bricks/brick1
  6:     option transport-type socket
  7:     option transport.address-family inet
  8:     option transport.tcp-user-timeout 0
  9:     option transport.socket.keepalive-time 20
 10:     option transport.socket.keepalive-interval 2
 11:     option transport.socket.keepalive-count 9
 12:     option send-gids true
 13: end-volume
 14:  
 15: volume cyclorana1-dht
 16:     type cluster/distribute
 17:     option lock-migration off
 18:     subvolumes cyclorana1-client-0
 19: end-volume
 20:  
 21: volume cyclorana1-write-behind
 22:     type performance/write-behind
 23:     subvolumes cyclorana1-dht
 24: end-volume
 25:  
 26: volume cyclorana1-read-ahead
 27:     type performance/read-ahead
 28:     subvolumes cyclorana1-write-behind
 29: end-volume
 30:  
 31: volume cyclorana1-readdir-ahead
 32:     type performance/readdir-ahead
 33:     option parallel-readdir off
 34:     option rda-request-size 131072
 35:     option rda-cache-limit 10MB
 36:     subvolumes cyclorana1-read-ahead
 37: end-volume
 38:  
 39: volume cyclorana1-io-cache
 40:     type performance/io-cache
 41:     subvolumes cyclorana1-readdir-ahead
 42: end-volume
 43:  
 44: volume cyclorana1-quick-read
 45:     type performance/quick-read
 46:     subvolumes cyclorana1-io-cache
 47: end-volume
 48:  
 49: volume cyclorana1-open-behind
 50:     type performance/open-behind
 51:     subvolumes cyclorana1-quick-read
 52: end-volume
 53:  
 54: volume cyclorana1-md-cache
 55:     type performance/md-cache
 56:     subvolumes cyclorana1-open-behind
 57: end-volume
 58:  
 59: volume cyclorana1-io-threads
 60:     type performance/io-threads
 61:     subvolumes cyclorana1-md-cache
 62: end-volume
 63:  
 64: volume cyclorana1
 65:     type debug/io-stats
 66:     option log-level INFO
 67:     option latency-measurement off
 68:     option count-fop-hits off
 69:     subvolumes cyclorana1-io-threads
 70: end-volume
 71:  
 72: volume meta-autoload
 73:     type meta
 74:     subvolumes cyclorana1
 75: end-volume
 76:  
+------------------------------------------------------------------------------+
[2019-08-27 06:07:10.825446] I [rpc-clnt.c:2013:rpc_clnt_reconfig] 0-cyclorana1-client-0: changing port to 49153 (from 0)
[2019-08-27 06:07:10.836134] I [MSGID: 114057] [client-handshake.c:1397:select_server_supported_programs] 0-cyclorana1-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2019-08-27 06:07:10.839068] I [MSGID: 114046] [client-handshake.c:1150:client_setvolume_cbk] 0-cyclorana1-client-0: Connected to cyclorana1-client-0, attached to remote volume '/bricks/brick1'.
[2019-08-27 06:07:10.839149] I [MSGID: 114047] [client-handshake.c:1161:client_setvolume_cbk] 0-cyclorana1-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2019-08-27 06:07:10.840019] I [MSGID: 114035] [client-handshake.c:121:client_set_lk_version_cbk] 0-cyclorana1-client-0: Server lk version = 1
[2019-08-27 06:07:10.863385] I [fuse-bridge.c:4915:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23
[2019-08-27 06:07:10.863513] I [fuse-bridge.c:5548:fuse_graph_sync] 0-fuse: switched to graph 0
hpy
  • 835
  • 3
  • 18
  • 28

0 Answers0