<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Hi all,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We’re trying to use gluster as a replicated volume. It works OK when both peers are up but when one peer is down and the other reboots, the “surviving” peer does not automount glusterfs. Furthermore, after the boot sequence is complete,
it can be mounted without issue. It automounts fine when the peer is up during startup. I tried to google this and while I found some similar issues, I haven’t found any solutions to my problem. Any insight would be appreciated. Thanks.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">gluster volume info output (after startup):<o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal">Volume Name: rel-vol<o:p></o:p></p>
<p class="MsoNormal">Type: Replicate<o:p></o:p></p>
<p class="MsoNormal">Volume ID: 90cbe313-e9f9-42d9-a947-802315ab72b0<o:p></o:p></p>
<p class="MsoNormal">Status: Started<o:p></o:p></p>
<p class="MsoNormal">Number of Bricks: 1 x 2 = 2<o:p></o:p></p>
<p class="MsoNormal">Transport-type: tcp<o:p></o:p></p>
<p class="MsoNormal">Bricks:<o:p></o:p></p>
<p class="MsoNormal">Brick1: 10.250.1.1:/export/brick1<o:p></o:p></p>
<p class="MsoNormal">Brick2: 10.250.1.2:/export/brick1<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">gluster peer status output (after startup):<o:p></o:p></p>
<p class="MsoNormal">Number of Peers: 1<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hostname: 10.250.1.2<o:p></o:p></p>
<p class="MsoNormal">Uuid: 8d49b929-4660-4b1e-821b-bfcd6291f516<o:p></o:p></p>
<p class="MsoNormal">State: Peer in Cluster (Disconnected)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Original volume create command: <o:p></o:p></p>
<p class="MsoNormal">gluster volume create rel-vol rep 2 transport tcp 10.250.1.1:/export/brick1 10.250.1.2:/export/brick1<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I am running Gluster 3.4.5 on OpenSuSE 12.2.<o:p></o:p></p>
<p class="MsoNormal">gluster --version:<o:p></o:p></p>
<p class="MsoNormal">glusterfs 3.4.5 built on Jul 25 2014 08:31:19<o:p></o:p></p>
<p class="MsoNormal">Repository revision: git://git.gluster.com/glusterfs.git<o:p></o:p></p>
<p class="MsoNormal">Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com><o:p></o:p></p>
<p class="MsoNormal">GlusterFS comes with ABSOLUTELY NO WARRANTY.<o:p></o:p></p>
<p class="MsoNormal">You may redistribute copies of GlusterFS under the terms of the GNU General Public License.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The fstab line is:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">localhost:/rel-vol /home glusterfs defaults,_netdev 0 0<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">lsof -i :24007-24100:<o:p></o:p></p>
<p class="MsoNormal">COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME<o:p></o:p></p>
<p class="MsoNormal">glusterd 4073 root 6u IPv4 82170 0t0 TCP s1:24007->s1:1023 (ESTABLISHED)<o:p></o:p></p>
<p class="MsoNormal">glusterd 4073 root 9u IPv4 13816 0t0 TCP *:24007 (LISTEN)<o:p></o:p></p>
<p class="MsoNormal">glusterd 4073 root 10u IPv4 88106 0t0 TCP s1:exp2->s2:24007 (SYN_SENT)<o:p></o:p></p>
<p class="MsoNormal">glusterfs 4097 root 8u IPv4 16751 0t0 TCP s1:1023->s1:24007 (ESTABLISHED)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This is shorter than it is when it works, but maybe that’s because the mount spawns some more processes.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Some ports are down:<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> <o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">root@q50-s1:/root> telnet localhost 24007<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Trying ::1...<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">telnet: connect to address ::1: Connection refused<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Trying 127.0.0.1...<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Connected to localhost.<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Escape character is '^]'.<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> <o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">telnet> close<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Connection closed.<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">root@q50-s1:/root> telnet localhost 24009<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Trying ::1...<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">telnet: connect to address ::1: Connection refused<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Trying 127.0.0.1...<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">telnet: connect to address 127.0.0.1: Connection refused<o:p></o:p></span></p>
<p><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">ps axww | fgrep glu:<o:p></o:p></span></p>
<p><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">4073 ? Ssl 0:10 /usr/sbin/glusterd -p /run/glusterd.pid<o:p></o:p></span></p>
<p><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">4097 ? Ssl 0:00 /usr/sbin/glusterfsd -s 10.250.1.1 --volfile-id rel-vol.10.250.1.1.export-brick1 -p /var/lib/glusterd/vols/rel-vol/run/10.250.1.1-export-brick1.pid -S /var/run/89ba432ed09e07e107723b4b266e18f9.socket
--brick-name /export/brick1 -l /var/log/glusterfs/bricks/export-brick1.log --xlator-option *-posix.glusterd-uuid=3b02a581-8fb9-4c6a-8323-9463262f23bc --brick-port 49152 --xlator-option rel-vol-server.listen-port=49152<o:p></o:p></span></p>
<p><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">5949 ttyS0 S+ 0:00 fgrep glu<o:p></o:p></span></p>
<p class="MsoNormal">These are the error messages I see in /var/log/gluster/home.log (/home is the mountpoint):<o:p></o:p></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">+------------------------------------------------------------------------------+<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:27.932285] E [client-handshake.c:1742:client_query_portmap_cbk] 0-rel-vol-client-0: failed to get the port number for remote subvolume.
Please run 'gluster volume status' on server to see if brick process is running.<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:27.932373] W [socket.c:514:__socket_rwv] 0-rel-vol-client-0: readv failed (No data available)<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:27.932405] I [client.c:2098:client_rpc_notify] 0-rel-vol-client-0: disconnected<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.818281] E [socket.c:2157:socket_connect_finish] 0-rel-vol-client-1: connection to 10.250.1.2:24007 failed (No route to host)<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.818313] E [afr-common.c:3735:afr_notify] 0-rel-vol-replicate-0: All subvolumes are down. Going offline until atleast one of them
comes back up.<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.822189] I [fuse-bridge.c:4771:fuse_graph_setup] 0-fuse: switched to graph 0<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.822245] W [socket.c:514:__socket_rwv] 0-rel-vol-client-1: readv failed (No data available)<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.822312] I [fuse-bridge.c:3726:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.18<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.822562] W [fuse-bridge.c:705:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not connected)<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.835120] I [fuse-bridge.c:4630:fuse_thread_proc] 0-fuse: unmounting /home<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.835397] W [glusterfsd.c:1002:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f00f0f682bd] (-->/lib64/libpthread.so.0(+0x7e0e)
[0x7f0<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">0f1603e0e] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x4075f5]))) 0-: received signum (15), shutting down<o:p></o:p></span></p>
<p style="margin:0in;margin-bottom:.0001pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">[2014-11-24 13:51:30.835416] I [fuse-bridge.c:5262:fini] 0-fuse: Unmounting '/home'.<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Relevant section from /var/log/glusterfs/etc-glusterfs-glusterd.vol.log:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.552371] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.4.5 (/usr/sbin/glusterd -p /run/glusterd.pid)<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.574553] I [glusterd.c:961:init] 0-management: Using /var/lib/glusterd as working directory<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.577734] I [socket.c:3480:socket_init] 0-socket.management: SSL support is NOT enabled<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.577756] I [socket.c:3495:socket_init] 0-socket.management: using system polling thread<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.577834] E [rpc-transport.c:253:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.4.5/rpc-transport/rdma.so: cannot open shared object file: No such file or directory<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.577849] W [rpc-transport.c:257:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.577858] W [rpcsvc.c:1389:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.578697] I [glusterd.c:354:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.598907] I [glusterd-store.c:1339:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 2<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.607802] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-0<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.607837] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-1<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.809027] I [glusterd-handler.c:2818:glusterd_friend_add] 0-management: connect returned 0<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.809098] I [rpc-clnt.c:962:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.809150] I [socket.c:3480:socket_init] 0-management: SSL support is NOT enabled<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.809162] I [socket.c:3495:socket_init] 0-management: using system polling thread<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:27.813801] I [glusterd.c:125:glusterd_uuid_init] 0-management: retrieved UUID: 3b02a581-8fb9-4c6a-8323-9463262f23bc<o:p></o:p></p>
<p class="MsoNormal">Given volfile:<o:p></o:p></p>
<p class="MsoNormal">+------------------------------------------------------------------------------+<o:p></o:p></p>
<p class="MsoNormal"> 1: volume management<o:p></o:p></p>
<p class="MsoNormal"> 2: type mgmt/glusterd<o:p></o:p></p>
<p class="MsoNormal"> 3: option working-directory /var/lib/glusterd<o:p></o:p></p>
<p class="MsoNormal"> 4: option transport-type socket,rdma<o:p></o:p></p>
<p class="MsoNormal"> 5: option transport.socket.keepalive-time 10<o:p></o:p></p>
<p class="MsoNormal"> 6: option transport.socket.keepalive-interval 2<o:p></o:p></p>
<p class="MsoNormal"> 7: option transport.socket.read-fail-log off<o:p></o:p></p>
<p class="MsoNormal"> 8: # option base-port 49152<o:p></o:p></p>
<p class="MsoNormal"> 9: end-volume<o:p></o:p></p>
<p class="MsoNormal">+------------------------------------------------------------------------------+<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.818283] E [socket.c:2157:socket_connect_finish] 0-management: connection to 10.250.1.2:24007 failed (No route to host)<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.820254] I [rpc-clnt.c:962:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.820316] I [socket.c:3480:socket_init] 0-management: SSL support is NOT enabled<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.820327] I [socket.c:3495:socket_init] 0-management: using system polling thread<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.820378] W [socket.c:514:__socket_rwv] 0-management: readv failed (No data available)<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.821243] I [glusterd-utils.c:1079:glusterd_volume_brickinfo_get] 0-management: Found brick<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.821268] I [socket.c:2236:socket_event_handler] 0-transport: disconnecting now<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.822036] I [glusterd-utils.c:1079:glusterd_volume_brickinfo_get] 0-management: Found brick<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:30.863454] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /export/brick1 on port 49152<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:33.824274] W [socket.c:514:__socket_rwv] 0-management: readv failed (No data available)<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:34.817560] I [glusterd-utils.c:1079:glusterd_volume_brickinfo_get] 0-management: Found brick<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:39.824281] W [socket.c:514:__socket_rwv] 0-management: readv failed (No data available)<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:42.830260] W [socket.c:514:__socket_rwv] 0-management: readv failed (No data available)<o:p></o:p></p>
<p class="MsoNormal">[2014-11-24 13:51:48.832276] W [socket.c:514:__socket_rwv] 0-management: readv failed (No data available)<o:p></o:p></p>
<p class="MsoNormal">[ad nauseam...]<o:p></o:p></p>
</div>
</body>
</html>