<div>Hi, </div><div> </div><div>    We have glusterFS setup with version 3.1.4 installed and configured in replica mode.</div><div> </div><div>The setup was configured in the following manner. The gluster volume info looks like </div>
<div> </div><div># gluster volume info</div><div>Volume Name: gluster-fs1<br>Type: Replicate<br>Status: Started<br>Number of Bricks: 2<br>Transport-type: rdma<br>Bricks:<br>Brick1: jr4-1-ib:/data/gluster/brick-md2<br>Brick2: jr4-2-ib:/data/gluster/brick-md2<br>
</div><div>We are having strange problem with gluster disconnection. We see the following error from the servers side.</div><div> </div><div>[2011-08-30 07:41:02.868432] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 07:41:05.870965] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 07:41:05.872927] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 07:41:08.875478] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 07:41:08.877490] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 07:41:11.880046] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 07:41:11.882048] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 07:41:14.884589] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 07:41:14.886616] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 07:41:17.889162] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 07:41:17.891141] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 07:41:20.893646] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 07:41:20.895656] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 07:41:23.898240] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 07:41:23.900252] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
</div><div>brick logs shows the following error messgaes</div><div> </div><div>[2011-08-29 18:16:00.308719] E [rpcsvc.c:1554:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x13460x, Program: GlusterFS-3.1.0, Pro<br>
gVers: 310, Proc: 34) to rpc-transport (rdma.gluster-fs1-server)<br>[2011-08-29 18:16:00.308737] E [server.c:137:server_submit_reply] 0-: Reply submission failed<br>[2011-08-29 18:16:00.308751] I [server-helpers.c:756:server_connection_destroy] 0-gluster-fs1-server: destroyed connection of n1710-1749-2011/08/26-15:43:<br>
13:560803-gluster-fs1-client-0<br>[2011-08-29 18:16:00.308773] E [rpcsvc.c:1554:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x15372x, Program: GlusterFS-3.1.0, Pro<br>gVers: 310, Proc: 34) to rpc-transport (rdma.gluster-fs1-server)<br>
[2011-08-29 18:16:00.308800] E [server.c:137:server_submit_reply] 0-: Reply submission failed<br>[2011-08-29 18:16:00.308819] I [server-helpers.c:756:server_connection_destroy] 0-gluster-fs1-server: destroyed connection of n1711-1788-2011/08/26-15:43:<br>
38:361566-gluster-fs1-client-0<br>[2011-08-29 18:16:00.309051] E [rpcsvc.c:1554:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1463x, Program: GlusterFS-3.1.0, Prog<br>Vers: 310, Proc: 34) to rpc-transport (rdma.gluster-fs1-server)<br>
[2011-08-29 18:16:00.309070] E [server.c:137:server_submit_reply] 0-: Reply submission failed<br>[2011-08-29 18:16:00.309143] I [server-helpers.c:756:server_connection_destroy] 0-gluster-fs1-server: destroyed connection of n1722-1765-2011/08/26-21:13:<br>
31:834472-gluster-fs1-client-0<br>[2011-08-29 18:16:00.310517] E [rpcsvc.c:1554:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1599091x, Program: GlusterFS-3.1.0, P<br>rogVers: 310, Proc: 34) to rpc-transport (rdma.gluster-fs1-server)<br>
[2011-08-29 18:16:00.310539] E [rpc-transport.c:976:rpc_transport_ref] 0-rpc_transport: invalid argument: this<br>[2011-08-29 18:16:00.310543] E [server.c:137:server_submit_reply] 0-: Reply submission failed<br>[2011-08-29 18:16:00.310564] E [rpc-transport.c:996:rpc_transport_unref] 0-rpc_transport: invalid argument: this<br>
[2011-08-29 18:16:00.310607] E [rpc-transport.c:976:rpc_transport_ref] 0-rpc_transport: invalid argument: this<br></div><div> </div><div>and on the client side we see the following error messages</div><div> </div><div>[2011-08-30 10:15:52.149071] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>
[2011-08-30 10:15:53.152158] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>[2011-08-30 10:15:54.888495] W [fuse-bridge.c:413:fuse_attr_cbk] 0-glusterfs-fuse: 9817037: LOOKUP() / =&gt; -1 (Transport endpoint is not connected)<br>
[2011-08-30 10:15:55.155257] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>[2011-08-30 10:15:56.158282] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>
[2011-08-30 10:15:58.161525] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br>[2011-08-30 10:15:59.164618] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-1: tcp connect to <a href="http://172.31.100.228:24009">172.31.100.228:24009</a> failed (Connection refused)<br>
[2011-08-30 10:16:01.167819] E [rdma.c:4428:tcp_connect_finish] 0-gluster-fs1-client-0: tcp connect to <a href="http://172.31.100.227:24009">172.31.100.227:24009</a> failed (Connection refused)<br></div><div> </div><div>Currently i only option we see is to restart the gluster services on the gluster brick nodes, which allows to automatically connect the glusterfs.</div>
<div>Could you please suggest us what would be the reason for the same.</div><div> </div><div>Regards,</div><div>Ramana Kasaraneni.</div>