<br><br><div class="gmail_quote">On Tue, Jun 28, 2011 at 3:19 PM, Darren Austin <span dir="ltr">&lt;<a href="mailto:darren-lists@widgit.com">darren-lists@widgit.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im">----- Original Message -----<br>

&gt; It looks like the disconnection happened in the middle of a write<br>

&gt; transaction (after the lock phase, before the unlock phase). And the<br>

<br>

</div>The server was deliberately disconnected after the write had begun, in order to test what would happen in that situation and to document a recovery procedure for it.<br>

<div class="im"><br>

&gt; server&#39;s detection of client disconnection (via TCP_KEEPALIVE) seems to have<br>

&gt; not happened before the client reconnected.<br>

<br>

</div>I&#39;ve not configured any special keep alive setting for the server or clients - the configuration was an out of the box glusterd.vol file, and a &quot;volume create&quot; sequence with standard params (no special settings or options applied).<br>


<br>

The disconnected server was also in that state for approx 10 minutes - not seconds.<br>

<br>

I assume the &quot;default&quot; set up is not to hold on to a locked file for over 10 minutes when in a disconnected state?<br>

Surely it shouldn&#39;t hold onto a lock *at all* once it&#39;s out of the cluster?</blockquote><div><br></div><div><br></div><div>The problem here is that the server hasn&#39;t even detected the disconnection. The client has a ping timeout logic checking for inactivity while there are pending fops and force disconnects a server. At the server side, it should either encounter a TCP RST or FIN, or, the TCP KEEPALIVE should kick in. This is the behavior today. The default TCP KEEPALIVE can possibly take over 10 minutes.</div>

<div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">

&gt; The client, having witnessed the reconnection has assumed the locks have been relinquished by the<br>

&gt; server. The server, however, having noticed the same client reconnection before<br>

&gt; breakage of the original connection has not released the held locks.<br>

<br>

</div>But why is the server still holding the locks WAY past the time it should be?<br></blockquote><div><br></div><div>Locks are associated with &quot;connections&quot;, not a time. Server is holding on because it believes the client is still connected (it hasn&#39;t witnessed a socket error yet)</div>

<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

We&#39;re not talking seconds here, we&#39;re talking minutes of disconnection.<br>

<br>

And why, when it is reconnected will it not sync that file back from the other servers that have a full copy of it?<br>

<div class="im"><br>

&gt; Tuning the server side tcp keepalive to a smaller value should fix<br>

&gt; this problem. Can you please verify?<br>

<br>

</div>Are you talking about the GlusterFS keep alive setting in the vol file, or changing the actual TCP keerpalive settings for the *whole* server?  Changing the server TCP keepalive is not an option, since it has ramifications on other things - and it shouldn&#39;t be necessary to solve what is, really, a GlusterFS bug...<br>

<br></blockquote><div><br></div><div>Of course not the system TCP keepalive. I was only talking about Gluster&#39;s TCP keepalive. It should have kicked in about 40 secs of inactivity. Can you check the server (brick) logs to check the order of detected disconnection and new/reconnection from the client?</div>

<div><br></div><div>Avati</div></div>