<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    <div class="moz-cite-prefix">On 06/04/2014 12:02 PM, Pranith Kumar
      Karampuri wrote:<br>
    </div>
    <blockquote cite="mid:538EBD6A.8090204@redhat.com" type="cite">
      <meta content="text/html; charset=ISO-8859-1"
        http-equiv="Content-Type">
      <br>
      <div class="moz-cite-prefix">On 06/04/2014 11:37 AM, Krutika
        Dhananjay wrote:<br>
      </div>
      <blockquote
        cite="mid:2112728330.8298304.1401862049173.JavaMail.zimbra@redhat.com"
        type="cite">
        <div style="font-family: garamond,new york,times,serif;
          font-size: 12pt; color: #000000">
          <div>Hi,<br>
          </div>
          <div><br>
          </div>
          <div>Recently there was a crash in locks translator (BZ
            1103347, BZ 1097102) with the following backtrace:<br>
          </div>
          <div>(gdb) bt<br>
            #0&nbsp; uuid_unpack (in=0x8 &lt;Address 0x8 out of bounds&gt;,
            uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44<br>
            #1&nbsp; 0x00007feeba9e19d6 in uuid_unparse_x (uu=&lt;value
            optimized out&gt;, out=0x2350fc0
            "081bbc7a-7551-44ac-85c7-aad5e2633db9", <br>
            &nbsp;&nbsp;&nbsp; fmt=0x7feebaa08e00
            "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at
            ../../contrib/uuid/unparse.c:55<br>
            #2&nbsp; 0x00007feeba9be837 in uuid_utoa (uuid=0x8 &lt;Address
            0x8 out of bounds&gt;) at common-utils.c:2138<br>
            #3&nbsp; 0x00007feeb06e8a58 in pl_inodelk_log_cleanup
            (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396<br>
            #4&nbsp; pl_inodelk_client_cleanup (this=0x230d910,
            ctx=0x7fee700f0c60) at inodelk.c:428<br>
            #5&nbsp; 0x00007feeb06ddf3a in pl_client_disconnect_cbk
            (this=0x230d910, client=&lt;value optimized out&gt;) at
            posix.c:2550<br>
            #6&nbsp; 0x00007feeba9fa2dd in gf_client_disconnect
            (client=0x27724a0) at client_t.c:368<br>
            #7&nbsp; 0x00007feeab77ed48 in server_connection_cleanup
            (this=0x2316390, client=0x27724a0, flags=&lt;value optimized
            out&gt;) at server-helpers.c:354<br>
            #8&nbsp; 0x00007feeab77ae2c in server_rpc_notify (rpc=&lt;value
            optimized out&gt;, xl=0x2316390, event=&lt;value optimized
            out&gt;, data=0x2bf51c0) at server.c:527<br>
            #9&nbsp; 0x00007feeba775155 in rpcsvc_handle_disconnect
            (svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720<br>
            #10 0x00007feeba776c30 in rpcsvc_notify (trans=0x2bf51c0,
            mydata=&lt;value optimized out&gt;, event=&lt;value
            optimized out&gt;, data=0x2bf51c0) at rpcsvc.c:758<br>
            #11 0x00007feeba778638 in rpc_transport_notify
            (this=&lt;value optimized out&gt;, event=&lt;value optimized
            out&gt;, data=&lt;value optimized out&gt;) at
            rpc-transport.c:512<br>
            #12 0x00007feeb115e971 in socket_event_poll_err
            (fd=&lt;value optimized out&gt;, idx=&lt;value optimized
            out&gt;, data=0x2bf51c0, poll_in=&lt;value optimized
            out&gt;, poll_out=0, <br>
            &nbsp;&nbsp;&nbsp; poll_err=0) at socket.c:1071<br>
            #13 socket_event_handler (fd=&lt;value optimized out&gt;,
            idx=&lt;value optimized out&gt;, data=0x2bf51c0,
            poll_in=&lt;value optimized out&gt;, poll_out=0, poll_err=0)
            at socket.c:2240<br>
            #14 0x00007feeba9fc6a7 in event_dispatch_epoll_handler
            (event_pool=0x22e2d00) at event-epoll.c:384<br>
            #15 event_dispatch_epoll (event_pool=0x22e2d00) at
            event-epoll.c:445<br>
            #16 0x0000000000407e93 in main (argc=19,
            argv=0x7fffea6c7f88) at glusterfsd.c:2023<br>
            (gdb) f 4<br>
            #4&nbsp; pl_inodelk_client_cleanup (this=0x230d910,
            ctx=0x7fee700f0c60) at inodelk.c:428<br>
            428&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; pl_inodelk_log_cleanup (l);<br>
            (gdb) p l-&gt;pl_inode-&gt;refkeeper <br>
            $1 = (inode_t *) 0x0<br>
            (gdb) <br>
            <div><br>
            </div>
          </div>
          <div>pl_inode-&gt;refkeeper was found to be NULL even when
            there were some blocked inodelks in a certain domain of the
            inode,</div>
          <div>which when dereferenced by the epoll thread in the
            cleanup codepath led to a crash.<br>
          </div>
          <div><br>
          </div>
          <div>On inspecting the code (for want of a consistent
            reproducer), three things were found:<br>
          </div>
          <div><br>
          </div>
          <div>1. The function where the crash happens
            (pl_inodelk_log_cleanup()), makes an attempt to resolve the
            inode to path as can be seen below. But the way inode_path()
            itself<br>
          </div>
          <div>&nbsp;&nbsp;&nbsp; works is to first construct the path based on the
            given inode's ancestry and place it in the buffer provided.
            And if all else fails, the gfid of the inode is placed in a
            certain format ("&lt;gfid:%s&gt;").<br>
          </div>
          <div>&nbsp;&nbsp;&nbsp; This eliminates the need for statements from line 4
            through 7 below, thereby "preventing" dereferencing of
            pl_inode-&gt;refkeeper.<br>
          </div>
          <div>&nbsp;&nbsp;&nbsp; Now, although this change prevents the crash
            altogether, it still does not fix the race that led to
            pl_inode-&gt;refkeeper becoming NULL, and comes at the cost
            of</div>
          <div>&nbsp;&nbsp;&nbsp; printing "(null)" in the log message on line 9 every
            time pl_inode-&gt;refkeeper is found to be NULL, rendering
            the logged messages somewhat useless.<br>
          </div>
          <div><br>
          </div>
          <div>&lt;code&gt;<br>
          </div>
          <div>&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pl_inode = lock-&gt;pl_inode;<br>
            &nbsp; 1<br>
            &nbsp; 2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; inode_path (pl_inode-&gt;refkeeper, NULL,
            &amp;path);<br>
            &nbsp; 3<br>
            &nbsp; 4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (path)<br>
            &nbsp; 5&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; file = path;<br>
            &nbsp; 6&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else<br>
            &nbsp; 7&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; file = uuid_utoa
            (pl_inode-&gt;refkeeper-&gt;gfid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;<br>
            &nbsp;
            8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

            &nbsp;<br>
            &nbsp; 9&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gf_log (THIS-&gt;name,
            GF_LOG_WARNING,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;<br>
            &nbsp;10&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "releasing lock on %s held by
            "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;<br>
            &nbsp;11&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "{client=%p, pid=%"PRId64"
            lk-owner=%s}",&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;<br>
            &nbsp;12&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; file, lock-&gt;client, (uint64_t)
            lock-&gt;client_pid,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;<br>
            &nbsp;13&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; lkowner_utoa (&amp;lock-&gt;owner));</div>
          <div>&lt;\code&gt;<br>
          </div>
        </div>
      </blockquote>
      I think this logging code is from the days when gfid handle
      concept was not there. So it wasn't returning
      &lt;gfid:gfid-str&gt; in cases the path is not present in the
      dentries. I believe the else block can be deleted safely now.<br>
      <br>
      Pranith<br>
      <blockquote
        cite="mid:2112728330.8298304.1401862049173.JavaMail.zimbra@redhat.com"
        type="cite">
        <div style="font-family: garamond,new york,times,serif;
          font-size: 12pt; color: #000000">
          <div><br>
          </div>
          <div>2. There is at least one codepath found that can lead to
            this crash:</div>
          <div>&nbsp;&nbsp;&nbsp; Imagine an inode on which an inodelk operation is
            attempted by a client and is successfully granted too.</div>
          <div>&nbsp;&nbsp; Now, between the time the lock was granted and
            pl_update_refkeeper() was called by this thread, the client
            could send a DISCONNECT event,</div>
          <div>&nbsp;&nbsp; causing cleanup codepath to be executed, where the
            epoll thread crashes on dereferencing pl_inode-&gt;refkeeper
            which is STILL NULL at this point.<br>
          </div>
          <div><br>
          </div>
          <div>&nbsp;&nbsp; Besides, there are still places in locks xlator where
            the refkeeper is NOT updated whenever the lists are modified
            - for instance in the cleanup codepath from a DISCONNECT.<br>
          </div>
          <div>&nbsp; <br>
          </div>
          <div>3. Also, pl_update_refkeeper() seems to be not taking
            into account blocked locks on the inode in the
            __pl_inode_is_empty() check, when it should, as there could
            still be cases<br>
          </div>
          <div>&nbsp;&nbsp;&nbsp; where the granted list could be empty but not the
            blocked list at the time of udpating the refkeeper, in which
            case pl_inode must still take ref on the inode.<br>
          </div>
        </div>
      </blockquote>
      Avati,<br>
      &nbsp;&nbsp;&nbsp;&nbsp; There is one more issue, where pl_inode_forget is cleaning up
      blocked/granted locks which is not handling the new 'client_list'.
      But if we fix inode ref/unref like Kritika proposed, pl_forget
      should never come when there is a granted/blocked lock. So may be
      we should delete that as well. Please feel free to comment.<br>
    </blockquote>
    It is pl_forget not pl_inode_forget. typo :-(.<br>
    <br>
    Pranith<br>
    <blockquote cite="mid:538EBD6A.8090204@redhat.com" type="cite"> <br>
      Pranith<br>
      <blockquote
        cite="mid:2112728330.8298304.1401862049173.JavaMail.zimbra@redhat.com"
        type="cite">
        <div style="font-family: garamond,new york,times,serif;
          font-size: 12pt; color: #000000">
          <div><br>
          </div>
          <div><span style="text-decoration: underline;"
              data-mce-style="text-decoration: underline;">Proposed
              solution to 2/3:</span></div>
          <div><br>
          </div>
          <div>1. Do away with refkeeper in pl_inode_t altogether.<br>
          </div>
          <div>2. Let every lock object (inodelk/entryllk/posix_lock)
            have an inode_t * member to act as a placeholder for the
            associated inode object on which it is locking.<br>
          </div>
          <div>3. Let each lock object hold a ref on the inode at the
            time of its creation and unref the inode during its
            destruction.<br>
          </div>
          <div><br>
          </div>
          <div>&nbsp;Please let me know what you think of the above.<br>
          </div>
          <div><br>
          </div>
          <div>-Krutika<br>
          </div>
          <div><br>
          </div>
          <div><br>
          </div>
        </div>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
        <pre wrap="">_______________________________________________
Gluster-devel mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-devel">http://supercolony.gluster.org/mailman/listinfo/gluster-devel</a>
</pre>
      </blockquote>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Gluster-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-devel">http://supercolony.gluster.org/mailman/listinfo/gluster-devel</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>