<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <tt>Hi Jeff,</tt><tt><br>
    </tt><tt><br>
    </tt><tt>Missed to add this:</tt><tt><br>
    </tt><tt>SSL_pending was 0 before calling SSL_read</tt><tt> and
      hence </tt><tt>SSL_get_error</tt><tt> returned '</tt><tt>SSL_ERROR_WANT_READ'<br>
      <br>
    </tt><tt>Thanks,</tt><tt><br>
    </tt><tt>Vijay</tt><br>
    <br>
    <br>
    <div class="moz-cite-prefix">On Tuesday 24 June 2014 05:15 PM,
      Vijaikumar M wrote:<br>
    </div>
    <blockquote cite="mid:53A964D5.4020604@redhat.com" type="cite">
      <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
      <tt>Hi Jeff,</tt><tt><br>
      </tt><tt><br>
      </tt><tt>This is regarding the patch <a moz-do-not-send="true"
          class="moz-txt-link-freetext"
          href="http://review.gluster.org/#/c/3842/">http://review.gluster.org/#/c/3842/</a>
        (epoll: edge triggered and multi-threaded epoll).</tt><tt><br>
      </tt><tt>The testcase './tests/bugs/bug-873367.t' hangs with this
        fix (Please find the stack trace below).</tt><tt><br>
      </tt><tt><br>
      </tt><tt>In the code snippet below we found that 'SSL_pending' was
        returning 0.</tt><tt><br>
      </tt><tt>I have added a condition here to return from the function
        when there is no data available.</tt><tt><br>
      </tt><tt>Please suggest if this is OK to do this way or do we need
        to restructure this function for multi-threaded epoll?</tt><tt><br>
      </tt><tt><br>
      </tt><tt>&lt;code: socket.c&gt;</tt><tt><br>
      </tt><tt> 178 static int</tt><tt><br>
      </tt><tt> 179 ssl_do (rpc_transport_t *this, void *buf, size_t
        len, SSL_trinary_func *func)</tt><tt><br>
      </tt><tt> 180 {</tt><tt><br>
      </tt><tt> ....</tt><tt><br>
      </tt><tt> </tt><tt><br>
      </tt><tt> 211                 switch
        (SSL_get_error(priv-&gt;ssl_ssl,r)) {</tt><tt><br>
      </tt><tt> 212                 case SSL_ERROR_NONE:</tt><tt><br>
      </tt><tt> 213                         return r;</tt><tt><br>
      </tt><tt> 214                 case SSL_ERROR_WANT_READ:</tt><tt><br>
      </tt><tt> 215                         if
        (SSL_pending(priv-&gt;ssl_ssl) ==
        0)                                          </tt><tt><br>
      </tt><tt> 216                                 return r;</tt><tt><br>
      </tt><tt> 217                         pfd.fd = priv-&gt;sock;</tt><tt><br>
      </tt><tt> 221                         if (poll(&amp;pfd,1,-1) &lt;
        0) {                                                    </tt><tt><br>
      </tt><tt>&lt;/code&gt;</tt><tt><br>
        <br>
        <br>
      </tt><tt><br>
      </tt><tt>Thanks,</tt><tt><br>
      </tt><tt>Vijay</tt><tt><br>
      </tt><br>
      <div class="moz-cite-prefix">On Tuesday 24 June 2014 03:55 PM,
        Vijaikumar M wrote:<br>
      </div>
      <blockquote cite="mid:53A95217.1010009@redhat.com" type="cite">
        <meta content="text/html; charset=UTF-8"
          http-equiv="Content-Type">
        <tt>From the stack trace we found that function
          'socket_submit_request' is waiting on mutext_lock.<br>
          lock is held by the function 'ssl_do' and this function is
          blocked by poll syscall.<br>
          <br>
          <br>
        </tt><tt>(gdb) bt</tt><tt><br>
        </tt><tt>#0  0x0000003daa80822d in pthread_join () from
          /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt>#1  0x00007f3b94eea9d0 in event_dispatch_epoll
          (event_pool=&lt;value optimized out&gt;) at event-epoll.c:632</tt><tt><br>
        </tt><tt>#2  0x0000000000407ecd in main (argc=4,
          argv=0x7fff160a4528) at glusterfsd.c:2023</tt><tt><br>
        </tt><tt><br>
        </tt><tt><br>
        </tt><tt>(gdb) info threads</tt><tt><br>
        </tt><tt>  10 Thread 0x7f3b8d483700 (LWP 26225) 
          0x0000003daa80e264 in __lll_lock_wait () from
          /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt>  9 Thread 0x7f3b8ca82700 (LWP 26226) 
          0x0000003daa80f4b5 in sigwait () from /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt>  8 Thread 0x7f3b8c081700 (LWP 26227) 
          0x0000003daa80b98e in <a moz-do-not-send="true"
            class="moz-txt-link-abbreviated"
            href="mailto:pthread_cond_timedwait@@GLIBC_2.3.2">pthread_cond_timedwait@@GLIBC_2.3.2</a>
          ()</tt><tt><br>
        </tt><tt>   from /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt>  7 Thread 0x7f3b8b680700 (LWP 26228) 
          0x0000003daa80b98e in <a moz-do-not-send="true"
            class="moz-txt-link-abbreviated"
            href="mailto:pthread_cond_timedwait@@GLIBC_2.3.2">pthread_cond_timedwait@@GLIBC_2.3.2</a>
          ()</tt><tt><br>
        </tt><tt>   from /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt>  6 Thread 0x7f3b8a854700 (LWP 26232) 
          0x0000003daa4e9163 in epoll_wait () from /lib64/libc.so.6</tt><tt><br>
        </tt><tt>  5 Thread 0x7f3b89e53700 (LWP 26233) 
          0x0000003daa4e9163 in epoll_wait () from /lib64/libc.so.6</tt><tt><br>
        </tt><tt>  4 Thread 0x7f3b833eb700 (LWP 26241) 
          0x0000003daa4df343 in poll () from /lib64/libc.so.6</tt><tt><br>
        </tt><tt>  3 Thread 0x7f3b82130700 (LWP 26245) 
          0x0000003daa80e264 in __lll_lock_wait () from
          /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt>  2 Thread 0x7f3b8172f700 (LWP 26247) 
          0x0000003daa80e75d in read () from /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt>* 1 Thread 0x7f3b94a38700 (LWP 26224) 
          0x0000003daa80822d in pthread_join () from
          /lib64/libpthread.so.0</tt><tt><br>
        </tt><tt><br>
        </tt><tt><br>
        </tt><font color="#ff0000"><b><tt>(gdb) thread 3</tt></b><b><tt><br>
            </tt></b><b><tt>[Switching to thread 3 (Thread
              0x7f3b82130700 (LWP 26245))]#0  0x0000003daa80e264 in
              __lll_lock_wait ()</tt></b><b><tt><br>
            </tt></b><b><tt>   from /lib64/libpthread.so.0</tt></b><b><tt><br>
            </tt></b><b><tt>(gdb) bt</tt><tt><br>
            </tt><tt>#0  0x0000003daa80e264 in __lll_lock_wait () from
              /lib64/libpthread.so.0</tt><tt><br>
            </tt><tt>#1  0x0000003daa809508 in _L_lock_854 () from
              /lib64/libpthread.so.0</tt><tt><br>
            </tt><tt>#2  0x0000003daa8093d7 in pthread_mutex_lock ()
              from /lib64/libpthread.so.0</tt><tt><br>
            </tt><tt>#3  0x00007f3b8aa74524 in socket_submit_request
              (this=0x7f3b7c0505c0, req=0x7f3b8212f0b0) at socket.c:3134</tt><tt><br>
            </tt></b></font><tt>#4  0x00007f3b94c6b7d5 in
          rpc_clnt_submit (rpc=0x7f3b7c029ce0, prog=&lt;value optimized
          out&gt;, </tt><tt><br>
        </tt><tt>    procnum=&lt;value optimized out&gt;,
          cbkfn=0x7f3b892364b0 &lt;client3_3_lookup_cbk&gt;,
          proghdr=0x7f3b8212f410, </tt><tt><br>
        </tt><tt>    proghdrcount=1, progpayload=0x0,
          progpayloadcount=0, iobref=&lt;value optimized out&gt;,
          frame=0x7f3b93d2a454, </tt><tt><br>
        </tt><tt>    rsphdr=0x7f3b8212f4c0, rsphdr_count=1,
          rsp_payload=0x0, rsp_payload_count=0,
          rsp_iobref=0x7f3b700010d0)</tt><tt><br>
        </tt><tt>    at rpc-clnt.c:1556</tt><tt><br>
        </tt><tt>#5  0x00007f3b892243b0 in client_submit_request
          (this=0x7f3b7c005ef0, req=&lt;value optimized out&gt;, </tt><tt><br>
        </tt><tt>    frame=0x7f3b93d2a454, prog=0x7f3b894525a0,
          procnum=27, cbkfn=0x7f3b892364b0 &lt;client3_3_lookup_cbk&gt;,
          iobref=0x0, </tt><tt><br>
        </tt><tt>    rsphdr=0x7f3b8212f4c0, rsphdr_count=1,
          rsp_payload=0x0, rsp_payload_count=0,
          rsp_iobref=0x7f3b700010d0, </tt><tt><br>
        </tt><tt>    xdrproc=0x7f3b94a4ede0 &lt;xdr_gfs3_lookup_req&gt;)
          at client.c:243</tt><tt><br>
        </tt><tt>#6  0x00007f3b8922fa42 in client3_3_lookup
          (frame=0x7f3b93d2a454, this=0x7f3b7c005ef0,
          data=0x7f3b8212f660)</tt><tt><br>
        </tt><tt>    at client-rpc-fops.c:3119</tt><tt><br>
        </tt><tt><br>
          <br>
          (gdb) p priv-&gt;lock<br>
          $1 = {__data = {__lock = 2, __count = 0, __owner = 26241,
          __nusers = 1, __kind = 0, __spins = 0, __list = {<br>
                __prev = 0x0, __next = 0x0}}, <br>
            __size =
          "\002\000\000\000\000\000\000\000\201f\000\000\001", '\000'
          &lt;repeats 26 times&gt;, __align = 2}<br>
          <br>
        </tt><tt><br>
        </tt><b><font color="#ff0000"><tt>(gdb) thread 4</tt><tt><br>
            </tt><tt>[Switching to thread 4 (Thread 0x7f3b833eb700 (LWP
              26241))]#0  0x0000003daa4df343 in poll () from
              /lib64/libc.so.6</tt><tt><br>
            </tt><tt>(gdb) bt</tt><tt><br>
            </tt><tt>#0  0x0000003daa4df343 in poll () from
              /lib64/libc.so.6<br>
              #1  0x00007f3b8aa71fff in ssl_do (this=0x7f3b7c0505c0,
              buf=0x7f3b7c051264, len=4, func=0x3db2441570
              &lt;SSL_read&gt;)<br>
                  at socket.c:216<br>
              #2  0x00007f3b8aa7277b in __socket_ssl_readv
              (this=&lt;value optimized out&gt;, opvector=&lt;value
              optimized out&gt;, <br>
                  opcount=&lt;value optimized out&gt;) at socket.c:335<br>
              #3  0x00007f3b8aa72c26 in __socket_cached_read
              (this=&lt;value optimized out&gt;, vector=&lt;value
              optimized out&gt;, <br>
                  count=&lt;value optimized out&gt;,
              pending_vector=0x7f3b7c051258,
              pending_count=0x7f3b7c051260, bytes=0x0, write=0)<br>
                  at socket.c:422<br>
              #4  __socket_rwv (this=&lt;value optimized out&gt;,
              vector=&lt;value optimized out&gt;, count=&lt;value
              optimized out&gt;, <br>
                  pending_vector=0x7f3b7c051258,
              pending_count=0x7f3b7c051260, bytes=0x0, write=0) at
              socket.c:496<br>
              #5  0x00007f3b8aa76040 in __socket_readv
              (this=0x7f3b7c0505c0) at socket.c:589<br>
              #6  __socket_proto_state_machine (this=0x7f3b7c0505c0) at
              socket.c:1966<br>
              #7  socket_proto_state_machine (this=0x7f3b7c0505c0) at
              socket.c:2106<br>
              #8  socket_event_poll_in (this=0x7f3b7c0505c0) at
              socket.c:2127<br>
              #9  0x00007f3b8aa77820 in socket_poller
              (ctx=0x7f3b7c0505c0) at socket.c:2338<br>
              #10 0x0000003daa8079d1 in start_thread () from
              /lib64/libpthread.so.0<br>
              #11 0x0000003daa4e8b6d in clone () from /lib64/libc.so.6<br>
            </tt></font></b><tt><br>
          <br>
        </tt><tt>Thanks,</tt><tt><br>
        </tt><tt>Vijay</tt><br>
        <br>
        <br>
        <div class="moz-cite-prefix">On Tuesday 24 June 2014 08:59 AM,
          Raghavendra Gowdappa wrote:<br>
        </div>
        <blockquote
          cite="mid:643428505.31205551.1403580594113.JavaMail.zimbra@redhat.com"
          type="cite">
          <pre wrap="">ok. Sorry, I didn't look into change #. I'll sync up with Vijay.

----- Original Message -----
</pre>
          <blockquote type="cite">
            <pre wrap="">From: "Anand Avati" <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:avati@redhat.com">&lt;avati@redhat.com&gt;</a>
To: "Raghavendra Gowdappa" <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:rgowdapp@redhat.com">&lt;rgowdapp@redhat.com&gt;</a>
Cc: <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:vmallika@redhat.com">vmallika@redhat.com</a>
Sent: Tuesday, June 24, 2014 8:55:34 AM
Subject: Re: Change in glusterfs[master]: epoll: Handle client and server FDs in a separate event pool

On 6/23/14, 8:00 PM, Raghavendra Gowdappa wrote:
</pre>
            <blockquote type="cite">
              <pre wrap="">----- Original Message -----
</pre>
              <blockquote type="cite">
                <pre wrap="">From: "Raghavendra Gowdappa" <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:rgowdapp@redhat.com">&lt;rgowdapp@redhat.com&gt;</a>
To: "Anand Avati" <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:avati@redhat.com">&lt;avati@redhat.com&gt;</a>
Cc: <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:vmallika@redhat.com">vmallika@redhat.com</a>
Sent: Tuesday, June 24, 2014 8:28:41 AM
Subject: Re: Change in glusterfs[master]: epoll: Handle client and server
FDs in a separate event pool



----- Original Message -----
</pre>
                <blockquote type="cite">
                  <pre wrap="">From: "Anand Avati" <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:avati@redhat.com">&lt;avati@redhat.com&gt;</a>
To: <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:vmallika@redhat.com">vmallika@redhat.com</a>
Cc: "Raghavendra G" <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:rgowdapp@redhat.com">&lt;rgowdapp@redhat.com&gt;</a>
Sent: Monday, June 23, 2014 10:07:19 PM
Subject: Re: Change in glusterfs[master]: epoll: Handle client and server
FDs in a separate event pool

On 6/22/14, 8:47 PM, Vijaikumar Mallikarjuna (Code Review) wrote:
</pre>
                  <blockquote type="cite">
                    <pre wrap="">Vijaikumar Mallikarjuna has posted comments on this change.

Change subject: epoll: Handle client and server FDs in a separate event
pool
......................................................................


Patch Set 9:

Hi Avati,

Actually we started working on the fix for Bug# 1096729 which was a
blocker
issue.
We tried multiple ways not to change the current epoll model for now,
however we had to do some changes in the epoll code and ended with this
patch.


MT patch# 3842 looks good to me. It will be great you can help us
getting
the patch in quickly.

Thanks,
Vijay

</pre>
                  </blockquote>
                  <pre wrap="">Copying Raghavendra as he's the RPC guy. Du - #3842 is blocked in review
for a long time because of some incompatibility with RPC SSL mode. Very
likely some issue in our SSL multi-threading code. Can you help Vijai
debug this and move #3842 forward? Also there are new SSL patches from
Jeff upstream. Can you guys check if the new patches fix this problem?
</pre>
                </blockquote>
                <pre wrap="">Sure, I'll try to sync up with Vijay.
</pre>
              </blockquote>
              <pre wrap="">However, I've a doubt on the approach we've to take. Doesn't your patch on
multithreaded epoll also fix this issue? Given that yours is a generic
solution, shouldn't it be favoured over this solution?

</pre>
              <blockquote type="cite">
                <blockquote type="cite">
                  <pre wrap="">
</pre>
                </blockquote>
              </blockquote>
            </blockquote>
            <pre wrap="">that's precisely what i meant.. #3824 (the more generic MT epoll) is
having some issues with SSL MT code (otherwise it is working fine)

</pre>
          </blockquote>
        </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>