<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">I'm not quite keen on trying HEAD on
      these servers yet, but I did grab the source package from
      <a class="moz-txt-link-freetext" href="http://repos.fedorapeople.org/repos/kkeithle/glusterfs/epel-6Server/SRPMS/">http://repos.fedorapeople.org/repos/kkeithle/glusterfs/epel-6Server/SRPMS/</a>
      and apply the patch manually.<br>
      <br>
      Much better! Looks like that did the trick.<br>
      <br>
      M.<br>
      <br>
      On 13-04-03 07:57 PM, Anand Avati wrote:<br>
    </div>
    <blockquote
cite="mid:CAFboF2zY2eDwm0Nj0==Q6pCOqDW9zQtO9FAOkh0tWU-XYanpXw@mail.gmail.com"
      type="cite">Here's a patch on top of today's git HEAD, if you can
      try - <a moz-do-not-send="true"
        href="http://review.gluster.org/4774/">http://review.gluster.org/4774/</a>
      <div><br>
      </div>
      <div>Thanks!</div>
      <div>Avati<br>
        <br>
        <div class="gmail_quote">On Wed, Apr 3, 2013 at 4:35 PM, Anand
          Avati <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:anand.avati@gmail.com" target="_blank">anand.avati@gmail.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">Hmm, I was
            be tempted to suggest that you were bitten by the
            gluster/ext4 readdir's d_off incompatibility issue (which
            got recently fixed <a moz-do-not-send="true"
              href="http://review.gluster.org/4711/" target="_blank">http://review.gluster.org/4711/</a>).
            But you say it works fine when you do ls one at a time
            sequentially.
            <div>
              <br>
            </div>
            <div>I just realized after reading your email that, in
              glusterfs, because we use the same anonymous fd for
              multiple client/application's readdir query, we have a
              race in the posix translator where two threads attempt to
              push/pull the same backend cursor in a chaotic way
              resulting in duplicate/lost entries. This might be the
              issue you are seeing, just guessing.</div>
            <div><br>
            </div>
            <div>Will you be willing to try out a source cod patch on
              top of the git HEAD to rebuild your glusterfs and verify
              if it fixes the issue? Will really appreciate it!</div>
            <div><br>
            </div>
            <div>Thanks,</div>
            <div>
              Avati<br>
              <br>
              <div class="gmail_quote">
                <div>
                  <div class="h5">On Wed, Apr 3, 2013 at 2:37 PM,
                    Michael Brown <span dir="ltr">&lt;<a
                        moz-do-not-send="true"
                        href="mailto:michael@netdirect.ca"
                        target="_blank">michael@netdirect.ca</a>&gt;</span>
                    wrote:<br>
                  </div>
                </div>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div>
                    <div class="h5">
                      <div bgcolor="#FFFFFF" text="#000000"> I'm seeing
                        a problem on my fairly fresh RHEL gluster
                        install. Smells to me like a parallelism problem
                        on the server.<br>
                        <br>
                        If I mount a gluster volume via NFS (using
                        glusterd's internal NFS server,
                        nfs-kernel-server) and read a directory from
                        multiple clients *in parallel*, I get
                        inconsistent results across servers. Some files
                        are missing from the directory listing, some may
                        be present twice!<br>
                        <br>
                        Exactly which files (or directories!) are
                        missing/duplicated varies each time. But I can
                        very consistently reproduce the behaviour.<br>
                        <br>
                        You can see a screenshot here: <a
                          moz-do-not-send="true"
                          href="http://imgur.com/JU8AFrt"
                          target="_blank">http://imgur.com/JU8AFrt</a><br>
                        <br>
                        The replication steps are:<br>
                        * clusterssh to each NFS client<br>
                        * <tt>unmount /gv0</tt> (to clear cache)<br>
                        * <tt>mount /gv0</tt> [1]<br>
                        * <tt>ls -al </tt><tt>/gv0/common/apache-jmeter-2.9/bin</tt>
                        (which is where I first noticed this)<br>
                        <br>
                        Here's the rub: if, instead of doing the 'ls' in
                        parallel, I do it in series, it works just fine
                        (consistent correct results everywhere). But
                        hitting the gluster server from multiple clients
                        <b>at the same time</b> causes problems.<br>
                        <br>
                        I can still stat() and open() the files missing
                        from the directory listing, they just don't show
                        up in an enumeration.<br>
                        <br>
                        Mounting gv0 as a gluster client filesystem
                        works just fine.<br>
                        <br>
                        Details of my setup:<br>
                        2 × gluster servers: 2×E5-2670, 128GB RAM, RHEL
                        6.4 64-bit, glusterfs-server-3.3.1-1.el6.x86_64
                        (from EPEL)<br>
                        4 × NFS clients: 2×E5-2660, 128GB RAM, RHEL 5.7
                        64-bit, glusterfs-3.3.1-11.el5 (from kkeithley's
                        repo, only used for testing)<br>
                        gv0 volume information is below<br>
                        bricks are 400GB SSDs with ext4[2]<br>
                        common network is 10GbE, replication between
                        servers happens over direct 10GbE link.<br>
                        <br>
                        I will be testing on xfs/btrfs/zfs eventually,
                        but for now I'm on ext4. <br>
                        <br>
                        Also attached is my chatlog from asking about
                        this in #gluster<br>
                        <br>
                        [1]: fstab line is: <tt>fearless1:/gv0 /gv0 nfs
                          defaults,sync,tcp,wsize=8192,rsize=8192 0 0</tt><br>
                        [2]: yes, I've turned off dir_index to avoid
                        That Bug. I've run the d_off test, results are
                        here: <a moz-do-not-send="true"
                          href="http://pastebin.com/zQt5gZnZ"
                          target="_blank">http://pastebin.com/zQt5gZnZ</a><br>
                        <br>
                        ----<br>
                        <tt>gluster&gt; volume info gv0</tt><tt><br>
                        </tt><tt> </tt><tt><br>
                        </tt><tt>Volume Name: gv0</tt><tt><br>
                        </tt><tt>Type: Distributed-Replicate</tt><tt><br>
                        </tt><tt>Volume ID:
                          20117b48-7f88-4f16-9490-a0349afacf71</tt><tt><br>
                        </tt><tt>Status: Started</tt><tt><br>
                        </tt><tt>Number of Bricks: 8 x 2 = 16</tt><tt><br>
                        </tt><tt>Transport-type: tcp</tt><tt><br>
                        </tt><tt>Bricks:</tt><tt><br>
                        </tt><tt>Brick1:
                          fearless1:/export/bricks/500117310007a6d8/glusterdata</tt><tt><br>
                        </tt><tt>Brick2:
                          fearless2:/export/bricks/500117310007a674/glusterdata</tt><tt><br>
                        </tt><tt>Brick3:
                          fearless1:/export/bricks/500117310007a714/glusterdata</tt><tt><br>
                        </tt><tt>Brick4:
                          fearless2:/export/bricks/500117310007a684/glusterdata</tt><tt><br>
                        </tt><tt>Brick5:
                          fearless1:/export/bricks/500117310007a7dc/glusterdata</tt><tt><br>
                        </tt><tt>Brick6:
                          fearless2:/export/bricks/500117310007a694/glusterdata</tt><tt><br>
                        </tt><tt>Brick7:
                          fearless1:/export/bricks/500117310007a7e4/glusterdata</tt><tt><br>
                        </tt><tt>Brick8:
                          fearless2:/export/bricks/500117310007a720/glusterdata</tt><tt><br>
                        </tt><tt>Brick9:
                          fearless1:/export/bricks/500117310007a7ec/glusterdata</tt><tt><br>
                        </tt><tt>Brick10:
                          fearless2:/export/bricks/500117310007a74c/glusterdata</tt><tt><br>
                        </tt><tt>Brick11:
                          fearless1:/export/bricks/500117310007a838/glusterdata</tt><tt><br>
                        </tt><tt>Brick12:
                          fearless2:/export/bricks/500117310007a814/glusterdata</tt><tt><br>
                        </tt><tt>Brick13:
                          fearless1:/export/bricks/500117310007a850/glusterdata</tt><tt><br>
                        </tt><tt>Brick14:
                          fearless2:/export/bricks/500117310007a84c/glusterdata</tt><tt><br>
                        </tt><tt>Brick15:
                          fearless1:/export/bricks/500117310007a858/glusterdata</tt><tt><br>
                        </tt><tt>Brick16:
                          fearless2:/export/bricks/500117310007a8f8/glusterdata</tt><tt><br>
                        </tt><tt>Options Reconfigured:</tt><tt><br>
                        </tt><tt>diagnostics.count-fop-hits: on</tt><tt><br>
                        </tt><tt>diagnostics.latency-measurement: on</tt><tt><br>
                        </tt><tt>nfs.disable: off</tt><tt><br>
                        </tt><tt>----</tt><span><font color="#888888"><br>
                            <br>
                            <pre cols="72">-- 
Michael Brown               | `One of the main causes of the fall of
Systems Consultant          | the Roman Empire was that, lacking zero,
Net Direct Inc.             | they had no way to indicate successful
☎: <a moz-do-not-send="true" href="tel:%2B1%20519%20883%201172%20x5106" value="+15198831172" target="_blank">+1 519 883 1172 x5106</a>    | termination of their C programs.' - Firth
</pre>
                          </font></span></div>
                      <br>
                    </div>
                  </div>
                  _______________________________________________<br>
                  Gluster-devel mailing list<br>
                  <a moz-do-not-send="true"
                    href="mailto:Gluster-devel@nongnu.org"
                    target="_blank">Gluster-devel@nongnu.org</a><br>
                  <a moz-do-not-send="true"
                    href="https://lists.nongnu.org/mailman/listinfo/gluster-devel"
                    target="_blank">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>
                  <br>
                </blockquote>
              </div>
              <br>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Michael Brown               | `One of the main causes of the fall of
Systems Consultant          | the Roman Empire was that, lacking zero,
Net Direct Inc.             | they had no way to indicate successful
☎: +1 519 883 1172 x5106    | termination of their C programs.' - Firth
</pre>
  </body>
</html>