<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">I'm not quite keen on trying HEAD on

      these servers yet, but I did grab the source package from

      <a class="moz-txt-link-freetext" href="http://repos.fedorapeople.org/repos/kkeithle/glusterfs/epel-6Server/SRPMS/">http://repos.fedorapeople.org/repos/kkeithle/glusterfs/epel-6Server/SRPMS/</a>

      and apply the patch manually.<br>

      <br>

      Much better! Looks like that did the trick.<br>

      <br>

      M.<br>

      <br>

      On 13-04-03 07:57 PM, Anand Avati wrote:<br>

    </div>

    <blockquote

cite="mid:CAFboF2zY2eDwm0Nj0==Q6pCOqDW9zQtO9FAOkh0tWU-XYanpXw@mail.gmail.com"

      type="cite">Here's a patch on top of today's git HEAD, if you can

      try - <a moz-do-not-send="true"

        href="http://review.gluster.org/4774/">http://review.gluster.org/4774/</a>

      <div><br>

      </div>

      <div>Thanks!</div>

      <div>Avati<br>

        <br>

        <div class="gmail_quote">On Wed, Apr 3, 2013 at 4:35 PM, Anand

          Avati <span dir="ltr">&lt;<a moz-do-not-send="true"

              href="mailto:anand.avati@gmail.com" target="_blank">anand.avati@gmail.com</a>&gt;</span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">Hmm, I was

            be tempted to suggest that you were bitten by the

            gluster/ext4 readdir's d_off incompatibility issue (which

            got recently fixed <a moz-do-not-send="true"

              href="http://review.gluster.org/4711/" target="_blank">http://review.gluster.org/4711/</a>).

            But you say it works fine when you do ls one at a time

            sequentially.

            <div>

              <br>

            </div>

            <div>I just realized after reading your email that, in

              glusterfs, because we use the same anonymous fd for

              multiple client/application's readdir query, we have a

              race in the posix translator where two threads attempt to

              push/pull the same backend cursor in a chaotic way

              resulting in duplicate/lost entries. This might be the

              issue you are seeing, just guessing.</div>

            <div><br>

            </div>

            <div>Will you be willing to try out a source cod patch on

              top of the git HEAD to rebuild your glusterfs and verify

              if it fixes the issue? Will really appreciate it!</div>

            <div><br>

            </div>

            <div>Thanks,</div>

            <div>

              Avati<br>

              <br>

              <div class="gmail_quote">

                <div>

                  <div class="h5">On Wed, Apr 3, 2013 at 2:37 PM,

                    Michael Brown <span dir="ltr">&lt;<a

                        moz-do-not-send="true"

                        href="mailto:michael@netdirect.ca"

                        target="_blank">michael@netdirect.ca</a>&gt;</span>

                    wrote:<br>

                  </div>

                </div>

                <blockquote class="gmail_quote" style="margin:0 0 0

                  .8ex;border-left:1px #ccc solid;padding-left:1ex">

                  <div>

                    <div class="h5">

                      <div bgcolor="#FFFFFF" text="#000000"> I'm seeing

                        a problem on my fairly fresh RHEL gluster

                        install. Smells to me like a parallelism problem

                        on the server.<br>

                        <br>

                        If I mount a gluster volume via NFS (using

                        glusterd's internal NFS server,

                        nfs-kernel-server) and read a directory from

                        multiple clients *in parallel*, I get

                        inconsistent results across servers. Some files

                        are missing from the directory listing, some may

                        be present twice!<br>

                        <br>

                        Exactly which files (or directories!) are

                        missing/duplicated varies each time. But I can

                        very consistently reproduce the behaviour.<br>

                        <br>

                        You can see a screenshot here: <a

                          moz-do-not-send="true"

                          href="http://imgur.com/JU8AFrt"

                          target="_blank">http://imgur.com/JU8AFrt</a><br>

                        <br>

                        The replication steps are:<br>

                        * clusterssh to each NFS client<br>

                        * <tt>unmount /gv0</tt> (to clear cache)<br>

                        * <tt>mount /gv0</tt> [1]<br>

                        * <tt>ls -al </tt><tt>/gv0/common/apache-jmeter-2.9/bin</tt>

                        (which is where I first noticed this)<br>

                        <br>

                        Here's the rub: if, instead of doing the 'ls' in

                        parallel, I do it in series, it works just fine

                        (consistent correct results everywhere). But

                        hitting the gluster server from multiple clients

                        <b>at the same time</b> causes problems.<br>

                        <br>

                        I can still stat() and open() the files missing

                        from the directory listing, they just don't show

                        up in an enumeration.<br>

                        <br>

                        Mounting gv0 as a gluster client filesystem

                        works just fine.<br>

                        <br>

                        Details of my setup:<br>

                        2 × gluster servers: 2×E5-2670, 128GB RAM, RHEL

                        6.4 64-bit, glusterfs-server-3.3.1-1.el6.x86_64

                        (from EPEL)<br>

                        4 × NFS clients: 2×E5-2660, 128GB RAM, RHEL 5.7

                        64-bit, glusterfs-3.3.1-11.el5 (from kkeithley's

                        repo, only used for testing)<br>

                        gv0 volume information is below<br>

                        bricks are 400GB SSDs with ext4[2]<br>

                        common network is 10GbE, replication between

                        servers happens over direct 10GbE link.<br>

                        <br>

                        I will be testing on xfs/btrfs/zfs eventually,

                        but for now I'm on ext4. <br>

                        <br>

                        Also attached is my chatlog from asking about

                        this in #gluster<br>

                        <br>

                        [1]: fstab line is: <tt>fearless1:/gv0 /gv0 nfs

                          defaults,sync,tcp,wsize=8192,rsize=8192 0 0</tt><br>

                        [2]: yes, I've turned off dir_index to avoid

                        That Bug. I've run the d_off test, results are

                        here: <a moz-do-not-send="true"

                          href="http://pastebin.com/zQt5gZnZ"

                          target="_blank">http://pastebin.com/zQt5gZnZ</a><br>

                        <br>

                        ----<br>

                        <tt>gluster&gt; volume info gv0</tt><tt><br>

                        </tt><tt> </tt><tt><br>

                        </tt><tt>Volume Name: gv0</tt><tt><br>

                        </tt><tt>Type: Distributed-Replicate</tt><tt><br>

                        </tt><tt>Volume ID:

                          20117b48-7f88-4f16-9490-a0349afacf71</tt><tt><br>

                        </tt><tt>Status: Started</tt><tt><br>

                        </tt><tt>Number of Bricks: 8 x 2 = 16</tt><tt><br>

                        </tt><tt>Transport-type: tcp</tt><tt><br>

                        </tt><tt>Bricks:</tt><tt><br>

                        </tt><tt>Brick1:

                          fearless1:/export/bricks/500117310007a6d8/glusterdata</tt><tt><br>

                        </tt><tt>Brick2:

                          fearless2:/export/bricks/500117310007a674/glusterdata</tt><tt><br>

                        </tt><tt>Brick3:

                          fearless1:/export/bricks/500117310007a714/glusterdata</tt><tt><br>

                        </tt><tt>Brick4:

                          fearless2:/export/bricks/500117310007a684/glusterdata</tt><tt><br>

                        </tt><tt>Brick5:

                          fearless1:/export/bricks/500117310007a7dc/glusterdata</tt><tt><br>

                        </tt><tt>Brick6:

                          fearless2:/export/bricks/500117310007a694/glusterdata</tt><tt><br>

                        </tt><tt>Brick7:

                          fearless1:/export/bricks/500117310007a7e4/glusterdata</tt><tt><br>

                        </tt><tt>Brick8:

                          fearless2:/export/bricks/500117310007a720/glusterdata</tt><tt><br>

                        </tt><tt>Brick9:

                          fearless1:/export/bricks/500117310007a7ec/glusterdata</tt><tt><br>

                        </tt><tt>Brick10:

                          fearless2:/export/bricks/500117310007a74c/glusterdata</tt><tt><br>

                        </tt><tt>Brick11:

                          fearless1:/export/bricks/500117310007a838/glusterdata</tt><tt><br>

                        </tt><tt>Brick12:

                          fearless2:/export/bricks/500117310007a814/glusterdata</tt><tt><br>

                        </tt><tt>Brick13:

                          fearless1:/export/bricks/500117310007a850/glusterdata</tt><tt><br>

                        </tt><tt>Brick14:

                          fearless2:/export/bricks/500117310007a84c/glusterdata</tt><tt><br>

                        </tt><tt>Brick15:

                          fearless1:/export/bricks/500117310007a858/glusterdata</tt><tt><br>

                        </tt><tt>Brick16:

                          fearless2:/export/bricks/500117310007a8f8/glusterdata</tt><tt><br>

                        </tt><tt>Options Reconfigured:</tt><tt><br>

                        </tt><tt>diagnostics.count-fop-hits: on</tt><tt><br>

                        </tt><tt>diagnostics.latency-measurement: on</tt><tt><br>

                        </tt><tt>nfs.disable: off</tt><tt><br>

                        </tt><tt>----</tt><span><font color="#888888"><br>

                            <br>

                            <pre cols="72">-- 

Michael Brown               | `One of the main causes of the fall of

Systems Consultant          | the Roman Empire was that, lacking zero,

Net Direct Inc.             | they had no way to indicate successful

☎: <a moz-do-not-send="true" href="tel:%2B1%20519%20883%201172%20x5106" value="+15198831172" target="_blank">+1 519 883 1172 x5106</a>    | termination of their C programs.' - Firth

</pre>

                          </font></span></div>

                      <br>

                    </div>

                  </div>

                  _______________________________________________<br>

                  Gluster-devel mailing list<br>

                  <a moz-do-not-send="true"

                    href="mailto:Gluster-devel@nongnu.org"

                    target="_blank">Gluster-devel@nongnu.org</a><br>

                  <a moz-do-not-send="true"

                    href="https://lists.nongnu.org/mailman/listinfo/gluster-devel"

                    target="_blank">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>

                  <br>

                </blockquote>

              </div>

              <br>

            </div>

          </blockquote>

        </div>

        <br>

      </div>

    </blockquote>

    <br>

    <br>

    <pre class="moz-signature" cols="72">-- 

Michael Brown               | `One of the main causes of the fall of

Systems Consultant          | the Roman Empire was that, lacking zero,

Net Direct Inc.             | they had no way to indicate successful

☎: +1 519 883 1172 x5106    | termination of their C programs.' - Firth

</pre>

  </body>

</html>