Here&#39;s a patch on top of today&#39;s git HEAD, if you can try - <a href="http://review.gluster.org/4774/">http://review.gluster.org/4774/</a><div><br></div><div>Thanks!</div><div>Avati<br><br><div class="gmail_quote">On Wed, Apr 3, 2013 at 4:35 PM, Anand Avati <span dir="ltr">&lt;<a href="mailto:anand.avati@gmail.com" target="_blank">anand.avati@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hmm, I was be tempted to suggest that you were bitten by the gluster/ext4 readdir&#39;s d_off incompatibility issue (which got recently fixed <a href="http://review.gluster.org/4711/" target="_blank">http://review.gluster.org/4711/</a>). But you say it works fine when you do ls one at a time sequentially.<div>

<br></div><div>I just realized after reading your email that, in glusterfs, because we use the same anonymous fd for multiple client/application&#39;s readdir query, we have a race in the posix translator where two threads attempt to push/pull the same backend cursor in a chaotic way resulting in duplicate/lost entries. This might be the issue you are seeing, just guessing.</div>

<div><br></div><div>Will you be willing to try out a source cod patch on top of the git HEAD to rebuild your glusterfs and verify if it fixes the issue? Will really appreciate it!</div><div><br></div><div>Thanks,</div><div>

Avati<br><br><div class="gmail_quote"><div><div class="h5">On Wed, Apr 3, 2013 at 2:37 PM, Michael Brown <span dir="ltr">&lt;<a href="mailto:michael@netdirect.ca" target="_blank">michael@netdirect.ca</a>&gt;</span> wrote:<br>

</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">

  <div bgcolor="#FFFFFF" text="#000000">

    I&#39;m seeing a problem on my fairly fresh RHEL gluster install. Smells

    to me like a parallelism problem on the server.<br>

    <br>

    If I mount a gluster volume via NFS (using glusterd&#39;s internal NFS

    server, nfs-kernel-server) and read a directory from multiple

    clients *in parallel*, I get inconsistent results across servers.

    Some files are missing from the directory listing, some may be

    present twice!<br>

    <br>

    Exactly which files (or directories!) are missing/duplicated varies

    each time. But I can very consistently reproduce the behaviour.<br>

    <br>

    You can see a screenshot here: <a href="http://imgur.com/JU8AFrt" target="_blank">http://imgur.com/JU8AFrt</a><br>

    <br>

    The replication steps are:<br>

    * clusterssh to each NFS client<br>

    * <tt>unmount /gv0</tt> (to clear cache)<br>

    * <tt>mount /gv0</tt> [1]<br>

    * <tt>ls -al </tt><tt>/gv0/common/apache-jmeter-2.9/bin</tt>

    (which is where I first noticed this)<br>

    <br>

    Here&#39;s the rub: if, instead of doing the &#39;ls&#39; in parallel, I do it

    in series, it works just fine (consistent correct results

    everywhere). But hitting the gluster server from multiple clients <b>at

      the same time</b> causes problems.<br>

    <br>

    I can still stat() and open() the files missing from the directory

    listing, they just don&#39;t show up in an enumeration.<br>

    <br>

    Mounting gv0 as a gluster client filesystem works just fine.<br>

    <br>

    Details of my setup:<br>

    2 × gluster servers: 2×E5-2670, 128GB RAM, RHEL 6.4 64-bit,

    glusterfs-server-3.3.1-1.el6.x86_64 (from EPEL)<br>

    4 × NFS clients: 2×E5-2660, 128GB RAM, RHEL 5.7 64-bit,

    glusterfs-3.3.1-11.el5 (from kkeithley&#39;s repo, only used for

    testing)<br>

    gv0 volume information is below<br>

    bricks are 400GB SSDs with ext4[2]<br>

    common network is 10GbE, replication between servers happens over

    direct 10GbE link.<br>

    <br>

    I will be testing on xfs/btrfs/zfs eventually, but for now I&#39;m on

    ext4. <br>

    <br>

    Also attached is my chatlog from asking about this in #gluster<br>

    <br>

    [1]: fstab line is: <tt>fearless1:/gv0 /gv0 nfs

      defaults,sync,tcp,wsize=8192,rsize=8192 0 0</tt><br>

    [2]: yes, I&#39;ve turned off dir_index to avoid That Bug. I&#39;ve run the

    d_off test, results are here: <a href="http://pastebin.com/zQt5gZnZ" target="_blank">http://pastebin.com/zQt5gZnZ</a><br>

    <br>

    ----<br>

    <tt>gluster&gt; volume info gv0</tt><tt><br>

    </tt><tt> </tt><tt><br>

    </tt><tt>Volume Name: gv0</tt><tt><br>

    </tt><tt>Type: Distributed-Replicate</tt><tt><br>

    </tt><tt>Volume ID: 20117b48-7f88-4f16-9490-a0349afacf71</tt><tt><br>

    </tt><tt>Status: Started</tt><tt><br>

    </tt><tt>Number of Bricks: 8 x 2 = 16</tt><tt><br>

    </tt><tt>Transport-type: tcp</tt><tt><br>

    </tt><tt>Bricks:</tt><tt><br>

    </tt><tt>Brick1:

      fearless1:/export/bricks/500117310007a6d8/glusterdata</tt><tt><br>

    </tt><tt>Brick2:

      fearless2:/export/bricks/500117310007a674/glusterdata</tt><tt><br>

    </tt><tt>Brick3:

      fearless1:/export/bricks/500117310007a714/glusterdata</tt><tt><br>

    </tt><tt>Brick4:

      fearless2:/export/bricks/500117310007a684/glusterdata</tt><tt><br>

    </tt><tt>Brick5:

      fearless1:/export/bricks/500117310007a7dc/glusterdata</tt><tt><br>

    </tt><tt>Brick6:

      fearless2:/export/bricks/500117310007a694/glusterdata</tt><tt><br>

    </tt><tt>Brick7:

      fearless1:/export/bricks/500117310007a7e4/glusterdata</tt><tt><br>

    </tt><tt>Brick8:

      fearless2:/export/bricks/500117310007a720/glusterdata</tt><tt><br>

    </tt><tt>Brick9:

      fearless1:/export/bricks/500117310007a7ec/glusterdata</tt><tt><br>

    </tt><tt>Brick10:

      fearless2:/export/bricks/500117310007a74c/glusterdata</tt><tt><br>

    </tt><tt>Brick11:

      fearless1:/export/bricks/500117310007a838/glusterdata</tt><tt><br>

    </tt><tt>Brick12:

      fearless2:/export/bricks/500117310007a814/glusterdata</tt><tt><br>

    </tt><tt>Brick13:

      fearless1:/export/bricks/500117310007a850/glusterdata</tt><tt><br>

    </tt><tt>Brick14:

      fearless2:/export/bricks/500117310007a84c/glusterdata</tt><tt><br>

    </tt><tt>Brick15:

      fearless1:/export/bricks/500117310007a858/glusterdata</tt><tt><br>

    </tt><tt>Brick16:

      fearless2:/export/bricks/500117310007a8f8/glusterdata</tt><tt><br>

    </tt><tt>Options Reconfigured:</tt><tt><br>

    </tt><tt>diagnostics.count-fop-hits: on</tt><tt><br>

    </tt><tt>diagnostics.latency-measurement: on</tt><tt><br>

    </tt><tt>nfs.disable: off</tt><tt><br>

    </tt><tt>----</tt><span><font color="#888888"><br>

    <br>

    <pre cols="72">-- 

Michael Brown               | `One of the main causes of the fall of

Systems Consultant          | the Roman Empire was that, lacking zero,

Net Direct Inc.             | they had no way to indicate successful

☎: <a href="tel:%2B1%20519%20883%201172%20x5106" value="+15198831172" target="_blank">+1 519 883 1172 x5106</a>    | termination of their C programs.&#39; - Firth

</pre>

  </font></span></div>

<br></div></div>_______________________________________________<br>

Gluster-devel mailing list<br>

<a href="mailto:Gluster-devel@nongnu.org" target="_blank">Gluster-devel@nongnu.org</a><br>

<a href="https://lists.nongnu.org/mailman/listinfo/gluster-devel" target="_blank">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>

<br></blockquote></div><br></div>

</blockquote></div><br></div>