<font size=2 face="sans-serif"><br>John,</font><br><br><font size=2 face="sans-serif">We've experienced the same problem too

it appears. &nbsp;I've ran an strace on the file when it returns &quot;structure

needs cleaning&quot; after trying to do an 'ls' . &nbsp;I get wind of this

from my developers when they make modifcations to their content and want

to see it served up from apache.</font><br><br><font size=2 face="sans-serif">It will stay like that for infinite

it seems like. &nbsp;The band-aid I found was if I did a 'ls -l' of the

directory the file lives in, it will clear up and then I am able to do

an 'ls -l' on the file that had the &quot;structure needs cleaning&quot;

message. &nbsp;My setup is a 2x2 distri/rep bricks.....</font><br><br><font size=2 face="sans-serif">From the #gluster IRC room, it was suggested

that I do an xfs_repair on the bricks even though it doesn't log any issues

with the underlying xfs filesystem. &nbsp;I've ran that 2 days ago off

hours, but it appears this morning the issue has come back to haunt me.</font><br><font size=2 face="sans-serif">Unfortunately my brick logs have rolled,

and I am not able to show you because it logs so much logrotate rolls it

off.</font><br><br><br><font size=2 face="sans-serif">fuse client log file</font><br><a href=http://fpaste.org/61559/69524561/><font size=2 face="sans-serif">http://fpaste.org/61559/69524561/</font></a><br><br><font size=2 face="sans-serif">dirty strace of a file</font><br><a href=http://fpaste.org/61556/95199613/><font size=2 face="sans-serif">http://fpaste.org/61556/95199613/</font></a><br><br><font size=2 face="sans-serif">listing of directory where dirty file

was issued</font><br><a href=http://fpaste.org/61557/86952173/><font size=2 face="sans-serif">http://fpaste.org/61557/86952173/</font></a><br><br><font size=2 face="sans-serif">cleaned and stat-able file</font><br><a href=http://fpaste.org/61558/86952224/><font size=2 face="sans-serif">http://fpaste.org/61558/86952224/</font></a><font size=2 face="sans-serif"><br></font><br><br><br><br><font size=1 color=#5f5f5f face="sans-serif">From: &nbsp; &nbsp; &nbsp;

&nbsp;</font><font size=1 face="sans-serif">gluster-users-request@gluster.org</font><br><font size=1 color=#5f5f5f face="sans-serif">To: &nbsp; &nbsp; &nbsp;

&nbsp;</font><font size=1 face="sans-serif">gluster-users@gluster.org</font><br><font size=1 color=#5f5f5f face="sans-serif">Date: &nbsp; &nbsp; &nbsp;

&nbsp;</font><font size=1 face="sans-serif">12/13/2013 05:58 AM</font><br><font size=1 color=#5f5f5f face="sans-serif">Subject: &nbsp; &nbsp;

&nbsp; &nbsp;</font><font size=1 face="sans-serif">Gluster-users

Digest, Vol 68, Issue 14</font><br><font size=1 color=#5f5f5f face="sans-serif">Sent by: &nbsp; &nbsp;

&nbsp; &nbsp;</font><font size=1 face="sans-serif">gluster-users-bounces@gluster.org</font><br><hr noshade><br><br><br><tt><font size=2>Send Gluster-users mailing list submissions to<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

gluster-users@gluster.org<br><br>To subscribe or unsubscribe via the World Wide Web, visit<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

</font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>or, via email, send a message with subject or body 'help' to<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

gluster-users-request@gluster.org<br><br>You can reach the person managing the list at<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

gluster-users-owner@gluster.org<br><br>When replying, please edit your Subject line so it is more specific<br>than &quot;Re: Contents of Gluster-users digest...&quot;<br><br><br>Today's Topics:<br><br> &nbsp; 1. Re: Structure needs cleaning on some files (Johan Huysmans)<br> &nbsp; 2. Re: Structure needs cleaning on some files (Johan Huysmans)<br> &nbsp; 3. Re: Gluster Community Weekly Meeting (Vijay Bellur)<br> &nbsp; 4. Re: Gluster Community Weekly Meeting (James)<br> &nbsp; 5. Re: Gluster Community Weekly Meeting (Vijay Bellur)<br> &nbsp; 6. Re: Structure needs cleaning on some files (Maik Kulbe)<br> &nbsp; 7. Re: Structure needs cleaning on some files (Anand Avati)<br> &nbsp; 8. Re: Structure needs cleaning on some files (Anand Avati)<br> &nbsp; 9. Gerrit doesn't use HTTPS (James)<br> &nbsp;10. gluster fails under heavy array job load load (harry mangalam)<br> &nbsp;11. qemu remote insecure connections (Joe Topjian)<br> &nbsp;12. Documentation hackathon for 3.5 (Vijay Bellur)<br> &nbsp;13. Re: gluster fails under heavy array job load load (Anand Avati)<br> &nbsp;14. Re: Gluster Community Weekly Meeting (Niels de Vos)<br><br><br>----------------------------------------------------------------------<br><br>Message: 1<br>Date: Thu, 12 Dec 2013 14:40:37 +0100<br>From: Johan Huysmans &lt;johan.huysmans@inuits.be&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>Message-ID: &lt;52A9BCD5.9020905@inuits.be&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;; Format=&quot;flowed&quot;<br><br>I created a bug for this issue:<br><br></font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1041109"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1041109</font></tt></a><tt><font size=2><br><br>gr.<br>Johan<br><br>On 10-12-13 12:52, Johan Huysmans wrote:<br>&gt; Hi All,<br>&gt;<br>&gt; It seems I can easily reproduce the problem.<br>&gt;<br>&gt; * on node 1 create a file (touch , cat , ...).<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file)<br>&gt; * on node 1 move file to other name (mv file file1)<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file), this

is <br>&gt; still working although the file is not really there<br>&gt; * on node 1 change file content<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file), this

is <br>&gt; still working and has a changed md5sum<br>&gt;<br>&gt; This is really strange behaviour.<br>&gt; Is this normal, can this be altered with a a setting?<br>&gt;<br>&gt; Thanks for any info,<br>&gt; gr.<br>&gt; Johan<br>&gt;<br>&gt; On 10-12-13 10:02, Johan Huysmans wrote:<br>&gt;&gt; I could reproduce this problem with while my mount point is running

<br>&gt;&gt; in debug mode.<br>&gt;&gt; logfile is attached.<br>&gt;&gt;<br>&gt;&gt; gr.<br>&gt;&gt; Johan Huysmans<br>&gt;&gt;<br>&gt;&gt; On 10-12-13 09:30, Johan Huysmans wrote:<br>&gt;&gt;&gt; Hi All,<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; When reading some files we get this error:<br>&gt;&gt;&gt; md5sum: /path/to/file.xml: Structure needs cleaning<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>&gt;&gt;&gt; [2013-12-10 08:07:32.256910] W <br>&gt;&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0:

<br>&gt;&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt;&gt; [2013-12-10 08:07:32.257436] W <br>&gt;&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1:

<br>&gt;&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt;&gt; [2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk]

<br>&gt;&gt;&gt; 0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1

(Structure <br>&gt;&gt;&gt; needs cleaning)<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; We are using gluster 3.4.1-3 on CentOS6.<br>&gt;&gt;&gt; Our servers are 64-bit, our clients 32-bit (we are already

using <br>&gt;&gt;&gt; --enable-ino32 on the mountpoint)<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; This is my gluster configuration:<br>&gt;&gt;&gt; Volume Name: testvolume<br>&gt;&gt;&gt; Type: Replicate<br>&gt;&gt;&gt; Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>&gt;&gt;&gt; Status: Started<br>&gt;&gt;&gt; Number of Bricks: 1 x 2 = 2<br>&gt;&gt;&gt; Transport-type: tcp<br>&gt;&gt;&gt; Bricks:<br>&gt;&gt;&gt; Brick1: SRV-1:/gluster/brick1<br>&gt;&gt;&gt; Brick2: SRV-2:/gluster/brick2<br>&gt;&gt;&gt; Options Reconfigured:<br>&gt;&gt;&gt; performance.force-readdirp: on<br>&gt;&gt;&gt; performance.stat-prefetch: off<br>&gt;&gt;&gt; network.ping-timeout: 5<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; And this is how the applications work:<br>&gt;&gt;&gt; We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>&gt;&gt;&gt; On 1 client node we have a application which writes files.<br>&gt;&gt;&gt; On the other client node we have a application which reads

these files.<br>&gt;&gt;&gt; On the node where the files are written we don't see any problem,

<br>&gt;&gt;&gt; and can read that file without problems.<br>&gt;&gt;&gt; On the other node we have problems (error messages above)

reading <br>&gt;&gt;&gt; that file.<br>&gt;&gt;&gt; The problem occurs when we perform a md5sum on the exact file,

when <br>&gt;&gt;&gt; perform a md5sum on all files in that directory there is no

problem.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; How can we solve this problem as this is annoying.<br>&gt;&gt;&gt; The problem occurs after some time (can be days), an umount

and <br>&gt;&gt;&gt; mount of the mountpoint solves it for some days.<br>&gt;&gt;&gt; Once it occurs (and we don't remount) it occurs every time.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; I hope someone can help me with this problems.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; Thanks,<br>&gt;&gt;&gt; Johan Huysmans<br>&gt;&gt;&gt; _______________________________________________<br>&gt;&gt;&gt; Gluster-users mailing list<br>&gt;&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/4152cb13/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/4152cb13/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 2<br>Date: Thu, 12 Dec 2013 14:51:35 +0100<br>From: Johan Huysmans &lt;johan.huysmans@inuits.be&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>Message-ID: &lt;52A9BF67.4070403@inuits.be&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;; Format=&quot;flowed&quot;<br><br>I created a bug for this issue:<br><br></font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1041109"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1041109</font></tt></a><tt><font size=2><br><br>gr.<br>Johan<br><br>On 10-12-13 12:52, Johan Huysmans wrote:<br>&gt; Hi All,<br>&gt;<br>&gt; It seems I can easily reproduce the problem.<br>&gt;<br>&gt; * on node 1 create a file (touch , cat , ...).<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file)<br>&gt; * on node 1 move file to other name (mv file file1)<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file), this

is <br>&gt; still working although the file is not really there<br>&gt; * on node 1 change file content<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file), this

is <br>&gt; still working and has a changed md5sum<br>&gt;<br>&gt; This is really strange behaviour.<br>&gt; Is this normal, can this be altered with a a setting?<br>&gt;<br>&gt; Thanks for any info,<br>&gt; gr.<br>&gt; Johan<br>&gt;<br>&gt; On 10-12-13 10:02, Johan Huysmans wrote:<br>&gt;&gt; I could reproduce this problem with while my mount point is running

<br>&gt;&gt; in debug mode.<br>&gt;&gt; logfile is attached.<br>&gt;&gt;<br>&gt;&gt; gr.<br>&gt;&gt; Johan Huysmans<br>&gt;&gt;<br>&gt;&gt; On 10-12-13 09:30, Johan Huysmans wrote:<br>&gt;&gt;&gt; Hi All,<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; When reading some files we get this error:<br>&gt;&gt;&gt; md5sum: /path/to/file.xml: Structure needs cleaning<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>&gt;&gt;&gt; [2013-12-10 08:07:32.256910] W <br>&gt;&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0:

<br>&gt;&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt;&gt; [2013-12-10 08:07:32.257436] W <br>&gt;&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1:

<br>&gt;&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt;&gt; [2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk]

<br>&gt;&gt;&gt; 0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1

(Structure <br>&gt;&gt;&gt; needs cleaning)<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; We are using gluster 3.4.1-3 on CentOS6.<br>&gt;&gt;&gt; Our servers are 64-bit, our clients 32-bit (we are already

using <br>&gt;&gt;&gt; --enable-ino32 on the mountpoint)<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; This is my gluster configuration:<br>&gt;&gt;&gt; Volume Name: testvolume<br>&gt;&gt;&gt; Type: Replicate<br>&gt;&gt;&gt; Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>&gt;&gt;&gt; Status: Started<br>&gt;&gt;&gt; Number of Bricks: 1 x 2 = 2<br>&gt;&gt;&gt; Transport-type: tcp<br>&gt;&gt;&gt; Bricks:<br>&gt;&gt;&gt; Brick1: SRV-1:/gluster/brick1<br>&gt;&gt;&gt; Brick2: SRV-2:/gluster/brick2<br>&gt;&gt;&gt; Options Reconfigured:<br>&gt;&gt;&gt; performance.force-readdirp: on<br>&gt;&gt;&gt; performance.stat-prefetch: off<br>&gt;&gt;&gt; network.ping-timeout: 5<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; And this is how the applications work:<br>&gt;&gt;&gt; We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>&gt;&gt;&gt; On 1 client node we have a application which writes files.<br>&gt;&gt;&gt; On the other client node we have a application which reads

these files.<br>&gt;&gt;&gt; On the node where the files are written we don't see any problem,

<br>&gt;&gt;&gt; and can read that file without problems.<br>&gt;&gt;&gt; On the other node we have problems (error messages above)

reading <br>&gt;&gt;&gt; that file.<br>&gt;&gt;&gt; The problem occurs when we perform a md5sum on the exact file,

when <br>&gt;&gt;&gt; perform a md5sum on all files in that directory there is no

problem.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; How can we solve this problem as this is annoying.<br>&gt;&gt;&gt; The problem occurs after some time (can be days), an umount

and <br>&gt;&gt;&gt; mount of the mountpoint solves it for some days.<br>&gt;&gt;&gt; Once it occurs (and we don't remount) it occurs every time.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; I hope someone can help me with this problems.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; Thanks,<br>&gt;&gt;&gt; Johan Huysmans<br>&gt;&gt;&gt; _______________________________________________<br>&gt;&gt;&gt; Gluster-users mailing list<br>&gt;&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/97a7843c/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/97a7843c/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 3<br>Date: Fri, 13 Dec 2013 00:13:30 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: James &lt;purpleidea@gmail.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;,

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Gluster<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Devel &lt;gluster-devel@nongnu.org&gt;, Niels de Vos &lt;ndevos@redhat.com&gt;<br>Subject: Re: [Gluster-users] Gluster Community Weekly Meeting<br>Message-ID: &lt;52AA03D2.9030708@redhat.com&gt;<br>Content-Type: text/plain; charset=UTF-8; format=flowed<br><br>On 12/12/2013 10:42 AM, James wrote:<br>&gt; RE: meeting, sorry I couldn't make it, but I have some comments:<br><br>No problem. It would be really good to have everybody in the meeting, <br>but if you cannot comments are definitely welcome :).<br><br>&gt;<br>&gt; 1) About the pre-packaged VM comment's. I've gotten Vagrant working

on<br>&gt; Fedora. I'm using this to rapidly spin up and test GlusterFS.<br>&gt; </font></tt><a href="https://ttboj.wordpress.com/2013/12/09/vagrant-on-fedora-with-libvirt/"><tt><font size=2>https://ttboj.wordpress.com/2013/12/09/vagrant-on-fedora-with-libvirt/</font></tt></a><tt><font size=2><br>&gt; In the coming week or so, I'll be publishing the Vagrant file for

my<br>&gt; GlusterFS setup, but if you really want it now I can send you an early<br>&gt; version. This obviously integrates with Puppet-Gluster, but whether<br>&gt; you use that or not is optional. I think this is the best way to test<br>&gt; GlusterFS. If someone gives me hosting, I could publish &quot;pre-built&quot;<br>&gt; images very easily. Let me know what you think.<br><br>Niels - do you have any thoughts here?<br><br>&gt;<br>&gt; 2) I never heard back from any action items from 2 weeks ago. I think<br>&gt; someone was going to connect me with a way to get access to some VM's<br>&gt; for testing stuff !<br><br>I see that there is an ongoing offline thread now. I think that should

<br>result in you getting those VMs.<br><br>&gt;<br>&gt; 3) Hagarth: &nbsp;RE: typos, I have at least one spell check patch

against<br>&gt; 3.4.1 I sent it to list before, but someone told me to enroll in the<br>&gt; jenkins thing, which wasn't worth it for a small patch. Let me know

if<br>&gt; you want it.<br><br>There are more typos now. I ran a cursory check with misspell-check [1]

<br>and found quite a few. Having that cleaned up on master and release-3.5

<br>would be great. Since the number is more, I am sure the patch would be

<br>non-trivial and having that routed through gerrit would be great. If you

<br>need a how to on getting to gerrit, it is available at [2].<br><br>&gt;<br>&gt; 4a) Someone mentioned documentation. Please feel free to merge in<br>&gt; </font></tt><a href="https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md"><tt><font size=2>https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md</font></tt></a><tt><font size=2><br>&gt; (markdown format). I have gone to great lengths to format this so

that<br>&gt; it displays properly in github markdown, and standard (pandoc)<br>&gt; markdown. This way it works on github, and can also be rendered to

a<br>&gt; pdf easily. Example:<br>&gt; </font></tt><a href="https://github.com/purpleidea/puppet-gluster/raw/master/puppet-gluster-documentation.pdf"><tt><font size=2>https://github.com/purpleidea/puppet-gluster/raw/master/puppet-gluster-documentation.pdf</font></tt></a><tt><font size=2><br>&gt; &nbsp; You can use the file as a template!<br><br>Again having this in gerrit would be useful for merging the puppet <br>documentation.<br><br>&gt;<br>&gt; 4b) I think the documentation should be kept in the same repo as<br>&gt; GlusterFS. This way, when you submit a feature branch, it can also<br>&gt; come with documentation. Lots of people work this way. It helps you<br>&gt; get minimal docs there, and/or at least some example code or a few<br>&gt; sentences. Also, looking at the docs, you can see what commits came<br>&gt; with this<br><br>I am with you on this one. After we are done with the planned <br>documentation hackathon, let us open a new thread on this to get more <br>opinions.<br><br>-Vijay<br><br>[1] </font></tt><a href="https://github.com/lyda/misspell-check"><tt><font size=2>https://github.com/lyda/misspell-check</font></tt></a><tt><font size=2><br><br>[2] <br></font></tt><a href=http://www.gluster.org/community/documentation/index.php/Development_Work_Flow><tt><font size=2>http://www.gluster.org/community/documentation/index.php/Development_Work_Flow</font></tt></a><tt><font size=2><br><br>&gt;<br>&gt; Thanks!<br>&gt;<br>&gt; James<br>&gt;<br>&gt;<br><br><br><br>------------------------------<br><br>Message: 4<br>Date: Thu, 12 Dec 2013 13:48:31 -0500<br>From: James &lt;purpleidea@gmail.com&gt;<br>To: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;,

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Gluster<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Devel &lt;gluster-devel@nongnu.org&gt;<br>Subject: Re: [Gluster-users] Gluster Community Weekly Meeting<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CADCaTgqJrJ6uTyGiti+q0SpXxMjE+m-vr6yeohgxP3xEdYyt5w@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=UTF-8<br><br>On Thu, Dec 12, 2013 at 1:43 PM, Vijay Bellur &lt;vbellur@redhat.com&gt;

wrote:<br>&gt;&gt; 4a) Someone mentioned documentation. Please feel free to merge

in<br>&gt;&gt; </font></tt><a href="https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md"><tt><font size=2>https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md</font></tt></a><tt><font size=2><br>&gt;&gt; (markdown format). I have gone to great lengths to format this

so that<br>&gt;&gt; it displays properly in github markdown, and standard (pandoc)<br>&gt;&gt; markdown. This way it works on github, and can also be rendered

to a<br>&gt;&gt; pdf easily. Example:<br>&gt;&gt;<br>&gt;&gt; </font></tt><a href="https://github.com/purpleidea/puppet-gluster/raw/master/puppet-gluster-documentation.pdf"><tt><font size=2>https://github.com/purpleidea/puppet-gluster/raw/master/puppet-gluster-documentation.pdf</font></tt></a><tt><font size=2><br>&gt;&gt; &nbsp; You can use the file as a template!<br>&gt;<br>&gt;<br>&gt; Again having this in gerrit would be useful for merging the puppet<br>&gt; documentation.<br><br><br>Okay, I'll try to look into Gerrit and maybe submit a fake patch for testing.<br>When and where (in the tree) would be a good time to submit a doc<br>patch? It's probably best to wait until after your docs hackathon,<br>right?<br><br><br>------------------------------<br><br>Message: 5<br>Date: Fri, 13 Dec 2013 00:30:09 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: James &lt;purpleidea@gmail.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;,

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Gluster<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Devel &lt;gluster-devel@nongnu.org&gt;<br>Subject: Re: [Gluster-users] Gluster Community Weekly Meeting<br>Message-ID: &lt;52AA07B9.9030807@redhat.com&gt;<br>Content-Type: text/plain; charset=UTF-8; format=flowed<br><br>On 12/13/2013 12:18 AM, James wrote:<br>&gt; On Thu, Dec 12, 2013 at 1:43 PM, Vijay Bellur &lt;vbellur@redhat.com&gt;

wrote:<br>&gt;&gt;&gt; 4a) Someone mentioned documentation. Please feel free to merge

in<br>&gt;&gt;&gt; </font></tt><a href="https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md"><tt><font size=2>https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md</font></tt></a><tt><font size=2><br>&gt;&gt;&gt; (markdown format). I have gone to great lengths to format

this so that<br>&gt;&gt;&gt; it displays properly in github markdown, and standard (pandoc)<br>&gt;&gt;&gt; markdown. This way it works on github, and can also be rendered

to a<br>&gt;&gt;&gt; pdf easily. Example:<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; </font></tt><a href="https://github.com/purpleidea/puppet-gluster/raw/master/puppet-gluster-documentation.pdf"><tt><font size=2>https://github.com/purpleidea/puppet-gluster/raw/master/puppet-gluster-documentation.pdf</font></tt></a><tt><font size=2><br>&gt;&gt;&gt; &nbsp; &nbsp;You can use the file as a template!<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; Again having this in gerrit would be useful for merging the puppet<br>&gt;&gt; documentation.<br>&gt;<br>&gt;<br>&gt; Okay, I'll try to look into Gerrit and maybe submit a fake patch for

testing.<br>&gt; When and where (in the tree) would be a good time to submit a doc<br>&gt; patch? It's probably best to wait until after your docs hackathon,<br>&gt; right?<br>&gt;<br><br>Just added a page in preparation for the documentation hackathon:<br><br></font></tt><a href=http://www.gluster.org/community/documentation/index.php/Submitting_Documentation_Patches><tt><font size=2>http://www.gluster.org/community/documentation/index.php/Submitting_Documentation_Patches</font></tt></a><tt><font size=2><br><br>I think the puppet guide can be under a new hierarchy located at <br>doc/deploy-guide/markdown/en-US/. You can certainly submit the puppet <br>doc patch as part of the hackathon.<br><br>-Vijay<br><br><br><br>------------------------------<br><br>Message: 6<br>Date: Thu, 12 Dec 2013 21:46:12 +0100<br>From: &quot;Maik Kulbe&quot; &lt;info@linux-web-development.de&gt;<br>To: &quot;Johan Huysmans&quot; &lt;johan.huysmans@inuits.be&gt;,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;026e5e3d74c6956119dcde7b393990c4@linux-web-development.de&gt;<br>Content-Type: text/plain; charset=&quot;utf-8&quot;; Format=&quot;flowed&quot;<br><br>How do you mount your Client? FUSE? I had similar problems when playing

around with the timeout options for the FUSE mount. If they are too high

they cache the metadata for too long. When you move the file the inode

should stay the same and on the second node the path should stay in cache

for a while so it still knows the inode for that moved files old path thus

can act on the file without knowing it's path.<br><br>The problems kick in when you delete a file and recreate it - the cache

tries to access the old inode, which was deleted, thus throwing errors.

If I recall correctly the &quot;structure needs cleaning&quot; is one of

two error messages I got, depending on which of the timeout mount options

was set to a higher value.<br><br>-----Original Mail-----<br>From: Johan Huysmans [johan.huysmans@inuits.be]<br>Sent: 12.12.13 - 14:51:35<br>To: gluster-users@gluster.org [gluster-users@gluster.org]<br><br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br><br>&gt; I created a bug for this issue:<br>&gt;<br>&gt; </font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1041109"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1041109</font></tt></a><tt><font size=2><br>&gt;<br>&gt; gr.<br>&gt; Johan<br>&gt;<br>&gt; On 10-12-13 12:52, Johan Huysmans wrote:<br>&gt;<br>&gt; Hi All,<br>&gt;<br>&gt; It seems I can easily reproduce the problem.<br>&gt;<br>&gt; * on node 1 create a file (touch , cat , ...).<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file)<br>&gt; * on node 1 move file to other name (mv file file1)<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file), this

is<br>&gt; still working although the file is not really there<br>&gt; * on node 1 change file content<br>&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file), this

is<br>&gt; still working and has a changed md5sum<br>&gt;<br>&gt; This is really strange behaviour.<br>&gt; Is this normal, can this be altered with a a setting?<br>&gt;<br>&gt; Thanks for any info,<br>&gt; gr.<br>&gt; Johan<br>&gt;<br>&gt; On 10-12-13 10:02, Johan Huysmans wrote:<br>&gt;<br>&gt; I could reproduce this problem with while my mount point is running

in<br>&gt; debug mode.<br>&gt; logfile is attached.<br>&gt;<br>&gt; gr.<br>&gt; Johan Huysmans<br>&gt;<br>&gt; On 10-12-13 09:30, Johan Huysmans wrote:<br>&gt;<br>&gt; Hi All,<br>&gt;<br>&gt; When reading some files we get this error:<br>&gt; md5sum: /path/to/file.xml: Structure needs cleaning<br>&gt;<br>&gt; in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>&gt; [2013-12-10 08:07:32.256910] W<br>&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0:<br>&gt; remote operation failed: No such file or directory<br>&gt; [2013-12-10 08:07:32.257436] W<br>&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1:<br>&gt; remote operation failed: No such file or directory<br>&gt; [2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk]<br>&gt; 0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1 (Structure<br>&gt; needs cleaning)<br>&gt;<br>&gt; We are using gluster 3.4.1-3 on CentOS6.<br>&gt; Our servers are 64-bit, our clients 32-bit (we are already using<br>&gt; --enable-ino32 on the mountpoint)<br>&gt;<br>&gt; This is my gluster configuration:<br>&gt; Volume Name: testvolume<br>&gt; Type: Replicate<br>&gt; Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>&gt; Status: Started<br>&gt; Number of Bricks: 1 x 2 = 2<br>&gt; Transport-type: tcp<br>&gt; Bricks:<br>&gt; Brick1: SRV-1:/gluster/brick1<br>&gt; Brick2: SRV-2:/gluster/brick2<br>&gt; Options Reconfigured:<br>&gt; performance.force-readdirp: on<br>&gt; performance.stat-prefetch: off<br>&gt; network.ping-timeout: 5<br>&gt;<br>&gt; And this is how the applications work:<br>&gt; We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>&gt; On 1 client node we have a application which writes files.<br>&gt; On the other client node we have a application which reads these<br>&gt; files.<br>&gt; On the node where the files are written we don't see any problem,<br>&gt; and can read that file without problems.<br>&gt; On the other node we have problems (error messages above) reading<br>&gt; that file.<br>&gt; The problem occurs when we perform a md5sum on the exact file, when<br>&gt; perform a md5sum on all files in that directory there is no problem.<br>&gt;<br>&gt; How can we solve this problem as this is annoying.<br>&gt; The problem occurs after some time (can be days), an umount and<br>&gt; mount of the mountpoint solves it for some days.<br>&gt; Once it occurs (and we don't remount) it occurs every time.<br>&gt;<br>&gt; I hope someone can help me with this problems.<br>&gt;<br>&gt; Thanks,<br>&gt; Johan Huysmans<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>-------------- next part --------------<br>A non-text attachment was scrubbed...<br>Name: smime.p7s<br>Type: application/x-pkcs7-signature<br>Size: 2332 bytes<br>Desc: not available</font></tt><br><tt><font size=2>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/99d59ead/attachment-0001.bin"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/99d59ead/attachment-0001.bin</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 7<br>Date: Thu, 12 Dec 2013 13:26:56 -0800<br>From: Anand Avati &lt;avati@gluster.org&gt;<br>To: Maik Kulbe &lt;info@linux-web-development.de&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAFboF2x1CraXbYSokGt1jhOhBCny+9LRPzASt-Wk8LYqx6nFdw@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>I have the same question. Do you have excessively high --entry-timeout<br>parameter to your FUSE mount? In any case, &quot;Structure needs cleaning&quot;

error<br>should not surface up to FUSE and that is still a bug.<br><br><br>On Thu, Dec 12, 2013 at 12:46 PM, Maik Kulbe<br>&lt;info@linux-web-development.de&gt;wrote:<br><br>&gt; How do you mount your Client? FUSE? I had similar problems when playing<br>&gt; around with the timeout options for the FUSE mount. If they are too

high<br>&gt; they cache the metadata for too long. When you move the file the inode<br>&gt; should stay the same and on the second node the path should stay in

cache<br>&gt; for a while so it still knows the inode for that moved files old path

thus<br>&gt; can act on the file without knowing it's path.<br>&gt;<br>&gt; The problems kick in when you delete a file and recreate it - the

cache &gt; tries to access the old inode, which was deleted, thus throwing errors.

If<br>&gt; I recall correctly the &quot;structure needs cleaning&quot; is one

of two error<br>&gt; messages I got, depending on which of the timeout mount options was

set to<br>&gt; a higher value.<br>&gt;<br>&gt; -----Original Mail-----<br>&gt; From: Johan Huysmans [johan.huysmans@inuits.be]<br>&gt; Sent: 12.12.13 - 14:51:35<br>&gt; To: gluster-users@gluster.org [gluster-users@gluster.org]<br>&gt;<br>&gt; Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>&gt;<br>&gt;<br>&gt; &nbsp;I created a bug for this issue:<br>&gt;&gt;<br>&gt;&gt; </font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1041109"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1041109</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;&gt; gr.<br>&gt;&gt; Johan<br>&gt;&gt;<br>&gt;&gt; On 10-12-13 12:52, Johan Huysmans wrote:<br>&gt;&gt;<br>&gt;&gt; Hi All,<br>&gt;&gt;<br>&gt;&gt; It seems I can easily reproduce the problem.<br>&gt;&gt;<br>&gt;&gt; * on node 1 create a file (touch , cat , ...).<br>&gt;&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file)<br>&gt;&gt; * on node 1 move file to other name (mv file file1)<br>&gt;&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file),

this is<br>&gt;&gt; still working although the file is not really there<br>&gt;&gt; * on node 1 change file content<br>&gt;&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file),

this is<br>&gt;&gt; still working and has a changed md5sum<br>&gt;&gt;<br>&gt;&gt; This is really strange behaviour.<br>&gt;&gt; Is this normal, can this be altered with a a setting?<br>&gt;&gt;<br>&gt;&gt; Thanks for any info,<br>&gt;&gt; gr.<br>&gt;&gt; Johan<br>&gt;&gt;<br>&gt;&gt; On 10-12-13 10:02, Johan Huysmans wrote:<br>&gt;&gt;<br>&gt;&gt; I could reproduce this problem with while my mount point is running

in<br>&gt;&gt; debug mode.<br>&gt;&gt; logfile is attached.<br>&gt;&gt;<br>&gt;&gt; gr.<br>&gt;&gt; Johan Huysmans<br>&gt;&gt;<br>&gt;&gt; On 10-12-13 09:30, Johan Huysmans wrote:<br>&gt;&gt;<br>&gt;&gt; Hi All,<br>&gt;&gt;<br>&gt;&gt; When reading some files we get this error:<br>&gt;&gt; md5sum: /path/to/file.xml: Structure needs cleaning<br>&gt;&gt;<br>&gt;&gt; in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>&gt;&gt; [2013-12-10 08:07:32.256910] W<br>&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0:<br>&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt; [2013-12-10 08:07:32.257436] W<br>&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1:<br>&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt; [2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk]<br>&gt;&gt; 0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1 (Structure<br>&gt;&gt; needs cleaning)<br>&gt;&gt;<br>&gt;&gt; We are using gluster 3.4.1-3 on CentOS6.<br>&gt;&gt; Our servers are 64-bit, our clients 32-bit (we are already using<br>&gt;&gt; --enable-ino32 on the mountpoint)<br>&gt;&gt;<br>&gt;&gt; This is my gluster configuration:<br>&gt;&gt; Volume Name: testvolume<br>&gt;&gt; Type: Replicate<br>&gt;&gt; Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>&gt;&gt; Status: Started<br>&gt;&gt; Number of Bricks: 1 x 2 = 2<br>&gt;&gt; Transport-type: tcp<br>&gt;&gt; Bricks:<br>&gt;&gt; Brick1: SRV-1:/gluster/brick1<br>&gt;&gt; Brick2: SRV-2:/gluster/brick2<br>&gt;&gt; Options Reconfigured:<br>&gt;&gt; performance.force-readdirp: on<br>&gt;&gt; performance.stat-prefetch: off<br>&gt;&gt; network.ping-timeout: 5<br>&gt;&gt;<br>&gt;&gt; And this is how the applications work:<br>&gt;&gt; We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>&gt;&gt; On 1 client node we have a application which writes files.<br>&gt;&gt; On the other client node we have a application which reads these<br>&gt;&gt; files.<br>&gt;&gt; On the node where the files are written we don't see any problem,<br>&gt;&gt; and can read that file without problems.<br>&gt;&gt; On the other node we have problems (error messages above) reading<br>&gt;&gt; that file.<br>&gt;&gt; The problem occurs when we perform a md5sum on the exact file,

when<br>&gt;&gt; perform a md5sum on all files in that directory there is no problem.<br>&gt;&gt;<br>&gt;&gt; How can we solve this problem as this is annoying.<br>&gt;&gt; The problem occurs after some time (can be days), an umount and<br>&gt;&gt; mount of the mountpoint solves it for some days.<br>&gt;&gt; Once it occurs (and we don't remount) it occurs every time.<br>&gt;&gt;<br>&gt;&gt; I hope someone can help me with this problems.<br>&gt;&gt;<br>&gt;&gt; Thanks,<br>&gt;&gt; Johan Huysmans<br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/730b11a2/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/730b11a2/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 8<br>Date: Thu, 12 Dec 2013 13:38:18 -0800<br>From: Anand Avati &lt;avati@gluster.org&gt;<br>To: Maik Kulbe &lt;info@linux-web-development.de&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAFboF2zNOFFbuM9_ayrw6Wv+DdXVOr+D=9Az0cxNx+gHiPEpaw@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Looks like your issue was fixed by patch </font></tt><a href=http://review.gluster.org/4989/><tt><font size=2>http://review.gluster.org/4989/</font></tt></a><tt><font size=2>in<br>master branch. Backporting this to release-3.4 now.<br><br>Thanks!<br>Avati<br><br><br>On Thu, Dec 12, 2013 at 1:26 PM, Anand Avati &lt;avati@gluster.org&gt;

wrote:<br><br>&gt; I have the same question. Do you have excessively high --entry-timeout<br>&gt; parameter to your FUSE mount? In any case, &quot;Structure needs cleaning&quot;

error<br>&gt; should not surface up to FUSE and that is still a bug.<br>&gt;<br>&gt;<br>&gt; On Thu, Dec 12, 2013 at 12:46 PM, Maik Kulbe &lt;<br>&gt; info@linux-web-development.de&gt; wrote:<br>&gt;<br>&gt;&gt; How do you mount your Client? FUSE? I had similar problems when

playing<br>&gt;&gt; around with the timeout options for the FUSE mount. If they are

too high<br>&gt;&gt; they cache the metadata for too long. When you move the file the

inode<br>&gt;&gt; should stay the same and on the second node the path should stay

in cache<br>&gt;&gt; for a while so it still knows the inode for that moved files old

path thus<br>&gt;&gt; can act on the file without knowing it's path.<br>&gt;&gt;<br>&gt;&gt; The problems kick in when you delete a file and recreate it -

the cache<br>&gt;&gt; tries to access the old inode, which was deleted, thus throwing

errors. If<br>&gt;&gt; I recall correctly the &quot;structure needs cleaning&quot; is

one of two error<br>&gt;&gt; messages I got, depending on which of the timeout mount options

was set to<br>&gt;&gt; a higher value.<br>&gt;&gt;<br>&gt;&gt; -----Original Mail-----<br>&gt;&gt; From: Johan Huysmans [johan.huysmans@inuits.be]<br>&gt;&gt; Sent: 12.12.13 - 14:51:35<br>&gt;&gt; To: gluster-users@gluster.org [gluster-users@gluster.org]<br>&gt;&gt;<br>&gt;&gt; Subject: Re: [Gluster-users] Structure needs cleaning on some

files<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; &nbsp;I created a bug for this issue:<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; </font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1041109"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1041109</font></tt></a><tt><font size=2><br>&gt;&gt;&gt;<br>&gt;&gt;&gt; gr.<br>&gt;&gt;&gt; Johan<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; On 10-12-13 12:52, Johan Huysmans wrote:<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; Hi All,<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; It seems I can easily reproduce the problem.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; * on node 1 create a file (touch , cat , ...).<br>&gt;&gt;&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file)<br>&gt;&gt;&gt; * on node 1 move file to other name (mv file file1)<br>&gt;&gt;&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file),

this is<br>&gt;&gt;&gt; still working although the file is not really there<br>&gt;&gt;&gt; * on node 1 change file content<br>&gt;&gt;&gt; * on node 2 take md5sum of direct file (md5sum /path/to/file),

this is<br>&gt;&gt;&gt; still working and has a changed md5sum<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; This is really strange behaviour.<br>&gt;&gt;&gt; Is this normal, can this be altered with a a setting?<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; Thanks for any info,<br>&gt;&gt;&gt; gr.<br>&gt;&gt;&gt; Johan<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; On 10-12-13 10:02, Johan Huysmans wrote:<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; I could reproduce this problem with while my mount point is

running in<br>&gt;&gt;&gt; debug mode.<br>&gt;&gt;&gt; logfile is attached.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; gr.<br>&gt;&gt;&gt; Johan Huysmans<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; On 10-12-13 09:30, Johan Huysmans wrote:<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; Hi All,<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; When reading some files we get this error:<br>&gt;&gt;&gt; md5sum: /path/to/file.xml: Structure needs cleaning<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>&gt;&gt;&gt; [2013-12-10 08:07:32.256910] W<br>&gt;&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0:<br>&gt;&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt;&gt; [2013-12-10 08:07:32.257436] W<br>&gt;&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1:<br>&gt;&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt;&gt; [2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk]<br>&gt;&gt;&gt; 0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1

(Structure<br>&gt;&gt;&gt; needs cleaning)<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; We are using gluster 3.4.1-3 on CentOS6.<br>&gt;&gt;&gt; Our servers are 64-bit, our clients 32-bit (we are already

using<br>&gt;&gt;&gt; --enable-ino32 on the mountpoint)<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; This is my gluster configuration:<br>&gt;&gt;&gt; Volume Name: testvolume<br>&gt;&gt;&gt; Type: Replicate<br>&gt;&gt;&gt; Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>&gt;&gt;&gt; Status: Started<br>&gt;&gt;&gt; Number of Bricks: 1 x 2 = 2<br>&gt;&gt;&gt; Transport-type: tcp<br>&gt;&gt;&gt; Bricks:<br>&gt;&gt;&gt; Brick1: SRV-1:/gluster/brick1<br>&gt;&gt;&gt; Brick2: SRV-2:/gluster/brick2<br>&gt;&gt;&gt; Options Reconfigured:<br>&gt;&gt;&gt; performance.force-readdirp: on<br>&gt;&gt;&gt; performance.stat-prefetch: off<br>&gt;&gt;&gt; network.ping-timeout: 5<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; And this is how the applications work:<br>&gt;&gt;&gt; We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>&gt;&gt;&gt; On 1 client node we have a application which writes files.<br>&gt;&gt;&gt; On the other client node we have a application which reads

these<br>&gt;&gt;&gt; files.<br>&gt;&gt;&gt; On the node where the files are written we don't see any problem,<br>&gt;&gt;&gt; and can read that file without problems.<br>&gt;&gt;&gt; On the other node we have problems (error messages above)

reading<br>&gt;&gt;&gt; that file.<br>&gt;&gt;&gt; The problem occurs when we perform a md5sum on the exact file,

when<br>&gt;&gt;&gt; perform a md5sum on all files in that directory there is no

problem.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; How can we solve this problem as this is annoying.<br>&gt;&gt;&gt; The problem occurs after some time (can be days), an umount

and<br>&gt;&gt;&gt; mount of the mountpoint solves it for some days.<br>&gt;&gt;&gt; Once it occurs (and we don't remount) it occurs every time.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; I hope someone can help me with this problems.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; Thanks,<br>&gt;&gt;&gt; Johan Huysmans<br>&gt;&gt;&gt; _______________________________________________<br>&gt;&gt;&gt; Gluster-users mailing list<br>&gt;&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;&gt;<br>&gt;&gt;&gt; _______________________________________________<br>&gt;&gt;&gt; Gluster-users mailing list<br>&gt;&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;&gt;<br>&gt;&gt;&gt; _______________________________________________<br>&gt;&gt;&gt; Gluster-users mailing list<br>&gt;&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;<br>&gt;<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/f3934ac1/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/f3934ac1/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 9<br>Date: Thu, 12 Dec 2013 17:35:02 -0500<br>From: James &lt;purpleidea@gmail.com&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;,

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Gluster<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Devel &lt;gluster-devel@nongnu.org&gt;<br>Subject: [Gluster-users] Gerrit doesn't use HTTPS<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CADCaTgrmcoJuNL4=vSCDGjjSSND8_eojSEjgUiocHV_cVskKxA@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=UTF-8<br><br>I just noticed that the Gluster Gerrit [1] doesn't use HTTPS!<br><br>Can this be fixed ASAP?<br><br>Cheers,<br>James<br><br>[1] </font></tt><a href=http://review.gluster.org/><tt><font size=2>http://review.gluster.org/</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 10<br>Date: Thu, 12 Dec 2013 17:03:12 -0800<br>From: harry mangalam &lt;harry.mangalam@uci.edu&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] gluster fails under heavy array job load load<br>Message-ID: &lt;2288372.PhVIIXonbv@stunted&gt;<br>Content-Type: text/plain; charset=&quot;us-ascii&quot;<br><br>Hi All,<br>(Gluster Volume Details at bottom)<br><br>I've posted some of this previously, but even after various upgrades, <br>attempted fixes, etc, it remains a problem.<br><br><br>Short version: &nbsp;Our gluster fs (~340TB) provides scratch space for

a ~5000core <br>academic compute cluster. &nbsp;<br>Much of our load is streaming IO, doing a lot of genomics work, and that

is <br>the load under which we saw this latest failure.<br>Under heavy batch load, especially array jobs, where there might be several

<br>64core nodes doing I/O on the 4servers/8bricks, we often get job failures

that <br>have the following profile:<br><br>Client POV:<br>Here is a sampling of the client logs (/var/log/glusterfs/gl.log) for all

<br>compute nodes that indicated interaction with the user's files<br>&lt;</font></tt><a href=http://pastie.org/8548781><tt><font size=2>http://pastie.org/8548781</font></tt></a><tt><font size=2>&gt;<br><br>Here are some client Info logs that seem fairly serious:<br>&lt;</font></tt><a href=http://pastie.org/8548785><tt><font size=2>http://pastie.org/8548785</font></tt></a><tt><font size=2>&gt;<br><br>The errors that referenced this user were gathered from all the nodes that

<br>were running his code (in compute*) and agglomerated with:<br><br>cut -f2,3 -d']' compute* |cut -f1 -dP | sort | uniq -c | sort -gr <br><br>and placed here to show the profile of errors that his run generated.<br>&lt;</font></tt><a href=http://pastie.org/8548796><tt><font size=2>http://pastie.org/8548796</font></tt></a><tt><font size=2>&gt;<br><br>so 71 of them were:<br> &nbsp;W [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-gl-client-7: remote

<br>operation failed: Transport endpoint is not connected. <br>etc<br><br>We've seen this before and previously discounted it bc it seems to have

been <br>related to the problem of spurious NFS-related bugs, but now I'm wondering

<br>whether it's a real problem. <br>Also the 'remote operation failed: Stale file handle. ' warnings.<br><br>There were no Errors logged per se, tho some of the W's looked fairly nasty,

<br>like the 'dht_layout_dir_mismatch'<br><br>&gt;From the server side, however, during the same period, there were:<br>0 Warnings about this user's files<br>0 Errors <br>458 Info lines<br>of which only 1 line was not a 'cleanup' line like this:<br>---<br>10.2.7.11:[2013-12-12 21:22:01.064289] I [server-helpers.c:460:do_fd_cleanup]

<br>0-gl-server: fd cleanup on /path/to/file<br>---<br>it was:<br>---<br>10.2.7.14:[2013-12-12 21:00:35.209015] I [server-rpc-<br>fops.c:898:_gf_server_log_setxattr_failure] 0-gl-server: 113697332: SETXATTR

<br>/bio/tdlong/RNAseqIII/ckpt.1084030 (c9488341-c063-4175-8492-75e2e282f690)

==&gt; <br>trusted.glusterfs.dht<br>---<br><br>We're losing about 10% of these kinds of array jobs bc of this, which is

just <br>not supportable.<br><br><br><br>Gluster details<br><br>servers and clients running gluster 3.4.0-8.el6 over QDR IB, IPoIB, thru

2 <br>Mellanox, 1 Voltaire switches, Mellanox cards, CentOS 6.4<br><br>$ gluster volume info<br> <br>Volume Name: gl<br>Type: Distribute<br>Volume ID: 21f480f7-fc5a-4fd8-a084-3964634a9332<br>Status: Started<br>Number of Bricks: 8<br>Transport-type: tcp,rdma<br>Bricks:<br>Brick1: bs2:/raid1<br>Brick2: bs2:/raid2<br>Brick3: bs3:/raid1<br>Brick4: bs3:/raid2<br>Brick5: bs4:/raid1<br>Brick6: bs4:/raid2<br>Brick7: bs1:/raid1<br>Brick8: bs1:/raid2<br>Options Reconfigured:<br>performance.write-behind-window-size: 1024MB<br>performance.flush-behind: on<br>performance.cache-size: 268435456<br>nfs.disable: on<br>performance.io-cache: on<br>performance.quick-read: on<br>performance.io-thread-count: 64<br>auth.allow: 10.2.*.*,10.1.*.*<br><br><br>'gluster volume status gl detail': <br>&lt;</font></tt><a href=http://pastie.org/8548826><tt><font size=2>http://pastie.org/8548826</font></tt></a><tt><font size=2>&gt;<br><br>---<br>Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine<br>[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487<br>415 South Circle View Dr, Irvine, CA, 92697 [shipping]<br>MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)<br>---<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/f804e657/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/f804e657/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 11<br>Date: Thu, 12 Dec 2013 22:28:57 -0700<br>From: Joe Topjian &lt;joe@topjian.net&gt;<br>To: gluster-users@gluster.org<br>Subject: [Gluster-users] qemu remote insecure connections<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CA+y7hvj8LR2mQ9-AbroXLOCb9jQu+M6GLPAduVfs5a3UGcWBEw@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Hello,<br><br>I'm having a problem getting remote servers to connect to Gluster with

qemu.<br><br>I have 5 servers, 4 of which run Gluster and host a volume. The qemu user<br>on all 5 servers has the same uid.<br><br>storage.owner-uid and storage.owner-gid is set to that user.<br><br>In addition, server.allow-insecure is on and is also set in the<br>glusterd.vol file. glusterd has also been restarted (numerous times).<br><br>When attempting to create a qemu file by connecting to the same server,<br>everything works:<br><br>qemu@192.168.1.11&gt; qemu-img create gluster://192.168.1.11/volumes/v.img

1M<br>Formatting 'gluster://192.168.1.11/volumes/v.img', fmt=raw size=1048576<br>qemu@192.168.1.11&gt;<br><br>But when trying to do it remotely, the command hangs indefinitely:<br><br>qemu@192.168.1.12&gt; qemu-img create gluster://192.168.1.11/volumes/v.img

1M<br>Formatting 'gluster://192.168.1.11/volumes/v.img', fmt=raw size=1048576<br>^C<br><br>Yet when 192.168.1.12 connects to gluster://192.168.1.12, the command works<br>and the file shows up in the distributed volume.<br><br>Further, when turning server.allow-insecure off, I get an immediate error<br>no matter what the source and destination connection is:<br><br>qemu@192.168.1.12&gt; qemu-img create gluster://192.168.1.11/volumes/v.img

1M<br>Formatting 'gluster://192.168.1.11/volumes/v.img', fmt=raw size=1048576<br>qemu-img: Gluster connection failed for server=192.168.1.11 port=0<br>volume=volumes image=v.img transport=tcp<br>qemu-img: gluster://192.168.1.11/volumes/v.img: error while creating raw:<br>No data available<br><br>Does anyone have any ideas how I can have an unprivileged user connect

to<br>remote gluster servers?<br><br>Thanks,<br>Joe<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/86b86bc1/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/86b86bc1/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 12<br>Date: Fri, 13 Dec 2013 11:43:42 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: &quot;'gluster-devel@nongnu.org'&quot; &lt;gluster-devel@nongnu.org&gt;<br>Cc: gluster-users Discussion List &lt;Gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] Documentation hackathon for 3.5<br>Message-ID: &lt;52AAA596.6020908@redhat.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br><br>Hi All,<br><br>The documentation hackathon for 3.5 is underway. You can find more <br>details here [1].<br><br>Anybody who submits a documentation patch that gets accepted between now

<br>and next week will stand a chance to get some swag :).<br><br>Keep your patches coming!<br><br>Cheers,<br>Vijay<br><br>[1] <br></font></tt><a href=http://www.gluster.org/community/documentation/index.php/3.5_Documentation_Hackathon><tt><font size=2>http://www.gluster.org/community/documentation/index.php/3.5_Documentation_Hackathon</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 13<br>Date: Thu, 12 Dec 2013 23:46:03 -0800<br>From: Anand Avati &lt;avati@gluster.org&gt;<br>To: harry mangalam &lt;harry.mangalam@uci.edu&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] gluster fails under heavy array job load<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

load<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAFboF2y8HpFB3xo1Mh43VOiv7jf=g22rHnpxzatGEvtHQgYQ6A@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Please provide the full client and server logs (in a bug report). The<br>snippets give some hints, but are not very meaningful without the full<br>context/history since mount time (they have after-the-fact symptoms, but<br>not the part which show the reason why disconnects happened).<br><br>Even before looking into the full logs here are some quick observations:<br><br>- write-behind-window-size = 1024MB seems *excessively* high. Please set<br>this to 1MB (default) and check if the stability improves.<br><br>- I see RDMA is enabled on the volume. Are you mounting clients through<br>RDMA? If so, for the purpose of diagnostics can you mount through TCP and<br>check the stability improves? If you are using RDMA with such a high<br>write-behind-window-size, spurious ping-timeouts are an almost certainty<br>during heavy writes. The RDMA driver has limited flow control, and setting<br>such a high window-size can easily congest all the RDMA buffers resulting<br>in spurious ping-timeouts and disconnections.<br><br>Avati<br></font></tt><br><tt><font size=2><br>On Thu, Dec 12, 2013 at 5:03 PM, harry mangalam &lt;harry.mangalam@uci.edu&gt;wrote:<br><br>&gt; &nbsp;Hi All,<br>&gt;<br>&gt; (Gluster Volume Details at bottom)<br>&gt;<br>&gt;<br>&gt;<br>&gt; I've posted some of this previously, but even after various upgrades,<br>&gt; attempted fixes, etc, it remains a problem.<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; Short version: Our gluster fs (~340TB) provides scratch space for

a<br>&gt; ~5000core academic compute cluster.<br>&gt;<br>&gt; Much of our load is streaming IO, doing a lot of genomics work, and

that<br>&gt; is the load under which we saw this latest failure.<br>&gt;<br>&gt; Under heavy batch load, especially array jobs, where there might be<br>&gt; several 64core nodes doing I/O on the 4servers/8bricks, we often get

job<br>&gt; failures that have the following profile:<br>&gt;<br>&gt;<br>&gt;<br>&gt; Client POV:<br>&gt;<br>&gt; Here is a sampling of the client logs (/var/log/glusterfs/gl.log)

for all<br>&gt; compute nodes that indicated interaction with the user's files<br>&gt;<br>&gt; &lt;</font></tt><a href=http://pastie.org/8548781><tt><font size=2>http://pastie.org/8548781</font></tt></a><tt><font size=2>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; Here are some client Info logs that seem fairly serious:<br>&gt;<br>&gt; &lt;</font></tt><a href=http://pastie.org/8548785><tt><font size=2>http://pastie.org/8548785</font></tt></a><tt><font size=2>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; The errors that referenced this user were gathered from all the nodes

that<br>&gt; were running his code (in compute*) and agglomerated with:<br>&gt;<br>&gt;<br>&gt;<br>&gt; cut -f2,3 -d']' compute* |cut -f1 -dP | sort | uniq -c | sort -gr<br>&gt;<br>&gt;<br>&gt;<br>&gt; and placed here to show the profile of errors that his run generated.<br>&gt;<br>&gt; &lt;</font></tt><a href=http://pastie.org/8548796><tt><font size=2>http://pastie.org/8548796</font></tt></a><tt><font size=2>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; so 71 of them were:<br>&gt;<br>&gt; W [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-gl-client-7: remote<br>&gt; operation failed: Transport endpoint is not connected.<br>&gt;<br>&gt; etc<br>&gt;<br>&gt;<br>&gt;<br>&gt; We've seen this before and previously discounted it bc it seems to

have<br>&gt; been related to the problem of spurious NFS-related bugs, but now

I'm<br>&gt; wondering whether it's a real problem.<br>&gt;<br>&gt; Also the 'remote operation failed: Stale file handle. ' warnings.<br>&gt;<br>&gt;<br>&gt;<br>&gt; There were no Errors logged per se, tho some of the W's looked fairly<br>&gt; nasty, like the 'dht_layout_dir_mismatch'<br>&gt;<br>&gt;<br>&gt;<br>&gt; From the server side, however, during the same period, there were:<br>&gt;<br>&gt; 0 Warnings about this user's files<br>&gt;<br>&gt; 0 Errors<br>&gt;<br>&gt; 458 Info lines<br>&gt;<br>&gt; of which only 1 line was not a 'cleanup' line like this:<br>&gt;<br>&gt; ---<br>&gt;<br>&gt; 10.2.7.11:[2013-12-12 21:22:01.064289] I<br>&gt; [server-helpers.c:460:do_fd_cleanup] 0-gl-server: fd cleanup on<br>&gt; /path/to/file<br>&gt;<br>&gt; ---<br>&gt;<br>&gt; it was:<br>&gt;<br>&gt; ---<br>&gt;<br>&gt; 10.2.7.14:[2013-12-12 21:00:35.209015] I<br>&gt; [server-rpc-fops.c:898:_gf_server_log_setxattr_failure] 0-gl-server:<br>&gt; 113697332: SETXATTR /bio/tdlong/RNAseqIII/ckpt.1084030<br>&gt; (c9488341-c063-4175-8492-75e2e282f690) ==&gt; trusted.glusterfs.dht<br>&gt;<br>&gt; ---<br>&gt;<br>&gt;<br>&gt;<br>&gt; We're losing about 10% of these kinds of array jobs bc of this, which

is<br>&gt; just not supportable.<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; Gluster details<br>&gt;<br>&gt;<br>&gt;<br>&gt; servers and clients running gluster 3.4.0-8.el6 over QDR IB, IPoIB,

thru 2<br>&gt; Mellanox, 1 Voltaire switches, Mellanox cards, CentOS 6.4<br>&gt;<br>&gt;<br>&gt;<br>&gt; $ gluster volume info<br>&gt;<br>&gt; &nbsp;Volume Name: gl<br>&gt;<br>&gt; Type: Distribute<br>&gt;<br>&gt; Volume ID: 21f480f7-fc5a-4fd8-a084-3964634a9332<br>&gt;<br>&gt; Status: Started<br>&gt;<br>&gt; Number of Bricks: 8<br>&gt;<br>&gt; Transport-type: tcp,rdma<br>&gt;<br>&gt; Bricks:<br>&gt;<br>&gt; Brick1: bs2:/raid1<br>&gt;<br>&gt; Brick2: bs2:/raid2<br>&gt;<br>&gt; Brick3: bs3:/raid1<br>&gt;<br>&gt; Brick4: bs3:/raid2<br>&gt;<br>&gt; Brick5: bs4:/raid1<br>&gt;<br>&gt; Brick6: bs4:/raid2<br>&gt;<br>&gt; Brick7: bs1:/raid1<br>&gt;<br>&gt; Brick8: bs1:/raid2<br>&gt;<br>&gt; Options Reconfigured:<br>&gt;<br>&gt; performance.write-behind-window-size: 1024MB<br>&gt;<br>&gt; performance.flush-behind: on<br>&gt;<br>&gt; performance.cache-size: 268435456<br>&gt;<br>&gt; nfs.disable: on<br>&gt;<br>&gt; performance.io-cache: on<br>&gt;<br>&gt; performance.quick-read: on<br>&gt;<br>&gt; performance.io-thread-count: 64<br>&gt;<br>&gt; auth.allow: 10.2.*.*,10.1.*.*<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; 'gluster volume status gl detail':<br>&gt;<br>&gt; &lt;</font></tt><a href=http://pastie.org/8548826><tt><font size=2>http://pastie.org/8548826</font></tt></a><tt><font size=2>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; ---<br>&gt;<br>&gt; Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine<br>&gt;<br>&gt; [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487<br>&gt;<br>&gt; 415 South Circle View Dr, Irvine, CA, 92697 [shipping]<br>&gt;<br>&gt; MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)<br>&gt;<br>&gt; ---<br>&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/4fea4498/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131212/4fea4498/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 14<br>Date: Fri, 13 Dec 2013 09:30:17 +0100<br>From: Niels de Vos &lt;ndevos@redhat.com&gt;<br>To: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;,

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Gluster<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Devel &lt;gluster-devel@nongnu.org&gt;<br>Subject: Re: [Gluster-users] Gluster Community Weekly Meeting<br>Message-ID: &lt;20131213083017.GQ11941@ndevos-laptop.usersys.redhat.com&gt;<br>Content-Type: text/plain; charset=us-ascii<br><br>On Fri, Dec 13, 2013 at 12:13:30AM +0530, Vijay Bellur wrote:<br>&gt; On 12/12/2013 10:42 AM, James wrote:<br>&gt; &gt;RE: meeting, sorry I couldn't make it, but I have some comments:<br>&gt; <br>&gt; No problem. It would be really good to have everybody in the<br>&gt; meeting, but if you cannot comments are definitely welcome :).<br>&gt; <br>&gt; &gt;<br>&gt; &gt;1) About the pre-packaged VM comment's. I've gotten Vagrant working

on<br>&gt; &gt;Fedora. I'm using this to rapidly spin up and test GlusterFS.<br>&gt; &gt;</font></tt><a href="https://ttboj.wordpress.com/2013/12/09/vagrant-on-fedora-with-libvirt/"><tt><font size=2>https://ttboj.wordpress.com/2013/12/09/vagrant-on-fedora-with-libvirt/</font></tt></a><tt><font size=2><br>&gt; &gt;In the coming week or so, I'll be publishing the Vagrant file

for my<br>&gt; &gt;GlusterFS setup, but if you really want it now I can send you

an early<br>&gt; &gt;version. This obviously integrates with Puppet-Gluster, but whether<br>&gt; &gt;you use that or not is optional. I think this is the best way

to test &gt; &gt;GlusterFS. If someone gives me hosting, I could publish &quot;pre-built&quot; &gt; &gt;images very easily. Let me know what you think. &gt;  &gt; Niels - do you have any thoughts here? I was thinking of using virt-builder[1] which is already part of Fedora.

&nbsp;<br>Personally I would stick with the Fedora tools, and not use yet <br>something else again. However, I'm more than happy if James builds and

<br>publishes one or more VMs for the test days, of course he is free to use

<br>whatever tools he likes :-)<br><br>The images should be minimal, and I do not expect them to be bigger than

<br>512MB when compressed. Best would be to host them on the <br>download.gluster.org server, at the same location of the packages.<br><br>Niels<br><br>1. </font></tt><a href="http://libguestfs.org/virt-builder.1.html"><tt><font size=2>http://libguestfs.org/virt-builder.1.html</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>_______________________________________________<br>Gluster-users mailing list<br>Gluster-users@gluster.org<br></font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br>End of Gluster-users Digest, Vol 68, Issue 14<br>*********************************************<br></font></tt><br><BR>

**<BR>

<BR>

This email and any attachments may contain information that is confidential and/or privileged for the sole use of the intended recipient.  Any use, review, disclosure, copying, distribution or reliance by others, and any forwarding of this email or its contents, without the express permission of the sender is strictly prohibited by law.  If you are not the intended recipient, please contact the sender immediately, delete the e-mail and destroy all copies.<BR>

**<BR>