<font size=2 face="sans-serif">Gluster community,</font><br><br><font size=2 face="sans-serif">[2013-12-11 04:40:06.609091] W [server-resolve.c:419:resolve_anonfd_simple]

0-server: inode for the gfid (76240621-1362-494d-a70a-f5824c3ce56e) is

not found. anonymous fd creation failed</font><br><font size=2 face="sans-serif">[2013-12-11 04:40:06.610588] W [server-resolve.c:419:resolve_anonfd_simple]

0-server: inode for the gfid (03ada1a2-ee51-4c85-a79f-a72aabde116d) is

not found. anonymous fd creation failed</font><br><font size=2 face="sans-serif">[2013-12-11 04:40:06.616978] W [server-resolve.c:419:resolve_anonfd_simple]

0-server: inode for the gfid (64fbc834-e00b-4afd-800e-97d64a32de92) is

not found. anonymous fd creation failed</font><br><font size=2 face="sans-serif">[2013-12-11 04:40:06.617069] W [server-resolve.c:419:resolve_anonfd_simple]

0-server: inode for the gfid (64fbc834-e00b-4afd-800e-97d64a32de92) is

not found. anonymous fd creation failed</font><br><font size=2 face="sans-serif">[2013-12-11 04:40:06.624845] W [server-resolve.c:419:resolve_anonfd_simple]

0-server: inode for the gfid (27837527-5dea-4367-a050-248a6266b2db) is

not found. anonymous fd creation failed</font><br><font size=2 face="sans-serif">followed by </font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:40:10.462202] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:40:29.331476] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:40:53.125088] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:41:00.975222] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:41:01.517990] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>Tue Dec 10 22:41:01 CST 2013</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:41:05.874819] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:41:05.878135] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>Tue Dec 10 22:42:01 CST 2013</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:42:05.136054] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:42:05.330591] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><font size=2 face="sans-serif"><b>[2013-12-11 04:42:41.224927] W [marker-quota.c:2039:mq_inspect_directory_xattr]

0-devstatic-marker: cannot add a new contribution node</b></font><br><br><font size=2 face="sans-serif">Please help me understand what is being

logged from the /var/log/glusterfs/bricks/static-content.log file</font><br><br><font size=2 face="sans-serif">Here is my config for this particular

brick in a 4 node distr/rep design.</font><br><br><br><font size=2 face="sans-serif">cat /var/lib/glusterd/vols/devstatic/devstatic.host2.static-content.vol</font><br><br><font size=2 face="sans-serif">volume devstatic-posix</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type storage/posix</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option volume-id 75832afb-f20e-4018-8d74-8550a92233fc</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option directory /static/content</font><br><font size=2 face="sans-serif">end-volume</font><br><br><font size=2 face="sans-serif">volume devstatic-access-control</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type features/access-control</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; subvolumes devstatic-posix</font><br><font size=2 face="sans-serif">end-volume</font><br><br><font size=2 face="sans-serif">volume devstatic-locks</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type features/locks</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; subvolumes devstatic-access-control</font><br><font size=2 face="sans-serif">end-volume</font><br><br><font size=2 face="sans-serif">volume devstatic-io-threads</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type performance/io-threads</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; subvolumes devstatic-locks</font><br><font size=2 face="sans-serif">end-volume</font><br><br><font size=2 face="sans-serif">volume devstatic-index</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type features/index</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option index-base /static/content/.glusterfs/indices</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; subvolumes devstatic-io-threads</font><br><font size=2 face="sans-serif">end-volume</font><br><font size=2 face="sans-serif">volume devstatic-marker</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type features/marker</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option quota on</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option xtime off</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option timestamp-file

/var/lib/glusterd/vols/devstatic/marker.tstamp</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option volume-uuid 75832afb-f20e-4018-8d74-8550a92233fc</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; subvolumes devstatic-index</font><br><font size=2 face="sans-serif">end-volume</font><br><br><font size=2 face="sans-serif">volume /static/content</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type debug/io-stats</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option count-fop-hits

off</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option latency-measurement

off</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; subvolumes devstatic-marker</font><br><font size=2 face="sans-serif">end-volume</font><br><br><font size=2 face="sans-serif">volume devstatic-server</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; type protocol/server</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option auth.addr./static/content.allow

*</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option auth.login.6173ce00-d694-4793-a755-cd1d80f5001f.password

13702989-510c-44c1-9bc4-8f1f21b65403</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option auth.login./static/content.allow

6173ce00-d694-4793-a755-cd1d80f5001f</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; option transport-type

tcp</font><br><font size=2 face="sans-serif">&nbsp; &nbsp; subvolumes /static/content</font><br><font size=2 face="sans-serif">end-volume<br><br>Khoi Mai<br><br></font><br><br><br><br><font size=1 color=#5f5f5f face="sans-serif">From: &nbsp; &nbsp; &nbsp;

&nbsp;</font><font size=1 face="sans-serif">gluster-users-request@gluster.org</font><br><font size=1 color=#5f5f5f face="sans-serif">To: &nbsp; &nbsp; &nbsp;

&nbsp;</font><font size=1 face="sans-serif">gluster-users@gluster.org</font><br><font size=1 color=#5f5f5f face="sans-serif">Date: &nbsp; &nbsp; &nbsp;

&nbsp;</font><font size=1 face="sans-serif">12/10/2013 05:58 AM</font><br><font size=1 color=#5f5f5f face="sans-serif">Subject: &nbsp; &nbsp;

&nbsp; &nbsp;</font><font size=1 face="sans-serif">Gluster-users

Digest, Vol 68, Issue 11</font><br><font size=1 color=#5f5f5f face="sans-serif">Sent by: &nbsp; &nbsp;

&nbsp; &nbsp;</font><font size=1 face="sans-serif">gluster-users-bounces@gluster.org</font><br><hr noshade><br><br><br><tt><font size=2>Send Gluster-users mailing list submissions to<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

gluster-users@gluster.org<br><br>To subscribe or unsubscribe via the World Wide Web, visit<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

</font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>or, via email, send a message with subject or body 'help' to<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

gluster-users-request@gluster.org<br><br>You can reach the person managing the list at<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

gluster-users-owner@gluster.org<br><br>When replying, please edit your Subject line so it is more specific<br>than &quot;Re: Contents of Gluster-users digest...&quot;<br><br><br>Today's Topics:<br><br> &nbsp; 1. Re: Testing failover and recovery (Per Hallsmark)<br> &nbsp; 2. Gluster - replica - Unable to self-heal contents of &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; '/'<br> &nbsp; &nbsp; &nbsp;(possible split-brain) (Alexandru Coseru)<br> &nbsp; 3. Gluster infrastructure question (Heiko Kr?mer)<br> &nbsp; 4. Re: How reliable is XFS under Gluster? (Kal Black)<br> &nbsp; 5. Re: Gluster infrastructure question (Nux!)<br> &nbsp; 6. Scalability - File system or Object Store (Randy Breunling)<br> &nbsp; 7. Re: Scalability - File system or Object Store (Jay Vyas)<br> &nbsp; 8. Re: Gluster infrastructure question (Joe Julian)<br> &nbsp; 9. Re: [Gluster-devel] GlusterFest Test Weekend - 3.5 &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Test #1<br> &nbsp; &nbsp; &nbsp;(John Mark Walker)<br> &nbsp;10. Re: Gluster infrastructure question (Nux!)<br> &nbsp;11. compatibility between 3.3 and 3.4 (samuel)<br> &nbsp;12. Re: Gluster infrastructure question (bernhard glomm)<br> &nbsp;13. Re: Gluster infrastructure question (Ben Turner)<br> &nbsp;14. Re: Gluster infrastructure question (Ben Turner)<br> &nbsp;15. Re: Scalability - File system or Object Store (Jeff Darcy)<br> &nbsp;16. Re: Gluster infrastructure question (Dan Mons)<br> &nbsp;17. Re: Gluster infrastructure question (Joe Julian)<br> &nbsp;18. Re: Gluster infrastructure question (Dan Mons)<br> &nbsp;19. Re: [CentOS 6] Upgrade to the glusterfs version in base or in<br> &nbsp; &nbsp; &nbsp;glusterfs-epel (Diep Pham Van)<br> &nbsp;20. Where does the 'date' string in &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; '/var/log/glusterfs/gl.log'<br> &nbsp; &nbsp; &nbsp;come from? (harry mangalam)<br> &nbsp;21. Re: Where does the 'date' string in<br> &nbsp; &nbsp; &nbsp;'/var/log/glusterfs/gl.log' come from? (Sharuzzaman

Ahmat Raslan)<br> &nbsp;22. FW: Self Heal Issue GlusterFS 3.3.1 (Bobby Jacob)<br> &nbsp;23. Re: Self Heal Issue GlusterFS 3.3.1 (Joe Julian)<br> &nbsp;24. Pausing rebalance (Franco Broi)<br> &nbsp;25. Re: Where does the 'date' string in<br> &nbsp; &nbsp; &nbsp;'/var/log/glusterfs/gl.log' come from? (Vijay Bellur)<br> &nbsp;26. Re: Pausing rebalance (shishir gowda)<br> &nbsp;27. Re: replace-brick failing - transport.address-family not<br> &nbsp; &nbsp; &nbsp;specified (Vijay Bellur)<br> &nbsp;28. Re: [CentOS 6] Upgrade to the glusterfs version in base or in<br> &nbsp; &nbsp; &nbsp;glusterfs-epel (Vijay Bellur)<br> &nbsp;29. Re: Pausing rebalance (Franco Broi)<br> &nbsp;30. Re: replace-brick failing - transport.address-family not<br> &nbsp; &nbsp; &nbsp;specified (Vijay Bellur)<br> &nbsp;31. Re: Pausing rebalance (Kaushal M)<br> &nbsp;32. Re: Pausing rebalance (Franco Broi)<br> &nbsp;33. Re: Self Heal Issue GlusterFS 3.3.1 (Bobby Jacob)<br> &nbsp;34. Structure needs cleaning on some files (Johan Huysmans)<br> &nbsp;35. Re: replace-brick failing - &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; transport.address-family not<br> &nbsp; &nbsp; &nbsp;specified (Bernhard Glomm)<br> &nbsp;36. Re: Structure needs cleaning on some files (Johan Huysmans)<br> &nbsp;37. Re: Gluster infrastructure question (Heiko Kr?mer)<br> &nbsp;38. Re: Errors from PHP stat() on files and directories in a<br> &nbsp; &nbsp; &nbsp;glusterfs mount (Johan Huysmans)<br> &nbsp;39. Re: Gluster infrastructure question (Andrew Lau)<br> &nbsp;40. Re: replace-brick failing - transport.address-family not<br> &nbsp; &nbsp; &nbsp;specified (Vijay Bellur)<br> &nbsp;41. Re: Gluster - replica - Unable to self-heal contents of '/'<br> &nbsp; &nbsp; &nbsp;(possible split-brain) (Vijay Bellur)<br> &nbsp;42. Error after crash of Virtual Machine during &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; migration<br> &nbsp; &nbsp; &nbsp;(Mariusz Sobisiak)<br> &nbsp;43. Re: Structure needs cleaning on some files (Johan Huysmans)<br><br><br>----------------------------------------------------------------------<br><br>Message: 1<br>Date: Mon, 9 Dec 2013 14:12:22 +0100<br>From: Per Hallsmark &lt;per@hallsmark.se&gt;<br>To: gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Testing failover and recovery<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAPaVuL-DL8R3GBNzv9fMJq-rTOYCs=NufTf-B5V7xKpoNML+7Q@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Hello,<br><br>Interesting, we seems to be several users with issues regarding recovery<br>but there is no to little replies... ;-)<br><br>I did some more testing over the weekend. Same initial workload (two<br>glusterfs servers, one client that continuesly<br>updates a file with timestamps) and then two easy testcases:<br><br>1. one of the glusterfs servers is constantly rebooting (just a initscript<br>that sleeps for 60 seconds before issuing &quot;reboot&quot;)<br><br>2. similar to 1 but instead of rebooting itself, it is rebooting the other<br>glusterfs server so that the result is that they a server<br> &nbsp; &nbsp;comes up, wait for a bit and then rebooting the other server.<br><br>During the whole weekend this has progressed nicely. The client is running<br>all the time without issues and the glusterfs<br>that comes back (either only one or one of the servers, depending on the<br>testcase shown above) is actively getting into<br>sync and updates it's copy of the file.<br><br>So it seems to me that we need to look deeper in the recovery case (of<br>course, but it is interesting to know about the<br>nice&amp;easy usescases as well). I'm surprised that the recovery from

a<br>failover (to restore the rendundancy) isn't getting<br>higher attention here. Are we (and others that has difficulties in this<br>area) running a unusual usecase?<br><br>BR,<br>Per<br><br><br>On Wed, Dec 4, 2013 at 12:17 PM, Per Hallsmark &lt;per@hallsmark.se&gt;

wrote:<br><br>&gt; Hello,<br>&gt;<br>&gt; I've found GlusterFS to be an interesting project. Not so much experience<br>&gt; of it<br>&gt; (although from similar usecases with DRBD+NFS setups) so I setup some<br>&gt; testcase to try out failover and recovery.<br>&gt;<br>&gt; For this I have a setup with two glusterfs servers (each is a VM)

and one<br>&gt; client (also a VM).<br>&gt; I'm using GlusterFS 3.4 btw.<br>&gt;<br>&gt; The servers manages a gluster volume created as:<br>&gt;<br>&gt; gluster volume create testvol rep 2 transport tcp gs1:/export/vda1/brick<br>&gt; gs2:/export/vda1/brick<br>&gt; gluster volume start testvol<br>&gt; gluster volume set testvol network.ping-timeout 5<br>&gt;<br>&gt; Then the client mounts this volume as:<br>&gt; mount -t glusterfs gs1:/testvol /import/testvol<br>&gt;<br>&gt; Everything seems to work good in normal usecases, I can write/read

to the<br>&gt; volume, take servers down and up again etc.<br>&gt;<br>&gt; As a fault scenario, I'm testing a fault injection like this:<br>&gt;<br>&gt; 1. continuesly writing timestamps to a file on the volume from the

client.<br>&gt; It is automated in a smaller testscript like:<br>&gt; :~/glusterfs-test$ cat scripts/test-gfs-client.sh<br>&gt; #!/bin/sh<br>&gt;<br>&gt; gfs=/import/testvol<br>&gt;<br>&gt; while true; do<br>&gt; date +%s &gt;&gt; $gfs/timestamp.txt<br>&gt; ts=`tail -1 $gfs/timestamp.txt`<br>&gt; &nbsp;md5sum=`md5sum $gfs/timestamp.txt | cut -f1 -d&quot; &quot;`<br>&gt; echo &quot;Timestamp = $ts, md5sum = $md5sum&quot;<br>&gt; &nbsp;sleep 1<br>&gt; done<br>&gt; :~/glusterfs-test$<br>&gt;<br>&gt; As can be seen, the client is a quite simple user of the glusterfs

volume.<br>&gt; Low datarate and single user for example.<br>&gt;<br>&gt;<br>&gt; 2. disabling ethernet in one of the VM (ifconfig eth0 down) to simulate<br>&gt; like a broken network<br>&gt;<br>&gt; 3. After a short while, the failed server is brought alive again (ifconfig<br>&gt; eth0 up)<br>&gt;<br>&gt; Step 2 and 3 is also automated in a testscript like:<br>&gt;<br>&gt; :~/glusterfs-test$ cat scripts/fault-injection.sh<br>&gt; #!/bin/sh<br>&gt;<br>&gt; # fault injection script tailored for two glusterfs nodes named gs1

and gs2<br>&gt;<br>&gt; if [ &quot;$HOSTNAME&quot; == &quot;gs1&quot; ]; then<br>&gt; peer=&quot;gs2&quot;<br>&gt; else<br>&gt; peer=&quot;gs1&quot;<br>&gt; fi<br>&gt;<br>&gt; inject_eth_fault() {<br>&gt; echo &quot;network down...&quot;<br>&gt; ifconfig eth0 down<br>&gt; &nbsp;sleep 10<br>&gt; ifconfig eth0 up<br>&gt; echo &quot;... and network up again.&quot;<br>&gt; }<br>&gt;<br>&gt; recover() {<br>&gt; echo &quot;recovering from fault...&quot;<br>&gt; service glusterd restart<br>&gt; }<br>&gt;<br>&gt; while true; do<br>&gt; sleep 60<br>&gt; if [ ! -f /tmp/nofault ]; then<br>&gt; if ping -c 1 $peer; then<br>&gt; &nbsp;inject_eth_fault<br>&gt; recover<br>&gt; fi<br>&gt; fi<br>&gt; done<br>&gt; :~/glusterfs-test$<br>&gt;<br>&gt;<br>&gt; I then see that:<br>&gt;<br>&gt; A. This goes well first time, one server leaves the cluster and the

client<br>&gt; hang for like 8 seconds before beeing able to write to the volume

again.<br>&gt;<br>&gt; B. When the failed server comes back, I can check that from both servers<br>&gt; they see each other and &quot;gluster peer status&quot; shows they

believe the other<br>&gt; is in connected state.<br>&gt;<br>&gt; C. When the failed server comes back, it is not automatically seeking<br>&gt; active participation on syncing volume etc (the local storage timestamp<br>&gt; file isn't updated).<br>&gt;<br>&gt; D. If I do restart of glusterd service (service glusterd restart)

the &gt; failed node seems to get back like it was before. Not always though...

The<br>&gt; chance is higher if I have long time between fault injections (long

= 60<br>&gt; sec or so, with a forced faulty state of 10 sec)<br>&gt; With a period time of some minutes, I could have the cluster servicing

the<br>&gt; client OK for up to 8+ hours at least.<br>&gt; Shortening the period, I'm easily down to like 10-15 minutes.<br>&gt;<br>&gt; E. Sooner or later I enter a state where the two servers seems to

be up,<br>&gt; seeing it's peer (gluster peer status) and such but none is serving

the<br>&gt; volume to the client.<br>&gt; I've tried to &quot;heal&quot; the volume in different way but it

doesn't help.<br>&gt; Sometimes it is just that one of the timestamp copies in each of<br>&gt; the servers is ahead which is simpler but sometimes both the timestamp<br>&gt; files have added data at end that the other doesnt have.<br>&gt;<br>&gt; To the questions:<br>&gt;<br>&gt; * Is it so that from a design point of perspective, the choice in

the<br>&gt; glusterfs team is that one shouldn't rely soley on glusterfs daemons

beeing<br>&gt; able to &nbsp;recover from a faulty state? There is need for cluster

manager<br>&gt; services (like heartbeat for example) to be part? That would make<br>&gt; experience C understandable and one could then take heartbeat or similar<br>&gt; packages to start/stop services.<br>&gt;<br>&gt; * What would then be the recommended procedure to recover from a faulty<br>&gt; glusterfs node? (so that experience D and E is not happening)<br>&gt;<br>&gt; * What is the expected failover timing (of course depending on config,

but<br>&gt; say with a give ping timeout etc)?<br>&gt; &nbsp; and expected recovery timing (with similar dependency on config)?<br>&gt;<br>&gt; * What/how is glusterfs team testing to make sure that the failover,<br>&gt; recovery/healing functionality etc works?<br>&gt;<br>&gt; Any opinion if the testcase is bad is of course also very welcome.<br>&gt;<br>&gt; Best regards,<br>&gt; Per<br>&gt;<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/69c23114/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/69c23114/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 2<br>Date: Mon, 9 Dec 2013 15:51:31 +0200<br>From: &quot;Alexandru Coseru&quot; &lt;alex.coseru@simplus.ro&gt;<br>To: &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] Gluster - replica - Unable to self-heal<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

contents of &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

'/' (possible split-brain)<br>Message-ID: &lt;01fe01cef4e5$c3f2cb00$4bd86100$@coseru@simplus.ro&gt;<br>Content-Type: text/plain; charset=&quot;us-ascii&quot;<br><br>Hello,<br><br> <br><br>I'm trying to build a replica volume, on two servers.<br><br> <br><br>The servers are: &nbsp;blade6 and blade7. &nbsp;(another blade1 in the

peer, but with<br>no volumes)<br><br>The volume seems ok, but I cannot mount it from NFS.<br><br> <br><br>Here are some logs:<br><br> <br><br> <br><br>[root@blade6 stor1]# df -h<br><br>/dev/mapper/gluster_stor1 &nbsp;882G &nbsp;200M &nbsp;837G &nbsp; 1% /gluster/stor1<br><br> <br><br>[root@blade7 stor1]# df -h<br><br>/dev/mapper/gluster_fast &nbsp; 846G &nbsp;158G &nbsp;646G &nbsp;20% /gluster/stor_fast<br><br>/dev/mapper/gluster_stor1 &nbsp;882G &nbsp; 72M &nbsp;837G &nbsp; 1% /gluster/stor1<br><br> <br><br>[root@blade6 stor1]# pwd<br><br>/gluster/stor1<br><br>[root@blade6 stor1]# ls -lh<br><br>total 0<br><br> <br><br>[root@blade7 stor1]# pwd<br><br>/gluster/stor1<br><br>[root@blade7 stor1]# ls -lh<br><br>total 0<br><br> <br><br> <br><br>[root@blade6 stor1]# gluster volume info<br><br>Volume Name: stor_fast<br><br>Type: Distribute<br><br>Volume ID: ad82b554-8ff0-4903-be32-f8dcb9420f31<br><br>Status: Started<br><br>Number of Bricks: 1<br><br>Transport-type: tcp<br><br>Bricks:<br><br>Brick1: blade7.xen:/gluster/stor_fast<br><br>Options Reconfigured:<br><br>nfs.port: 2049<br><br> <br><br>Volume Name: stor1<br><br>Type: Replicate<br><br>Volume ID: 6bd88164-86c2-40f6-9846-b21e90303e73<br><br>Status: Started<br><br>Number of Bricks: 1 x 2 = 2<br><br>Transport-type: tcp<br><br>Bricks:<br><br>Brick1: blade7.xen:/gluster/stor1<br><br>Brick2: blade6.xen:/gluster/stor1<br><br>Options Reconfigured:<br><br>nfs.port: 2049<br><br> <br><br> <br><br>[root@blade7 stor1]# gluster volume info<br><br>Volume Name: stor_fast<br><br>Type: Distribute<br><br>Volume ID: ad82b554-8ff0-4903-be32-f8dcb9420f31<br><br>Status: Started<br><br>Number of Bricks: 1<br><br>Transport-type: tcp<br><br>Bricks:<br><br>Brick1: blade7.xen:/gluster/stor_fast<br><br>Options Reconfigured:<br><br>nfs.port: 2049<br><br> <br><br>Volume Name: stor1<br><br>Type: Replicate<br><br>Volume ID: 6bd88164-86c2-40f6-9846-b21e90303e73<br><br>Status: Started<br><br>Number of Bricks: 1 x 2 = 2<br><br>Transport-type: tcp<br><br>Bricks:<br><br>Brick1: blade7.xen:/gluster/stor1<br><br>Brick2: blade6.xen:/gluster/stor1<br><br>Options Reconfigured:<br><br>nfs.port: 2049<br><br> <br><br>[root@blade6 stor1]# gluster volume status<br><br>Status of volume: stor_fast<br><br>Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online &nbsp;Pid<br><br>----------------------------------------------------------------------------<br>--<br><br>Brick blade7.xen:/gluster/stor_fast &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;49152

&nbsp; Y &nbsp; &nbsp; &nbsp; 1742<br><br>NFS Server on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2049 &nbsp;

&nbsp;Y<br>20074<br><br>NFS Server on blade1.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; 2049 &nbsp; &nbsp;Y &nbsp; &nbsp; &nbsp; 22255<br><br>NFS Server on blade7.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; 2049 &nbsp; &nbsp;Y &nbsp; &nbsp; &nbsp; 7574<br><br> <br><br>There are no active volume tasks<br><br>Status of volume: stor1<br><br>Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online &nbsp;Pid<br><br>----------------------------------------------------------------------------<br>--<br><br>Brick blade7.xen:/gluster/stor1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;49154 &nbsp; Y &nbsp; &nbsp; &nbsp; 7562<br><br>Brick blade6.xen:/gluster/stor1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;49154 &nbsp; Y &nbsp; &nbsp; &nbsp; 20053<br><br>NFS Server on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2049 &nbsp;

&nbsp;Y<br>20074<br><br>Self-heal Daemon on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; N/A &nbsp; &nbsp; Y<br>20079<br><br>NFS Server on blade1.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; 2049 &nbsp; &nbsp;Y &nbsp; &nbsp; &nbsp; 22255<br><br>Self-heal Daemon on blade1.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; N/A &nbsp; &nbsp; Y &nbsp; &nbsp; &nbsp; 22260<br><br>NFS Server on blade7.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; 2049 &nbsp; &nbsp;Y &nbsp; &nbsp; &nbsp; 7574<br><br>Self-heal Daemon on blade7.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; N/A &nbsp; &nbsp; Y &nbsp; &nbsp; &nbsp; 7578<br><br> <br><br>There are no active volume tasks<br><br> <br><br>[root@blade7 stor1]# gluster volume status<br><br>Status of volume: stor_fast<br><br>Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online &nbsp;Pid<br><br>----------------------------------------------------------------------------<br>--<br><br>Brick blade7.xen:/gluster/stor_fast &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;49152 &nbsp; Y &nbsp; &nbsp; &nbsp; 1742<br><br>NFS Server on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2049 &nbsp; &nbsp;Y

&nbsp; &nbsp; &nbsp; 7574<br><br>NFS Server on blade6.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2049 &nbsp; &nbsp;Y

&nbsp; &nbsp; &nbsp; 20074<br><br>NFS Server on blade1.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; 2049 &nbsp; &nbsp;Y &nbsp; &nbsp; &nbsp; 22255<br><br> <br><br>There are no active volume tasks<br><br>Status of volume: stor1<br><br>Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online &nbsp;Pid<br><br>----------------------------------------------------------------------------<br>--<br><br>Brick blade7.xen:/gluster/stor1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;49154 &nbsp; Y &nbsp; &nbsp; &nbsp; 7562<br><br>Brick blade6.xen:/gluster/stor1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;49154 &nbsp; Y &nbsp; &nbsp; &nbsp; 20053<br><br>NFS Server on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2049 &nbsp;

&nbsp;Y &nbsp; &nbsp; &nbsp; 7574<br><br>Self-heal Daemon on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; N/A &nbsp; &nbsp; Y &nbsp;

&nbsp; &nbsp; 7578<br><br>NFS Server on blade1.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; 2049 &nbsp; &nbsp;Y &nbsp; &nbsp; &nbsp; 22255<br><br>Self-heal Daemon on blade1.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; N/A &nbsp; &nbsp; Y &nbsp; &nbsp; &nbsp; 22260<br><br>NFS Server on blade6.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2049 &nbsp; &nbsp;Y

&nbsp; &nbsp; &nbsp; 20074<br><br>Self-heal Daemon on blade6.xen &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;N/A &nbsp; &nbsp; Y &nbsp; &nbsp;

&nbsp; 20079<br><br> <br><br>There are no active volume tasks<br><br> <br><br> <br><br>[root@blade6 stor1]# gluster peer status<br><br>Number of Peers: 2<br><br> <br><br>Hostname: blade1.xen<br><br>Port: 24007<br><br>Uuid: 194a57a7-cb0e-43de-a042-0ac4026fd07b<br><br>State: Peer in Cluster (Connected)<br><br> <br><br>Hostname: blade7.xen<br><br>Port: 24007<br><br>Uuid: 574eb256-30d2-4639-803e-73d905835139<br><br>State: Peer in Cluster (Connected)<br><br> <br><br>[root@blade7 stor1]# gluster peer status<br><br>Number of Peers: 2<br><br> <br><br>Hostname: blade6.xen<br><br>Port: 24007<br><br>Uuid: a65cadad-ef79-4821-be41-5649fb204f3e<br><br>State: Peer in Cluster (Connected)<br><br> <br><br>Hostname: blade1.xen<br><br>Uuid: 194a57a7-cb0e-43de-a042-0ac4026fd07b<br><br>State: Peer in Cluster (Connected)<br><br> <br><br>[root@blade6 stor1]# gluster volume heal stor1 info<br><br>Gathering Heal info on volume stor1 has been successful<br><br> <br><br>Brick blade7.xen:/gluster/stor1<br><br>Number of entries: 0<br></font></tt><br><tt><font size=2>&nbsp;<br><br>Brick blade6.xen:/gluster/stor1<br><br>Number of entries: 0<br><br> <br><br>[root@blade7 stor1]# gluster volume heal stor1 info<br><br>Gathering Heal info on volume stor1 has been successful<br><br> <br><br>Brick blade7.xen:/gluster/stor1<br><br>Number of entries: 0<br><br> <br><br>Brick blade6.xen:/gluster/stor1<br><br>Number of entries: 0<br><br> <br><br> <br><br>When I'm trying to mount the volume with NFS, I have the following errors:<br><br> <br><br>[2013-12-09 13:20:52.066978] E<br>[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]<br>0-stor1-replicate-0: Unable to self-heal contents of '/' (possible<br>split-brain). Please delete the file from all but the preferred subvolume.-<br>Pending matrix: &nbsp;[ [ 0 2 ] [ 2 0 ] ]<br><br>[2013-12-09 13:20:52.067386] E<br>[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk]<br>0-stor1-replicate-0: background &nbsp;meta-data self-heal failed on /<br><br>[2013-12-09 13:20:52.067452] E [mount3.c:290:mnt3svc_lookup_mount_cbk]<br>0-nfs: error=Input/output error<br><br>[2013-12-09 13:20:53.092039] E<br>[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]<br>0-stor1-replicate-0: Unable to self-heal contents of '/' (possible<br>split-brain). Please delete the file from all but the preferred subvolume.-<br>Pending matrix: &nbsp;[ [ 0 2 ] [ 2 0 ] ]<br><br>[2013-12-09 13:20:53.092497] E<br>[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk]<br>0-stor1-replicate-0: background &nbsp;meta-data self-heal failed on /<br><br>[2013-12-09 13:20:53.092559] E [mount3.c:290:mnt3svc_lookup_mount_cbk]<br>0-nfs: error=Input/output error<br><br> <br><br>What I'm doing wrong ?<br><br>PS: &nbsp;Volume stor_fast works like a charm.<br><br> <br><br>Best Regards,<br><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/b0b21677/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/b0b21677/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 3<br>Date: Mon, 09 Dec 2013 14:18:28 +0100<br>From: Heiko Kr?mer &lt;hkraemer@anynines.de&gt;<br>To: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] Gluster infrastructure question<br>Message-ID: &lt;52A5C324.4090408@anynines.de&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>-----BEGIN PGP SIGNED MESSAGE-----<br>Hash: SHA1<br><br>Heyho guys,<br><br>I'm running since years glusterfs in a small environment without big<br>problems.<br><br>Now I'm going to use glusterFS for a bigger cluster but I've some<br>questions :)<br><br>Environment:<br>* 4 Servers<br>* 20 x 2TB HDD, each<br>* Raidcontroller<br>* Raid 10<br>* 4x bricks =&gt; Replicated, Distributed volume<br>* Gluster 3.4<br><br>1)<br>I'm asking me, if I can delete the raid10 on each server and create<br>for each HDD a separate brick.<br>In this case have a volume 80 Bricks so 4 Server x 20 HDD's. Is there<br>any experience about the write throughput in a production system with<br>many of bricks like in this case? In addition i'll get double of HDD<br>capacity.<br><br>2)<br>I've heard a talk about glusterFS and out scaling. The main point was<br>if more bricks are in use, the scale out process will take a long<br>time. The problem was/is the Hash-Algo. So I'm asking me how is it if<br>I've one very big brick (Raid10 20TB on each server) or I've much more<br>bricks, what's faster and is there any issues?<br>Is there any experiences ?<br><br>3)<br>Failover of a HDD is for a raid controller with HotSpare HDD not a big<br>deal. Glusterfs will rebuild automatically if a brick fails and there<br>are no data present, this action will perform a lot of network traffic<br>between the mirror bricks but it will handle it equal as the raid<br>controller right ?<br><br><br><br>Thanks and cheers<br>Heiko<br><br><br><br>- -- <br>Anynines.com<br><br>Avarteq GmbH<br>B.Sc. Informatik<br>Heiko Kr?mer<br>CIO<br>Twitter: @anynines<br><br>- ----<br>Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian Fischer<br>Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.: DE262633168<br>Sitz: Saarbr?cken<br>-----BEGIN PGP SIGNATURE-----<br>Version: GnuPG v1.4.14 (GNU/Linux)<br>Comment: Using GnuPG with Thunderbird - </font></tt><a href=http://www.enigmail.net/><tt><font size=2>http://www.enigmail.net/</font></tt></a><tt><font size=2><br><br>iQEcBAEBAgAGBQJSpcMfAAoJELxFogM4ixOF/ncH/3L9DvOWHrF0XBqCgeT6QQ6B<br>lDwtXiD9xoznht0Zs2S9LA9Z7r2l5/fzMOUSOawEMv6M16Guwq3gQ1lClUi4Iwj0<br>GKKtYQ6F4aG4KXHY4dlu1QKT5OaLk8ljCQ47Tc9aAiJMhfC1/IgQXOslFv26utdJ<br>N9jxiCl2+r/tQvQRw6mA4KAuPYPwOV+hMtkwfrM4UsIYGGbkNPnz1oqmBsfGdSOs<br>TJh6+lQRD9KYw72q3I9G6ZYlI7ylL9Q7vjTroVKH232pLo4G58NLxyvWvcOB9yK6<br>Bpf/gRMxFNKA75eW5EJYeZ6EovwcyCAv7iAm+xNKhzsoZqbBbTOJxS5zKm4YWoY=<br>=bDly<br>-----END PGP SIGNATURE-----<br>-------------- next part --------------<br>A non-text attachment was scrubbed...<br>Name: hkraemer.vcf<br>Type: text/x-vcard<br>Size: 277 bytes<br>Desc: not available<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/d70112ef/attachment-0001.vcf"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/d70112ef/attachment-0001.vcf</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 4<br>Date: Mon, 9 Dec 2013 09:51:41 -0500<br>From: Kal Black &lt;kaloblak@gmail.com&gt;<br>To: Paul Robert Marino &lt;prmarino1@gmail.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] How reliable is XFS under Gluster?<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CADZk1LMcRjn=qG-mWbc5S8SeJtkFB2AZica2NKuU3Z7mwQ=2kQ@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Thank you all for the wonderful input,<br>I haven't used extensively XFS so far and my concerns primarily came from<br>reading an article (mostly the discussion after it) by Jonathan Corbetrom<br>on LWN (</font></tt><a href=http://lwn.net/Articles/476263/><tt><font size=2>http://lwn.net/Articles/476263/</font></tt></a><tt><font size=2>)

and another one </tt><a href="http://toruonu.blogspot.ca/2012/12/xfs-vs-ext4.html"><tt><font size=2>http://toruonu.blogspot.ca/2012/12/xfs-vs-ext4.html</tt></a><tt><font size=2>.

They are both<br>relatively recent and I was under the impression the XFS still has<br>problems, in certain cases of power loss, where the metadata and the actual<br>data are not being in sync, which might lead existing data being corrupted.<br>But again, like Paul Robert Marino pointed out, choosing a right IO<br>scheduler might greatly reduce the risk of this to happen.<br><br><br>On Sun, Dec 8, 2013 at 11:04 AM, Paul Robert Marino &lt;prmarino1@gmail.com&gt;wrote:<br><br>&gt; XFS is fine Ive been using it on various distros in production for<br>&gt; over a decade now and I've rarely had any problems with it and when

I &gt; have they have been trivial to fix which is something I honestly cant &gt; say about ext3 or ext4. &gt; &gt; Usually when there is a power failure during a write if the &gt; transaction wasn't completely committed to the disk it is rolled back &gt; via the journal.the one exception to this is when you have a battery &gt; backed cache where the battery discharges before power is restored,

or<br>&gt; a very cheap consumer grade disk which uses its cache for writes and<br>&gt; lies about the sync state.<br>&gt; in either of these scenarios any file system will have problems.<br>&gt;<br>&gt; Out of any of the filesystems Ive worked with in general XFS handles<br>&gt; the battery discharge senario the cleanest and is the easiest to<br>&gt; recover.<br>&gt; if you have the second scenario with the cheap disks with a cache

that<br>&gt; lies nothing will help you not even a fsync because the hardware lies.<br>&gt; Also the subject of fsync is a little more complicated than most<br>&gt; people think there are several kinds of fsync and each behaves<br>&gt; differently on different filesystems. PostgreSQL has documentation<br>&gt; about it here<br>&gt; </font></tt><a href="http://www.postgresql.org/docs/9.1/static/runtime-config-wal.html"><tt><font size=2>http://www.postgresql.org/docs/9.1/static/runtime-config-wal.html</font></tt></a><tt><font size=2><br>&gt; looks at wal_sync_method if you would like to have a better about

how<br>&gt; fsync works without getting too deep into the subject.<br>&gt;<br>&gt; By the way most apps don't need to do fsyncs and it would bring your<br>&gt; system to a crawl if they all did so take people saying<br>&gt; all programs should fsync with a grain of salt.<br>&gt;<br>&gt; In most cases when these problems come up its really that they didn't<br>&gt; set the right IO scheduler for what the server does. For example CFQ<br>&gt; which is the EL default can leave your write in ram cache for quite

a<br>&gt; while before sending it to disk in an attempt to optimize your IO;<br>&gt; however the deadline scheduler will attempt to optimize your IO but<br>&gt; will predictably sync it to disk after a period of time regardless

of<br>&gt; whether it was able to fully optimize it or not. Also there is noop<br>&gt; which does no optimization at all and leaves every thing to the<br>&gt; hardware, this is common and recommended for VM's and there is some<br>&gt; argument to use it with high end raid controllers for things like<br>&gt; financial data where you need to absolutely ensure the write happen<br>&gt; ASAP because there may be fines or other large penalties if you loose<br>&gt; any data.<br>&gt;<br>&gt;<br>&gt;<br>&gt; On Sat, Dec 7, 2013 at 3:04 AM, Franco Broi &lt;Franco.Broi@iongeo.com&gt;<br>&gt; wrote:<br>&gt; &gt; Been using ZFS for about 9 months and am about to add as other

400TB, no<br>&gt; &gt; issues so far.<br>&gt; &gt;<br>&gt; &gt; On 7 Dec 2013 04:23, Brian Foster &lt;bfoster@redhat.com&gt;

wrote: &gt; &gt; On 12/06/2013 01:57 PM, Kal Black wrote: &gt; &gt;&gt; Hello, &gt; &gt;&gt; I am in the point of picking up a FS for new brick nodes.

I was used to<br>&gt; &gt;&gt; like and use ext4 until now but I recently red for an issue

introduced<br>&gt; by<br>&gt; &gt;&gt; a<br>&gt; &gt;&gt; patch in ext4 that breaks the distributed translator. In

the same time,<br>&gt; it<br>&gt; &gt;&gt; looks like the recommended FS for a brick is no longer ext4

but XFS &gt; which &gt; &gt;&gt; apparently will also be the default FS in the upcoming RedHat7.

On the<br>&gt; &gt;&gt; other hand, XFS is being known as a file system that can

be easily<br>&gt; &gt;&gt; corrupted (zeroing files) in case of a power failure. Supporters

of the<br>&gt; &gt;&gt; file system claim that this should never happen if an application

has<br>&gt; been<br>&gt; &gt;&gt; properly coded (properly committing/fsync-ing data to storage)

and the<br>&gt; &gt;&gt; storage itself has been properly configured (disk cash disabled

on &gt; &gt;&gt; individual disks and battery backed cache used on the controllers).

My<br>&gt; &gt;&gt; question is, should I be worried about losing data in a power

failure or<br>&gt; &gt;&gt; similar scenarios (or any) using GlusterFS and XFS? Are there

best<br>&gt; &gt;&gt; practices for setting up a Gluster brick + XFS? Has the ext4

issue been<br>&gt; &gt;&gt; reliably fixed? (my understanding is that this will be impossible

unless<br>&gt; &gt;&gt; ext4 isn't being modified to allow popper work with Gluster)<br>&gt; &gt;&gt;<br>&gt; &gt;<br>&gt; &gt; Hi Kal,<br>&gt; &gt;<br>&gt; &gt; You are correct in that Red Hat recommends using XFS for gluster

bricks.<br>&gt; &gt; I'm sure there are plenty of ext4 (and other fs) users as well,

so other<br>&gt; &gt; users should chime in as far as real experiences with various

brick<br>&gt; &gt; filesystems goes. Also, I believe the dht/ext issue has been

resolved<br>&gt; &gt; for some time now.<br>&gt; &gt;<br>&gt; &gt; With regard to &quot;XFS zeroing files on power failure,&quot;

I'd suggest you<br>&gt; &gt; check out the following blog post:<br>&gt; &gt;<br>&gt; &gt;<br>&gt; </font></tt><a href="http://sandeen.net/wordpress/computers/xfs-does-not-null-files-and-requires-no-flux/"><tt><font size=2>http://sandeen.net/wordpress/computers/xfs-does-not-null-files-and-requires-no-flux/</font></tt></a><tt><font size=2><br>&gt; &gt;<br>&gt; &gt; My cursory understanding is that there were apparently situations

where<br>&gt; &gt; the inode size of a recently extended file would be written to

the log<br>&gt; &gt; before the actual extending data is written to disk, thus creating

a<br>&gt; &gt; crash window where the updated size would be seen, but not the

actual<br>&gt; &gt; data. In other words, this isn't a &quot;zeroing files&quot;

behavior in as much<br>&gt; &gt; as it is an ordering issue with logging the inode size. This

is probably<br>&gt; &gt; why you've encountered references to fsync(), because with the

fix your<br>&gt; &gt; data is still likely lost (unless/until you've run an fsync to

flush to<br>&gt; &gt; disk), you just shouldn't see the extended inode size unless

the actual<br>&gt; &gt; data made it to disk.<br>&gt; &gt;<br>&gt; &gt; Also note that this was fixed in 2007. ;)<br>&gt; &gt;<br>&gt; &gt; Brian<br>&gt; &gt;<br>&gt; &gt;&gt; Best regards<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; _______________________________________________<br>&gt; &gt;&gt; Gluster-users mailing list<br>&gt; &gt;&gt; Gluster-users@gluster.org<br>&gt; &gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; &gt;&gt;<br>&gt; &gt;<br>&gt; &gt; _______________________________________________<br>&gt; &gt; Gluster-users mailing list<br>&gt; &gt; Gluster-users@gluster.org<br>&gt; &gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; &gt;<br>&gt; &gt; ________________________________<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; This email and any files transmitted with it are confidential

and are<br>&gt; &gt; intended solely for the use of the individual or entity to whom

they are<br>&gt; &gt; addressed. If you are not the original recipient or the person<br>&gt; responsible<br>&gt; &gt; for delivering the email to the intended recipient, be advised

that you<br>&gt; have<br>&gt; &gt; received this email in error, and that any use, dissemination,<br>&gt; forwarding,<br>&gt; &gt; printing, or copying of this email is strictly prohibited. If

you<br>&gt; received<br>&gt; &gt; this email in error, please immediately notify the sender and

delete the<br>&gt; &gt; original.<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; _______________________________________________<br>&gt; &gt; Gluster-users mailing list<br>&gt; &gt; Gluster-users@gluster.org<br>&gt; &gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/4b56a323/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/4b56a323/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 5<br>Date: Mon, 09 Dec 2013 15:44:24 +0000<br>From: Nux! &lt;nux@li.nux.ro&gt;<br>To: gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID: &lt;9775f8114ebbc392472010f2d9bdf432@li.nux.ro&gt;<br>Content-Type: text/plain; charset=UTF-8; format=flowed<br><br>On 09.12.2013 13:18, Heiko Kr?mer wrote:<br>&gt; 1)<br>&gt; I'm asking me, if I can delete the raid10 on each server and create<br>&gt; for each HDD a separate brick.<br>&gt; In this case have a volume 80 Bricks so 4 Server x 20 HDD's. Is there<br>&gt; any experience about the write throughput in a production system with<br>&gt; many of bricks like in this case? In addition i'll get double of HDD<br>&gt; capacity.<br><br>I have found problems with bricks to be disruptive whereas replacing a

<br>RAID member is quite trivial. I would recommend against dropping RAID.<br><br>&gt; 3)<br>&gt; Failover of a HDD is for a raid controller with HotSpare HDD not a

big<br>&gt; deal. Glusterfs will rebuild automatically if a brick fails and there<br>&gt; are no data present, this action will perform a lot of network traffic<br>&gt; between the mirror bricks but it will handle it equal as the raid<br>&gt; controller right ?<br><br>Gluster will not &quot;rebuild automatically&quot; a brick, you will need

to  manually add/remove it. Additionally, if a brick goes bad gluster won't do anything about it,  the affected volumes will just slow down or stop working at all. Again, my advice is KEEP THE RAID and set up good monitoring of drives.

<br>:)<br><br>HTH<br>Lucian<br><br>-- <br>Sent from the Delta quadrant using Borg technology!<br><br>Nux!<br></font></tt><a href=www.nux.ro><tt><font size=2>www.nux.ro</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 6<br>Date: Mon, 9 Dec 2013 07:57:47 -0800<br>From: Randy Breunling &lt;rbreunling@gmail.com&gt;<br>To: gluster-users@gluster.org<br>Cc: Randy Breunling &lt;rbreunling@gmail.com&gt;<br>Subject: [Gluster-users] Scalability - File system or Object Store<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAJwwApQ5-SvboWV_iRGC+HJSuT25xSoz_9CBJfGDmpqT4tDJzw@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>&gt;From any experience...which has shown to scale better...a file system

or an<br>object store?<br><br>--Randy<br>San Jose CA<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/dcf7491e/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/dcf7491e/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 7<br>Date: Mon, 9 Dec 2013 11:07:58 -0500<br>From: Jay Vyas &lt;jayunit100@gmail.com&gt;<br>To: Randy Breunling &lt;rbreunling@gmail.com&gt;<br>Cc: &quot;Gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Scalability - File system or Object Store<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAAu13zE4kYJ1Dt9ypOMt=M=ps7QfyPSn4LSqZ3YLYBnW5pE4yA@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>in object stores you sacrifice the consistency gauranteed by filesystems<br>for **higher** availability. &nbsp; &nbsp; probably by &quot;scale&quot;

you mean higher<br>availability, so... &nbsp;the answer is probably object storage.<br><br>That said, gluster is an interesting file system in that it is<br>&quot;object-like&quot; --- it is really fast for lookups.... and so if

you aren't<br>really sure you need objects, you might be able to do just fine with<br>gluster out of the box.<br><br>One really cool idea that is permeating the gluster community nowadays

is<br>this &quot;UFO&quot; concept, -- you can easily start with regular gluster,

and then<br>layer an object store on top at a later date if you want to &nbsp;sacrifice<br>posix operations for (even) higher availability.<br><br>&quot;Unified File and Object Storage - Unified file and object storage

allows<br>admins to utilize the same data store for both POSIX-style mounts as well<br>as S3 or Swift-compatible APIs.&quot; &nbsp; (from<br></font></tt><a href=http://gluster.org/community/documentation/index.php/3.3beta><tt><font size=2>http://gluster.org/community/documentation/index.php/3.3beta</font></tt></a><tt><font size=2>)<br><br><br>On Mon, Dec 9, 2013 at 10:57 AM, Randy Breunling &lt;rbreunling@gmail.com&gt;wrote:<br><br>&gt; From any experience...which has shown to scale better...a file system

or<br>&gt; an object store?<br>&gt;<br>&gt; --Randy<br>&gt; San Jose CA<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br><br><br><br>-- <br>Jay Vyas<br></font></tt><a href=http://jayunit100.blogspot.com/><tt><font size=2>http://jayunit100.blogspot.com</font></tt></a><tt><font size=2><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/e46cf569/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/e46cf569/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 8<br>Date: Mon, 09 Dec 2013 08:09:24 -0800<br>From: Joe Julian &lt;joe@julianfamily.org&gt;<br>To: Nux! &lt;nux@li.nux.ro&gt;,gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID: &lt;698ab788-9f27-44a6-bd98-a53eb25f4573@email.android.com&gt;<br>Content-Type: text/plain; charset=UTF-8<br><br><br><br>Nux! &lt;nux@li.nux.ro&gt; wrote:<br>&gt;On 09.12.2013 13:18, Heiko Kr?mer wrote:<br>&gt;&gt; 1)<br>&gt;&gt; I'm asking me, if I can delete the raid10 on each server and create<br>&gt;&gt; for each HDD a separate brick.<br>&gt;&gt; In this case have a volume 80 Bricks so 4 Server x 20 HDD's. Is

there<br>&gt;&gt; any experience about the write throughput in a production system

with<br>&gt;&gt; many of bricks like in this case? In addition i'll get double

of HDD<br>&gt;&gt; capacity.<br>&gt;<br>&gt;I have found problems with bricks to be disruptive whereas replacing

a <br>&gt;RAID member is quite trivial. I would recommend against dropping RAID.<br>&gt;<br><br>Brick disruption has been addressed in 3.4.<br><br>&gt;&gt; 3)<br>&gt;&gt; Failover of a HDD is for a raid controller with HotSpare HDD not

a<br>&gt;big<br>&gt;&gt; deal. Glusterfs will rebuild automatically if a brick fails and

there<br>&gt;&gt; are no data present, this action will perform a lot of network<br>&gt;traffic<br>&gt;&gt; between the mirror bricks but it will handle it equal as the raid<br>&gt;&gt; controller right ?<br>&gt;<br>&gt;Gluster will not &quot;rebuild automatically&quot; a brick, you will

need to <br>&gt;manually add/remove it.<br>Not exactly, but you will have to manually add an attribute and &quot;heal...full&quot;

to re-mirror the replacement.<br><br>&gt;Additionally, if a brick goes bad gluster won't do anything about it,

&gt;the affected volumes will just slow down or stop working at all. &gt; Again, addressed in 3.4. &gt;Again, my advice is KEEP THE RAID and set up good monitoring of drives. &gt; I'm not arguing for or against RAID. It's another tool in our tool box.

I, personally, use JBOD. Our use case has a lot of different files being

used by different clients. JBOD maximizes our use of cache.<br><br><br><br>------------------------------<br><br>Message: 9<br>Date: Mon, 9 Dec 2013 11:28:05 -0500 (EST)<br>From: John Mark Walker &lt;johnmark@gluster.org&gt;<br>To: &quot;Kaleb S. KEITHLEY&quot; &lt;kkeithle@redhat.com&gt;<br>Cc: &quot;Gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Gluster Devel &lt;gluster-devel@nongnu.org&gt;<br>Subject: Re: [Gluster-users] [Gluster-devel] GlusterFest Test Weekend<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

- 3.5 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Test #1<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;1654421306.26844542.1386606485161.JavaMail.root@redhat.com&gt;<br>Content-Type: text/plain; charset=utf-8<br><br>Incidentally, we're wrapping this up today. If you want to be included

in the list of swag-receivers (t-shirt, USB car charger, and stickers),

you still have a couple of hours to file a bug and have it verified by

the dev team.<br><br>Thanks, everyone :)<br><br>-JM<br><br><br>----- Original Message -----<br>&gt; On 12/05/2013 09:31 PM, John Mark Walker wrote:<br>&gt; &gt; Greetings,<br>&gt; &gt;<br>&gt; &gt; If you've been keeping up with our weekly meetings and the 3.5

planning<br>&gt; &gt; page, then you know that tomorrow, December 6, is the first testing

&quot;day&quot;<br>&gt; &gt; for 3.5. But since this is a Friday, we're going to make the

party last<br>&gt; &gt; all weekend, through mid-day Monday.<br>&gt; &gt;<br>&gt; <br>&gt; YUM repos with 3.5.0qa3 RPMs for EPEL-6 and Fedora 18, 19, and 20

&nbsp;are<br>&gt; available at<br>&gt; </font></tt><a href="http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.5.0qa3/"><tt><font size=2>http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.5.0qa3/</font></tt></a><tt><font size=2><br>&gt; <br>&gt; <br>&gt; --<br>&gt; <br>&gt; Kaleb<br>&gt; <br>&gt; _______________________________________________<br>&gt; Gluster-devel mailing list<br>&gt; Gluster-devel@nongnu.org<br>&gt; </font></tt><a href="https://lists.nongnu.org/mailman/listinfo/gluster-devel"><tt><font size=2>https://lists.nongnu.org/mailman/listinfo/gluster-devel</font></tt></a><tt><font size=2><br>&gt; <br><br><br>------------------------------<br><br>Message: 10<br>Date: Mon, 09 Dec 2013 16:43:42 +0000<br>From: Nux! &lt;nux@li.nux.ro&gt;<br>To: Joe Julian &lt;joe@julianfamily.org&gt;<br>Cc: gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID: &lt;b48aa7ed1b14432fc4047c934320e941@li.nux.ro&gt;<br>Content-Type: text/plain; charset=UTF-8; format=flowed<br><br>On 09.12.2013 16:09, Joe Julian wrote:<br>&gt;&gt; <br>&gt; <br>&gt; Brick disruption has been addressed in 3.4.<br><br>Good to know! What exactly happens when the brick goes unresponsive?<br><br><br>&gt;&gt; Additionally, if a brick goes bad gluster won't do anything about

it,<br>&gt;&gt; the affected volumes will just slow down or stop working at all.<br>&gt;&gt; <br>&gt; <br>&gt; Again, addressed in 3.4.<br><br>How? What is the expected behaviour now?<br><br>Thanks!<br><br>-- <br>Sent from the Delta quadrant using Borg technology!<br><br>Nux!<br></font></tt><a href=www.nux.ro><tt><font size=2>www.nux.ro</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 11</font></tt><br><tt><font size=2>Date: Mon, 9 Dec 2013 18:03:59 +0100<br>From: samuel &lt;samu60@gmail.com&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] compatibility between 3.3 and 3.4<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAOg=WDc-JT=CfqE39qWSPTjP2OqKj4L_oCfDG8icQKVTpi+0JQ@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Hi all,<br><br>We're playing around with new versions and uprading options. We currently<br>have a 2x2x2 stripped-distributed-replicated volume based on 3.3.0 and<br>we're planning to upgrade to 3.4 version.<br><br>We've tried upgrading fist the clients and we've tried with 3.4.0, 3.4.1<br>and 3.4.2qa2 but all of them caused the same error:<br><br>Failed to get stripe-size<br><br>So it seems as if 3.4 clients are not compatible to 3.3 volumes. Is this<br>assumtion right?<br><br>Is there any procedure to upgrade the gluster from 3.3 to 3.4 without<br>stopping the service?<br>Where are the compatibility limitations between these 2 versions?<br><br>Any hint or link to documentation would be highly appreciated.<br><br>Thank you in advance,<br>Samuel.<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/cec50893/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/cec50893/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 12<br>Date: Mon, 9 Dec 2013 19:52:57 +0100<br>From: bernhard glomm &lt;bernhard.glomm@ecologic.eu&gt;<br>To: Heiko Kr?mer &lt;hkraemer@anynines.de&gt;<br>Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID: &lt;E2AB54DC-4D82-4734-9BE2-E7B0B700BBA3@ecologic.eu&gt;<br>Content-Type: text/plain; charset=&quot;windows-1252&quot;<br><br>Hi Heiko,<br><br>some years ago I had to deliver a reliable storage that should be easy

to grow in size over time.<br>For that I was in close contact with<br>presto prime who produced a lot of interesting research results accessible

to the public.<br></font></tt><a href=http://www.prestoprime.org/project/public.en.html><tt><font size=2>http://www.prestoprime.org/project/public.en.html</font></tt></a><tt><font size=2><br>what was striking me was the general concern of how and when and with which

pattern hard drives will fail, and the rebuilding time in case a &quot;big&quot; (i.e. 2TB+) drive fails.

(one of the papers at pp was dealing in detail with that)<br>From that background my approach was to build relatively small raid6 bricks

(9 * 2 TB + 1 Hot-Spare)<br>and connect them together with a distributed glusterfs.<br>I never experienced any problems with that and felt quite comfortable about

it.<br>That was for just a lot of big file data exported via samba.<br>At the same time I used another, mirrored, glusterfs as a storage backend

for <br>my VM-images, same there, no problem and much less hazel and headache than

drbd and ocfs2 <br>which I run on another system.<br>hth<br>best <br><br>Bernhard<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;Bernhard Glomm<br>IT Administration<br><br>Phone: &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;+49 (30) 86880 134<br>Fax: &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;+49 (30) 86880 100<br>Skype: &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;bernhard.glomm.ecologic<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br>Ecologic Institut gemeinn?tzige GmbH | Pfalzburger Str. 43/44 | 10717 Berlin

| Germany<br>GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.:

DE811963464<br>Ecologic? is a Trade Mark (TM) of Ecologic Institut gemeinn?tzige GmbH<br><br>On Dec 9, 2013, at 2:18 PM, Heiko Kr?mer &lt;hkraemer@anynines.de&gt; wrote:<br><br>&gt; Signed PGP part<br>&gt; Heyho guys,<br>&gt; <br>&gt; I'm running since years glusterfs in a small environment without big<br>&gt; problems.<br>&gt; <br>&gt; Now I'm going to use glusterFS for a bigger cluster but I've some<br>&gt; questions :)<br>&gt; <br>&gt; Environment:<br>&gt; * 4 Servers<br>&gt; * 20 x 2TB HDD, each<br>&gt; * Raidcontroller<br>&gt; * Raid 10<br>&gt; * 4x bricks =&gt; Replicated, Distributed volume<br>&gt; * Gluster 3.4<br>&gt; <br>&gt; 1)<br>&gt; I'm asking me, if I can delete the raid10 on each server and create<br>&gt; for each HDD a separate brick.<br>&gt; In this case have a volume 80 Bricks so 4 Server x 20 HDD's. Is there<br>&gt; any experience about the write throughput in a production system with<br>&gt; many of bricks like in this case? In addition i'll get double of HDD<br>&gt; capacity.<br>&gt; <br>&gt; 2)<br>&gt; I've heard a talk about glusterFS and out scaling. The main point

was<br>&gt; if more bricks are in use, the scale out process will take a long<br>&gt; time. The problem was/is the Hash-Algo. So I'm asking me how is it

if<br>&gt; I've one very big brick (Raid10 20TB on each server) or I've much

more<br>&gt; bricks, what's faster and is there any issues?<br>&gt; Is there any experiences ?<br>&gt; <br>&gt; 3)<br>&gt; Failover of a HDD is for a raid controller with HotSpare HDD not a

big &gt; deal. Glusterfs will rebuild automatically if a brick fails and there &gt; are no data present, this action will perform a lot of network traffic &gt; between the mirror bricks but it will handle it equal as the raid &gt; controller right ? &gt;  &gt;  &gt;  &gt; Thanks and cheers &gt; Heiko &gt;  &gt;  &gt;  &gt; -- &gt; Anynines.com &gt;  &gt; Avarteq GmbH &gt; B.Sc. Informatik &gt; Heiko Kr?mer &gt; CIO &gt; Twitter: @anynines &gt;  &gt; ---- &gt; Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian Fischer &gt; Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.: DE262633168 &gt; Sitz: Saarbr?cken &gt;  &gt; &lt;hkraemer.vcf&gt;_______________________________________________ &gt; Gluster-users mailing list &gt; Gluster-users@gluster.org &gt; </tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</tt></a><tt><font size=2> -------------- next part -------------- An HTML attachment was scrubbed... URL: &lt;</tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/c95b9cc8/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/c95b9cc8/attachment-0001.html</tt></a><tt><font size=2>&gt; -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 495 bytes Desc: Message signed with OpenPGP using GPGMail URL: &lt;</tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/c95b9cc8/attachment-0001.sig"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/c95b9cc8/attachment-0001.sig</tt></a><tt><font size=2>&gt; ------------------------------ Message: 13 Date: Mon, 9 Dec 2013 14:26:45 -0500 (EST) From: Ben Turner &lt;bturner@redhat.com&gt; To: Heiko Kr?mer &lt;hkraemer@anynines.de&gt; Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt; Subject: Re: [Gluster-users] Gluster infrastructure question Message-ID: &lt;124648027.2334242.1386617205234.JavaMail.root@redhat.com&gt; Content-Type: text/plain; charset=utf-8 ----- Original Message ----- &gt; From: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt; &gt; To: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt; &gt; Sent: Monday, December 9, 2013 8:18:28 AM &gt; Subject: [Gluster-users] Gluster infrastructure question &gt;  &gt; -----BEGIN PGP SIGNED MESSAGE----- &gt; Hash: SHA1 &gt;  &gt; Heyho guys, &gt;  &gt; I'm running since years glusterfs in a small environment without big &gt; problems. &gt;  &gt; Now I'm going to use glusterFS for a bigger cluster but I've some &gt; questions :) &gt;  &gt; Environment: &gt; * 4 Servers &gt; * 20 x 2TB HDD, each &gt; * Raidcontroller &gt; * Raid 10 &gt; * 4x bricks =&gt; Replicated, Distributed volume &gt; * Gluster 3.4 &gt;  &gt; 1) &gt; I'm asking me, if I can delete the raid10 on each server and create &gt; for each HDD a separate brick. &gt; In this case have a volume 80 Bricks so 4 Server x 20 HDD's. Is there &gt; any experience about the write throughput in a production system with &gt; many of bricks like in this case? In addition i'll get double of HDD &gt; capacity. Have a look at: </tt><a href="http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf</tt></a><tt><font size=2> Specifically: ? RAID arrays ? More RAID LUNs for better concurrency ? For RAID6, 256-KB stripe size I use a single RAID 6 that is divided into several LUNs for my bricks.

&nbsp;For example, on my Dell servers(with PERC6 RAID controllers) each

server has 12 disks that I put into raid 6. &nbsp;Then I break the RAID

6 into 6 LUNs and create a new PV/VG/LV for each brick. &nbsp;From there

I follow the recommendations listed in the presentation.<br><br>HTH!<br><br>-b<br> <br>&gt; 2)<br>&gt; I've heard a talk about glusterFS and out scaling. The main point

was<br>&gt; if more bricks are in use, the scale out process will take a long<br>&gt; time. The problem was/is the Hash-Algo. So I'm asking me how is it

if<br>&gt; I've one very big brick (Raid10 20TB on each server) or I've much

more<br>&gt; bricks, what's faster and is there any issues?<br>&gt; Is there any experiences ?<br>&gt; <br>&gt; 3)<br>&gt; Failover of a HDD is for a raid controller with HotSpare HDD not a

big<br>&gt; deal. Glusterfs will rebuild automatically if a brick fails and there<br>&gt; are no data present, this action will perform a lot of network traffic<br>&gt; between the mirror bricks but it will handle it equal as the raid<br>&gt; controller right ?<br>&gt; <br>&gt; <br>&gt; <br>&gt; Thanks and cheers<br>&gt; Heiko<br>&gt; <br>&gt; <br>&gt; <br>&gt; - --<br>&gt; Anynines.com<br>&gt; <br>&gt; Avarteq GmbH<br>&gt; B.Sc. Informatik<br>&gt; Heiko Kr?mer<br>&gt; CIO<br>&gt; Twitter: @anynines<br>&gt; <br>&gt; - ----<br>&gt; Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian Fischer<br>&gt; Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.: DE262633168<br>&gt; Sitz: Saarbr?cken<br>&gt; -----BEGIN PGP SIGNATURE-----<br>&gt; Version: GnuPG v1.4.14 (GNU/Linux)<br>&gt; Comment: Using GnuPG with Thunderbird - </font></tt><a href=http://www.enigmail.net/><tt><font size=2>http://www.enigmail.net/</font></tt></a><tt><font size=2><br>&gt; <br>&gt; iQEcBAEBAgAGBQJSpcMfAAoJELxFogM4ixOF/ncH/3L9DvOWHrF0XBqCgeT6QQ6B<br>&gt; lDwtXiD9xoznht0Zs2S9LA9Z7r2l5/fzMOUSOawEMv6M16Guwq3gQ1lClUi4Iwj0<br>&gt; GKKtYQ6F4aG4KXHY4dlu1QKT5OaLk8ljCQ47Tc9aAiJMhfC1/IgQXOslFv26utdJ<br>&gt; N9jxiCl2+r/tQvQRw6mA4KAuPYPwOV+hMtkwfrM4UsIYGGbkNPnz1oqmBsfGdSOs<br>&gt; TJh6+lQRD9KYw72q3I9G6ZYlI7ylL9Q7vjTroVKH232pLo4G58NLxyvWvcOB9yK6<br>&gt; Bpf/gRMxFNKA75eW5EJYeZ6EovwcyCAv7iAm+xNKhzsoZqbBbTOJxS5zKm4YWoY=<br>&gt; =bDly<br>&gt; -----END PGP SIGNATURE-----<br>&gt; <br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 14<br>Date: Mon, 9 Dec 2013 14:31:00 -0500 (EST)<br>From: Ben Turner &lt;bturner@redhat.com&gt;<br>To: Heiko Kr?mer &lt;hkraemer@anynines.de&gt;<br>Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;1676822821.2336090.1386617460049.JavaMail.root@redhat.com&gt;<br>Content-Type: text/plain; charset=utf-8<br><br>----- Original Message -----<br>&gt; From: &quot;Ben Turner&quot; &lt;bturner@redhat.com&gt;<br>&gt; To: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt; Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt; Sent: Monday, December 9, 2013 2:26:45 PM<br>&gt; Subject: Re: [Gluster-users] Gluster infrastructure question<br>&gt; <br>&gt; ----- Original Message -----<br>&gt; &gt; From: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt; &gt; To: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt; &gt; Sent: Monday, December 9, 2013 8:18:28 AM<br>&gt; &gt; Subject: [Gluster-users] Gluster infrastructure question<br>&gt; &gt; <br>&gt; &gt; -----BEGIN PGP SIGNED MESSAGE-----<br>&gt; &gt; Hash: SHA1<br>&gt; &gt; <br>&gt; &gt; Heyho guys,<br>&gt; &gt; <br>&gt; &gt; I'm running since years glusterfs in a small environment without

big<br>&gt; &gt; problems.<br>&gt; &gt; <br>&gt; &gt; Now I'm going to use glusterFS for a bigger cluster but I've

some<br>&gt; &gt; questions :)<br>&gt; &gt; <br>&gt; &gt; Environment:<br>&gt; &gt; * 4 Servers<br>&gt; &gt; * 20 x 2TB HDD, each<br>&gt; &gt; * Raidcontroller<br>&gt; &gt; * Raid 10<br>&gt; &gt; * 4x bricks =&gt; Replicated, Distributed volume<br>&gt; &gt; * Gluster 3.4<br>&gt; &gt; <br>&gt; &gt; 1)<br>&gt; &gt; I'm asking me, if I can delete the raid10 on each server and

create &gt; &gt; for each HDD a separate brick. &gt; &gt; In this case have a volume 80 Bricks so 4 Server x 20 HDD's.

Is there<br>&gt; &gt; any experience about the write throughput in a production system

with<br>&gt; &gt; many of bricks like in this case? In addition i'll get double

of HDD &gt; &gt; capacity. &gt;  &gt; Have a look at: &gt;  &gt; </tt><a href="http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf</tt></a><tt><font size=2> That one was from 2012, here is the latest: </tt><a href="http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf</tt></a><tt><font size=2> -b   &gt; Specifically: &gt;  &gt; ? RAID arrays &gt; ? More RAID LUNs for better concurrency &gt; ? For RAID6, 256-KB stripe size &gt;  &gt; I use a single RAID 6 that is divided into several LUNs for my bricks.

&nbsp;For<br>&gt; example, on my Dell servers(with PERC6 RAID controllers) each server

has 12<br>&gt; disks that I put into raid 6. &nbsp;Then I break the RAID 6 into 6

LUNs and<br>&gt; create a new PV/VG/LV for each brick. &nbsp;From there I follow the<br>&gt; recommendations listed in the presentation.<br>&gt; <br>&gt; HTH!<br>&gt; <br>&gt; -b<br>&gt; &nbsp;<br>&gt; &gt; 2)<br>&gt; &gt; I've heard a talk about glusterFS and out scaling. The main point

was<br>&gt; &gt; if more bricks are in use, the scale out process will take a

long<br>&gt; &gt; time. The problem was/is the Hash-Algo. So I'm asking me how

is it if<br>&gt; &gt; I've one very big brick (Raid10 20TB on each server) or I've

much more<br>&gt; &gt; bricks, what's faster and is there any issues?<br>&gt; &gt; Is there any experiences ?<br>&gt; &gt; <br>&gt; &gt; 3)<br>&gt; &gt; Failover of a HDD is for a raid controller with HotSpare HDD

not a big<br>&gt; &gt; deal. Glusterfs will rebuild automatically if a brick fails and

there<br>&gt; &gt; are no data present, this action will perform a lot of network

traffic<br>&gt; &gt; between the mirror bricks but it will handle it equal as the

raid<br>&gt; &gt; controller right ?<br>&gt; &gt; <br>&gt; &gt; <br>&gt; &gt; <br>&gt; &gt; Thanks and cheers<br>&gt; &gt; Heiko<br>&gt; &gt; <br>&gt; &gt; <br>&gt; &gt; <br>&gt; &gt; - --<br>&gt; &gt; Anynines.com<br>&gt; &gt; <br>&gt; &gt; Avarteq GmbH<br>&gt; &gt; B.Sc. Informatik<br>&gt; &gt; Heiko Kr?mer<br>&gt; &gt; CIO<br>&gt; &gt; Twitter: @anynines<br>&gt; &gt; <br>&gt; &gt; - ----<br>&gt; &gt; Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian Fischer<br>&gt; &gt; Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.: DE262633168<br>&gt; &gt; Sitz: Saarbr?cken<br>&gt; &gt; -----BEGIN PGP SIGNATURE-----<br>&gt; &gt; Version: GnuPG v1.4.14 (GNU/Linux)<br>&gt; &gt; Comment: Using GnuPG with Thunderbird - </font></tt><a href=http://www.enigmail.net/><tt><font size=2>http://www.enigmail.net/</font></tt></a><tt><font size=2><br>&gt; &gt; <br>&gt; &gt; iQEcBAEBAgAGBQJSpcMfAAoJELxFogM4ixOF/ncH/3L9DvOWHrF0XBqCgeT6QQ6B<br>&gt; &gt; lDwtXiD9xoznht0Zs2S9LA9Z7r2l5/fzMOUSOawEMv6M16Guwq3gQ1lClUi4Iwj0<br>&gt; &gt; GKKtYQ6F4aG4KXHY4dlu1QKT5OaLk8ljCQ47Tc9aAiJMhfC1/IgQXOslFv26utdJ<br>&gt; &gt; N9jxiCl2+r/tQvQRw6mA4KAuPYPwOV+hMtkwfrM4UsIYGGbkNPnz1oqmBsfGdSOs<br>&gt; &gt; TJh6+lQRD9KYw72q3I9G6ZYlI7ylL9Q7vjTroVKH232pLo4G58NLxyvWvcOB9yK6<br>&gt; &gt; Bpf/gRMxFNKA75eW5EJYeZ6EovwcyCAv7iAm+xNKhzsoZqbBbTOJxS5zKm4YWoY=<br>&gt; &gt; =bDly<br>&gt; &gt; -----END PGP SIGNATURE-----<br>&gt; &gt; <br>&gt; &gt; _______________________________________________<br>&gt; &gt; Gluster-users mailing list<br>&gt; &gt; Gluster-users@gluster.org<br>&gt; &gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 15<br>Date: Mon, 09 Dec 2013 14:57:08 -0500<br>From: Jeff Darcy &lt;jdarcy@redhat.com&gt;<br>To: Randy Breunling &lt;rbreunling@gmail.com&gt;, gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Scalability - File system or Object Store<br>Message-ID: &lt;52A62094.1000507@redhat.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br><br>On 12/09/2013 10:57 AM, Randy Breunling wrote:<br>&gt; From any experience...which has shown to scale better...a file system<br>&gt; &nbsp;or an object store?<br><br>In terms of numbers of files/objects, I'd have to say object stores. &nbsp;S3<br>and Azure are both over a *trillion* objects, and I've never heard of a<br>filesystem that size. &nbsp;In terms of performance it might go the other<br>way. &nbsp;More importantly, I think the object stores give up too much

in<br>terms of semantics - e.g. hierarchical directories and rename, byte<br>granularity, consistency/durability guarantees. &nbsp;It saddens me to

see so<br>many people working around these limitations in their apps based on<br>object stores - duplicating each others' work, creating<br>incompatibibility (e.g. with a half dozen &quot;conventions&quot; for simulating<br>hierarchical directories), and sometimes even losing data to subtle<br>distributed-coordination bugs. &nbsp;An app that uses a subset of an<br>underlying filesystem's functionality is far more likely to be correct<br>and portable than one that tries to build extra abstractions on top of

a<br>bare-bones object store.<br><br><br><br>------------------------------<br><br>Message: 16<br>Date: Tue, 10 Dec 2013 07:58:25 +1000<br>From: Dan Mons &lt;dmons@cuttingedge.com.au&gt;<br>To: Ben Turner &lt;bturner@redhat.com&gt;<br>Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

Heiko Kr?mer &lt;hkraemer@anynines.de&gt;<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CACa6TycgVYLNOWkk7eO2L80hhEdQLJpgk-+Bav_dfL2gPVGpjw@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=UTF-8<br><br>I went with big RAID on each node (16x 3TB SATA disks in RAID6 with a<br>hot spare per node) rather than brick-per-disk. &nbsp;The simple reason<br>being that I wanted to configure distribute+replicate at the GlusterFS<br>level, and be 100% guaranteed that the replication happened across to<br>another node, and not to another brick on the same node. &nbsp;As each

node<br>only has one giant brick, the cluster is forced to replicate to a<br>separate node each time.<br><br>Some careful initial setup could probably have done the same, but I<br>wanted to avoid the dramas of my employer expanding the cluster one<br>node at a time later on, causing that design goal to fail as the new<br>single node with many bricks found replication partners on itself.<br><br>On a different topic, I find no real-world difference in RAID10 to<br>RAID6 with GlusterFS. &nbsp;Most of the access delay in Gluster has little<br>to do with the speed of the disk. &nbsp;The only downside to RAID6 is a<br>long rebuild time if you're unlucky enough to blow a couple of drives<br>at once. &nbsp;RAID50 might be a better choice if you're up at 20 drives<br>per node.<br><br>We invested in SSD caching on our nodes, and to be honest it was<br>rather pointless. &nbsp;Certainly not bad, but the real-world speed boost<br>is not noticed by end users.<br><br>-Dan<br><br>----------------<br>Dan Mons<br>R&amp;D SysAdmin<br>Unbreaker of broken things<br>Cutting Edge<br></font></tt><a href=http://cuttingedge.com.au/><tt><font size=2>http://cuttingedge.com.au</font></tt></a><tt><font size=2><br><br><br>On 10 December 2013 05:31, Ben Turner &lt;bturner@redhat.com&gt; wrote:<br>&gt; ----- Original Message -----<br>&gt;&gt; From: &quot;Ben Turner&quot; &lt;bturner@redhat.com&gt;<br>&gt;&gt; To: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt;&gt; Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt;&gt; Sent: Monday, December 9, 2013 2:26:45 PM<br>&gt;&gt; Subject: Re: [Gluster-users] Gluster infrastructure question<br>&gt;&gt;<br>&gt;&gt; ----- Original Message -----<br>&gt;&gt; &gt; From: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt;&gt; &gt; To: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt;&gt; &gt; Sent: Monday, December 9, 2013 8:18:28 AM<br>&gt;&gt; &gt; Subject: [Gluster-users] Gluster infrastructure question<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; -----BEGIN PGP SIGNED MESSAGE-----<br>&gt;&gt; &gt; Hash: SHA1<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; Heyho guys,<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; I'm running since years glusterfs in a small environment

without big<br>&gt;&gt; &gt; problems.<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; Now I'm going to use glusterFS for a bigger cluster but I've

some<br>&gt;&gt; &gt; questions :)<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; Environment:<br>&gt;&gt; &gt; * 4 Servers<br>&gt;&gt; &gt; * 20 x 2TB HDD, each<br>&gt;&gt; &gt; * Raidcontroller<br>&gt;&gt; &gt; * Raid 10<br>&gt;&gt; &gt; * 4x bricks =&gt; Replicated, Distributed volume<br>&gt;&gt; &gt; * Gluster 3.4<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; 1)<br>&gt;&gt; &gt; I'm asking me, if I can delete the raid10 on each server

and create &gt;&gt; &gt; for each HDD a separate brick. &gt;&gt; &gt; In this case have a volume 80 Bricks so 4 Server x 20 HDD's.

Is there<br>&gt;&gt; &gt; any experience about the write throughput in a production

system with<br>&gt;&gt; &gt; many of bricks like in this case? In addition i'll get double

of HDD<br>&gt;&gt; &gt; capacity.<br>&gt;&gt;<br>&gt;&gt; Have a look at:<br>&gt;&gt;<br>&gt;&gt; </font></tt><a href="http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf</font></tt></a><tt><font size=2><br>&gt;<br>&gt; That one was from 2012, here is the latest:<br>&gt;<br>&gt; </font></tt><a href="http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf</font></tt></a><tt><font size=2><br>&gt;<br>&gt; -b<br>&gt;<br>&gt;&gt; Specifically:<br>&gt;&gt;<br>&gt;&gt; ? RAID arrays<br>&gt;&gt; ? More RAID LUNs for better concurrency<br>&gt;&gt; ? For RAID6, 256-KB stripe size<br>&gt;&gt;<br>&gt;&gt; I use a single RAID 6 that is divided into several LUNs for my

bricks. &nbsp;For<br>&gt;&gt; example, on my Dell servers(with PERC6 RAID controllers) each

server has 12<br>&gt;&gt; disks that I put into raid 6. &nbsp;Then I break the RAID 6 into

6 LUNs and<br>&gt;&gt; create a new PV/VG/LV for each brick. &nbsp;From there I follow

the<br>&gt;&gt; recommendations listed in the presentation.<br>&gt;&gt;<br>&gt;&gt; HTH!<br>&gt;&gt;<br>&gt;&gt; -b<br>&gt;&gt;<br>&gt;&gt; &gt; 2)<br>&gt;&gt; &gt; I've heard a talk about glusterFS and out scaling. The main

point was<br>&gt;&gt; &gt; if more bricks are in use, the scale out process will take

a long<br>&gt;&gt; &gt; time. The problem was/is the Hash-Algo. So I'm asking me

how is it if<br>&gt;&gt; &gt; I've one very big brick (Raid10 20TB on each server) or I've

much more<br>&gt;&gt; &gt; bricks, what's faster and is there any issues?<br>&gt;&gt; &gt; Is there any experiences ?<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; 3)<br>&gt;&gt; &gt; Failover of a HDD is for a raid controller with HotSpare

HDD not a big<br>&gt;&gt; &gt; deal. Glusterfs will rebuild automatically if a brick fails

and there<br>&gt;&gt; &gt; are no data present, this action will perform a lot of network

traffic<br>&gt;&gt; &gt; between the mirror bricks but it will handle it equal as

the raid<br>&gt;&gt; &gt; controller right ?<br>&gt;&gt; &gt;<br>&gt;&gt; &gt;<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; Thanks and cheers<br>&gt;&gt; &gt; Heiko<br>&gt;&gt; &gt;<br>&gt;&gt; &gt;<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; - --<br>&gt;&gt; &gt; Anynines.com<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; Avarteq GmbH<br>&gt;&gt; &gt; B.Sc. Informatik<br>&gt;&gt; &gt; Heiko Kr?mer<br>&gt;&gt; &gt; CIO<br>&gt;&gt; &gt; Twitter: @anynines<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; - ----<br>&gt;&gt; &gt; Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian Fischer<br>&gt;&gt; &gt; Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.: DE262633168<br>&gt;&gt; &gt; Sitz: Saarbr?cken<br>&gt;&gt; &gt; -----BEGIN PGP SIGNATURE-----<br>&gt;&gt; &gt; Version: GnuPG v1.4.14 (GNU/Linux)<br>&gt;&gt; &gt; Comment: Using GnuPG with Thunderbird - </font></tt><a href=http://www.enigmail.net/><tt><font size=2>http://www.enigmail.net/</font></tt></a><tt><font size=2><br>&gt;&gt; &gt;<br>&gt;&gt; &gt; iQEcBAEBAgAGBQJSpcMfAAoJELxFogM4ixOF/ncH/3L9DvOWHrF0XBqCgeT6QQ6B<br>&gt;&gt; &gt; lDwtXiD9xoznht0Zs2S9LA9Z7r2l5/fzMOUSOawEMv6M16Guwq3gQ1lClUi4Iwj0<br>&gt;&gt; &gt; GKKtYQ6F4aG4KXHY4dlu1QKT5OaLk8ljCQ47Tc9aAiJMhfC1/IgQXOslFv26utdJ<br>&gt;&gt; &gt; N9jxiCl2+r/tQvQRw6mA4KAuPYPwOV+hMtkwfrM4UsIYGGbkNPnz1oqmBsfGdSOs<br>&gt;&gt; &gt; TJh6+lQRD9KYw72q3I9G6ZYlI7ylL9Q7vjTroVKH232pLo4G58NLxyvWvcOB9yK6<br>&gt;&gt; &gt; Bpf/gRMxFNKA75eW5EJYeZ6EovwcyCAv7iAm+xNKhzsoZqbBbTOJxS5zKm4YWoY=<br>&gt;&gt; &gt; =bDly<br>&gt;&gt; &gt; -----END PGP SIGNATURE-----<br>&gt;&gt; &gt;<br>&gt;&gt; &gt; _______________________________________________<br>&gt;&gt; &gt; Gluster-users mailing list</font></tt><br><tt><font size=2>&gt;&gt; &gt; Gluster-users@gluster.org<br>&gt;&gt; &gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 17<br>Date: Mon, 09 Dec 2013 14:09:11 -0800<br>From: Joe Julian &lt;joe@julianfamily.org&gt;<br>To: Dan Mons &lt;dmons@cuttingedge.com.au&gt;<br>Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID: &lt;52A63F87.8070107@julianfamily.org&gt;<br>Content-Type: text/plain; charset=UTF-8; format=flowed<br><br>Replicas are defined in the order bricks are listed in the volume create

<br>command. So<br> &nbsp; gluster volume create myvol replica 2 server1:/data/brick1 <br>server2:/data/brick1 server3:/data/brick1 server4:/data/brick1<br>will replicate between server1 and server2 and replicate between server3

<br>and server4.<br><br>Bricks added to a replica 2 volume after it's been created will require

<br>pairs of bricks,<br><br>The best way to &quot;force&quot; replication to happen on another server

is to <br>just define it that way.<br><br>On 12/09/2013 01:58 PM, Dan Mons wrote:<br>&gt; I went with big RAID on each node (16x 3TB SATA disks in RAID6 with

a<br>&gt; hot spare per node) rather than brick-per-disk. &nbsp;The simple reason<br>&gt; being that I wanted to configure distribute+replicate at the GlusterFS<br>&gt; level, and be 100% guaranteed that the replication happened across

to<br>&gt; another node, and not to another brick on the same node. &nbsp;As

each node<br>&gt; only has one giant brick, the cluster is forced to replicate to a<br>&gt; separate node each time.<br>&gt;<br>&gt; Some careful initial setup could probably have done the same, but

I<br>&gt; wanted to avoid the dramas of my employer expanding the cluster one<br>&gt; node at a time later on, causing that design goal to fail as the new<br>&gt; single node with many bricks found replication partners on itself.<br>&gt;<br>&gt; On a different topic, I find no real-world difference in RAID10 to<br>&gt; RAID6 with GlusterFS. &nbsp;Most of the access delay in Gluster has

little<br>&gt; to do with the speed of the disk. &nbsp;The only downside to RAID6

is a<br>&gt; long rebuild time if you're unlucky enough to blow a couple of drives<br>&gt; at once. &nbsp;RAID50 might be a better choice if you're up at 20

drives<br>&gt; per node.<br>&gt;<br>&gt; We invested in SSD caching on our nodes, and to be honest it was<br>&gt; rather pointless. &nbsp;Certainly not bad, but the real-world speed

boost<br>&gt; is not noticed by end users.<br>&gt;<br>&gt; -Dan<br>&gt;<br>&gt; ----------------<br>&gt; Dan Mons<br>&gt; R&amp;D SysAdmin<br>&gt; Unbreaker of broken things<br>&gt; Cutting Edge<br>&gt; </font></tt><a href=http://cuttingedge.com.au/><tt><font size=2>http://cuttingedge.com.au</font></tt></a><tt><font size=2><br>&gt;<br>&gt;<br>&gt; On 10 December 2013 05:31, Ben Turner &lt;bturner@redhat.com&gt; wrote:<br>&gt;&gt; ----- Original Message -----<br>&gt;&gt;&gt; From: &quot;Ben Turner&quot; &lt;bturner@redhat.com&gt;<br>&gt;&gt;&gt; To: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt;&gt;&gt; Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt;&gt;&gt; Sent: Monday, December 9, 2013 2:26:45 PM<br>&gt;&gt;&gt; Subject: Re: [Gluster-users] Gluster infrastructure question<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; ----- Original Message -----<br>&gt;&gt;&gt;&gt; From: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt;&gt;&gt;&gt; To: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt;&gt;&gt;&gt; Sent: Monday, December 9, 2013 8:18:28 AM<br>&gt;&gt;&gt;&gt; Subject: [Gluster-users] Gluster infrastructure question<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; -----BEGIN PGP SIGNED MESSAGE-----<br>&gt;&gt;&gt;&gt; Hash: SHA1<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; Heyho guys,<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; I'm running since years glusterfs in a small environment

without big<br>&gt;&gt;&gt;&gt; problems.<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; Now I'm going to use glusterFS for a bigger cluster but

I've some<br>&gt;&gt;&gt;&gt; questions :)<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; Environment:<br>&gt;&gt;&gt;&gt; * 4 Servers<br>&gt;&gt;&gt;&gt; * 20 x 2TB HDD, each<br>&gt;&gt;&gt;&gt; * Raidcontroller<br>&gt;&gt;&gt;&gt; * Raid 10<br>&gt;&gt;&gt;&gt; * 4x bricks =&gt; Replicated, Distributed volume<br>&gt;&gt;&gt;&gt; * Gluster 3.4<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; 1)<br>&gt;&gt;&gt;&gt; I'm asking me, if I can delete the raid10 on each server

and create<br>&gt;&gt;&gt;&gt; for each HDD a separate brick.<br>&gt;&gt;&gt;&gt; In this case have a volume 80 Bricks so 4 Server x 20

HDD's. Is there<br>&gt;&gt;&gt;&gt; any experience about the write throughput in a production

system with<br>&gt;&gt;&gt;&gt; many of bricks like in this case? In addition i'll get

double of HDD<br>&gt;&gt;&gt;&gt; capacity.<br>&gt;&gt;&gt; Have a look at:<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; </font></tt><a href="http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf</font></tt></a><tt><font size=2><br>&gt;&gt; That one was from 2012, here is the latest:<br>&gt;&gt;<br>&gt;&gt; </font></tt><a href="http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;&gt; -b<br>&gt;&gt;<br>&gt;&gt;&gt; Specifically:<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; ? RAID arrays<br>&gt;&gt;&gt; ? More RAID LUNs for better concurrency<br>&gt;&gt;&gt; ? For RAID6, 256-KB stripe size<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; I use a single RAID 6 that is divided into several LUNs for

my bricks. &nbsp;For<br>&gt;&gt;&gt; example, on my Dell servers(with PERC6 RAID controllers) each

server has 12<br>&gt;&gt;&gt; disks that I put into raid 6. &nbsp;Then I break the RAID

6 into 6 LUNs and<br>&gt;&gt;&gt; create a new PV/VG/LV for each brick. &nbsp;From there I follow

the<br>&gt;&gt;&gt; recommendations listed in the presentation.<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; HTH!<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; -b<br>&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; 2)<br>&gt;&gt;&gt;&gt; I've heard a talk about glusterFS and out scaling. The

main point was<br>&gt;&gt;&gt;&gt; if more bricks are in use, the scale out process will

take a long<br>&gt;&gt;&gt;&gt; time. The problem was/is the Hash-Algo. So I'm asking

me how is it if<br>&gt;&gt;&gt;&gt; I've one very big brick (Raid10 20TB on each server) or

I've much more<br>&gt;&gt;&gt;&gt; bricks, what's faster and is there any issues?<br>&gt;&gt;&gt;&gt; Is there any experiences ?<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; 3)<br>&gt;&gt;&gt;&gt; Failover of a HDD is for a raid controller with HotSpare

HDD not a big<br>&gt;&gt;&gt;&gt; deal. Glusterfs will rebuild automatically if a brick

fails and there<br>&gt;&gt;&gt;&gt; are no data present, this action will perform a lot of

network traffic<br>&gt;&gt;&gt;&gt; between the mirror bricks but it will handle it equal

as the raid<br>&gt;&gt;&gt;&gt; controller right ?<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; Thanks and cheers<br>&gt;&gt;&gt;&gt; Heiko<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; - --<br>&gt;&gt;&gt;&gt; Anynines.com<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; Avarteq GmbH<br>&gt;&gt;&gt;&gt; B.Sc. Informatik<br>&gt;&gt;&gt;&gt; Heiko Kr?mer<br>&gt;&gt;&gt;&gt; CIO<br>&gt;&gt;&gt;&gt; Twitter: @anynines<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; - ----<br>&gt;&gt;&gt;&gt; Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian

Fischer &gt;&gt;&gt;&gt; Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.:

DE262633168<br>&gt;&gt;&gt;&gt; Sitz: Saarbr?cken<br>&gt;&gt;&gt;&gt; -----BEGIN PGP SIGNATURE-----<br>&gt;&gt;&gt;&gt; Version: GnuPG v1.4.14 (GNU/Linux)<br>&gt;&gt;&gt;&gt; Comment: Using GnuPG with Thunderbird - </font></tt><a href=http://www.enigmail.net/><tt><font size=2>http://www.enigmail.net/</font></tt></a><tt><font size=2><br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; iQEcBAEBAgAGBQJSpcMfAAoJELxFogM4ixOF/ncH/3L9DvOWHrF0XBqCgeT6QQ6B<br>&gt;&gt;&gt;&gt; lDwtXiD9xoznht0Zs2S9LA9Z7r2l5/fzMOUSOawEMv6M16Guwq3gQ1lClUi4Iwj0<br>&gt;&gt;&gt;&gt; GKKtYQ6F4aG4KXHY4dlu1QKT5OaLk8ljCQ47Tc9aAiJMhfC1/IgQXOslFv26utdJ<br>&gt;&gt;&gt;&gt; N9jxiCl2+r/tQvQRw6mA4KAuPYPwOV+hMtkwfrM4UsIYGGbkNPnz1oqmBsfGdSOs<br>&gt;&gt;&gt;&gt; TJh6+lQRD9KYw72q3I9G6ZYlI7ylL9Q7vjTroVKH232pLo4G58NLxyvWvcOB9yK6<br>&gt;&gt;&gt;&gt; Bpf/gRMxFNKA75eW5EJYeZ6EovwcyCAv7iAm+xNKhzsoZqbBbTOJxS5zKm4YWoY=<br>&gt;&gt;&gt;&gt; =bDly<br>&gt;&gt;&gt;&gt; -----END PGP SIGNATURE-----<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; _______________________________________________<br>&gt;&gt;&gt;&gt; Gluster-users mailing list<br>&gt;&gt;&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt;&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;&gt; _______________________________________________<br>&gt;&gt;&gt; Gluster-users mailing list<br>&gt;&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br><br><br>------------------------------<br><br>Message: 18<br>Date: Tue, 10 Dec 2013 09:38:03 +1000<br>From: Dan Mons &lt;dmons@cuttingedge.com.au&gt;<br>To: Joe Julian &lt;joe@julianfamily.org&gt;<br>Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CACa6TyenCTAgoKKsXCmrvd0G191VdBPkdNf3j4yROkT_9jTyhQ@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1<br><br>On 10 December 2013 08:09, Joe Julian &lt;joe@julianfamily.org&gt; wrote:<br>&gt; Replicas are defined in the order bricks are listed in the volume

create<br>&gt; command. So<br>&gt; &nbsp; gluster volume create myvol replica 2 server1:/data/brick1<br>&gt; server2:/data/brick1 server3:/data/brick1 server4:/data/brick1<br>&gt; will replicate between server1 and server2 and replicate between server3

and<br>&gt; server4.<br>&gt;<br>&gt; Bricks added to a replica 2 volume after it's been created will require<br>&gt; pairs of bricks,<br>&gt;<br>&gt; The best way to &quot;force&quot; replication to happen on another

server is to just<br>&gt; define it that way.<br><br>Yup, that's understood. &nbsp;The problem is when (for argument's sake)

:<br><br>* We've defined 4 hosts with 10 disks each<br>* Each individual disk is a brick<br>* Replication is defined correctly when creating the volume initially<br>* I'm on holidays, my employer buys a single node, configures it<br>brick-per-disk, and the IT junior adds it to the cluster<br><br>All good up until that final point, and then I've got that fifth node<br>at the end replicating to itself. &nbsp;Node goes down some months later,<br>chaos ensues.<br><br>Not a GlusterFS/technology problem, but a problem with what frequently<br>happens at a human level. &nbsp;As a sysadmin, these are also things I

need<br>to work around, even if it means deviating from best practices. :)<br><br>-Dan<br><br><br>------------------------------<br><br>Message: 19<br>Date: Tue, 10 Dec 2013 11:06:06 +0700<br>From: Diep Pham Van &lt;imeo@favadi.com&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] [CentOS 6] Upgrade to the glusterfs<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

version in base or in glusterfs-epel<br>Message-ID: &lt;20131210110606.2e217dc6@debbox&gt;<br>Content-Type: text/plain; charset=US-ASCII<br><br>On Mon, 9 Dec 2013 19:53:20 +0900<br>Nguyen Viet Cuong &lt;mrcuongnv@gmail.com&gt; wrote:<br><br>&gt; There is no glusterfs-server in the &quot;base&quot; repository, just

client.<br>Silly me.<br>After install and attempt to mount with base version of glusterfs-fuse,<br>I realize that I have to change 'backupvolfile-server' mount option to<br>'backup-volfile-servers'[1].<br><br>Links:<br>[1] </font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1023950"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1023950</font></tt></a><tt><font size=2><br><br>-- <br>PHAM Van Diep<br><br><br>------------------------------<br><br>Message: 20<br>Date: Mon, 09 Dec 2013 20:44:06 -0800<br>From: harry mangalam &lt;harry.mangalam@uci.edu&gt;<br>To: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] Where does the 'date' string in<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

'/var/log/glusterfs/gl.log' come from?<br>Message-ID: &lt;34671480.j6DT7uby7B@stunted&gt;<br>Content-Type: text/plain; charset=&quot;us-ascii&quot;<br><br>Admittedly I should search the source, but I wonder if anyone knows this

<br>offhand.<br><br>Background: &nbsp;of our &nbsp;84 ROCKS (6.1) -provisioned compute nodes,

4 have picked <br>up an 'advanced date' &nbsp;in the /var/log/glusterfs/gl.log file - that

date <br>string is running about 5-6 hours ahead of the system date and all the

Gluster <br>servers (which are identical and correct). &nbsp;The time advancement does

not <br>appear to be identical tho it's hard to tell since it only shows on errors

and <br>those update irregularly.<br><br>All the clients are the same version and all the servers are the same (gluster

<br>v 3.4.0-8.el6.x86_64<br><br>This would not be of interest except that those 4 clients are losing files,

<br>unable to reliably do IO, etc on the gluster fs. &nbsp;They don't appear

to be <br>having problems with NFS mounts, nor with a Fraunhofer FS that is also

mounted <br>on each node,<br><br>Rebooting 2 of them has no effect - they come right back with an advanced

<br>date.<br><br><br>---<br>Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine<br>[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487<br>415 South Circle View Dr, Irvine, CA, 92697 [shipping]<br>MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)<br>---<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/9cde5ba3/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131209/9cde5ba3/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 21<br>Date: Tue, 10 Dec 2013 12:49:25 +0800<br>From: Sharuzzaman Ahmat Raslan &lt;sharuzzaman@gmail.com&gt;<br>To: harry mangalam &lt;harry.mangalam@uci.edu&gt;<br>Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Where does the 'date' string in<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

'/var/log/glusterfs/gl.log' come from?<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAK+zuc=5SY7wuFXUe-i2nUXAhGr+Ddaahr_7TKYgMxgtWKh1zg@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Hi Harry,<br><br>Did you setup ntp on each of the node, and sync the time to one single<br>source?<br><br>Thanks.<br><br><br>On Tue, Dec 10, 2013 at 12:44 PM, harry mangalam &lt;harry.mangalam@uci.edu&gt;wrote:<br><br>&gt; &nbsp;Admittedly I should search the source, but I wonder if anyone

knows this<br>&gt; offhand.<br>&gt;<br>&gt;<br>&gt;<br>&gt; Background: of our 84 ROCKS (6.1) -provisioned compute nodes, 4 have<br>&gt; picked up an 'advanced date' in the /var/log/glusterfs/gl.log file

- that<br>&gt; date string is running about 5-6 hours ahead of the system date and

all the<br>&gt; Gluster servers (which are identical and correct). The time advancement<br>&gt; does not appear to be identical tho it's hard to tell since it only

shows<br>&gt; on errors and those update irregularly.<br>&gt;<br>&gt;<br>&gt;<br>&gt; All the clients are the same version and all the servers are the same<br>&gt; (gluster v 3.4.0-8.el6.x86_64<br>&gt;<br>&gt;<br>&gt;<br>&gt; This would not be of interest except that those 4 clients are losing<br>&gt; files, unable to reliably do IO, etc on the gluster fs. They don't

appear<br>&gt; to be having problems with NFS mounts, nor with a Fraunhofer FS that

is<br>&gt; also mounted on each node,<br>&gt;<br>&gt;<br>&gt;<br>&gt; Rebooting 2 of them has no effect - they come right back with an advanced<br>&gt; date.<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; ---<br>&gt;<br>&gt; Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine<br>&gt;<br>&gt; [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487<br>&gt;<br>&gt; 415 South Circle View Dr, Irvine, CA, 92697 [shipping]<br>&gt;<br>&gt; MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)<br>&gt;<br>&gt; ---<br>&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br><br><br><br>-- <br>Sharuzzaman Ahmat Raslan<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/d0de4ecd/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/d0de4ecd/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 22<br>Date: Tue, 10 Dec 2013 04:49:50 +0000<br>From: Bobby Jacob &lt;bobby.jacob@alshaya.com&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] FW: Self Heal Issue GlusterFS 3.3.1<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;AC3305F9C186F849B835A3E6D3C9BEFEB5A763@KWTPRMBX001.mha.local&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Hi,<br><br>Can someone please advise on this issue. ?? Urgent. Selfheal is working

every 10 minutes only. ??<br><br>Thanks &amp; Regards,<br>Bobby Jacob<br><br>From: Bobby Jacob<br>Sent: Tuesday, December 03, 2013 8:51 AM<br>To: gluster-users@gluster.org<br>Subject: FW: Self Heal Issue GlusterFS 3.3.1<br><br>Just and addition: on the node where the self heal is not working when

I check /var/log/glusterd/glustershd.log, I see the following:<br><br>[2013-12-03 05:49:18.348637] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.350273] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.354813] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.355893] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.356901] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.357730] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.359136] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.360276] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.361168] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.362135] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.363569] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.364232] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.364872] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.365777] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.367383] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br>[2013-12-03 05:49:18.368075] E [afr-self-heald.c:685:_link_inode_update_loc]

0-glustervol-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)<br><br>Thanks &amp; Regards,<br>Bobby Jacob<br><br>From: gluster-users-bounces@gluster.org [</font></tt><a href="mailto:gluster-users-bounces@gluster.org"><tt><font size=2>mailto:gluster-users-bounces@gluster.org</font></tt></a><tt><font size=2>]

On Behalf Of Bobby Jacob<br>Sent: Tuesday, December 03, 2013 8:48 AM<br>To: gluster-users@gluster.org<br>Subject: [Gluster-users] Self Heal Issue GlusterFS 3.3.1<br><br>Hi,<br><br>I'm running glusterFS 3.3.1 on Centos 6.4.<br><br>? &nbsp;Gluster volume status<br><br><br><br>Status of volume: glustervol<br><br>Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online &nbsp;Pid<br><br>------------------------------------------------------------------------------<br><br>Brick KWTOCUATGS001:/mnt/cloudbrick &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y &nbsp; &nbsp; &nbsp;

20031<br><br>Brick KWTOCUATGS002:/mnt/cloudbrick &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y &nbsp; &nbsp; &nbsp;

1260<br><br>NFS Server on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 38467 &nbsp; Y

&nbsp; &nbsp; &nbsp; 43320<br><br>Self-heal Daemon on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;N/A &nbsp; &nbsp; Y &nbsp; &nbsp; &nbsp; 43326<br><br>NFS Server on KWTOCUATGS002 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 38467 &nbsp; Y &nbsp;

&nbsp; &nbsp; 5842<br><br>Self-heal Daemon on KWTOCUATGS002 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; N/A &nbsp; &nbsp; Y &nbsp; &nbsp; &nbsp;

5848<br><br>The self heal stops working and application write only to 1 brick and it

doesn't replicate. When I check /var/log/glusterfs/glustershd.log I see

the following.:<br><br>[2013-12-03 05:42:32.033563] W [socket.c:410:__socket_keepalive] 0-socket:

failed to set keep idle on socket 8<br>[2013-12-03 05:42:32.033646] W [socket.c:1876:socket_server_event_handler]

0-socket.glusterfsd: Failed to set keep-alive: Operation not supported<br>[2013-12-03 05:42:32.790473] I [client-handshake.c:1614:select_server_supported_programs]

0-glustervol-client-1: Using Program GlusterFS 3.3.2, Num (1298437), Version

(330)<br>[2013-12-03 05:42:32.790840] I [client-handshake.c:1411:client_setvolume_cbk]

0-glustervol-client-1: Connected to 172.16.95.153:24009, attached to remote

volume '/mnt/cloudbrick'.<br>[2013-12-03 05:42:32.790884] I [client-handshake.c:1423:client_setvolume_cbk]

0-glustervol-client-1: Server and Client lk-version numbers are not same,

reopening the fds<br>[2013-12-03 05:42:32.791003] I [afr-common.c:3685:afr_notify] 0-glustervol-replicate-0:

Subvolume 'glustervol-client-1' came back up; going online.<br>[2013-12-03 05:42:32.791161] I [client-handshake.c:453:client_set_lk_version_cbk]

0-glustervol-client-1: Server lk version = 1<br>[2013-12-03 05:42:32.795103] E [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]

0-glustervol-replicate-0: open of &lt;gfid:a7e88fd1-6e32-40ab-90f6-ea452242a7c6&gt;

failed on child glustervol-client-0 (Transport endpoint is not connected)<br>[2013-12-03 05:42:32.798064] E [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]

0-glustervol-replicate-0: open of &lt;gfid:081c6657-301a-42a4-9f95-6eeba6c67413&gt;

failed on child glustervol-client-0 (Transport endpoint is not connected)<br>[2013-12-03 05:42:32.799278] E [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]

0-glustervol-replicate-0: open of &lt;gfid:565f1358-449c-45e2-8535-93b5632c0d1e&gt;

failed on child glustervol-client-0 (Transport endpoint is not connected)<br>[2013-12-03 05:42:32.800636] E [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]

0-glustervol-replicate-0: open of &lt;gfid:9c7010ac-5c11-4561-8b86-5c4d6561f34e&gt;

failed on child glustervol-client-0 (Transport endpoint is not connected)<br>[2013-12-03 05:42:32.802223] E [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]

0-glustervol-replicate-0: open of &lt;gfid:25fd406f-63e0-4037-bb01-da282cbe4d76&gt;

failed on child glustervol-client-0 (Transport endpoint is not connected)<br>[2013-12-03 05:42:32.803339] E [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]

0-glustervol-replicate-0: open of &lt;gfid:a109c429-5885-499e-8711-09fdccd396f2&gt;

failed on child glustervol-client-0 (Transport endpoint is not connected)<br>[2013-12-03 05:42:32.804308] E [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]

0-glustervol-replicate-0: open of &lt;gfid:5a8fd3bf-9215-444c-b974-5c280f5699a6&gt;

failed on child glustervol-client-0 (Transport endpoint is not connected)<br>[2013-12-03 05:42:32.804877] I [client-handshake.c:1614:select_server_supported_programs]

0-glustervol-client-0: Using Program GlusterFS 3.3.2, Num (1298437), Version

(330)<br>[2013-12-03 05:42:32.807517] I [client-handshake.c:1411:client_setvolume_cbk]

0-glustervol-client-0: Connected to 172.16.107.154:24009, attached to remote

volume '/mnt/cloudbrick'.<br>[2013-12-03 05:42:32.807562] I [client-handshake.c:1423:client_setvolume_cbk]

0-glustervol-client-0: Server and Client lk-version numbers are not same,

reopening the fds<br>[2013-12-03 05:42:32.810357] I [client-handshake.c:453:client_set_lk_version_cbk]

0-glustervol-client-0: Server lk version = 1<br>[2013-12-03 05:42:32.827437] E [afr-self-heal-data.c:764:afr_sh_data_fxattrop_fstat_done]

0-glustervol-replicate-0: Unable to self-heal contents of '&lt;gfid:1262d40d-46a3-4e57-b07b-0fcc972c8403&gt;'

(possible split-brain). Please delete the file from all but the preferred

subvolume.<br>[2013-12-03 05:42:39.205157] E [afr-self-heal-metadata.c:472:afr_sh_metadata_fix]

0-glustervol-replicate-0: Unable to self-heal permissions/ownership of

'&lt;gfid:c590e3fb-a376-4ac9-86a6-14a80814e06f&gt;' (possible split-brain).

Please fix the file on all backend volumes<br>[2013-12-03 05:42:39.215793] E [afr-self-heal-metadata.c:472:afr_sh_metadata_fix]

0-glustervol-replicate-0: Unable to self-heal permissions/ownership of

'&lt;gfid:c0660768-289f-48ac-b8e5-e5b5a3a4b965&gt;' (possible split-brain).

Please fix the file on all backend volumes<br><br><br>PLEASE ADVICE.<br><br>Thanks &amp; Regards,<br>Bobby Jacob<br><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/8fa935eb/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/8fa935eb/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br>-------------- next part --------------<br>An embedded and charset-unspecified text was scrubbed...<br>Name: ATT00001.txt<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/8fa935eb/attachment-0001.txt"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/8fa935eb/attachment-0001.txt</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 23<br>Date: Mon, 09 Dec 2013 20:59:21 -0800<br>From: Joe Julian &lt;joe@julianfamily.org&gt;<br>To: Bobby Jacob &lt;bobby.jacob@alshaya.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Self Heal Issue GlusterFS 3.3.1<br>Message-ID: &lt;1386651561.2455.12.camel@bunion-ii.julianfamily.org&gt;<br>Content-Type: text/plain; charset=&quot;UTF-8&quot;<br><br>On Tue, 2013-12-03 at 05:47 +0000, Bobby Jacob wrote:<br>&gt; Hi,<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; I?m running glusterFS 3.3.1 on Centos 6.4. <br>&gt; <br>&gt; ? Gluster volume status<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; Status of volume: glustervol<br>&gt; <br>&gt; Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online<br>&gt; Pid<br>&gt; <br>&gt; ------------------------------------------------------------------------------<br>&gt; <br>&gt; Brick KWTOCUATGS001:/mnt/cloudbrick &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y<br>&gt; 20031<br>&gt; <br>&gt; Brick KWTOCUATGS002:/mnt/cloudbrick &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y<br>&gt; 1260<br>&gt; <br>&gt; NFS Server on localhost<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; 38467 &nbsp; Y &nbsp; &nbsp; &nbsp; 43320<br>&gt; <br>&gt; Self-heal Daemon on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;N/A<br>&gt; Y &nbsp; &nbsp; &nbsp; 43326<br>&gt; <br>&gt; NFS Server on KWTOCUATGS002 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 38467 &nbsp; Y<br>&gt; 5842<br>&gt; <br>&gt; Self-heal Daemon on KWTOCUATGS002 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; N/A &nbsp; &nbsp; Y<br>&gt; 5848<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; The self heal stops working and application write only to 1 brick

and<br>&gt; it doesn?t replicate. When I check /var/log/glusterfs/glustershd.log

I<br>&gt; see the following.:<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; [2013-12-03 05:42:32.033563] W [socket.c:410:__socket_keepalive]<br>&gt; 0-socket: failed to set keep idle on socket 8<br>&gt; <br>&gt; [2013-12-03 05:42:32.033646] W<br>&gt; [socket.c:1876:socket_server_event_handler] 0-socket.glusterfsd:<br>&gt; Failed to set keep-alive: Operation not supported<br>&gt; <br>&gt; [2013-12-03 05:42:32.790473] I<br>&gt; [client-handshake.c:1614:select_server_supported_programs]<br>&gt; 0-glustervol-client-1: Using Program GlusterFS 3.3.2, Num (1298437),<br>&gt; Version (330)<br>&gt; <br>&gt; [2013-12-03 05:42:32.790840] I<br>&gt; [client-handshake.c:1411:client_setvolume_cbk] 0-glustervol-client-1:<br>&gt; Connected to 172.16.95.153:24009, attached to remote volume<br>&gt; '/mnt/cloudbrick'.<br>&gt; <br>&gt; [2013-12-03 05:42:32.790884] I<br>&gt; [client-handshake.c:1423:client_setvolume_cbk] 0-glustervol-client-1:<br>&gt; Server and Client lk-version numbers are not same, reopening the fds<br>&gt; <br>&gt; [2013-12-03 05:42:32.791003] I [afr-common.c:3685:afr_notify]<br>&gt; 0-glustervol-replicate-0: Subvolume 'glustervol-client-1' came back<br>&gt; up; going online.<br>&gt; <br>&gt; [2013-12-03 05:42:32.791161] I<br>&gt; [client-handshake.c:453:client_set_lk_version_cbk]<br>&gt; 0-glustervol-client-1: Server lk version = 1<br>&gt; </font></tt><br><tt><font size=2>&gt; [2013-12-03 05:42:32.795103] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:a7e88fd1-6e32-40ab-90f6-ea452242a7c6&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.798064] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:081c6657-301a-42a4-9f95-6eeba6c67413&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.799278] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:565f1358-449c-45e2-8535-93b5632c0d1e&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.800636] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:9c7010ac-5c11-4561-8b86-5c4d6561f34e&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.802223] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:25fd406f-63e0-4037-bb01-da282cbe4d76&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.803339] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:a109c429-5885-499e-8711-09fdccd396f2&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.804308] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:5a8fd3bf-9215-444c-b974-5c280f5699a6&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.804877] I<br>&gt; [client-handshake.c:1614:select_server_supported_programs]<br>&gt; 0-glustervol-client-0: Using Program GlusterFS 3.3.2, Num (1298437),<br>&gt; Version (330)<br>&gt; <br>&gt; [2013-12-03 05:42:32.807517] I<br>&gt; [client-handshake.c:1411:client_setvolume_cbk] 0-glustervol-client-0:<br>&gt; Connected to 172.16.107.154:24009, attached to remote volume<br>&gt; '/mnt/cloudbrick'.<br>&gt; <br>&gt; [2013-12-03 05:42:32.807562] I<br>&gt; [client-handshake.c:1423:client_setvolume_cbk] 0-glustervol-client-0:<br>&gt; Server and Client lk-version numbers are not same, reopening the fds<br>&gt; <br>&gt; [2013-12-03 05:42:32.810357] I<br>&gt; [client-handshake.c:453:client_set_lk_version_cbk]<br>&gt; 0-glustervol-client-0: Server lk version = 1<br>&gt; <br>&gt; [2013-12-03 05:42:32.827437] E<br>&gt; [afr-self-heal-data.c:764:afr_sh_data_fxattrop_fstat_done]<br>&gt; 0-glustervol-replicate-0: Unable to self-heal contents of<br>&gt; '&lt;gfid:1262d40d-46a3-4e57-b07b-0fcc972c8403&gt;' (possible split-brain).<br>&gt; Please delete the file from all but the preferred subvolume.<br><br>That file is at<br>$brick/.glusterfs/12/62/1262d40d-46a3-4e57-b07b-0fcc972c8403<br><br>Try picking one to remove like it says.<br>&gt; <br>&gt; [2013-12-03 05:42:39.205157] E<br>&gt; [afr-self-heal-metadata.c:472:afr_sh_metadata_fix]<br>&gt; 0-glustervol-replicate-0: Unable to self-heal permissions/ownership

of<br>&gt; '&lt;gfid:c590e3fb-a376-4ac9-86a6-14a80814e06f&gt;' (possible split-brain).<br>&gt; Please fix the file on all backend volumes<br>&gt; <br>&gt; [2013-12-03 05:42:39.215793] E<br>&gt; [afr-self-heal-metadata.c:472:afr_sh_metadata_fix]<br>&gt; 0-glustervol-replicate-0: Unable to self-heal permissions/ownership

of<br>&gt; '&lt;gfid:c0660768-289f-48ac-b8e5-e5b5a3a4b965&gt;' (possible split-brain).<br>&gt; Please fix the file on all backend volumes<br>&gt; <br>&gt; &nbsp;<br>If that doesn't allow it to heal, you may need to find which filename<br>that's hardlinked to. ls -li the gfid file at the path I demonstrated<br>earlier. With that inode number in hand, find $brick -inum $inode_number<br>Once you know which filenames it's linked with, remove all linked copies<br>from all but one replica. Then the self-heal can continue successfully.<br><br><br><br><br>------------------------------<br><br>Message: 24<br>Date: Tue, 10 Dec 2013 13:09:38 +0800<br>From: Franco Broi &lt;franco.broi@iongeo.com&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] Pausing rebalance<br>Message-ID: &lt;1386652178.1682.110.camel@tc1&gt;<br>Content-Type: text/plain; charset=&quot;UTF-8&quot;<br><br><br>Before attempting a rebalance on my existing distributed Gluster volume<br>I thought I'd do some testing with my new storage. I created a volume<br>consisting of 4 bricks on the same server and wrote some data to it. I<br>then added a new brick from a another server. I ran the fix-layout and<br>wrote some new files and could see them on the new brick. All good so<br>far, so I started the data rebalance. After it had been running for a<br>while I wanted to add another brick, which I obviously couldn't do while<br>it was running so I stopped it. Even with it stopped It wouldn't let me<br>add a brick so I tried restarting it, but it wouldn't let me do that<br>either. I presume you just reissue the start command as there's no<br>restart?<br><br>[root@nas3 ~]# gluster vol rebalance test-volume status<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Node Rebalanced-files

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;size &nbsp; &nbsp; &nbsp; scanned &nbsp;

&nbsp; &nbsp;failures &nbsp; &nbsp; &nbsp; skipped &nbsp; &nbsp; &nbsp;

&nbsp; status run time in secs<br>--------- &nbsp; &nbsp; &nbsp;----------- &nbsp; ----------- &nbsp; -----------

&nbsp; ----------- &nbsp; ----------- &nbsp; ------------ &nbsp; --------------<br>localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp;

&nbsp; &nbsp; 611.7GB &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1358 &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;10

&nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;4929.00<br>localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp;

&nbsp; &nbsp; 611.7GB &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1358 &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;10

&nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;4929.00<br> nas4-10g &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp;

&nbsp; &nbsp; &nbsp;0Bytes &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1506 &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; 0 &nbsp; &nbsp; &nbsp;completed &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; 8.00 volume rebalance: test-volume: success:  [root@nas3 ~]# gluster vol add-brick test-volume nas4-10g:/data14/gvol volume add-brick: failed: Volume name test-volume rebalance is in progress.

Please retry after completion<br>[root@nas3 ~]# gluster vol rebalance test-volume start<br>volume rebalance: test-volume: failed: Rebalance on test-volume is already

started<br><br>In the end I used the force option to make it start but was that the<br>right thing to do?<br><br>glusterfs 3.4.1 built on Oct 28 2013 11:01:59<br>Volume Name: test-volume<br>Type: Distribute<br>Volume ID: 56ee0173-aed1-4be6-a809-ee0544f9e066<br>Status: Started<br>Number of Bricks: 5<br>Transport-type: tcp<br>Bricks:<br>Brick1: nas3-10g:/data9/gvol<br>Brick2: nas3-10g:/data10/gvol<br>Brick3: nas3-10g:/data11/gvol<br>Brick4: nas3-10g:/data12/gvol<br>Brick5: nas4-10g:/data13/gvol<br><br><br><br><br>------------------------------<br><br>Message: 25<br>Date: Tue, 10 Dec 2013 10:42:28 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: harry mangalam &lt;harry.mangalam@uci.edu&gt;,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Where does the 'date' string in<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

'/var/log/glusterfs/gl.log' come from?<br>Message-ID: &lt;52A6A2BC.7010501@redhat.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br><br>On 12/10/2013 10:14 AM, harry mangalam wrote:<br>&gt; Admittedly I should search the source, but I wonder if anyone knows

this<br>&gt; offhand.<br>&gt;<br>&gt; Background: of our 84 ROCKS (6.1) -provisioned compute nodes, 4 have<br>&gt; picked up an 'advanced date' in the /var/log/glusterfs/gl.log file

-<br>&gt; that date string is running about 5-6 hours ahead of the system date

and<br>&gt; all the Gluster servers (which are identical and correct). The time<br>&gt; advancement does not appear to be identical tho it's hard to tell

since<br>&gt; it only shows on errors and those update irregularly.<br><br>The timestamps in the log file are by default in UTC. That could <br>possibly explain why the timestamps look advanced in the log file.<br><br>&gt;<br>&gt; All the clients are the same version and all the servers are the same<br>&gt; (gluster v 3.4.0-8.el6.x86_64<br>&gt;<br>&gt; This would not be of interest except that those 4 clients are losing<br>&gt; files, unable to reliably do IO, etc on the gluster fs. They don't<br>&gt; appear to be having problems with NFS mounts, nor with a Fraunhofer

FS<br>&gt; that is also mounted on each node,<br><br>Do you observe anything in the client log files of these machines that

<br>indicate I/O problems?<br><br>Thanks,<br>Vijay<br><br><br>------------------------------<br><br>Message: 26<br>Date: Tue, 10 Dec 2013 10:56:52 +0530<br>From: shishir gowda &lt;gowda.shishir@gmail.com&gt;<br>To: Franco Broi &lt;franco.broi@iongeo.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Pausing rebalance<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAMYy+hVgyiPMYiDtkKtA1EBbbcpJAyp3O1_1=oAqKq1dc4NN+g@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>Hi Franco,<br><br>If a file is under migration, and a rebalance stop is encountered, then<br>rebalance process exits only after the completion of the migration.<br>That might be one of the reasons why you saw rebalance in progress message<br>while trying to add the brick<br><br>Could you please share the average file size in your setup?<br><br>You could always check the rebalance status command to ensure rebalance

has<br>indeed completed/stopped before proceeding with the add-brick. Using<br>add-brick force while rebalance is on-going should not be used in normal<br>scenarios. I do see that in your case, they show stopped/completed.<br>Glusterd logs would help in triaging the issue.<br><br>Rebalance re-writes layouts, and migrates data. While this is happening,

if a add-brick is done, then the cluster might go into a imbalanced stated. Hence, the check if rebalance is in progress while doing add-brick With regards, Shishir On 10 December 2013 10:39, Franco Broi &lt;franco.broi@iongeo.com&gt; wrote: &gt; &gt; Before attempting a rebalance on my existing distributed Gluster volume &gt; I thought I'd do some testing with my new storage. I created a volume &gt; consisting of 4 bricks on the same server and wrote some data to it.

I<br>&gt; then added a new brick from a another server. I ran the fix-layout

and<br>&gt; wrote some new files and could see them on the new brick. All good

so<br>&gt; far, so I started the data rebalance. After it had been running for

a<br>&gt; while I wanted to add another brick, which I obviously couldn't do

while<br>&gt; it was running so I stopped it. Even with it stopped It wouldn't let

me<br>&gt; add a brick so I tried restarting it, but it wouldn't let me do that<br>&gt; either. I presume you just reissue the start command as there's no<br>&gt; restart?<br>&gt;<br>&gt; [root@nas3 ~]# gluster vol rebalance test-volume status<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Node Rebalanced-files

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;size<br>&gt; &nbsp; &nbsp; scanned &nbsp; &nbsp; &nbsp;failures &nbsp; &nbsp; &nbsp;

skipped &nbsp; &nbsp; &nbsp; &nbsp; status run time in secs<br>&gt; --------- &nbsp; &nbsp; &nbsp;----------- &nbsp; ----------- &nbsp;

----------- &nbsp; -----------<br>&gt; ----------- &nbsp; ------------ &nbsp; --------------<br>&gt; localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7

&nbsp; &nbsp; &nbsp; 611.7GB &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1358 &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0<br>&gt; &nbsp; &nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp;4929.00<br>&gt; localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7

&nbsp; &nbsp; &nbsp; 611.7GB &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1358 &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0<br>&gt; &nbsp; &nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp;4929.00<br>&gt; &nbsp;nas4-10g &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0

&nbsp; &nbsp; &nbsp; &nbsp;0Bytes &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1506

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0<br>&gt; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp;completed &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; 8.00<br>&gt; volume rebalance: test-volume: success:<br>&gt; [root@nas3 ~]# gluster vol add-brick test-volume nas4-10g:/data14/gvol<br>&gt; volume add-brick: failed: Volume name test-volume rebalance is in<br>&gt; progress. Please retry after completion<br>&gt; [root@nas3 ~]# gluster vol rebalance test-volume start<br>&gt; volume rebalance: test-volume: failed: Rebalance on test-volume is

already<br>&gt; started<br>&gt;<br>&gt; In the end I used the force option to make it start but was that the<br>&gt; right thing to do?<br>&gt;<br>&gt; glusterfs 3.4.1 built on Oct 28 2013 11:01:59<br>&gt; Volume Name: test-volume<br>&gt; Type: Distribute<br>&gt; Volume ID: 56ee0173-aed1-4be6-a809-ee0544f9e066<br>&gt; Status: Started<br>&gt; Number of Bricks: 5<br>&gt; Transport-type: tcp<br>&gt; Bricks:<br>&gt; Brick1: nas3-10g:/data9/gvol<br>&gt; Brick2: nas3-10g:/data10/gvol<br>&gt; Brick3: nas3-10g:/data11/gvol<br>&gt; Brick4: nas3-10g:/data12/gvol<br>&gt; Brick5: nas4-10g:/data13/gvol<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/1944e9e8/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/1944e9e8/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 27<br>Date: Tue, 10 Dec 2013 11:02:52 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: Alex Pearson &lt;alex@apics.co.uk&gt;<br>Cc: gluster-users Discussion List &lt;Gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] replace-brick failing -<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

transport.address-family not specified<br>Message-ID: &lt;52A6A784.6070404@redhat.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br><br>On 12/08/2013 05:44 PM, Alex Pearson wrote:<br>&gt; Hi All,<br>&gt; Just to assist anyone else having this issue, and so people can correct

me if I'm wrong...<br>&gt;<br>&gt; It would appear that replace-brick is 'horribly broken' and should

not be used in Gluster 3.4. &nbsp;Instead a combination of &quot;remove-brick

... count X ... start&quot; should be used to remove the resilience from

a volume and the brick, then &quot;add-brick ... count X&quot; to add the

new brick.<br>&gt;<br>&gt; This does beg the question of why the hell a completely broken command

was left in the 'stable' release of the software. &nbsp;This sort of thing

really hurts Glusters credibility.<br><br>A mention of replace-brick not being functional was made in the release

<br>note for 3.4.0:<br><br></font></tt><a href="https://github.com/gluster/glusterfs/blob/release-3.4/doc/release-notes/3.4.0.md"><tt><font size=2>https://github.com/gluster/glusterfs/blob/release-3.4/doc/release-notes/3.4.0.md</font></tt></a><tt><font size=2><br><br>&gt;<br>&gt; Ref: </font></tt><a href="http://www.gluster.org/pipermail/gluster-users/2013-August/036936.html"><tt><font size=2>http://www.gluster.org/pipermail/gluster-users/2013-August/036936.html</font></tt></a><tt><font size=2><br><br>This discussion happened after the release of GlusterFS 3.4. However, I

<br>do get the point you are trying to make here. We can have an explicit <br>warning in CLI when operations considered broken are attempted. There is

<br>a similar plan to add a warning for rdma volumes:<br><br></font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1017176"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1017176</font></tt></a><tt><font size=2><br><br>There is a patch under review currently to remove the replace-brick <br>command from CLI:<br><br></font></tt><a href=http://review.gluster.org/6031><tt><font size=2>http://review.gluster.org/6031</font></tt></a><tt><font size=2><br><br>This is intended for master. If you can open a bug report indicating an

<br>appropriate warning message that you would like to see when <br>replace-brick is attempted, I would be happy to get such a fix in to <br>both 3.4 and 3.5.<br><br>Thanks,<br>Vijay<br><br>&gt;<br>&gt; Cheers<br>&gt;<br>&gt; Alex<br>&gt;<br>&gt; ----- Original Message -----<br>&gt; From: &quot;Alex Pearson&quot; &lt;alex@apics.co.uk&gt;<br>&gt; To: gluster-users@gluster.org<br>&gt; Sent: Friday, 6 December, 2013 5:25:43 PM<br>&gt; Subject: [Gluster-users] replace-brick failing - transport.address-family

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

not specified<br>&gt;<br>&gt; Hello,<br>&gt; I have what I think is a fairly basic Gluster setup, however when

I try to carry out a replace-brick operation it consistently fails...<br>&gt;<br>&gt; Here are the command line options:<br>&gt;<br>&gt; root@osh1:~# gluster volume info media<br>&gt;<br>&gt; Volume Name: media<br>&gt; Type: Replicate<br>&gt; Volume ID: 4c290928-ba1c-4a45-ac05-85365b4ea63a<br>&gt; Status: Started<br>&gt; Number of Bricks: 1 x 2 = 2<br>&gt; Transport-type: tcp<br>&gt; Bricks:<br>&gt; Brick1: osh1.apics.co.uk:/export/sdc/media<br>&gt; Brick2: osh2.apics.co.uk:/export/sdb/media<br>&gt;<br>&gt; root@osh1:~# gluster volume replace-brick media osh1.apics.co.uk:/export/sdc/media

osh1.apics.co.uk:/export/WCASJ2055681/media start<br>&gt; volume replace-brick: success: replace-brick started successfully<br>&gt; ID: 60bef96f-a5c7-4065-864e-3e0b2773d7bb<br>&gt; root@osh1:~# gluster volume replace-brick media osh1.apics.co.uk:/export/sdc/media

osh1.apics.co.uk:/export/WCASJ2055681/media status<br>&gt; volume replace-brick: failed: Commit failed on localhost. Please check

the log file for more details.<br>&gt;<br>&gt; root@osh1:~# tail /var/log/glusterfs/bricks/export-sdc-media.log<br>&gt; [2013-12-06 17:24:54.795754] E [name.c:147:client_fill_address_family]

0-media-replace-brick: transport.address-family not specified. Could not

guess default value from (remote-host:(null) or transport.unix.connect-path:(null))

options<br>&gt; [2013-12-06 17:24:57.796422] W [dict.c:1055:data_to_str] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(+0x528b)

[0x7fb826e3428b] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e)

[0x7fb826e3a25e] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(client_fill_address_family+0x200)

[0x7fb826e39f50]))) 0-dict: data is NULL<br>&gt; [2013-12-06 17:24:57.796494] W [dict.c:1055:data_to_str] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(+0x528b)

[0x7fb826e3428b] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e)

[0x7fb826e3a25e] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(client_fill_address_family+0x20b)

[0x7fb826e39f5b]))) 0-dict: data is NULL<br>&gt; [2013-12-06 17:24:57.796519] E [name.c:147:client_fill_address_family]

0-media-replace-brick: transport.address-family not specified. Could not

guess default value from (remote-host:(null) or transport.unix.connect-path:(null))

options<br>&gt; [2013-12-06 17:25:00.797153] W [dict.c:1055:data_to_str] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(+0x528b)

[0x7fb826e3428b] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e)

[0x7fb826e3a25e] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(client_fill_address_family+0x200)

[0x7fb826e39f50]))) 0-dict: data is NULL<br>&gt; [2013-12-06 17:25:00.797226] W [dict.c:1055:data_to_str] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(+0x528b)

[0x7fb826e3428b] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e)

[0x7fb826e3a25e] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(client_fill_address_family+0x20b)

[0x7fb826e39f5b]))) 0-dict: data is NULL<br>&gt; [2013-12-06 17:25:00.797251] E [name.c:147:client_fill_address_family]

0-media-replace-brick: transport.address-family not specified. Could not

guess default value from (remote-host:(null) or transport.unix.connect-path:(null))

options<br>&gt; [2013-12-06 17:25:03.797811] W [dict.c:1055:data_to_str] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(+0x528b)

[0x7fb826e3428b] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e)

[0x7fb826e3a25e] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(client_fill_address_family+0x200)

[0x7fb826e39f50]))) 0-dict: data is NULL<br>&gt; [2013-12-06 17:25:03.797883] W [dict.c:1055:data_to_str] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(+0x528b)

[0x7fb826e3428b] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e)

[0x7fb826e3a25e] (--&gt;/usr/lib/x86_64-linux-gnu/glusterfs/3.4.1/rpc-transport/socket.so(client_fill_address_family+0x20b)

[0x7fb826e39f5b]))) 0-dict: data is NULL<br>&gt; [2013-12-06 17:25:03.797909] E [name.c:147:client_fill_address_family]

0-media-replace-brick: transport.address-family not specified. Could not

guess default value from (remote-host:(null) or transport.unix.connect-path:(null))

options<br>&gt;<br>&gt;<br>&gt; I've tried placing the transport.address-family option in various

places, however it hasn't helped.<br>&gt;<br>&gt; Any help would be very much appreciated.<br>&gt;<br>&gt; Thanks in advance<br>&gt;<br>&gt; Alex<br>&gt;<br><br><br><br>------------------------------<br><br>Message: 28<br>Date: Tue, 10 Dec 2013 11:04:49 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: Diep Pham Van &lt;imeo@favadi.com&gt;, &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &quot;gluster-users@gluster.org&quot;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] [CentOS 6] Upgrade to the glusterfs<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

version in base or in glusterfs-epel<br>Message-ID: &lt;52A6A7F9.2090009@redhat.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br><br>On 12/10/2013 09:36 AM, Diep Pham Van wrote:<br>&gt; On Mon, 9 Dec 2013 19:53:20 +0900<br>&gt; Nguyen Viet Cuong &lt;mrcuongnv@gmail.com&gt; wrote:<br>&gt;<br>&gt;&gt; There is no glusterfs-server in the &quot;base&quot; repository,

just client.<br>&gt; Silly me.<br>&gt; After install and attempt to mount with base version of glusterfs-fuse,<br>&gt; I realize that I have to change 'backupvolfile-server' mount option

to<br>&gt; 'backup-volfile-servers'[1].<br><br>And a patch to provide backward compatibility for 'backupvolfile-server'

<br>is available now [1].<br><br>-Vijay<br><br>[1] </font></tt><a href=http://review.gluster.org/6464><tt><font size=2>http://review.gluster.org/6464</font></tt></a><tt><font size=2><br><br><br>&gt;<br>&gt; Links:<br>&gt; [1] </font></tt><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1023950"><tt><font size=2>https://bugzilla.redhat.com/show_bug.cgi?id=1023950</font></tt></a><tt><font size=2><br>&gt;<br><br><br><br>------------------------------<br><br>Message: 29<br>Date: Tue, 10 Dec 2013 13:39:38 +0800<br>From: Franco Broi &lt;franco.broi@iongeo.com&gt;<br>To: shishir gowda &lt;gowda.shishir@gmail.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Pausing rebalance<br>Message-ID: &lt;1386653978.1682.125.camel@tc1&gt;<br>Content-Type: text/plain; charset=&quot;utf-8&quot;<br><br>On Tue, 2013-12-10 at 10:56 +0530, shishir gowda wrote: <br>&gt; Hi Franco,<br>&gt; <br>&gt; <br>&gt; If a file is under migration, and a rebalance stop is encountered,<br>&gt; then rebalance process exits only after the completion of the<br>&gt; migration.<br>&gt; <br>&gt; That might be one of the reasons why you saw rebalance in progress<br>&gt; message while trying to add the brick<br><br>The status said it was stopped. I didn't do a top on the machine but are<br>you saying that it was still rebalancing despite saying it had stopped?<br><br>&gt; <br>&gt; Could you please share the average file size in your setup?<br>&gt; <br><br>Bit hard to say, I just copied some data from our main processing<br>system. The sizes range from very small to 10's of gigabytes.<br><br>&gt; <br>&gt; You could always check the rebalance status command to ensure<br>&gt; rebalance has indeed completed/stopped before proceeding with the<br>&gt; add-brick. Using add-brick force while rebalance is on-going should<br>&gt; not be used in normal scenarios. I do see that in your case, they

show<br>&gt; stopped/completed. Glusterd logs would help in triaging the issue.<br><br>See attached.<br><br>&gt; <br>&gt; <br>&gt; Rebalance re-writes layouts, and migrates data. While this is<br>&gt; happening, if a add-brick is done, then the cluster might go into

a<br>&gt; imbalanced stated. Hence, the check if rebalance is in progress while<br>&gt; doing add-brick<br><br>I can see that but as far as I could tell, the rebalance had stopped<br>according to the status.<br><br>Just to be clear, what command restarts the rebalancing?<br><br>&gt; <br>&gt; <br>&gt; With regards,<br>&gt; Shishir<br>&gt; <br>&gt; <br>&gt; <br>&gt; On 10 December 2013 10:39, Franco Broi &lt;franco.broi@iongeo.com&gt;

wrote:<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; <br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Before attempting a rebalance on my existing

distributed<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster volume<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; I thought I'd do some testing with my

new storage. I created a<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; consisting of 4 bricks on the same server

and wrote some data<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; to it. I<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; then added a new brick from a another

server. I ran the<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; fix-layout and<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; wrote some new files and could see them

on the new brick. All &gt; &nbsp; &nbsp; &nbsp; &nbsp; good so &gt; &nbsp; &nbsp; &nbsp; &nbsp; far, so I started the data rebalance.

After it had been<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; running for a<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; while I wanted to add another brick, which

I obviously<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; couldn't do while<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; it was running so I stopped it. Even with

it stopped It<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; wouldn't let me<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; add a brick so I tried restarting it,

but it wouldn't let me<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; do that<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; either. I presume you just reissue the

start command as<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; there's no<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; restart?<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; <br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol rebalance test-volume

status<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Node Rebalanced-files<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;size &nbsp; &nbsp;

&nbsp; scanned &nbsp; &nbsp; &nbsp;failures &nbsp; &nbsp; &nbsp; skipped<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; status run time in secs<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; --------- &nbsp; &nbsp; &nbsp;-----------

&nbsp; ----------- &nbsp; -----------<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; ----------- &nbsp; ----------- &nbsp;

------------ &nbsp; --------------<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; localhost &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp; &nbsp; &nbsp; 611.7GB &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp;1358<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;10

&nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;4929.00<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; localhost &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp; &nbsp; &nbsp; 611.7GB &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp;1358<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;10

&nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;4929.00<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;nas4-10g &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;0Bytes &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp;1506<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

0 &nbsp; &nbsp; &nbsp;completed &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

8.00<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume rebalance: test-volume: success:<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol add-brick test-volume<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; nas4-10g:/data14/gvol<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume add-brick: failed: Volume name

test-volume rebalance is<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; in progress. Please retry after completion<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol rebalance test-volume

start<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume rebalance: test-volume: failed:

Rebalance on<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; test-volume is already started<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; <br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; In the end I used the force option to

make it start but was<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; that the<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; right thing to do?<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; <br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; glusterfs 3.4.1 built on Oct 28 2013 11:01:59<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Volume Name: test-volume<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Type: Distribute<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Volume ID: 56ee0173-aed1-4be6-a809-ee0544f9e066<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Status: Started<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Number of Bricks: 5<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Transport-type: tcp<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Bricks:<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick1: nas3-10g:/data9/gvol<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick2: nas3-10g:/data10/gvol<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick3: nas3-10g:/data11/gvol<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick4: nas3-10g:/data12/gvol<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick5: nas4-10g:/data13/gvol<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; <br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; <br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; _______________________________________________<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster-users mailing list<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster-users@gluster.org<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; <br>&gt; <br><br>-------------- next part --------------<br>A non-text attachment was scrubbed...<br>Name: etc-glusterfs-glusterd.vol.log.gz<br>Type: application/gzip<br>Size: 7209 bytes<br>Desc: not available<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/adc5d486/attachment-0001.bin"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/adc5d486/attachment-0001.bin</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 30<br>Date: Tue, 10 Dec 2013 11:09:47 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: Nguyen Viet Cuong &lt;mrcuongnv@gmail.com&gt;<br>Cc: &quot;Gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] replace-brick failing -<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

transport.address-family not specified<br>Message-ID: &lt;52A6A923.4030208@redhat.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br><br>On 12/08/2013 07:06 PM, Nguyen Viet Cuong wrote:<br>&gt; Thanks for sharing.<br>&gt;<br>&gt; Btw, I do believe that GlusterFS 3.2.x is much more stable than 3.4.x

in<br>&gt; production.<br>&gt;<br><br>This is quite contrary to what we have seen in the community. From a <br>development perspective too, we feel much better about 3.4.1. Are there

<br>specific instances that worked well with 3.2.x which does not work fine

<br>for you in 3.4.x?<br><br>Cheers,<br>Vijay<br><br><br><br><br><br></font></tt><br><tt><font size=2><br><br>------------------------------<br><br>Message: 31<br>Date: Tue, 10 Dec 2013 11:30:21 +0530<br>From: Kaushal M &lt;kshlmster@gmail.com&gt;<br>To: Franco Broi &lt;franco.broi@iongeo.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Pausing rebalance<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAOujamU0J4Tam9ojFAmCoPqSzd5Tm1FeyfMYEBv2znMX9yN=4A@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=ISO-8859-1<br><br>On Tue, Dec 10, 2013 at 11:09 AM, Franco Broi &lt;franco.broi@iongeo.com&gt;

wrote:<br>&gt; On Tue, 2013-12-10 at 10:56 +0530, shishir gowda wrote:<br>&gt;&gt; Hi Franco,<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; If a file is under migration, and a rebalance stop is encountered,<br>&gt;&gt; then rebalance process exits only after the completion of the<br>&gt;&gt; migration.<br>&gt;&gt;<br>&gt;&gt; That might be one of the reasons why you saw rebalance in progress<br>&gt;&gt; message while trying to add the brick<br>&gt;<br>&gt; The status said it was stopped. I didn't do a top on the machine but

are<br>&gt; you saying that it was still rebalancing despite saying it had stopped?<br>&gt;<br><br>The 'stopped' status is a little bit misleading. The rebalance process<br>could have been migrating a large file when the stop command was<br>issued, so the process would continue migrating that file and quit<br>once it finished. In this time period, though the status says<br>'stopped' the rebalance process is actually running, which prevents<br>other operations from happening. Ideally, we would have a 'stopping'<br>status which would convey the correct meaning. But for now we can only<br>verify that a rebalance process has actually stopped by monitoring the<br>actual rebalance process. The rebalance process is a 'glusterfs'<br>process with some arguments containing rebalance.<br><br>&gt;&gt;<br>&gt;&gt; Could you please share the average file size in your setup?<br>&gt;&gt;<br>&gt;<br>&gt; Bit hard to say, I just copied some data from our main processing<br>&gt; system. The sizes range from very small to 10's of gigabytes.<br>&gt;<br>&gt;&gt;<br>&gt;&gt; You could always check the rebalance status command to ensure<br>&gt;&gt; rebalance has indeed completed/stopped before proceeding with

the<br>&gt;&gt; add-brick. Using add-brick force while rebalance is on-going should<br>&gt;&gt; not be used in normal scenarios. I do see that in your case, they

show<br>&gt;&gt; stopped/completed. Glusterd logs would help in triaging the issue.<br>&gt;<br>&gt; See attached.<br>&gt;<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; Rebalance re-writes layouts, and migrates data. While this is<br>&gt;&gt; happening, if a add-brick is done, then the cluster might go into

a<br>&gt;&gt; imbalanced stated. Hence, the check if rebalance is in progress

while<br>&gt;&gt; doing add-brick<br>&gt;<br>&gt; I can see that but as far as I could tell, the rebalance had stopped<br>&gt; according to the status.<br>&gt;<br>&gt; Just to be clear, what command restarts the rebalancing?<br>&gt;<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; With regards,<br>&gt;&gt; Shishir<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; On 10 December 2013 10:39, Franco Broi &lt;franco.broi@iongeo.com&gt;

wrote:<br>&gt;&gt;<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Before attempting a rebalance on my

existing distributed<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster volume<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; I thought I'd do some testing with

my new storage. I created a<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; consisting of 4 bricks on the same

server and wrote some data<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; to it. I<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; then added a new brick from a another

server. I ran the<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; fix-layout and<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; wrote some new files and could see

them on the new brick. All &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; good so &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; far, so I started the data rebalance.

After it had been &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; running for a &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; while I wanted to add another brick,

which I obviously<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; couldn't do while<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; it was running so I stopped it. Even

with it stopped It<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; wouldn't let me<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; add a brick so I tried restarting

it, but it wouldn't let me<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; do that<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; either. I presume you just reissue

the start command as<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; there's no<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; restart?<br>&gt;&gt;<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol rebalance

test-volume status<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; Node Rebalanced-files<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;size &nbsp; &nbsp;

&nbsp; scanned &nbsp; &nbsp; &nbsp;failures &nbsp; &nbsp; &nbsp; skipped<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; status run time in secs<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; --------- &nbsp; &nbsp; &nbsp;-----------

&nbsp; ----------- &nbsp; -----------<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; ----------- &nbsp; ----------- &nbsp;

------------ &nbsp; --------------<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; localhost &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp; &nbsp; &nbsp; 611.7GB &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp;1358<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;4929.00<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; localhost &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp; &nbsp; &nbsp; 611.7GB &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp;1358<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;4929.00<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;nas4-10g &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;0Bytes &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;1506<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; 0 &nbsp; &nbsp; &nbsp;completed &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; 8.00<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume rebalance: test-volume: success:<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol add-brick

test-volume<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; nas4-10g:/data14/gvol<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume add-brick: failed: Volume name

test-volume rebalance is<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; in progress. Please retry after completion<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol rebalance

test-volume start<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume rebalance: test-volume: failed:

Rebalance on<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; test-volume is already started<br>&gt;&gt;<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; In the end I used the force option

to make it start but was<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; that the<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; right thing to do?<br>&gt;&gt;<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; glusterfs 3.4.1 built on Oct 28 2013

11:01:59<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Volume Name: test-volume<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Type: Distribute<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Volume ID: 56ee0173-aed1-4be6-a809-ee0544f9e066<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Status: Started<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Number of Bricks: 5<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Transport-type: tcp<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Bricks:<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick1: nas3-10g:/data9/gvol<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick2: nas3-10g:/data10/gvol<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick3: nas3-10g:/data11/gvol<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick4: nas3-10g:/data12/gvol<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick5: nas4-10g:/data13/gvol<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; _______________________________________________<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster-users mailing list<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster-users@gluster.org<br>&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;&gt;<br>&gt;&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br><br>------------------------------<br><br>Message: 32<br>Date: Tue, 10 Dec 2013 14:32:46 +0800<br>From: Franco Broi &lt;franco.broi@iongeo.com&gt;<br>To: Kaushal M &lt;kshlmster@gmail.com&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Pausing rebalance<br>Message-ID: &lt;1386657166.1682.130.camel@tc1&gt;<br>Content-Type: text/plain; charset=&quot;UTF-8&quot;<br><br><br>Thanks for clearing that up. I had to wait about 30 minutes for all<br>rebalancing activity to cease, then I was able to add a new brick.<br><br>What does it use to migrate the files? The copy rate was pretty slow<br>considering both bricks were on the same server, I only saw about<br>200MB/Sec. Each brick is a 16 disk ZFS raidz2, copying with dd I can get<br>well over 500MB/Sec.<br><br><br>On Tue, 2013-12-10 at 11:30 +0530, Kaushal M wrote: <br>&gt; On Tue, Dec 10, 2013 at 11:09 AM, Franco Broi &lt;franco.broi@iongeo.com&gt;

wrote:<br>&gt; &gt; On Tue, 2013-12-10 at 10:56 +0530, shishir gowda wrote:<br>&gt; &gt;&gt; Hi Franco,<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; If a file is under migration, and a rebalance stop is encountered,<br>&gt; &gt;&gt; then rebalance process exits only after the completion of

the<br>&gt; &gt;&gt; migration.<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; That might be one of the reasons why you saw rebalance in

progress<br>&gt; &gt;&gt; message while trying to add the brick<br>&gt; &gt;<br>&gt; &gt; The status said it was stopped. I didn't do a top on the machine

but are<br>&gt; &gt; you saying that it was still rebalancing despite saying it had

stopped?<br>&gt; &gt;<br>&gt; <br>&gt; The 'stopped' status is a little bit misleading. The rebalance process<br>&gt; could have been migrating a large file when the stop command was<br>&gt; issued, so the process would continue migrating that file and quit<br>&gt; once it finished. In this time period, though the status says<br>&gt; 'stopped' the rebalance process is actually running, which prevents<br>&gt; other operations from happening. Ideally, we would have a 'stopping'<br>&gt; status which would convey the correct meaning. But for now we can

only<br>&gt; verify that a rebalance process has actually stopped by monitoring

the<br>&gt; actual rebalance process. The rebalance process is a 'glusterfs'<br>&gt; process with some arguments containing rebalance.<br>&gt; <br>&gt; &gt;&gt;<br>&gt; &gt;&gt; Could you please share the average file size in your setup?<br>&gt; &gt;&gt;<br>&gt; &gt;<br>&gt; &gt; Bit hard to say, I just copied some data from our main processing<br>&gt; &gt; system. The sizes range from very small to 10's of gigabytes.<br>&gt; &gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; You could always check the rebalance status command to ensure<br>&gt; &gt;&gt; rebalance has indeed completed/stopped before proceeding

with the<br>&gt; &gt;&gt; add-brick. Using add-brick force while rebalance is on-going

should<br>&gt; &gt;&gt; not be used in normal scenarios. I do see that in your case,

they show<br>&gt; &gt;&gt; stopped/completed. Glusterd logs would help in triaging the

issue.<br>&gt; &gt;<br>&gt; &gt; See attached.<br>&gt; &gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; Rebalance re-writes layouts, and migrates data. While this

is<br>&gt; &gt;&gt; happening, if a add-brick is done, then the cluster might

go into a<br>&gt; &gt;&gt; imbalanced stated. Hence, the check if rebalance is in progress

while<br>&gt; &gt;&gt; doing add-brick<br>&gt; &gt;<br>&gt; &gt; I can see that but as far as I could tell, the rebalance had

stopped<br>&gt; &gt; according to the status.<br>&gt; &gt;<br>&gt; &gt; Just to be clear, what command restarts the rebalancing?<br>&gt; &gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; With regards,<br>&gt; &gt;&gt; Shishir<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; On 10 December 2013 10:39, Franco Broi &lt;franco.broi@iongeo.com&gt;

wrote:<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Before attempting a rebalance

on my existing distributed<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster volume<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; I thought I'd do some testing

with my new storage. I created a<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; consisting of 4 bricks on the

same server and wrote some data<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; to it. I<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; then added a new brick from a

another server. I ran the<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; fix-layout and<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; wrote some new files and could

see them on the new brick. All &gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; good so &gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; far, so I started the data rebalance.

After it had been<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; running for a<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; while I wanted to add another

brick, which I obviously &gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; couldn't do while &gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; it was running so I stopped it.

Even with it stopped It<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; wouldn't let me<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; add a brick so I tried restarting

it, but it wouldn't let me<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; do that<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; either. I presume you just reissue

the start command as<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; there's no<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; restart?<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol rebalance

test-volume status<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; Node Rebalanced-files<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;size &nbsp;

&nbsp; &nbsp; scanned &nbsp; &nbsp; &nbsp;failures &nbsp; &nbsp; &nbsp;

skipped<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; status run time in secs<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; --------- &nbsp; &nbsp; &nbsp;-----------

&nbsp; ----------- &nbsp; -----------<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; ----------- &nbsp; -----------

&nbsp; ------------ &nbsp; --------------<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; localhost &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp; &nbsp; &nbsp; 611.7GB &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;1358<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp;4929.00<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; localhost &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7 &nbsp; &nbsp; &nbsp; 611.7GB &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp;1358<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;stopped &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp;4929.00<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;nas4-10g &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp;0Bytes

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1506<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp;completed &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; 8.00<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume rebalance: test-volume:

success:<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol add-brick

test-volume<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; nas4-10g:/data14/gvol<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume add-brick: failed: Volume

name test-volume rebalance is<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; in progress. Please retry after

completion<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; [root@nas3 ~]# gluster vol rebalance

test-volume start<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; volume rebalance: test-volume:

failed: Rebalance on<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; test-volume is already started<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; In the end I used the force option

to make it start but was<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; that the<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; right thing to do?<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; glusterfs 3.4.1 built on Oct

28 2013 11:01:59<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Volume Name: test-volume<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Type: Distribute<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Volume ID: 56ee0173-aed1-4be6-a809-ee0544f9e066<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Status: Started<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Number of Bricks: 5<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Transport-type: tcp<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Bricks:<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick1: nas3-10g:/data9/gvol<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick2: nas3-10g:/data10/gvol<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick3: nas3-10g:/data11/gvol<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick4: nas3-10g:/data12/gvol<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Brick5: nas4-10g:/data13/gvol<br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; _______________________________________________<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster-users mailing list<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; Gluster-users@gluster.org<br>&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; &gt;&gt;<br>&gt; &gt;&gt;<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; _______________________________________________<br>&gt; &gt; Gluster-users mailing list<br>&gt; &gt; Gluster-users@gluster.org<br>&gt; &gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br><br><br><br>------------------------------<br><br>Message: 33<br>Date: Tue, 10 Dec 2013 07:42:57 +0000<br>From: Bobby Jacob &lt;bobby.jacob@alshaya.com&gt;<br>To: Joe Julian &lt;joe@julianfamily.org&gt;<br>Cc: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Self Heal Issue GlusterFS 3.3.1<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;AC3305F9C186F849B835A3E6D3C9BEFEB5A841@KWTPRMBX001.mha.local&gt; Content-Type: text/plain; charset=&quot;utf-8&quot; Hi, Thanks Joe, the split brain files have been removed as you recommended.

How can we deal with this situation as there is no document which solves

such issues. ?<br><br>[root@KWTOCUATGS001 83]# gluster volume heal glustervol info<br>Gathering Heal info on volume glustervol has been successful<br><br>Brick KWTOCUATGS001:/mnt/cloudbrick<br>Number of entries: 14<br>/Tommy Kolega<br>&lt;gfid:10429dd5-180c-432e-aa4a-8b1624b86f4b&gt;<br>&lt;gfid:7883309e-8764-4cf6-82a6-d8d81cb60dd7&gt;<br>&lt;gfid:3e3d77d6-2818-4766-ae3b-4f582118321b&gt;<br>&lt;gfid:8bd03482-025c-4c09-8704-60be9ddfdfd8&gt;<br>&lt;gfid:2685e11a-4eb9-4a92-883e-faa50edfa172&gt;<br>&lt;gfid:24d83cbd-e621-4330-b0c1-ae1f0fd2580d&gt;<br>&lt;gfid:197e50fa-bfc0-4651-acaa-1f3d2d73936f&gt;<br>&lt;gfid:3e094ee9-c9cf-4010-82f4-6d18c1ab9ca0&gt;<br>&lt;gfid:77783245-4e03-4baf-8cb4-928a57b266cb&gt;<br>&lt;gfid:70340eaa-7967-41d0-855f-36add745f16f&gt;<br>&lt;gfid:c590e3fb-a376-4ac9-86a6-14a80814e06f&gt;<br>&lt;gfid:b1651457-175a-43ec-b476-d91ae8b52b0b&gt;<br>/Tommy Kolega/lucene_index<br><br>Brick KWTOCUATGS002:/mnt/cloudbrick<br>Number of entries: 15<br>&lt;gfid:7883309e-8764-4cf6-82a6-d8d81cb60dd7&gt;<br>&lt;gfid:0454d0d2-d432-4ac8-8476-02a8522e4a6a&gt;<br>&lt;gfid:a7e88fd1-6e32-40ab-90f6-ea452242a7c6&gt;<br>&lt;gfid:00389876-700f-4351-b00e-1c57496eed89&gt;<br>&lt;gfid:0cd48d89-1dd2-47f6-9311-58224b19446e&gt;<br>&lt;gfid:081c6657-301a-42a4-9f95-6eeba6c67413&gt;<br>&lt;gfid:565f1358-449c-45e2-8535-93b5632c0d1e&gt;<br>&lt;gfid:9c7010ac-5c11-4561-8b86-5c4d6561f34e&gt;<br>&lt;gfid:25fd406f-63e0-4037-bb01-da282cbe4d76&gt;<br>&lt;gfid:a109c429-5885-499e-8711-09fdccd396f2&gt;<br>&lt;gfid:5a8fd3bf-9215-444c-b974-5c280f5699a6&gt;<br>/Tommy Kolega<br>/Tommy Kolega/lucene_index<br>&lt;gfid:c49e9d76-e5d4-47dc-9cf1-3f858f6d07ea&gt;<br>&lt;gfid:c590e3fb-a376-4ac9-86a6-14a80814e06f&gt;<br><br>Thanks &amp; Regards,<br>Bobby Jacob<br><br>-----Original Message-----<br>From: Joe Julian [</font></tt><a href=mailto:joe@julianfamily.org><tt><font size=2>mailto:joe@julianfamily.org</font></tt></a><tt><font size=2>]

<br>Sent: Tuesday, December 10, 2013 7:59 AM<br>To: Bobby Jacob<br>Cc: gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Self Heal Issue GlusterFS 3.3.1<br><br>On Tue, 2013-12-03 at 05:47 +0000, Bobby Jacob wrote:<br>&gt; Hi,<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; I?m running glusterFS 3.3.1 on Centos 6.4. <br>&gt; <br>&gt; ? Gluster volume status<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; Status of volume: glustervol<br>&gt; <br>&gt; Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online<br>&gt; Pid<br>&gt; <br>&gt; ----------------------------------------------------------------------<br>&gt; --------<br>&gt; <br>&gt; Brick KWTOCUATGS001:/mnt/cloudbrick &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y<br>&gt; 20031<br>&gt; <br>&gt; Brick KWTOCUATGS002:/mnt/cloudbrick &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y<br>&gt; 1260<br>&gt; <br>&gt; NFS Server on localhost<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; 38467 &nbsp; Y &nbsp; &nbsp; &nbsp; 43320<br>&gt; <br>&gt; Self-heal Daemon on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp;N/A<br>&gt; Y &nbsp; &nbsp; &nbsp; 43326<br>&gt; <br>&gt; NFS Server on KWTOCUATGS002 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 38467 &nbsp; Y<br>&gt; 5842<br>&gt; <br>&gt; Self-heal Daemon on KWTOCUATGS002 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; N/A &nbsp; &nbsp; Y<br>&gt; 5848<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; The self heal stops working and application write only to 1 brick

and <br>&gt; it doesn?t replicate. When I check /var/log/glusterfs/glustershd.log

I <br>&gt; see the following.:<br>&gt; <br>&gt; &nbsp;<br>&gt; <br>&gt; [2013-12-03 05:42:32.033563] W [socket.c:410:__socket_keepalive]<br>&gt; 0-socket: failed to set keep idle on socket 8<br>&gt; <br>&gt; [2013-12-03 05:42:32.033646] W<br>&gt; [socket.c:1876:socket_server_event_handler] 0-socket.glusterfsd:<br>&gt; Failed to set keep-alive: Operation not supported<br>&gt; <br>&gt; [2013-12-03 05:42:32.790473] I<br>&gt; [client-handshake.c:1614:select_server_supported_programs]<br>&gt; 0-glustervol-client-1: Using Program GlusterFS 3.3.2, Num (1298437),

<br>&gt; Version (330)<br>&gt; <br>&gt; [2013-12-03 05:42:32.790840] I<br>&gt; [client-handshake.c:1411:client_setvolume_cbk] 0-glustervol-client-1:<br>&gt; Connected to 172.16.95.153:24009, attached to remote volume <br>&gt; '/mnt/cloudbrick'.<br>&gt; <br>&gt; [2013-12-03 05:42:32.790884] I<br>&gt; [client-handshake.c:1423:client_setvolume_cbk] 0-glustervol-client-1:<br>&gt; Server and Client lk-version numbers are not same, reopening the fds<br>&gt; <br>&gt; [2013-12-03 05:42:32.791003] I [afr-common.c:3685:afr_notify]<br>&gt; 0-glustervol-replicate-0: Subvolume 'glustervol-client-1' came back

<br>&gt; up; going online.<br>&gt; <br>&gt; [2013-12-03 05:42:32.791161] I<br>&gt; [client-handshake.c:453:client_set_lk_version_cbk]<br>&gt; 0-glustervol-client-1: Server lk version = 1<br>&gt; <br>&gt; [2013-12-03 05:42:32.795103] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:a7e88fd1-6e32-40ab-90f6-ea452242a7c6&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.798064] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:081c6657-301a-42a4-9f95-6eeba6c67413&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.799278] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:565f1358-449c-45e2-8535-93b5632c0d1e&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.800636] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:9c7010ac-5c11-4561-8b86-5c4d6561f34e&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.802223] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:25fd406f-63e0-4037-bb01-da282cbe4d76&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.803339] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:a109c429-5885-499e-8711-09fdccd396f2&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.804308] E<br>&gt; [afr-self-heal-data.c:1321:afr_sh_data_open_cbk]<br>&gt; 0-glustervol-replicate-0: open of<br>&gt; &lt;gfid:5a8fd3bf-9215-444c-b974-5c280f5699a6&gt; failed on child<br>&gt; glustervol-client-0 (Transport endpoint is not connected)<br>&gt; <br>&gt; [2013-12-03 05:42:32.804877] I<br>&gt; [client-handshake.c:1614:select_server_supported_programs]<br>&gt; 0-glustervol-client-0: Using Program GlusterFS 3.3.2, Num (1298437),

<br>&gt; Version (330)<br>&gt; <br>&gt; [2013-12-03 05:42:32.807517] I<br>&gt; [client-handshake.c:1411:client_setvolume_cbk] 0-glustervol-client-0:<br>&gt; Connected to 172.16.107.154:24009, attached to remote volume <br>&gt; '/mnt/cloudbrick'.<br>&gt; <br>&gt; [2013-12-03 05:42:32.807562] I<br>&gt; [client-handshake.c:1423:client_setvolume_cbk] 0-glustervol-client-0:<br>&gt; Server and Client lk-version numbers are not same, reopening the fds<br>&gt; <br>&gt; [2013-12-03 05:42:32.810357] I<br>&gt; [client-handshake.c:453:client_set_lk_version_cbk]<br>&gt; 0-glustervol-client-0: Server lk version = 1<br>&gt; <br>&gt; [2013-12-03 05:42:32.827437] E<br>&gt; [afr-self-heal-data.c:764:afr_sh_data_fxattrop_fstat_done]<br>&gt; 0-glustervol-replicate-0: Unable to self-heal contents of <br>&gt; '&lt;gfid:1262d40d-46a3-4e57-b07b-0fcc972c8403&gt;' (possible split-brain).<br>&gt; Please delete the file from all but the preferred subvolume.<br><br>That file is at<br>$brick/.glusterfs/12/62/1262d40d-46a3-4e57-b07b-0fcc972c8403<br><br>Try picking one to remove like it says.<br>&gt; <br>&gt; [2013-12-03 05:42:39.205157] E<br>&gt; [afr-self-heal-metadata.c:472:afr_sh_metadata_fix]<br>&gt; 0-glustervol-replicate-0: Unable to self-heal permissions/ownership

of <br>&gt; '&lt;gfid:c590e3fb-a376-4ac9-86a6-14a80814e06f&gt;' (possible split-brain).<br>&gt; Please fix the file on all backend volumes<br>&gt; <br>&gt; [2013-12-03 05:42:39.215793] E<br>&gt; [afr-self-heal-metadata.c:472:afr_sh_metadata_fix]<br>&gt; 0-glustervol-replicate-0: Unable to self-heal permissions/ownership

of <br>&gt; '&lt;gfid:c0660768-289f-48ac-b8e5-e5b5a3a4b965&gt;' (possible split-brain).<br>&gt; Please fix the file on all backend volumes<br>&gt; <br>&gt; &nbsp;<br>If that doesn't allow it to heal, you may need to find which filename that's

hardlinked to. ls -li the gfid file at the path I demonstrated earlier.

With that inode number in hand, find $brick -inum $inode_number Once you

know which filenames it's linked with, remove all linked copies from all

but one replica. Then the self-heal can continue successfully.<br><br><br><br>------------------------------<br><br>Message: 34<br>Date: Tue, 10 Dec 2013 09:30:22 +0100<br>From: Johan Huysmans &lt;johan.huysmans@inuits.be&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] Structure needs cleaning on some files<br>Message-ID: &lt;52A6D11E.4030406@inuits.be&gt;<br>Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br><br>Hi All,<br><br>When reading some files we get this error:<br>md5sum: /path/to/file.xml: Structure needs cleaning<br><br>in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>[2013-12-10 08:07:32.256910] W <br>[client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0: remote

<br>operation failed: No such file or directory<br>[2013-12-10 08:07:32.257436] W <br>[client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1: remote

<br>operation failed: No such file or directory<br>[2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk] <br>0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1 (Structure needs

<br>cleaning)<br><br>We are using gluster 3.4.1-3 on CentOS6.<br>Our servers are 64-bit, our clients 32-bit (we are already using <br>--enable-ino32 on the mountpoint)<br><br>This is my gluster configuration:<br>Volume Name: testvolume<br>Type: Replicate<br>Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>Status: Started<br>Number of Bricks: 1 x 2 = 2<br>Transport-type: tcp<br>Bricks:<br>Brick1: SRV-1:/gluster/brick1<br>Brick2: SRV-2:/gluster/brick2<br>Options Reconfigured:<br>performance.force-readdirp: on<br>performance.stat-prefetch: off<br>network.ping-timeout: 5<br><br>And this is how the applications work:<br>We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>On 1 client node we have a application which writes files.<br>On the other client node we have a application which reads these files.<br>On the node where the files are written we don't see any problem, and <br>can read that file without problems.<br>On the other node we have problems (error messages above) reading that

file.<br>The problem occurs when we perform a md5sum on the exact file, when <br>perform a md5sum on all files in that directory there is no problem.<br><br></font></tt><br><tt><font size=2>How can we solve this problem as this is annoying.<br>The problem occurs after some time (can be days), an umount and mount of

<br>the mountpoint solves it for some days.<br>Once it occurs (and we don't remount) it occurs every time.<br><br><br>I hope someone can help me with this problems.<br><br>Thanks,<br>Johan Huysmans<br><br><br>------------------------------<br><br>Message: 35<br>Date: Tue, 10 Dec 2013 08:56:56 +0000<br>From: &quot;Bernhard Glomm&quot; &lt;bernhard.glomm@ecologic.eu&gt;<br>To: vbellur@redhat.com, mrcuongnv@gmail.com<br>Cc: gluster-users@gluster.org<br>Subject: Re: [Gluster-users] replace-brick failing -<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

transport.address-family not specified<br>Message-ID: &lt;03a55549428f5909f0b3db1dee93d8c55e3ba3c3@ecologic.eu&gt;<br>Content-Type: text/plain; charset=&quot;utf-8&quot;<br><br>Am 10.12.2013 06:39:47, schrieb Vijay Bellur:<br>&gt; On 12/08/2013 07:06 PM, Nguyen Viet Cuong wrote:<br>&gt; &gt; Thanks for sharing.<br>&gt; &gt; <br>&gt; &gt; Btw, I do believe that GlusterFS 3.2.x is much more stable than

3.4.x in<br>&gt; &gt; production.<br>&gt; &gt; <br><br>&gt; This is quite contrary to what we have seen in the community. From

a <br>&gt; development perspective too, we feel much better about 3.4.1. Are

there <br>&gt; specific instances that worked well with 3.2.x which does not work

fine <br>&gt; for you in 3.4.x?<br><br><br>987555 -?is that fixed in 3.5?Or did it even make it into 3.4.2couldn't

find a note on that.Show stopper for moving from?3.2.x to anywhere for

me!<br>cheersb?<br>&gt; <br>&gt; Cheers,<br>&gt; Vijay<br>&gt; <br>&gt; <br>&gt; <br>&gt; <br>&gt; <br>&gt; <br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br><br><br><br>-- <br><br><br> &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; <br> &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Bernhard Glomm<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;IT Administration<br><br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Phone:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+49 (30)

86880 134<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Fax:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+49 (30)

86880 100<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Skype:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;bernhard.glomm.ecologic<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Ecologic Institut gemeinn?tzige GmbH

| Pfalzburger Str. 43/44 | 10717 Berlin | Germany<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;GF: R. Andreas Kraemer | AG: Charlottenburg

HRB 57947 | USt/VAT-IdNr.: DE811963464<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Ecologic? is a Trade Mark (TM) of Ecologic

Institut gemeinn?tzige GmbH<br> &nbsp; &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp;<br> &nbsp; &nbsp;<br> &nbsp; &nbsp; &nbsp; &nbsp; <br><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/475454d4/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/475454d4/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 36<br>Date: Tue, 10 Dec 2013 10:02:14 +0100<br>From: Johan Huysmans &lt;johan.huysmans@inuits.be&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>Message-ID: &lt;52A6D896.1020404@inuits.be&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;; Format=&quot;flowed&quot;<br><br>I could reproduce this problem with while my mount point is running in

<br>debug mode.<br>logfile is attached.<br><br>gr.<br>Johan Huysmans<br><br>On 10-12-13 09:30, Johan Huysmans wrote:<br>&gt; Hi All,<br>&gt;<br>&gt; When reading some files we get this error:<br>&gt; md5sum: /path/to/file.xml: Structure needs cleaning<br>&gt;<br>&gt; in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>&gt; [2013-12-10 08:07:32.256910] W <br>&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0:

<br>&gt; remote operation failed: No such file or directory<br>&gt; [2013-12-10 08:07:32.257436] W <br>&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1:

<br>&gt; remote operation failed: No such file or directory<br>&gt; [2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk] <br>&gt; 0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1 (Structure

<br>&gt; needs cleaning)<br>&gt;<br>&gt; We are using gluster 3.4.1-3 on CentOS6.<br>&gt; Our servers are 64-bit, our clients 32-bit (we are already using <br>&gt; --enable-ino32 on the mountpoint)<br>&gt;<br>&gt; This is my gluster configuration:<br>&gt; Volume Name: testvolume<br>&gt; Type: Replicate<br>&gt; Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>&gt; Status: Started<br>&gt; Number of Bricks: 1 x 2 = 2<br>&gt; Transport-type: tcp<br>&gt; Bricks:<br>&gt; Brick1: SRV-1:/gluster/brick1<br>&gt; Brick2: SRV-2:/gluster/brick2<br>&gt; Options Reconfigured:<br>&gt; performance.force-readdirp: on<br>&gt; performance.stat-prefetch: off<br>&gt; network.ping-timeout: 5<br>&gt;<br>&gt; And this is how the applications work:<br>&gt; We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>&gt; On 1 client node we have a application which writes files.<br>&gt; On the other client node we have a application which reads these files.<br>&gt; On the node where the files are written we don't see any problem,

and <br>&gt; can read that file without problems.<br>&gt; On the other node we have problems (error messages above) reading

that <br>&gt; file.<br>&gt; The problem occurs when we perform a md5sum on the exact file, when

<br>&gt; perform a md5sum on all files in that directory there is no problem.<br>&gt;<br>&gt;<br>&gt; How can we solve this problem as this is annoying.<br>&gt; The problem occurs after some time (can be days), an umount and mount

<br>&gt; of the mountpoint solves it for some days.<br>&gt; Once it occurs (and we don't remount) it occurs every time.<br>&gt;<br>&gt;<br>&gt; I hope someone can help me with this problems.<br>&gt;<br>&gt; Thanks,<br>&gt; Johan Huysmans<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br>-------------- next part --------------<br>A non-text attachment was scrubbed...<br>Name: gluster_debug.log<br>Type: text/x-log<br>Size: 16600 bytes<br>Desc: not available<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/bdf626dc/attachment-0001.bin"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/bdf626dc/attachment-0001.bin</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 37<br>Date: Tue, 10 Dec 2013 10:08:43 +0100<br>From: Heiko Kr?mer &lt;hkraemer@anynines.com&gt;<br>To: gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID: &lt;52A6DA1B.3030209@anynines.com&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;<br><br>-----BEGIN PGP SIGNED MESSAGE-----<br>Hash: SHA1<br><br>Hi guys,<br><br>thanks for all these reports. Well, I think I'll change my Raid level<br>to 6 and let the Raid controller build and rebuild all Raid members<br>and replicate again with glusterFS. I get more capacity but I need to<br>check if the write throughput acceptable.<br><br>I think, I can't take advantage of using glusterFS with a lot of<br>Bricks because I've found more cons as pros in my case.<br><br>@Ben thx for this very detailed document!<br><br><br>Cheers and Thanks<br>Heiko<br><br><br>On 10.12.2013 00:38, Dan Mons wrote:<br>&gt; On 10 December 2013 08:09, Joe Julian &lt;joe@julianfamily.org&gt;<br>&gt; wrote:<br>&gt;&gt; Replicas are defined in the order bricks are listed in the volume<br>&gt;&gt; create command. So gluster volume create myvol replica 2<br>&gt;&gt; server1:/data/brick1 server2:/data/brick1 server3:/data/brick1<br>&gt;&gt; server4:/data/brick1 will replicate between server1 and server2<br>&gt;&gt; and replicate between server3 and server4.<br>&gt;&gt; <br>&gt;&gt; Bricks added to a replica 2 volume after it's been created will<br>&gt;&gt; require pairs of bricks,<br>&gt;&gt; <br>&gt;&gt; The best way to &quot;force&quot; replication to happen on another

server<br>&gt;&gt; is to just define it that way.<br>&gt; <br>&gt; Yup, that's understood. &nbsp;The problem is when (for argument's

sake)<br>&gt; :<br>&gt; <br>&gt; * We've defined 4 hosts with 10 disks each * Each individual disk<br>&gt; is a brick * Replication is defined correctly when creating the<br>&gt; volume initially * I'm on holidays, my employer buys a single node,<br>&gt; configures it brick-per-disk, and the IT junior adds it to the<br>&gt; cluster<br>&gt; <br>&gt; All good up until that final point, and then I've got that fifth<br>&gt; node at the end replicating to itself. &nbsp;Node goes down some months<br>&gt; later, chaos ensues.<br>&gt; <br>&gt; Not a GlusterFS/technology problem, but a problem with what<br>&gt; frequently happens at a human level. &nbsp;As a sysadmin, these are

also<br>&gt; things I need to work around, even if it means deviating from best<br>&gt; practices. :)<br>&gt; <br>&gt; -Dan _______________________________________________ Gluster-users<br>&gt; mailing list Gluster-users@gluster.org <br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; <br><br>- -- <br>Anynines.com<br><br>Avarteq GmbH<br>B.Sc. Informatik<br>Heiko Kr?mer<br>CIO<br>Twitter: @anynines<br><br>- ----<br>Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian Fischer<br>Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.: DE262633168<br>Sitz: Saarbr?cken<br>-----BEGIN PGP SIGNATURE-----<br>Version: GnuPG v1.4.14 (GNU/Linux)<br>Comment: Using GnuPG with Thunderbird - </font></tt><a href=http://www.enigmail.net/><tt><font size=2>http://www.enigmail.net/</font></tt></a><tt><font size=2><br><br>iQEcBAEBAgAGBQJSptoTAAoJELxFogM4ixOFJTsIAJBWed3AGiiI+PDC2ubfboKc<br>UPkMc+zuirRh2+QJBAoZ4CsAv9eIZ5NowclSSby9PTq2XRjjLvMdKuI+IbXCRT4j<br>AbMLYfP3g4Q+agXnY6N6WJ6ZIqXQ8pbCK3shYp9nBfVYkiDUT1bGk0WcgQmEWTCw<br>ta1h17LYkworIDRtqWQAl4jr4JR4P3x4cmwOZiHCVCtlyOP02x/fN4dji6nyOtuB<br>kQPBVsND5guQNU8Blg5cQoES5nthtuwJdkWXB+neaCZd/u3sexVSNe5m15iWbyYg<br>mAoVvlBJ473IKATlxM5nVqcUhmjFwNcc8MMwczXxTkwniYzth53BSoltPn7kIx4=<br>=epys<br>-----END PGP SIGNATURE-----<br>-------------- next part --------------<br>A non-text attachment was scrubbed...<br>Name: hkraemer.vcf<br>Type: text/x-vcard<br>Size: 277 bytes<br>Desc: not available<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/f663943d/attachment-0001.vcf"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/f663943d/attachment-0001.vcf</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 38<br>Date: Tue, 10 Dec 2013 10:42:43 +0100<br>From: Johan Huysmans &lt;johan.huysmans@inuits.be&gt;<br>To: gluster-users@gluster.org, bill.mair@web.de<br>Subject: Re: [Gluster-users] Errors from PHP stat() on files and<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

directories in a glusterfs mount<br>Message-ID: &lt;52A6E213.3000109@inuits.be&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;; Format=&quot;flowed&quot;<br><br>Hi,<br><br>It seems I have a related problem (just posted this on the mailing list).<br>Do you already have a solution for this problem?<br><br>gr.<br>Johan Huysmans<br><br>On 05-12-13 20:05, Bill Mair wrote:<br>&gt; Hi,<br>&gt;<br>&gt; I'm trying to use glusterfs to mirror the ownCloud &quot;data&quot;

area between <br>&gt; 2 servers.<br>&gt;<br>&gt; They are using debian jessie due to some dependancies that I have

for <br>&gt; other components.<br>&gt;<br>&gt; This is where my issue rears it's ugly head. This is failing because

I <br>&gt; can't stat the files and directories on my glusterfs mount.<br>&gt;<br>&gt; /var/www/owncloud/data is where I am mounting the volume and I can

<br>&gt; reproduce the error using a simple php test application, so I don't

<br>&gt; think that it is apache or owncloud related.<br>&gt;<br>&gt; I'd be grateful for any pointers on how to resolve this problem.<br>&gt;<br>&gt; Thanks,<br>&gt;<br>&gt; &nbsp; &nbsp; Bill<br>&gt;<br>&gt; Attached is &quot;simple.php&quot; test and the results of executing

&quot;strace <br>&gt; php5 simple.php&quot; twice, once with the glusterfs mounted <br>&gt; (simple.php.strace-glusterfs) and once against the file system when

<br>&gt; unmounted (simple.php.strace-unmounted).<br>&gt;<br>&gt; ------------------------------------------------------------------------<br>&gt;<br>&gt; Here is what I get in the gluster log when I run the test (as root):<br>&gt;<br>&gt; /var/log/glusterfs/var-www-owncloud-data.log<br>&gt;<br>&gt; [2013-12-05 18:33:50.802250] D <br>&gt; [client-handshake.c:185:client_start_ping] 0-gv-ocdata-client-0: <br>&gt; returning as transport is already disconnected OR there are no frames

<br>&gt; (0 || 0)<br>&gt; [2013-12-05 18:33:50.825132] D <br>&gt; [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] <br>&gt; 0-gv-ocdata-replicate-0: pending_matrix: [ 0 0 ]<br>&gt; [2013-12-05 18:33:50.825322] D <br>&gt; [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] <br>&gt; 0-gv-ocdata-replicate-0: pending_matrix: [ 0 0 ]<br>&gt; [2013-12-05 18:33:50.825393] D <br>&gt; [afr-self-heal-common.c:887:afr_mark_sources] 0-gv-ocdata-replicate-0:

<br>&gt; Number of sources: 0<br>&gt; [2013-12-05 18:33:50.825456] D <br>&gt; [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]

<br>&gt; 0-gv-ocdata-replicate-0: returning read_child: 0<br>&gt; [2013-12-05 18:33:50.825511] D <br>&gt; [afr-common.c:1380:afr_lookup_select_read_child] <br>&gt; 0-gv-ocdata-replicate-0: Source selected as 0 for /<br>&gt; [2013-12-05 18:33:50.825579] D <br>&gt; [afr-common.c:1117:afr_lookup_build_response_params] <br>&gt; 0-gv-ocdata-replicate-0: Building lookup response from 0<br>&gt; [2013-12-05 18:33:50.827069] D <br>&gt; [afr-common.c:131:afr_lookup_xattr_req_prepare] <br>&gt; 0-gv-ocdata-replicate-0: /check.txt: failed to get the gfid from dict<br>&gt; [2013-12-05 18:33:50.829409] D <br>&gt; [client-handshake.c:185:client_start_ping] 0-gv-ocdata-client-0: <br>&gt; returning as transport is already disconnected OR there are no frames

<br>&gt; (0 || 0)<br>&gt; [2013-12-05 18:33:50.836719] D <br>&gt; [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] <br>&gt; 0-gv-ocdata-replicate-0: pending_matrix: [ 0 0 ]<br>&gt; [2013-12-05 18:33:50.836870] D <br>&gt; [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] <br>&gt; 0-gv-ocdata-replicate-0: pending_matrix: [ 0 0 ]<br>&gt; [2013-12-05 18:33:50.836941] D <br>&gt; [afr-self-heal-common.c:887:afr_mark_sources] 0-gv-ocdata-replicate-0:

<br>&gt; Number of sources: 0<br>&gt; [2013-12-05 18:33:50.837002] D <br>&gt; [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]

<br>&gt; 0-gv-ocdata-replicate-0: returning read_child: 0<br>&gt; [2013-12-05 18:33:50.837058] D <br>&gt; [afr-common.c:1380:afr_lookup_select_read_child] <br>&gt; 0-gv-ocdata-replicate-0: Source selected as 0 for /check.txt<br>&gt; [2013-12-05 18:33:50.837129] D <br>&gt; [afr-common.c:1117:afr_lookup_build_response_params] <br>&gt; 0-gv-ocdata-replicate-0: Building lookup response from 0<br>&gt;<br>&gt; Other bits of information<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# uname -a<br>&gt; Linux bbb-1 3.8.13-bone30 #1 SMP Thu Nov 14 02:59:07 UTC 2013 armv7l

<br>&gt; GNU/Linux<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# dpkg -l glusterfs-*<br>&gt; Desired=Unknown/Install/Remove/Purge/Hold<br>&gt; | <br>&gt; Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend<br>&gt; |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)<br>&gt; ||/ Name Version &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; Architecture Description<br>&gt; +++-============================================-===========================-===========================-==============================================================================================<br>&gt; ii &nbsp;glusterfs-client 3.4.1-1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; armhf clustered <br>&gt; file-system (client package)<br>&gt; ii &nbsp;glusterfs-common 3.4.1-1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; armhf GlusterFS <br>&gt; common libraries and translator modules<br>&gt; ii &nbsp;glusterfs-server 3.4.1-1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; armhf clustered <br>&gt; file-system (server package)<br>&gt;<br>&gt; mount<br>&gt;<br>&gt; bbb-1:gv-ocdata on /var/www/owncloud/data type fuse.glusterfs <br>&gt; (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)<br>&gt;<br>&gt; /etc/fstab<br>&gt;<br>&gt; UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx /sdhc ext4 defaults 0 0<br>&gt; bbb-1:gv-ocdata /var/www/owncloud/data glusterfs <br>&gt; defaults,_netdev,log-level=DEBUG 0 0<br>&gt;<br>&gt; ls -al on the various paths<br>&gt;<br>&gt; root@bbb-1:/var/log/glusterfs# ll -d /sdhc/<br>&gt; drwxrwxr-x 7 root root 4096 Nov 28 19:15 /sdhc/<br>&gt;<br>&gt; root@bbb-1:/var/log/glusterfs# ll -d /sdhc/gv-ocdata/<br>&gt; drwxrwx--- 5 www-data www-data 4096 Dec &nbsp;5 00:50 /sdhc/gv-ocdata/<br>&gt;<br>&gt; root@bbb-1:/var/log/glusterfs# ll -d /sdhc/gv-ocdata/check.txt<br>&gt; -rw-r--r-- 2 root root 10 Dec &nbsp;5 00:50 /sdhc/gv-ocdata/check.txt<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# ll -d /var/www/owncloud/data/<br>&gt; drwxrwx--- 5 www-data www-data 4096 Dec &nbsp;5 00:50 /var/www/owncloud/data/<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# ll -d /var/www/owncloud/data/check.txt<br>&gt; -rw-r--r-- 1 root root 10 Dec &nbsp;5 00:50 /var/www/owncloud/data/check.txt<br>&gt;<br>&gt; file &amp; dir attr information:<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# attr -l /var/www/owncloud/data<br>&gt; Attribute &quot;glusterfs.volume-id&quot; has a 16 byte value for

<br>&gt; /var/www/owncloud/data<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# attr -l /var/www/owncloud/data/check.txt<br>&gt; root@bbb-1:/var/www/owncloud#<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# attr -l /sdhc/gv-ocdata/<br>&gt; Attribute &quot;glusterfs.volume-id&quot; has a 16 byte value for

/sdhc/gv-ocdata/<br>&gt; Attribute &quot;gfid&quot; has a 16 byte value for /sdhc/gv-ocdata/<br>&gt; Attribute &quot;glusterfs.dht&quot; has a 16 byte value for /sdhc/gv-ocdata/<br>&gt; Attribute &quot;afr.gv-ocdata-client-0&quot; has a 12 byte value for

<br>&gt; /sdhc/gv-ocdata/<br>&gt; Attribute &quot;afr.gv-ocdata-client-1&quot; has a 12 byte value for

<br>&gt; /sdhc/gv-ocdata/<br>&gt;<br>&gt; root@bbb-1:/var/www/owncloud# attr -l /sdhc/gv-ocdata/check.txt<br>&gt; Attribute &quot;gfid&quot; has a 16 byte value for /sdhc/gv-ocdata/check.txt<br>&gt; Attribute &quot;afr.gv-ocdata-client-0&quot; has a 12 byte value for

<br>&gt; /sdhc/gv-ocdata/check.txt<br>&gt; Attribute &quot;afr.gv-ocdata-client-1&quot; has a 12 byte value for

<br>&gt; /sdhc/gv-ocdata/check.txt<br>&gt; root@bbb-1:/var/www/owncloud#<br>&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/d77e25bb/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/d77e25bb/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 39<br>Date: Tue, 10 Dec 2013 21:03:36 +1100<br>From: Andrew Lau &lt;andrew@andrewklau.com&gt;<br>To: Ben Turner &lt;bturner@redhat.com&gt;<br>Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Gluster infrastructure question<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;CAD7dF9c3uexEG++1YEHwh3zw7a1Xy+=Co_xO+zrDrggDuV2DJQ@mail.gmail.com&gt;<br>Content-Type: text/plain; charset=&quot;utf-8&quot;<br><br>Hi Ben,<br><br>For glusterfs would you recommend the enterprise-storage<br>or throughput-performance tuned profile?<br><br>Thanks,<br>Andrew<br><br><br>On Tue, Dec 10, 2013 at 6:31 AM, Ben Turner &lt;bturner@redhat.com&gt;

wrote:<br><br>&gt; ----- Original Message -----<br>&gt; &gt; From: &quot;Ben Turner&quot; &lt;bturner@redhat.com&gt;<br>&gt; &gt; To: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt; &gt; Cc: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt; &gt; Sent: Monday, December 9, 2013 2:26:45 PM<br>&gt; &gt; Subject: Re: [Gluster-users] Gluster infrastructure question<br>&gt; &gt;<br>&gt; &gt; ----- Original Message -----<br>&gt; &gt; &gt; From: &quot;Heiko Kr?mer&quot; &lt;hkraemer@anynines.de&gt;<br>&gt; &gt; &gt; To: &quot;gluster-users@gluster.org List&quot; &lt;gluster-users@gluster.org&gt;<br>&gt; &gt; &gt; Sent: Monday, December 9, 2013 8:18:28 AM<br>&gt; &gt; &gt; Subject: [Gluster-users] Gluster infrastructure question<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; -----BEGIN PGP SIGNED MESSAGE-----<br>&gt; &gt; &gt; Hash: SHA1<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; Heyho guys,<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; I'm running since years glusterfs in a small environment

without big<br>&gt; &gt; &gt; problems.<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; Now I'm going to use glusterFS for a bigger cluster but

I've some<br>&gt; &gt; &gt; questions :)<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; Environment:<br>&gt; &gt; &gt; * 4 Servers<br>&gt; &gt; &gt; * 20 x 2TB HDD, each<br>&gt; &gt; &gt; * Raidcontroller<br>&gt; &gt; &gt; * Raid 10<br>&gt; &gt; &gt; * 4x bricks =&gt; Replicated, Distributed volume<br>&gt; &gt; &gt; * Gluster 3.4<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; 1)<br>&gt; &gt; &gt; I'm asking me, if I can delete the raid10 on each server

and create &gt; &gt; &gt; for each HDD a separate brick. &gt; &gt; &gt; In this case have a volume 80 Bricks so 4 Server x 20 HDD's.

Is there<br>&gt; &gt; &gt; any experience about the write throughput in a production

system with<br>&gt; &gt; &gt; many of bricks like in this case? In addition i'll get double

of HDD<br>&gt; &gt; &gt; capacity.<br>&gt; &gt;<br>&gt; &gt; Have a look at:<br>&gt; &gt;<br>&gt; &gt; </font></tt><a href="http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf</font></tt></a><tt><font size=2><br>&gt;<br>&gt; That one was from 2012, here is the latest:<br>&gt;<br>&gt;<br>&gt; </font></tt><a href="http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf"><tt><font size=2>http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf</font></tt></a><tt><font size=2><br>&gt;<br>&gt; -b<br>&gt;<br>&gt; &gt; Specifically:<br>&gt; &gt;<br>&gt; &gt; ? RAID arrays<br>&gt; &gt; ? More RAID LUNs for better concurrency<br>&gt; &gt; ? For RAID6, 256-KB stripe size<br>&gt; &gt;<br>&gt; &gt; I use a single RAID 6 that is divided into several LUNs for my

bricks.<br>&gt; &nbsp;For<br>&gt; &gt; example, on my Dell servers(with PERC6 RAID controllers) each

server has<br>&gt; 12<br>&gt; &gt; disks that I put into raid 6. &nbsp;Then I break the RAID 6 into

6 LUNs and<br>&gt; &gt; create a new PV/VG/LV for each brick. &nbsp;From there I follow

the<br>&gt; &gt; recommendations listed in the presentation.<br>&gt; &gt;<br>&gt; &gt; HTH!<br>&gt; &gt;<br>&gt; &gt; -b<br>&gt; &gt;<br>&gt; &gt; &gt; 2)<br>&gt; &gt; &gt; I've heard a talk about glusterFS and out scaling. The main

point was<br>&gt; &gt; &gt; if more bricks are in use, the scale out process will take

a long<br>&gt; &gt; &gt; time. The problem was/is the Hash-Algo. So I'm asking me

how is it if<br>&gt; &gt; &gt; I've one very big brick (Raid10 20TB on each server) or

I've much more<br>&gt; &gt; &gt; bricks, what's faster and is there any issues?<br>&gt; &gt; &gt; Is there any experiences ?<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; 3)<br>&gt; &gt; &gt; Failover of a HDD is for a raid controller with HotSpare

HDD not a big<br>&gt; &gt; &gt; deal. Glusterfs will rebuild automatically if a brick fails

and there<br>&gt; &gt; &gt; are no data present, this action will perform a lot of network

traffic<br>&gt; &gt; &gt; between the mirror bricks but it will handle it equal as

the raid<br>&gt; &gt; &gt; controller right ?</font></tt><br><tt><font size=2>&gt; &gt; &gt;<br>&gt; &gt; &gt;<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; Thanks and cheers<br>&gt; &gt; &gt; Heiko<br>&gt; &gt; &gt;<br>&gt; &gt; &gt;<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; - --<br>&gt; &gt; &gt; Anynines.com<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; Avarteq GmbH<br>&gt; &gt; &gt; B.Sc. Informatik<br>&gt; &gt; &gt; Heiko Kr?mer<br>&gt; &gt; &gt; CIO<br>&gt; &gt; &gt; Twitter: @anynines<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; - ----<br>&gt; &gt; &gt; Gesch?ftsf?hrer: Alexander Fai?t, Dipl.-Inf.(FH) Julian

Fischer<br>&gt; &gt; &gt; Handelsregister: AG Saarbr?cken HRB 17413, Ust-IdNr.: DE262633168<br>&gt; &gt; &gt; Sitz: Saarbr?cken<br>&gt; &gt; &gt; -----BEGIN PGP SIGNATURE-----<br>&gt; &gt; &gt; Version: GnuPG v1.4.14 (GNU/Linux)<br>&gt; &gt; &gt; Comment: Using GnuPG with Thunderbird - </font></tt><a href=http://www.enigmail.net/><tt><font size=2>http://www.enigmail.net/</font></tt></a><tt><font size=2><br>&gt; &gt; &gt;<br>&gt; &gt; &gt; iQEcBAEBAgAGBQJSpcMfAAoJELxFogM4ixOF/ncH/3L9DvOWHrF0XBqCgeT6QQ6B<br>&gt; &gt; &gt; lDwtXiD9xoznht0Zs2S9LA9Z7r2l5/fzMOUSOawEMv6M16Guwq3gQ1lClUi4Iwj0<br>&gt; &gt; &gt; GKKtYQ6F4aG4KXHY4dlu1QKT5OaLk8ljCQ47Tc9aAiJMhfC1/IgQXOslFv26utdJ<br>&gt; &gt; &gt; N9jxiCl2+r/tQvQRw6mA4KAuPYPwOV+hMtkwfrM4UsIYGGbkNPnz1oqmBsfGdSOs<br>&gt; &gt; &gt; TJh6+lQRD9KYw72q3I9G6ZYlI7ylL9Q7vjTroVKH232pLo4G58NLxyvWvcOB9yK6<br>&gt; &gt; &gt; Bpf/gRMxFNKA75eW5EJYeZ6EovwcyCAv7iAm+xNKhzsoZqbBbTOJxS5zKm4YWoY=<br>&gt; &gt; &gt; =bDly<br>&gt; &gt; &gt; -----END PGP SIGNATURE-----<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; _______________________________________________<br>&gt; &gt; &gt; Gluster-users mailing list<br>&gt; &gt; &gt; Gluster-users@gluster.org<br>&gt; &gt; &gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; &gt; _______________________________________________<br>&gt; &gt; Gluster-users mailing list<br>&gt; &gt; Gluster-users@gluster.org<br>&gt; &gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/b19779ff/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/b19779ff/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>Message: 40<br>Date: Tue, 10 Dec 2013 15:34:56 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: Bernhard Glomm &lt;bernhard.glomm@ecologic.eu&gt;, mrcuongnv@gmail.com<br>Cc: gluster-users@gluster.org<br>Subject: Re: [Gluster-users] replace-brick failing -<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

transport.address-family not specified<br>Message-ID: &lt;52A6E748.5070300@redhat.com&gt;<br>Content-Type: text/plain; charset=UTF-8; format=flowed<br><br>On 12/10/2013 02:26 PM, Bernhard Glomm wrote:<br>&gt; Am 10.12.2013 06:39:47, schrieb Vijay Bellur:<br>&gt;<br>&gt; &nbsp; &nbsp; On 12/08/2013 07:06 PM, Nguyen Viet Cuong wrote:<br>&gt;<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Thanks for sharing.<br>&gt;<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; Btw, I do believe that GlusterFS 3.2.x

is much more stable than &gt; &nbsp; &nbsp; &nbsp; &nbsp; 3.4.x in &gt; &nbsp; &nbsp; &nbsp; &nbsp; production. &gt; &gt; &gt; &nbsp; &nbsp; This is quite contrary to what we have seen in the community.

>From a<br>&gt; &nbsp; &nbsp; development perspective too, we feel much better about

3.4.1. Are there<br>&gt; &nbsp; &nbsp; specific instances that worked well with 3.2.x which

does not work fine<br>&gt; &nbsp; &nbsp; for you in 3.4.x?<br>&gt;<br>&gt;<br>&gt; 987555 - is that fixed in 3.5?<br>&gt;<br>&gt; Or did it even make it into 3.4.2<br>&gt;<br>&gt; couldn't find a note on that.<br>&gt;<br><br>Yes, this will be part of 3.4.2. Note that the original problem was due

<br>to libvirt being rigid about the ports that it needs to use for <br>migrations. AFAIK this has been addressed in upstream libvirt as well.<br>Through this bug fix, glusterfs provides a mechanism where it can use a

<br>separate range of ports for bricks. This configuration can be enabled to

<br>work with other applications that do not adhere with guidelines laid out

<br>by IANA.<br><br>Cheers,<br>Vijay<br><br><br><br><br>------------------------------<br><br>Message: 41<br>Date: Tue, 10 Dec 2013 15:38:16 +0530<br>From: Vijay Bellur &lt;vbellur@redhat.com&gt;<br>To: Alexandru Coseru &lt;alex.coseru@simplus.ro&gt;,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

gluster-users@gluster.org<br>Subject: Re: [Gluster-users] Gluster - replica - Unable to self-heal<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

contents of '/' (possible split-brain)<br>Message-ID: &lt;52A6E810.9050900@redhat.com&gt;<br>Content-Type: text/plain; charset=windows-1252; format=flowed<br><br>On 12/09/2013 07:21 PM, Alexandru Coseru wrote:<br><br>&gt;<br>&gt; [2013-12-09 13:20:52.066978] E<br>&gt; [afr-self-heal-common.c:197:afr_sh_print_split_brain_log]<br>&gt; 0-stor1-replicate-0: Unable to self-heal contents of '/' (possible<br>&gt; split-brain). Please delete the file from all but the preferred<br>&gt; subvolume.- Pending matrix: &nbsp;[ [ 0 2 ] [ 2 0 ] ]<br>&gt;<br>&gt; [2013-12-09 13:20:52.067386] E<br>&gt; [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk]<br>&gt; 0-stor1-replicate-0: background &nbsp;meta-data self-heal failed on

/<br>&gt;<br>&gt; [2013-12-09 13:20:52.067452] E [mount3.c:290:mnt3svc_lookup_mount_cbk]<br>&gt; 0-nfs: error=Input/output error<br>&gt;<br>&gt; [2013-12-09 13:20:53.092039] E<br>&gt; [afr-self-heal-common.c:197:afr_sh_print_split_brain_log]<br>&gt; 0-stor1-replicate-0: Unable to self-heal contents of '/' (possible<br>&gt; split-brain). Please delete the file from all but the preferred<br>&gt; subvolume.- Pending matrix: &nbsp;[ [ 0 2 ] [ 2 0 ] ]<br>&gt;<br>&gt; [2013-12-09 13:20:53.092497] E<br>&gt; [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk]<br>&gt; 0-stor1-replicate-0: background &nbsp;meta-data self-heal failed on

/<br>&gt;<br>&gt; [2013-12-09 13:20:53.092559] E [mount3.c:290:mnt3svc_lookup_mount_cbk]<br>&gt; 0-nfs: error=Input/output error<br>&gt;<br>&gt; What I?m doing wrong ?<br><br>Looks like there is a metadata split-brain on /.<br><br>The split-brain resolution document at [1] can possibly be of help here.<br><br>-Vijay<br><br>[1] </font></tt><a href="https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md"><tt><font size=2>https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md</font></tt></a><tt><font size=2><br><br><br>&gt;<br>&gt; PS: &nbsp;Volume stor_fast works like a charm.<br>&gt;<br><br>Good to know, thanks!<br><br><br><br>------------------------------<br><br>Message: 42<br>Date: Tue, 10 Dec 2013 11:59:44 +0100<br>From: &quot;Mariusz Sobisiak&quot; &lt;MSobisiak@ydp.pl&gt;<br>To: &lt;gluster-users@gluster.org&gt;<br>Subject: [Gluster-users] Error after crash of Virtual Machine during<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

migration<br>Message-ID:<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&lt;507D8C234E515F4F969362F9666D7EBBE875D1@nagato1.intranet.ydp&gt;<br>Content-Type: text/plain; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; charset=&quot;us-ascii&quot;<br><br>Greetings,<br><br>Legend:<br>storage-gfs-3-prd - the first gluster.<br>storage-1-saas - new gluster where &quot;the first gluster&quot; had to

be<br>migrated.<br>storage-gfs-4-prd - the second gluster (which had to be migrated later).<br><br>I've started command replace-brick:<br>'gluster volume replace-brick sa_bookshelf storage-gfs-3-prd:/ydp/shared<br>storage-1-saas:/ydp/shared start'<br><br>During that Virtual Machine (Xen) has crashed. Now I can't abort<br>migration and continue it again.<br><br>When I try:<br>'# gluster volume replace-brick sa_bookshelf<br>storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared abort'<br>The command lasts about 5 minutes then finishes with no results. Apart<br>from that Gluster after that command starts behave very strange. <br>For example I can't do '# gluster volume heal sa_bookshelf info' because<br>it lasts about 5 minutes and returns black screen (the same like abort).<br><br>Then I restart Gluster server and Gluster returns to normal work except<br>the replace-brick commands. When I do:<br>'# gluster volume replace-brick sa_bookshelf<br>storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared status'<br>I get:<br>Number of files migrated = 0 &nbsp; &nbsp; &nbsp; Current file=<br>I can do 'volume heal info' commands etc. until I call the command:<br>'# gluster volume replace-brick sa_bookshelf<br>storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared abort'.<br><br><br><br># gluster --version<br>glusterfs 3.3.1 built on Oct 22 2012 07:54:24 Repository revision:<br>git://git.gluster.com/glusterfs.git<br>Copyright (c) 2006-2011 Gluster Inc. &lt;</font></tt><a href=http://www.gluster.com/><tt><font size=2>http://www.gluster.com</font></tt></a><tt><font size=2>&gt;

GlusterFS<br>comes with ABSOLUTELY NO WARRANTY.<br>You may redistribute copies of GlusterFS under the terms of the GNU<br>General Public License.<br><br>Brick (/ydp/shared) logs (repeats the same constantly):<br>[2013-12-06 11:29:44.790299] W [dict.c:995:data_to_str]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab<br>) [0x7ff4a5d35fcb]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r<br>emote_sockaddr+0x15d) [0x7ff4a5d3d64d]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address<br>_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL<br>[2013-12-06 11:29:44.790402] W [dict.c:995:data_to_str]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab<br>) [0x7ff4a5d35fcb]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r<br>emote_sockaddr+0x15d) [0x7ff4a5d3d64d]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address<br>_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL<br>[2013-12-06 11:29:44.790465] E [name.c:141:client_fill_address_family]<br>0-sa_bookshelf-replace-brick: transport.address-family not specified.<br>Could not guess default value from (remote-host:(null) or<br>transport.unix.connect-path:(null)) options<br>[2013-12-06 11:29:47.791037] W [dict.c:995:data_to_str]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab<br>) [0x7ff4a5d35fcb]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r<br>emote_sockaddr+0x15d) [0x7ff4a5d3d64d]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address<br>_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL<br>[2013-12-06 11:29:47.791141] W [dict.c:995:data_to_str]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab<br>) [0x7ff4a5d35fcb]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r<br>emote_sockaddr+0x15d) [0x7ff4a5d3d64d]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address<br>_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL<br>[2013-12-06 11:29:47.791174] E [name.c:141:client_fill_address_family]<br>0-sa_bookshelf-replace-brick: transport.address-family not specified.<br>Could not guess default value from (remote-host:(null) or<br>transport.unix.connect-path:(null)) options<br>[2013-12-06 11:29:50.791775] W [dict.c:995:data_to_str]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab<br>) [0x7ff4a5d35fcb]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r<br>emote_sockaddr+0x15d) [0x7ff4a5d3d64d]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address<br>_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL<br>[2013-12-06 11:29:50.791986] W [dict.c:995:data_to_str]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab<br>) [0x7ff4a5d35fcb]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r<br>emote_sockaddr+0x15d) [0x7ff4a5d3d64d]<br>(--&gt;/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address<br>_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL<br>[2013-12-06 11:29:50.792046] E [name.c:141:client_fill_address_family]<br>0-sa_bookshelf-replace-brick: transport.address-family not specified.<br>Could not guess default value from (remote-host:(null) or<br>transport.unix.connect-path:(null)) options<br><br><br># gluster volume info<br><br>Volume Name: sa_bookshelf<br>Type: Distributed-Replicate<br>Volume ID: 74512f52-72ec-4538-9a54-4e50c4691722<br>Status: Started<br>Number of Bricks: 2 x 2 = 4<br>Transport-type: tcp<br>Bricks:<br>Brick1: storage-gfs-3-prd:/ydp/shared<br>Brick2: storage-gfs-4-prd:/ydp/shared<br>Brick3: storage-gfs-3-prd:/ydp/shared2<br>Brick4: storage-gfs-4-prd:/ydp/shared2<br><br><br># gluster volume status<br>Status of volume: sa_bookshelf<br>Gluster process &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; Port &nbsp; &nbsp;Online<br>Pid<br>------------------------------------------------------------------------<br>------<br>Brick storage-gfs-3-prd:/ydp/shared &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y<br>758<br>Brick storage-gfs-4-prd:/ydp/shared &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24009 &nbsp; Y<br>730<br>Brick storage-gfs-3-prd:/ydp/shared2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;24010 &nbsp; Y<br>764<br>Brick storage-gfs-4-prd:/ydp/shared2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;24010 &nbsp; Y<br>4578<br>NFS Server on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 38467 &nbsp;

Y<br>770<br>Self-heal Daemon on localhost &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; N/A &nbsp; &nbsp; Y<br>776<br>NFS Server on storage-1-saas &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;38467 &nbsp; Y<br>840<br>Self-heal Daemon on storage-1-saas &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;N/A &nbsp; &nbsp; Y<br>846<br>NFS Server on storage-gfs-4-prd &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 38467 &nbsp; Y<br>4584<br>Self-heal Daemon on storage-gfs-4-prd &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; N/A &nbsp; &nbsp; Y<br>4590<br><br>storage-gfs-3-prd:~# gluster peer status Number of Peers: 2<br><br>Hostname: storage-1-saas<br>Uuid: 37b9d881-ce24-4550-b9de-6b304d7e9d07<br>State: Peer in Cluster (Connected)<br><br>Hostname: storage-gfs-4-prd<br>Uuid: 4c384f45-873b-4c12-9683-903059132c56<br>State: Peer in Cluster (Connected)<br><br><br>(from storage-1-saas)# gluster peer status Number of Peers: 2<br><br>Hostname: 172.16.3.60<br>Uuid: 1441a7b0-09d2-4a40-a3ac-0d0e546f6884<br>State: Peer in Cluster (Connected)<br><br>Hostname: storage-gfs-4-prd<br>Uuid: 4c384f45-873b-4c12-9683-903059132c56<br>State: Peer in Cluster (Connected)<br><br><br><br>Clients work properly.<br>I googled for that but I found that was a bug but in 3.3.0 version. How<br>can I repair that and continue my migration? Thank You for any help.<br><br>BTW: I moved Gluster Server via Gluster 3.4: Brick Restoration - Replace<br>Crashed Server how to.<br><br>Regards,<br>Mariusz<br><br><br>------------------------------<br><br>Message: 43<br>Date: Tue, 10 Dec 2013 12:52:29 +0100<br>From: Johan Huysmans &lt;johan.huysmans@inuits.be&gt;<br>To: &quot;gluster-users@gluster.org&quot; &lt;gluster-users@gluster.org&gt;<br>Subject: Re: [Gluster-users] Structure needs cleaning on some files<br>Message-ID: &lt;52A7007D.6020005@inuits.be&gt;<br>Content-Type: text/plain; charset=&quot;iso-8859-1&quot;; Format=&quot;flowed&quot;<br><br>Hi All,<br><br>It seems I can easily reproduce the problem.<br><br>* on node 1 create a file (touch , cat , ...).<br>* on node 2 take md5sum of direct file (md5sum /path/to/file)<br>* on node 1 move file to other name (mv file file1)<br>* on node 2 take md5sum of direct file (md5sum /path/to/file), this is

<br>still working although the file is not really there<br>* on node 1 change file content<br>* on node 2 take md5sum of direct file (md5sum /path/to/file), this is

<br>still working and has a changed md5sum<br><br>This is really strange behaviour.<br>Is this normal, can this be altered with a a setting?<br><br>Thanks for any info,<br>gr.<br>Johan<br><br>On 10-12-13 10:02, Johan Huysmans wrote:<br>&gt; I could reproduce this problem with while my mount point is running

in <br>&gt; debug mode.<br>&gt; logfile is attached.<br>&gt;<br>&gt; gr.<br>&gt; Johan Huysmans<br>&gt;<br>&gt; On 10-12-13 09:30, Johan Huysmans wrote:<br>&gt;&gt; Hi All,<br>&gt;&gt;<br>&gt;&gt; When reading some files we get this error:<br>&gt;&gt; md5sum: /path/to/file.xml: Structure needs cleaning<br>&gt;&gt;<br>&gt;&gt; in /var/log/glusterfs/mnt-sharedfs.log we see these errors:<br>&gt;&gt; [2013-12-10 08:07:32.256910] W <br>&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-0:

<br>&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt; [2013-12-10 08:07:32.257436] W <br>&gt;&gt; [client-rpc-fops.c:526:client3_3_stat_cbk] 1-testvolume-client-1:

<br>&gt;&gt; remote operation failed: No such file or directory<br>&gt;&gt; [2013-12-10 08:07:32.259356] W [fuse-bridge.c:705:fuse_attr_cbk]

<br>&gt;&gt; 0-glusterfs-fuse: 8230: STAT() /path/to/file.xml =&gt; -1 (Structure

<br>&gt;&gt; needs cleaning)<br>&gt;&gt;<br>&gt;&gt; We are using gluster 3.4.1-3 on CentOS6.<br>&gt;&gt; Our servers are 64-bit, our clients 32-bit (we are already using

<br>&gt;&gt; --enable-ino32 on the mountpoint)<br>&gt;&gt;<br>&gt;&gt; This is my gluster configuration:<br>&gt;&gt; Volume Name: testvolume<br>&gt;&gt; Type: Replicate<br>&gt;&gt; Volume ID: ca9c2f87-5d5b-4439-ac32-b7c138916df7<br>&gt;&gt; Status: Started<br>&gt;&gt; Number of Bricks: 1 x 2 = 2<br>&gt;&gt; Transport-type: tcp<br>&gt;&gt; Bricks:<br>&gt;&gt; Brick1: SRV-1:/gluster/brick1<br>&gt;&gt; Brick2: SRV-2:/gluster/brick2<br>&gt;&gt; Options Reconfigured:<br>&gt;&gt; performance.force-readdirp: on<br>&gt;&gt; performance.stat-prefetch: off<br>&gt;&gt; network.ping-timeout: 5<br>&gt;&gt;<br>&gt;&gt; And this is how the applications work:<br>&gt;&gt; We have 2 client nodes who both have a fuse.glusterfs mountpoint.<br>&gt;&gt; On 1 client node we have a application which writes files.<br>&gt;&gt; On the other client node we have a application which reads these

files.<br>&gt;&gt; On the node where the files are written we don't see any problem,

and <br>&gt;&gt; can read that file without problems.<br>&gt;&gt; On the other node we have problems (error messages above) reading

<br>&gt;&gt; that file.<br>&gt;&gt; The problem occurs when we perform a md5sum on the exact file,

when <br>&gt;&gt; perform a md5sum on all files in that directory there is no problem.<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; How can we solve this problem as this is annoying.<br>&gt;&gt; The problem occurs after some time (can be days), an umount and

mount <br>&gt;&gt; of the mountpoint solves it for some days.<br>&gt;&gt; Once it occurs (and we don't remount) it occurs every time.<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; I hope someone can help me with this problems.<br>&gt;&gt;<br>&gt;&gt; Thanks,<br>&gt;&gt; Johan Huysmans<br>&gt;&gt; _______________________________________________<br>&gt;&gt; Gluster-users mailing list<br>&gt;&gt; Gluster-users@gluster.org<br>&gt;&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br>&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt; Gluster-users@gluster.org<br>&gt; </font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: &lt;</font></tt><a href="http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/32f9069c/attachment-0001.html"><tt><font size=2>http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131210/32f9069c/attachment-0001.html</font></tt></a><tt><font size=2>&gt;<br><br>------------------------------<br><br>_______________________________________________<br>Gluster-users mailing list<br>Gluster-users@gluster.org<br></font></tt><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"><tt><font size=2>http://supercolony.gluster.org/mailman/listinfo/gluster-users</font></tt></a><tt><font size=2><br><br>End of Gluster-users Digest, Vol 68, Issue 11<br>*********************************************<br></font></tt><br><BR>

**<BR>

<BR>

This email and any attachments may contain information that is confidential and/or privileged for the sole use of the intended recipient.  Any use, review, disclosure, copying, distribution or reliance by others, and any forwarding of this email or its contents, without the express permission of the sender is strictly prohibited by law.  If you are not the intended recipient, please contact the sender immediately, delete the e-mail and destroy all copies.<BR>

**<BR>