<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
On 11/06/2013 11:52 AM, Justin Dossey wrote:<br>
<blockquote
cite="mid:CAPMPShxa3fbp4BH4-W5opG+ObiFCfk_OKSvL92FmzKguRKVJ0Q@mail.gmail.com"
type="cite">
<div dir="ltr">Shawn,
<div><br>
</div>
<div>I had a very similar experience with a rebalance on 3.3.1,
and it took weeks to get everything straightened out. I would
be happy to share the scripts I wrote to correct the
permissions issues if you wish, though I'm not sure it would
be appropriate to share them directly on this list. Perhaps I
should just create a project on Github that is devoted to
collecting scripts people use to fix their GlusterFS
environments! </div>
<div><br>
</div>
<div>After that (awful) experience, I am loath to run further
rebalances. I've even spent days evaluating alternatives to
GlusterFS, as my experience with this list over the last six
months indicates that support for community users is minimal,
even in the face of major bugs such as the one with
rebalancing and the continuing "gfid different on subvolume"
bugs with 3.3.2.</div>
</div>
</blockquote>
I'm one of oldest GlusterFS users around here and one of the biggest
proponents and even I have been loath to rebalance until 3.4.1.<br>
<br>
There are no open bugs for gfid mismatches that I could find. The
last time someone mentioned that error in IRC it was 2am, I was at a
convention, and I told the user how to solve that problem (
<a class="moz-txt-link-freetext" href="http://irclog.perlgeek.de/gluster/2013-06-14#i_7196149">http://irclog.perlgeek.de/gluster/2013-06-14#i_7196149</a> ). It was
caused by split-brain. If you have a bug, it would be more
productive to file it rather than make negative comments about a
community of people that have no requirement to help anybody, but do
it anyway just because they're nice people.<br>
<br>
This is going to sound snarky because it's in text, but I mean this
sincerely. If community support is not sufficient, you might
consider purchasing support from a company that provides it
professionally.<br>
<br>
<blockquote
cite="mid:CAPMPShxa3fbp4BH4-W5opG+ObiFCfk_OKSvL92FmzKguRKVJ0Q@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div>Let me know what you think of the Github thing and I'll
proceed appropriately.</div>
</div>
</blockquote>
Even better, put them up on <a class="moz-txt-link-freetext" href="http://forge.gluster.org">http://forge.gluster.org</a><br>
<br>
<blockquote
cite="mid:CAPMPShxa3fbp4BH4-W5opG+ObiFCfk_OKSvL92FmzKguRKVJ0Q@mail.gmail.com"
type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Tue, Nov 5, 2013 at 9:05 PM, Shawn
Heisey <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:gluster@elyograg.org" target="_blank">gluster@elyograg.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">We
recently added storage servers to our gluster install,
running 3.3.1<br>
on CentOS 6. It went from 40TB usable (8x2
distribute-replicate) to<br>
80TB usable (16x2). There was a little bit over 20TB used
space on the<br>
volume.<br>
<br>
The add-brick went through without incident, but the
rebalance failed<br>
after moving 1.5TB of the approximately 10TB that needed to
be moved. A<br>
side issue is that it took four days for that 1.5TB to move.
I'm aware<br>
that gluster has overhead, and that there's only so much
speed you can<br>
get out of gigabit, but a 100Mb/s half-duplex link could
have copied the<br>
data faster if it had been a straight copy.<br>
<br>
After I discovered that the rebalance had failed, I noticed
that there<br>
were other problems. There are a small number of completely
lost files<br>
(91 that I know about so far), a huge number of permission
issues (over<br>
800,000 files changed to 000), and about 32000 files that
are throwing<br>
read errors via the fuse/nfs mount but seem to be available
directly on<br>
bricks. That last category of problem file has the sticky
bit set, with<br>
almost all of them having ---------T permissions. The good
files on<br>
bricks typically have the same permissions, but are readable
by root. I<br>
haven't worked out the scripting necessary to automate all
the fixing<br>
that needs to happen yet.<br>
<br>
We really need to know what happened. We do plan to upgrade
to 3.4.1,<br>
but there were some reasons that we didn't want to upgrade
before adding<br>
storage.<br>
<br>
* Upgrading will result in service interruption to our
clients, which<br>
mount via NFS. It would likely be just a hiccup, with quick
failover,<br>
but it's still a service interruption.<br>
* We have a pacemaker cluster providing the shared IP
address for NFS<br>
mounting. It's running CentOS 6.3. A "yum upgrade" to
upgrade gluster<br>
will also upgrade to CentOS 6.4. The pacemaker in 6.4 is
incompatible<br>
with the pacemaker in 6.3, which will likely result in<br>
longer-than-expected downtime for the shared IP address.<br>
* We didn't want to risk potential problems with running
gluster 3.3.1<br>
on the existing servers and 3.4.1 on the new servers.<br>
* We needed the new storage added right away, before we
could schedule<br>
maintenance to deal with the upgrade issues.<br>
<br>
Something that would be extremely helpful would be obtaining
the<br>
services of an expert-level gluster consultant who can look
over<br>
everything we've done to see if there is anything we've done
wrong and<br>
how we might avoid problems in the future. I don't know how
much the<br>
company can authorize for this, but we obviously want it to
be as cheap<br>
as possible. We are in Salt Lake City, UT, USA. It would
be preferable<br>
to have the consultant be physically present at our
location.<br>
<br>
I'm working on redacting one bit of identifying info from
our rebalance<br>
log, then I can put it up on dropbox for everyone to
examine.<br>
<br>
Thanks,<br>
Shawn<br>
<br>
_______________________________________________<br>
Gluster-users mailing list<br>
<a moz-do-not-send="true"
href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a moz-do-not-send="true"
href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"
target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Justin Dossey<br>
CTO, PodOmatic
<div><br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</body>
</html>