<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<div class="moz-cite-prefix">On 08/10/2014 11:42 PM, Ravishankar N
wrote:<br>
</div>
<blockquote cite="mid:53E865C0.4070800@redhat.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<div class="moz-cite-prefix">On 08/09/2014 01:23 AM, Joe Julian
wrote:<br>
</div>
<blockquote cite="mid:53E52AB2.4080104@julianfamily.org"
type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
Thinking about it more, I'd still rather have this functionality
exposed at the client through xattrs. For 5 years I've thought
about this, and the more I encounter split-brain, the more I
think this is the needed approach.<br>
<br>
</blockquote>
<br>
Joe, why do you feel resolving split-brains should be exposed to
clients? Whatever approach is taken (either a gluster CLI command
or an overloaded get/satfattr call, is it not better to have this
done at the server side?)<br>
<br>
</blockquote>
* It's consistent with the way other functions actually operate,
rebalance, self-heal, etc. In that they're really just clients.<br>
* On the client it offers more possibilities for us admins to be
able to fix something on the fly.<br>
* It's an API at that point. Software could be coded to perform its
own self-heal based on the rules that might apply to that particular
use case.<br>
* If multi-tenancy is ever added, it is a method by which the tenant
can repair his own files.<br>
<br>
It was late, last time, and I missed one important operation. The
ability to mv one copy of the split-brain to a new filename in case
you choose wrongly and need it. I've seen that with VM images.
Typically, it doesn't really matter which VM image you chose (if
your data's in a smart place instead of on the image). Pick either
one and boot it back up. Occasionally, though, the image is
irreparable. Frequently, the "other copy" is ok, so if one fails to
boot, we swap to the other.<br>
<blockquote cite="mid:53E865C0.4070800@redhat.com" type="cite"> <br>
<blockquote cite="mid:53E52AB2.4080104@julianfamily.org"
type="cite"> "getfattr -n trusted.glusterfs.stat" returns
xml/json/some_madeup_datastructure with the results of stat from
each brick<br>
"getfattr -n trusted.glusterfs.afr" returns the afr matrix<br>
"setfattr -n trusted.glusterfs.sb-pick -v "server2:/srv/brick1"<br>
<br>
That gives us the tools we need to choose what to do with any
given split-brain. For large swaths of automated repair, we can
use find.<br>
<br>
I suppose that last bit could still be implemented through that
cli command.<br>
<br>
<br>
<div class="moz-cite-prefix">On 08/07/2014 01:35 AM, Ravishankar
N wrote:<br>
</div>
<blockquote cite="mid:53E33A56.9060801@redhat.com" type="cite">
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<br>
Manual resolution of split-brains [1] has been a tedious task
involving understanding and modifying AFR's changelog extended
attributes. To simplify and to an extent automate this task,
we are proposing a new CLI command with which the user can
specify what the source brick/file is, and automatically heal
the files in the appropriate direction. <br>
<br>
Command: gluster volume resolve-split-brain <VOLNAME>
{<bigger_file> | source-brick <brick_name>
[<file>] } <br>
<br>
Breaking up the command into its possible options, we have: <br>
<br>
a) gluster volume resolve-split-brain <VOLNAME>
<bigger_file> <br>
When this command is executed, AFR will consider the brick
having the highest file size as the source and heal it to all
other bricks (including all other sources and sinks) in that
replica subvolume. If the file size is same in all the bricks,
it does <b class="moz-txt-star"><span class="moz-txt-tag">*</span>not<span
class="moz-txt-tag">*</span></b> heal the file. <br>
<br>
b) gluster volume resolve-split-brain <VOLNAME >
source-brick <brick_name > [<file>] <br>
<br>
When this command is executed, if <file> is specified,
AFR heals the file from the source-brick <brick_name> to
all other bricks of that replica subvolume. For resolving
multiple files, the command must be run iteratively, once per
file. <br>
If <file> is not specified, AFR heals all the files that
have an entry in .glusterfs/indices/xattrop <b
class="moz-txt-star"><span class="moz-txt-tag">*</span>and<span
class="moz-txt-tag">*</span></b> are in split-brain. As
before, heals happen from source-brick <brick_name> to
all other bricks. <br>
<br>
Future work could also include extending the command to add
other policies like choosing the file having the latest mtime
as the source, integration with trash xlator wherein the files
deleted from the sink are moved to the trash dir etc.<br>
<br>
Please give feedback on the above. <br>
<br>
Regards,<br>
Ravi<br>
<br>
[1] <a moz-do-not-send="true" class="moz-txt-link-freetext"
href="https://github.com/gluster/glusterfs/blob/master/doc/split-brain.mdVOLNAME">https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md</a>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-devel mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-devel">http://supercolony.gluster.org/mailman/listinfo/gluster-devel</a>
</pre>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-devel mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-devel">http://supercolony.gluster.org/mailman/listinfo/gluster-devel</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>