<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 8 September 2014 05:05, Krishnan Parthasarathi <span dir="ltr">&lt;<a href="mailto:kparthas@redhat.com" target="_blank">kparthas@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class=""><br>

<br>

&gt; Bulk of current GlusterD code deals with keeping the configuration of the<br>

&gt; cluster and the volumes in it consistent and available across the nodes. The<br>

&gt; current algorithm is not scalable (N^2 in no. of nodes) and doesn&#39;t prevent<br>

&gt; split-brain of configuration. This is the problem area we are targeting for<br>

&gt; the first phase.<br>

&gt;<br>

&gt; As part of the first phase, we aim to delegate the distributed configuration<br>

&gt; store. We are exploring consul [1] as a replacement for the existing<br>

&gt; distributed configuration store (sum total of /var/lib/glusterd/* across all<br>

&gt; nodes). Consul provides distributed configuration store which is consistent<br>

&gt; and partition tolerant. By moving all Gluster related configuration<br>

&gt; information into consul we could avoid split-brain situations.<br>

&gt; Did you get a chance to go over the following questions while making the<br>

&gt; decision? If yes could you please share the info.<br>

&gt; What are the consistency guarantees for changing the configuration in case of<br>

&gt; network partitions?<br>

&gt; specifically when there are 2 nodes and 1 of them is not reachable?<br>

&gt; consistency guarantees when there are more than 2 nodes?<br>

&gt; What are the consistency guarantees for reading configuration in case of<br>

&gt; network partitions?<br>

<br>

</span>consul uses Raft[1] distributed consensus algorithm internally for maintaining<br>

consistency. The Raft consensus algorithm is proven to be correct. I will be<br>

going through the workings of the algorithm soon. I will share my answers to<br>

the above questions after that. Thanks for the questions, it is important<br>

for the user to understand the behaviour of a system especially under failure.<br>

I am considering adding a FAQ section to this proposal, where questions like the above would<br>

go, once it gets accepted and makes it to the feature page.<br>

<br>

[1] - <a href="https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf" target="_blank">https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf</a><br>

<span class=""><font color="#888888"><br></font></span></blockquote><div><br></div><div>The following article provides some results on how Consul works following partitioning, actually testing whether it recovers successfully:</div><div><a href="http://aphyr.com/posts/316-call-me-maybe-etcd-and-consul">http://aphyr.com/posts/316-call-me-maybe-etcd-and-consul</a></div><div> <br></div><div>It gives Consul a positive review.</div><div><br></div><div>HTH</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class=""><font color="#888888">

~KP<br>

</font></span><div class=""><div class="h5"><br>

_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Jonathan Barber &lt;<a href="mailto:jonathan.barber@gmail.com">jonathan.barber@gmail.com</a>&gt;

</div></div>