<div dir="ltr">I did a lot of testing on distributed-replication and my end result was that 3.3.1 is not adequate in the automatic self-heal. I ran the qa version of 3.3.2 and I was not able to find a fault. Also if you can get your replication up to 3 then you can set a quorum of 2 and that would make very rare chances that you would ever get a split brain. Also I don&#39;t recommend this to everyone but in my scenario I was appending csv files. I found using the diff option for the self-heal created less data loss, again if you don&#39;t understand that option then don&#39;t change it.<div>


<br></div><div>Also still waiting for php to fix my bug <a href="https://bugs.php.net/bug.php?id=60110">https://bugs.php.net/bug.php?id=60110</a></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Jul 15, 2013 at 1:52 AM, Toby Corkindale <span dir="ltr">&lt;<a href="mailto:toby.corkindale@strategicdata.com.au" target="_blank">toby.corkindale@strategicdata.com.au</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 12/07/13 06:44, Michael Peek wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi gurus,<br>

<br>

So I have a cluster that I&#39;ve set up and I&#39;m banging on.  It&#39;s comprised<br>

of four machines with two drives in each machine.  (By the way, the<br>

3.2.5 version that comes with stock Ubuntu 12.04 seems to have a lot of<br>

bugs/instability.  I was screwing it up daily just by putting it through<br>

some heavy-use tests.  Then I downloaded 3.3.1 from the PPA, and so far<br>

things seem a LOT more stable.  I haven&#39;t managed to break anything yet,<br>

although the night is still young.)<br>

<br>

I&#39;m dumping data to it like mad, and I decide to simulate a filesystem<br>

error my remounting half of the cluster&#39;s drives in read-only mode with<br>

&quot;mount -o remount,ro&quot;.<br>

<br>

The cluster seems to slow just slightly, but it kept on ticking.  Great.<br>

</blockquote>

<br>

<br></div>

While you&#39;re performing your testing, can I suggest you include testing following behaviour too, to ensure the performance meets your needs.<br>

<br>

Fill the volumes up with data, to a point similar to what you expect to reach in production use. Not just in terms of disk space, but number of files and directories as well. You might need to write a small script that can build a simulated directory tree, populated with a range of file sizes.<br>


<br>

Take one of the nodes offline (or read-only), and then touch and modify a large number of files randomly around the volume. Imagine that a node was offline for 24 hours, and that you&#39;re simulating the quantity of write patterns that would occur in total over that time.<br>


<br>

Now bring the &quot;failed&quot; node back online and start the healing process.<br>

Meanwhile, continue to simulate client access patterns on the files you were modifying earlier. Ensure that performance is still sufficient for your needs.<br>

<br>

<br>

It&#39;s a more complicated test to run, but it&#39;s important to measure how gluster performs with your workload in non-ideal circumstances that you will eventually hit.<div class="HOEnZb"><div class="h5"><br>

______________________________<u></u>_________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.<u></u>org/mailman/listinfo/gluster-<u></u>users</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Follow Me: <a href="http://www.twitter.com/scottix" target="_blank">@Scottix</a><br><a href="http://about.me/scottix" target="_blank">http://about.me/scottix</a><br>


Scottix@Gmail.com

</div>