<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <div class="moz-cite-prefix">On 06/10/2014 02:00 PM,

      <a class="moz-txt-link-abbreviated" href="mailto:gluster-users-request@gluster.org">gluster-users-request@gluster.org</a> wrote:<br>

    </div>

    <blockquote

      cite="mid:mailman.3.1402401602.24727.gluster-users@gluster.org"

      type="cite">

      <pre wrap="">From: Laurent Chouinard <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:laurent.chouinard@ubisoft.com">&lt;laurent.chouinard@ubisoft.com&gt;</a>

To: Pranith Kumar Karampuri <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:pkarampu@redhat.com">&lt;pkarampu@redhat.com&gt;</a>

Cc: <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:gluster-users@gluster.org">"gluster-users@gluster.org"</a> <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:gluster-users@gluster.org">&lt;gluster-users@gluster.org&gt;</a>

Subject: Re: [Gluster-users] Unavailability during self-heal for large

                 volumes

Message-ID:

 <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:95ea1865fac2484980d020c6a3b7f0cd@MSR-MAIL-EXCH02.ubisoft.org">&lt;95ea1865fac2484980d020c6a3b7f0cd@MSR-MAIL-EXCH02.ubisoft.org&gt;</a>

Content-Type: text/plain; charset="utf-8"

</pre>

      <blockquote type="cite" style="color: #000000;">

        <pre wrap=""><span class="moz-txt-citetags">&gt; </span>Laurent,

<span class="moz-txt-citetags">&gt; </span>   This has been improved significantly in afr-v2 (enhanced version of 

</pre>

      </blockquote>

      <pre wrap="">replication

</pre>

      <blockquote type="cite" style="color: #000000;">

        <pre wrap=""><span class="moz-txt-citetags">&gt; </span>translator in gluster) which will be released with 3.6 I believe. The 

</pre>

      </blockquote>

      <pre wrap="">issue happens 

</pre>

      <blockquote type="cite" style="color: #000000;">

        <pre wrap=""><span class="moz-txt-citetags">&gt; </span>because of the directory self-heal in the older versions. In the new 

</pre>

      </blockquote>

      <pre wrap="">version per file

</pre>

      <blockquote type="cite" style="color: #000000;">

        <pre wrap=""><span class="moz-txt-citetags">&gt; </span>healing in a directory is performed instead of Full directory heal 

</pre>

      </blockquote>

      <pre wrap="">at-once which was

</pre>

      <blockquote type="cite" style="color: #000000;">

        <pre wrap=""><span class="moz-txt-citetags">&gt; </span>creating a lot of traffic. Unfortunately This is too big a change to 

</pre>

      </blockquote>

      <pre wrap="">backport to older 

</pre>

      <blockquote type="cite" style="color: #000000;">

        <pre wrap=""><span class="moz-txt-citetags">&gt; </span>releases <span class="moz-smiley-s2" title=":-("><span>:-(</span></span>.

<span class="moz-txt-citetags">&gt;</span>

<span class="moz-txt-citetags">&gt; </span>Pranith

</pre>

      </blockquote>

      <pre wrap="">

Hi Pranith,

Thank you for this information. 

Do you think there is a way to limit/throttle the current directory 

self-heal then? I don't mind if it takes a long time.

Alternatively, is there a way to completely disable the complete healing 

system? I would consider running a manual healing operation by STAT'ing 

every file, which would allow me to throttle the speed to a more 

manageable level.

Thanks,

Laurent Chouinard

</pre>

    </blockquote>

    You could try this:<br>

    <br>

    <a class="moz-txt-link-freetext" href="http://www.gluster.org/author/andrew-lau/">http://www.gluster.org/author/andrew-lau/</a><br>

    <br>

    <span class="author">by <a

        href="http://www.gluster.org/author/andrew-lau/" title="Posts by

        Andrew Lau" rel="author">Andrew Lau</a> on February 3, 2014</span>

    <h2><a

href="http://www.andrewklau.com//controlling-glusterfsd-cpu-outbreaks-with-cgroups/">Controlling

        glusterfsd CPU outbreaks with cgroups</a></h2>

    <p>Some of you may that same feeling when adding a new brick to your

      gluster replicated volume which already has an excess of 1TB data

      already on there and suddenly your gluster server has shot up to

      500% CPU usage. What's worse is when my hosts run along side oVirt

      so while gluster hogged all the CPU, my VMs started to crawl, even

      running simple commands like <code>top</code> would take 30+

      seconds. Not a good feeling.</p>

    <p>My first attempt I limited the NIC's bandwidth to 200Mbps rather

      than the 2x1Gbps aggregated link and this calmed glusterfsd down

      to a healthy 50%. A temporary fix which however meant clients

      accessing gluster storage would be bottlenecked by that shared

      limit.</p>

    <p>So off to the mailing list - a great suggestion from

      James/purpleidea (<a

        href="https://ttboj.wordpress.com/code/puppet-gluster/">https://ttboj.wordpress.com/code/puppet-gluster/</a>)

      on using cgroups.</p>

    <p>The concept is simple, we limit the total CPU glusterfsd sees so

      when it comes to doing the checksums for self heals, replication

      etc. They won't have the high priority which other services such

      as running VMs would have. <strong>This effectively slows down

        replication rate in return for lower CPU usage.<br>

      </strong></p>

    <p>Kind regards,<br>

    </p>

    Jorick Astrego<br>

    Netbulae B.V.<br>

    <a class="moz-txt-link-freetext" href="http://www.netbulae.eu">http://www.netbulae.eu</a><br>

    <br>

  </body>

</html>