<div dir="ltr">Hi Pranith,<div><br></div><div>Thank you very much for the quick reply and the information.  I am in the process now of recreating the cluster using XFS.  This all brings up a few questions:</div><div><br></div><div>- I assume the change from EXT4 to XFS will correct the problem with readdir (in other words, the issue is not present in XFS)?</div><div>- Do you have any idea when the patch for this might be out?  My reason for asking is that I have another cluster that has been updated to 3.6 and is running on EXT4 but does not yet have an issue.  This concerns me so I am hoping the patch will be out soon?</div><div>- What exactly does cluster.entry-self-heal do?  I can&#39;t seem to find a description of it?</div><div>- I assume from your posts that the reason the cluster is fine until traffic hits it is because the self-heal is not happening until traffic causes the files to be read.  Is that how it works?</div><div><br></div><div>Thank you again for the fast response and the great product!</div><div><br></div><div>----</div><div>Kyle</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Nov 22, 2014 at 11:36 AM, Pranith Kumar Karampuri <span dir="ltr">&lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div text="#000000" bgcolor="#FFFFFF"><span class="">

    <br>

    <div>On 11/22/2014 11:04 PM, Pranith Kumar

      Karampuri wrote:<br>

    </div>

    <blockquote type="cite">

      <br>

      <div>On 11/22/2014 10:40 PM, Pranith Kumar

        Karampuri wrote:<br>

      </div>

      <blockquote type="cite">

        <br>

        <div>On 11/22/2014 10:29 PM, Kyle Harris

          wrote:<br>

        </div>

        <blockquote type="cite">

          <div dir="ltr">

            <div>Hello,</div>

            <div><br>

            </div>

            <div>I have an issue with a 3 node replicated cluster.  My

              issue started after reboot a while back.  The top command

              would show the glusterfs and glusterfsd processes eating

              up almost all the resources on an all three nodes of the

              cluster.  So much so that it would not run the web sites

              that are hosted on it.  The httpd processes would begin to

              hang.  I finally decided to tear down the cluster and

              rebuild it from the ground up.  I did so and then copied

              all the data back which took all night due to the amount

              of data.  All was well during that entire copy process

              back to the cluster with no resource spikes.<br>

            </div>

          </div>

        </blockquote>

      </blockquote>

      Assuming you go back to 3.5.2<br>

      Execute the following commands:<br>

      # gluster volume set &lt;volname&gt; cluster.entry-self-heal off<br>

      <br>

      This should prevent httpd hangs.<br>

      <br>

      If you still find that the CPU usage is very high, execute the

      following command:<br>

      # gluster volume set &lt;volname&gt; cluster.self-heal-daemon off<br>

      <br>

      This disables self-healing. But you should probably periodically

      heal so that the data is healed by enabling self-heal-daemon using

      following command:<br>

      # gluster volume set &lt;volname&gt; cluster.self-heal-daemon on<br>

      <br>

      Once &quot;gluster volume heal &lt;volname&gt; info&quot; shows zero

      entries, then healing is complete.<br>

      <br>

      We took some steps to improve this in 3.6. But readdir in EXT4 is

      not working correctly so that is probably giving problems here.

      Lets wait for Vijay to merge the patch I mentioned, then things

      should be fine.<br>

    </blockquote></span>

    Sorry for the inconvenience caused. We found the issue after the

    release is made :-(.<span class="HOEnZb"><font color="#888888"><br>

    <br>

    Pranith</font></span><div><div class="h5"><br>

    <blockquote type="cite"> <br>

      Pranith<br>

      <blockquote type="cite">

        <blockquote type="cite">

          <div dir="ltr">

            <div><br>

            </div>

            <div>I should note that this cluster is home to many

              Apache/PHP based web sites.  The problem starts again,

              however the minute I point traffic back to the sites on

              the cluster.  Before pointing traffic to it, all is fine

              but as soon as the traffic begins to hit it, the

              utilization again begins to spike.  Note that all the

              sites run just fine when hosted from a standard EXT4

              partition.  I noticed another thread labeled &quot;glusterfsd

              process thrashing CPU&quot; where Pranith asks if the user has

              directories with lots of files and I do.</div>

            <div><br>

            </div>

            <div>Here are some other details of my cluster:</div>

            <div>- OS:  CentOS 6.6 with all updates on all 3 nodes as of

              11-22-2014</div>

            <div>- All 3 nodes have 8 cores with 16 GB of RAM</div>

            <div>- Nodes are all formatted with EXT4</div>

            <div>- All three nodes also have the files systems mounted

              on them for use with Apache.  I have experimented with

              both NFS and Fuse mounts and it doesn&#39;t seem to make a

              difference which I use for this particular problem.  I am

              currently using Fuse.</div>

            <div>- Approximately 135 GB of data.  Some deep directories

              with many small files.</div>

            <div>- No optimization or changes have been made to the

              cluster . . . it is running with default options</div>

            <div>- Gluster version 3.6.1-1 installed from RPMs</div>

            <div>- Note the issue originally occurred on version 3.5.2

              but I updated before rebuilding it in hopes that would fix

              it (it didn&#39;t)</div>

            <div><br>

            </div>

            <div>Can anyone give me guidance on how to tackle this

              problem?  I am hoping perhaps Pranith can give some

              details as to why the question about many files and how to

              proceed given my situation.  I know others have commented

              about having many small files with regard to performance

              but when the processors are not spiked, performance has

              been acceptable.  Any help would be greatly appreciated.</div>

            <div><br>

            </div>

          </div>

        </blockquote>

        Kyle,<br>

              3.6.1 and EXT4 has a problem because of 64 bits offset.

        Afr-v2 implementation introduced this problem. We thought the

        following patch is merged but it didn&#39;t :-( <a href="http://review.gluster.com/8201" target="_blank">http://review.gluster.com/8201</a>.

        Please don&#39;t use 3.6.1 with EXT4<br>

        <br>

        Vijay,<br>

              Please merge <a href="http://review.gluster.com/8201" target="_blank">http://review.gluster.com/8201</a><br>

        <br>

        Pranith<br>

        <blockquote type="cite">

          <div dir="ltr">-- <br>

            <div>

              <div dir="ltr">Kyle 

                <div>

                  <div><br>

                  </div>

                </div>

              </div>

            </div>

          </div>

          <br>

          <fieldset></fieldset>

          <br>

          <pre>_______________________________________________

Gluster-users mailing list

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>

        </blockquote>

        <br>

      </blockquote>

      <br>

      <br>

      <fieldset></fieldset>

      <br>

      <pre>_______________________________________________

Gluster-users mailing list

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>

    </blockquote>

    <br>

  </div></div></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">Kyle A. Harris<div>Kyle@TheHarrisHome.com<br><div>615-364-6752<br></div><div><br></div></div></div></div>

</div>