<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <div class="moz-cite-prefix">On 11/22/2014 11:50 PM, Kyle Harris

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAO5ZC7ELc9v5Xu4PhGyD4kyDxitt4gcKg6zJxKoWcKwxcw2qZA@mail.gmail.com"

      type="cite">

      <div dir="ltr">Hi Pranith,

        <div><br>

        </div>

        <div>Thank you very much for the quick reply and the

          information.  I am in the process now of recreating the

          cluster using XFS.  This all brings up a few questions:</div>

        <div><br>

        </div>

        <div>- I assume the change from EXT4 to XFS will correct the

          problem with readdir (in other words, the issue is not present

          in XFS)?</div>

      </div>

    </blockquote>

    Yes. This particular readdir issue is present because of the way

    gluster is handling EXT4's 64 bit offsets in readdir.<br>

    <blockquote

cite="mid:CAO5ZC7ELc9v5Xu4PhGyD4kyDxitt4gcKg6zJxKoWcKwxcw2qZA@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>- Do you have any idea when the patch for this might be

          out?  My reason for asking is that I have another cluster that

          has been updated to 3.6 and is running on EXT4 but does not

          yet have an issue.  This concerns me so I am hoping the patch

          will be out soon?</div>

      </div>

    </blockquote>

    Patch is out, but we need to wait for next release. Let me talk to

    Vijay once and see if we can make it quickly.<br>

    <blockquote

cite="mid:CAO5ZC7ELc9v5Xu4PhGyD4kyDxitt4gcKg6zJxKoWcKwxcw2qZA@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>- What exactly does cluster.entry-self-heal do?  I can't

          seem to find a description of it?</div>

      </div>

    </blockquote>

    It enables/disables directory self-heal.<br>

    <blockquote

cite="mid:CAO5ZC7ELc9v5Xu4PhGyD4kyDxitt4gcKg6zJxKoWcKwxcw2qZA@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>- I assume from your posts that the reason the cluster is

          fine until traffic hits it is because the self-heal is not

          happening until traffic causes the files to be read.  Is that

          how it works?</div>

      </div>

    </blockquote>

    Yes.<br>

    <blockquote

cite="mid:CAO5ZC7ELc9v5Xu4PhGyD4kyDxitt4gcKg6zJxKoWcKwxcw2qZA@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><br>

        </div>

        <div>Thank you again for the fast response and the great

          product!</div>

        <div><br>

        </div>

        <div>----</div>

        <div>Kyle</div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On Sat, Nov 22, 2014 at 11:36 AM,

          Pranith Kumar Karampuri <span dir="ltr">&lt;<a

              moz-do-not-send="true" href="mailto:pkarampu@redhat.com"

              target="_blank">pkarampu@redhat.com</a>&gt;</span> wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div text="#000000" bgcolor="#FFFFFF"><span class=""> <br>

                <div>On 11/22/2014 11:04 PM, Pranith Kumar Karampuri

                  wrote:<br>

                </div>

                <blockquote type="cite"> <br>

                  <div>On 11/22/2014 10:40 PM, Pranith Kumar Karampuri

                    wrote:<br>

                  </div>

                  <blockquote type="cite"> <br>

                    <div>On 11/22/2014 10:29 PM, Kyle Harris wrote:<br>

                    </div>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div>Hello,</div>

                        <div><br>

                        </div>

                        <div>I have an issue with a 3 node replicated

                          cluster.  My issue started after reboot a

                          while back.  The top command would show the

                          glusterfs and glusterfsd processes eating up

                          almost all the resources on an all three nodes

                          of the cluster.  So much so that it would not

                          run the web sites that are hosted on it.  The

                          httpd processes would begin to hang.  I

                          finally decided to tear down the cluster and

                          rebuild it from the ground up.  I did so and

                          then copied all the data back which took all

                          night due to the amount of data.  All was well

                          during that entire copy process back to the

                          cluster with no resource spikes.<br>

                        </div>

                      </div>

                    </blockquote>

                  </blockquote>

                  Assuming you go back to 3.5.2<br>

                  Execute the following commands:<br>

                  # gluster volume set &lt;volname&gt;

                  cluster.entry-self-heal off<br>

                  <br>

                  This should prevent httpd hangs.<br>

                  <br>

                  If you still find that the CPU usage is very high,

                  execute the following command:<br>

                  # gluster volume set &lt;volname&gt;

                  cluster.self-heal-daemon off<br>

                  <br>

                  This disables self-healing. But you should probably

                  periodically heal so that the data is healed by

                  enabling self-heal-daemon using following command:<br>

                  # gluster volume set &lt;volname&gt;

                  cluster.self-heal-daemon on<br>

                  <br>

                  Once "gluster volume heal &lt;volname&gt; info" shows

                  zero entries, then healing is complete.<br>

                  <br>

                  We took some steps to improve this in 3.6. But readdir

                  in EXT4 is not working correctly so that is probably

                  giving problems here. Lets wait for Vijay to merge the

                  patch I mentioned, then things should be fine.<br>

                </blockquote>

              </span> Sorry for the inconvenience caused. We found the

              issue after the release is made :-(.<span class="HOEnZb"><font

                  color="#888888"><br>

                  <br>

                  Pranith</font></span>

              <div>

                <div class="h5"><br>

                  <blockquote type="cite"> <br>

                    Pranith<br>

                    <blockquote type="cite">

                      <blockquote type="cite">

                        <div dir="ltr">

                          <div><br>

                          </div>

                          <div>I should note that this cluster is home

                            to many Apache/PHP based web sites.  The

                            problem starts again, however the minute I

                            point traffic back to the sites on the

                            cluster.  Before pointing traffic to it, all

                            is fine but as soon as the traffic begins to

                            hit it, the utilization again begins to

                            spike.  Note that all the sites run just

                            fine when hosted from a standard EXT4

                            partition.  I noticed another thread labeled

                            "glusterfsd process thrashing CPU" where

                            Pranith asks if the user has directories

                            with lots of files and I do.</div>

                          <div><br>

                          </div>

                          <div>Here are some other details of my

                            cluster:</div>

                          <div>- OS:  CentOS 6.6 with all updates on all

                            3 nodes as of 11-22-2014</div>

                          <div>- All 3 nodes have 8 cores with 16 GB of

                            RAM</div>

                          <div>- Nodes are all formatted with EXT4</div>

                          <div>- All three nodes also have the files

                            systems mounted on them for use with

                            Apache.  I have experimented with both NFS

                            and Fuse mounts and it doesn't seem to make

                            a difference which I use for this particular

                            problem.  I am currently using Fuse.</div>

                          <div>- Approximately 135 GB of data.  Some

                            deep directories with many small files.</div>

                          <div>- No optimization or changes have been

                            made to the cluster . . . it is running with

                            default options</div>

                          <div>- Gluster version 3.6.1-1 installed from

                            RPMs</div>

                          <div>- Note the issue originally occurred on

                            version 3.5.2 but I updated before

                            rebuilding it in hopes that would fix it (it

                            didn't)</div>

                          <div><br>

                          </div>

                          <div>Can anyone give me guidance on how to

                            tackle this problem?  I am hoping perhaps

                            Pranith can give some details as to why the

                            question about many files and how to proceed

                            given my situation.  I know others have

                            commented about having many small files with

                            regard to performance but when the

                            processors are not spiked, performance has

                            been acceptable.  Any help would be greatly

                            appreciated.</div>

                          <div><br>

                          </div>

                        </div>

                      </blockquote>

                      Kyle,<br>

                            3.6.1 and EXT4 has a problem because of 64

                      bits offset. Afr-v2 implementation introduced this

                      problem. We thought the following patch is merged

                      but it didn't :-( <a moz-do-not-send="true"

                        href="http://review.gluster.com/8201"

                        target="_blank">http://review.gluster.com/8201</a>.

                      Please don't use 3.6.1 with EXT4<br>

                      <br>

                      Vijay,<br>

                            Please merge <a moz-do-not-send="true"

                        href="http://review.gluster.com/8201"

                        target="_blank">http://review.gluster.com/8201</a><br>

                      <br>

                      Pranith<br>

                      <blockquote type="cite">

                        <div dir="ltr">-- <br>

                          <div>

                            <div dir="ltr">Kyle 

                              <div>

                                <div><br>

                                </div>

                              </div>

                            </div>

                          </div>

                        </div>

                        <br>

                        <fieldset></fieldset>

                        <br>

                        <pre>_______________________________________________

Gluster-users mailing list

<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>

                      </blockquote>

                      <br>

                    </blockquote>

                    <br>

                    <br>

                    <fieldset></fieldset>

                    <br>

                    <pre>_______________________________________________

Gluster-users mailing list

<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>

                  </blockquote>

                  <br>

                </div>

              </div>

            </div>

          </blockquote>

        </div>

        <br>

        <br clear="all">

        <div><br>

        </div>

        -- <br>

        <div class="gmail_signature">

          <div dir="ltr">Kyle A. Harris

            <div><a class="moz-txt-link-abbreviated" href="mailto:Kyle@TheHarrisHome.com">Kyle@TheHarrisHome.com</a><br>

              <div>615-364-6752<br>

              </div>

              <div><br>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

  </body>

</html>