<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<div class="moz-cite-prefix">On 11/22/2014 10:40 PM, Pranith Kumar
Karampuri wrote:<br>
</div>
<blockquote cite="mid:5470C370.4060002@redhat.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<br>
<div class="moz-cite-prefix">On 11/22/2014 10:29 PM, Kyle Harris
wrote:<br>
</div>
<blockquote
cite="mid:CAO5ZC7HgTS4PwCa+q3nyfwDKtJm91NhMjLOPc8ogfcJvbzZzfw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>Hello,</div>
<div><br>
</div>
<div>I have an issue with a 3 node replicated cluster. My
issue started after reboot a while back. The top command
would show the glusterfs and glusterfsd processes eating up
almost all the resources on an all three nodes of the
cluster. So much so that it would not run the web sites
that are hosted on it. The httpd processes would begin to
hang. I finally decided to tear down the cluster and
rebuild it from the ground up. I did so and then copied all
the data back which took all night due to the amount of
data. All was well during that entire copy process back to
the cluster with no resource spikes.<br>
</div>
</div>
</blockquote>
</blockquote>
Assuming you go back to 3.5.2<br>
Execute the following commands:<br>
# gluster volume set <volname> cluster.entry-self-heal off<br>
<br>
This should prevent httpd hangs.<br>
<br>
If you still find that the CPU usage is very high, execute the
following command:<br>
# gluster volume set <volname> cluster.self-heal-daemon off<br>
<br>
This disables self-healing. But you should probably periodically
heal so that the data is healed by enabling self-heal-daemon using
following command:<br>
# gluster volume set <volname> cluster.self-heal-daemon on<br>
<br>
Once "gluster volume heal <volname> info" shows zero entries,
then healing is complete.<br>
<br>
We took some steps to improve this in 3.6. But readdir in EXT4 is
not working correctly so that is probably giving problems here. Lets
wait for Vijay to merge the patch I mentioned, then things should be
fine.<br>
<br>
Pranith<br>
<blockquote cite="mid:5470C370.4060002@redhat.com" type="cite">
<blockquote
cite="mid:CAO5ZC7HgTS4PwCa+q3nyfwDKtJm91NhMjLOPc8ogfcJvbzZzfw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div>I should note that this cluster is home to many
Apache/PHP based web sites. The problem starts again,
however the minute I point traffic back to the sites on the
cluster. Before pointing traffic to it, all is fine but as
soon as the traffic begins to hit it, the utilization again
begins to spike. Note that all the sites run just fine when
hosted from a standard EXT4 partition. I noticed another
thread labeled "glusterfsd process thrashing CPU" where
Pranith asks if the user has directories with lots of files
and I do.</div>
<div><br>
</div>
<div>Here are some other details of my cluster:</div>
<div>- OS: CentOS 6.6 with all updates on all 3 nodes as of
11-22-2014</div>
<div>- All 3 nodes have 8 cores with 16 GB of RAM</div>
<div>- Nodes are all formatted with EXT4</div>
<div>- All three nodes also have the files systems mounted on
them for use with Apache. I have experimented with both NFS
and Fuse mounts and it doesn't seem to make a difference
which I use for this particular problem. I am currently
using Fuse.</div>
<div>- Approximately 135 GB of data. Some deep directories
with many small files.</div>
<div>- No optimization or changes have been made to the
cluster . . . it is running with default options</div>
<div>- Gluster version 3.6.1-1 installed from RPMs</div>
<div>- Note the issue originally occurred on version 3.5.2 but
I updated before rebuilding it in hopes that would fix it
(it didn't)</div>
<div><br>
</div>
<div>Can anyone give me guidance on how to tackle this
problem? I am hoping perhaps Pranith can give some details
as to why the question about many files and how to proceed
given my situation. I know others have commented about
having many small files with regard to performance but when
the processors are not spiked, performance has been
acceptable. Any help would be greatly appreciated.</div>
<div><br>
</div>
</div>
</blockquote>
Kyle,<br>
3.6.1 and EXT4 has a problem because of 64 bits offset.
Afr-v2 implementation introduced this problem. We thought the
following patch is merged but it didn't :-( <a
moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://review.gluster.com/8201">http://review.gluster.com/8201</a>.
Please don't use 3.6.1 with EXT4<br>
<br>
Vijay,<br>
Please merge <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://review.gluster.com/8201">http://review.gluster.com/8201</a><br>
<br>
Pranith<br>
<blockquote
cite="mid:CAO5ZC7HgTS4PwCa+q3nyfwDKtJm91NhMjLOPc8ogfcJvbzZzfw@mail.gmail.com"
type="cite">
<div dir="ltr">-- <br>
<div class="gmail_signature">
<div dir="ltr">Kyle
<div>
<div><br>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>