<div dir="ltr">Thank you!<div><br></div><div>I think we may need some sort of dampening method and more specific input into Nagios. i.e. Details on which files are out-of-sync, versus just the number of files out-of-sync.<div><br></div><div>I'm using these: <a href="http://download.gluster.org/pub/gluster/glusterfs-nagios/">http://download.gluster.org/pub/gluster/glusterfs-nagios/</a></div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 19, 2014 at 10:14 AM, Nishanth Thomas <span dir="ltr"><<a href="mailto:nthomas@redhat.com" target="_blank">nthomas@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Vince,<br>
<br>
Are you referring the monitoring scripts mentioned in the blog( <a href="http://gopukrish.wordpress.com/2014/11/16/monitor-glusterfs-using-nagios-plugin/" target="_blank">http://gopukrish.wordpress.com/2014/11/16/monitor-glusterfs-using-nagios-plugin/</a>) or the scripts part of the gluster(<a href="http://gluster.org/pipermail/gluster-users.old/2014-June/017819.html" target="_blank">http://gluster.org/pipermail/gluster-users.old/2014-June/017819.html</a>)?<br>
Please confirm?<br>
<br>
Thanks,<br>
Nishanth<br>
<div><div class="h5"><br>
----- Original Message -----<br>
From: "Humble Devassy Chirammal" <<a href="mailto:humble.devassy@gmail.com">humble.devassy@gmail.com</a>><br>
To: "Vince Loschiavo" <<a href="mailto:vloschiavo@gmail.com">vloschiavo@gmail.com</a>><br>
Cc: "<a href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a>" <<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>>, "Sahina Bose" <<a href="mailto:sabose@redhat.com">sabose@redhat.com</a>>, <a href="mailto:nthomas@redhat.com">nthomas@redhat.com</a><br>
Sent: Wednesday, November 19, 2014 11:22:18 PM<br>
Subject: Re: [Gluster-users] v3.6.1 vs v3.5.2 self heal - help (Nagios related)<br>
<br>
Hi Vince,<br>
It could be a behavioural change in heal process output capture with latest<br>
GlusterFS. If that is the case, we may tune the interval which nagios<br>
collect heal info output or some other settings to avoid continuous<br>
alerts. I am Ccing gluster nagios devs.<br>
<br>
--Humble<br>
<br>
--Humble<br>
<br>
<br>
On Wed, Nov 19, 2014 at 9:50 PM, Vince Loschiavo <<a href="mailto:vloschiavo@gmail.com">vloschiavo@gmail.com</a>><br>
wrote:<br>
<br>
><br>
> Hello Gluster Community,<br>
><br>
> I have been using the Nagios monitoring scripts, mentioned in the below<br>
> thread, on 3.5.2 with great success. The most useful of these is the self<br>
> heal.<br>
><br>
> However, I've just upgraded to 3.6.1 on the lab and the self heal daemon<br>
> has become quite aggressive. I continually get alerts/warnings on 3.6.1<br>
> that virt disk images need self heal, then they clear. This is not the<br>
> case on 3.5.2. This<br>
><br>
> Configuration:<br>
> 2 node, 2 brick replicated volume with 2x1GB LAG network between the peers<br>
> using this volume as a QEMU/KVM virt image store through the fuse mount on<br>
> Centos 6.5.<br>
><br>
> Example:<br>
> on 3.5.2:<br>
</div></div>> *gluster volume heal volumename info: *shows the bricks and number of<br>
<span class="">> entries to be healed: 0<br>
><br>
> On v3.5.2 - During normal gluster operations, I can run this command over<br>
> and over again, 2-4 times per second, and it will always show 0 entries to<br>
> be healed. I've used this as an indicator that the bricks are<br>
> synchronized.<br>
><br>
> Last night, I upgraded to 3.6.1 in lab and I'm seeing different behavior.<br>
</span>> Running *gluster volume heal volumename info*, during normal operations,<br>
<div class="HOEnZb"><div class="h5">> will show a file out-of-sync, seemingly between every block written to disk<br>
> then synced to the peer. I can run the command over and over again, 2-4<br>
> times per second, and it will almost always show something out of sync.<br>
> The individual files change, meaning:<br>
><br>
> Example:<br>
> 1st Run: shows file1 out of sync<br>
> 2nd run: shows file 2 and file 3 out of sync but file 1 is now in sync<br>
> (not in the list)<br>
> 3rd run: shows file 3 and file 4 out of sync but file 1 and 2 are in sync<br>
> (not in the list).<br>
> ...<br>
> nth run: shows 0 files out of sync<br>
> nth+1 run: shows file 3 and 12 out of sync.<br>
><br>
> From looking at the virtual machines running off this gluster volume, it's<br>
> obvious that gluster is working well. However, this obviously plays havoc<br>
> with Nagios and alerts. Nagios will run the heal info and get different<br>
> and non-useful results each time, and will send alerts.<br>
><br>
> Is this behavior change (3.5.2 vs 3.6.1) expected? Is there a way to tune<br>
> the settings or change the monitoring method to get better results into<br>
> Nagios.<br>
><br>
> Thank you,<br>
><br>
> --<br>
> -Vince Loschiavo<br>
><br>
><br>
> On Wed, Nov 19, 2014 at 4:35 AM, Humble Devassy Chirammal <<br>
> <a href="mailto:humble.devassy@gmail.com">humble.devassy@gmail.com</a>> wrote:<br>
><br>
>> Hi Gopu,<br>
>><br>
>> Awesome !!<br>
>><br>
>> We can have a Gluster blog about this implementation.<br>
>><br>
>> --Humble<br>
>><br>
>><br>
>><br>
>> --Humble<br>
>><br>
>><br>
>> On Wed, Nov 19, 2014 at 5:38 PM, Gopu Krishnan <<a href="mailto:gopukrishnantec@gmail.com">gopukrishnantec@gmail.com</a><br>
>> > wrote:<br>
>><br>
>>> Thanks for all your help... I was able to configure nagios using the<br>
>>> glusterfs plugin. Following link shows how I configured it. Hope it helps<br>
>>> someone else.:<br>
>>><br>
>>><br>
>>> <a href="http://gopukrish.wordpress.com/2014/11/16/monitor-glusterfs-using-nagios-plugin/" target="_blank">http://gopukrish.wordpress.com/2014/11/16/monitor-glusterfs-using-nagios-plugin/</a><br>
>>><br>
>>> On Sun, Nov 16, 2014 at 11:44 AM, Humble Devassy Chirammal <<br>
>>> <a href="mailto:humble.devassy@gmail.com">humble.devassy@gmail.com</a>> wrote:<br>
>>><br>
>>>> Hi,<br>
>>>><br>
>>>> Please look at this thread<br>
>>>> <a href="http://gluster.org/pipermail/gluster-users.old/2014-June/017819.html" target="_blank">http://gluster.org/pipermail/gluster-users.old/2014-June/017819.html</a><br>
>>>><br>
>>>> Btw, if you are around, we have a talk on same topic in upcoming<br>
>>>> GlusterFS India meetup.<br>
>>>><br>
>>>> Details can be fetched from:<br>
>>>> <a href="http://www.meetup.com/glusterfs-India/" target="_blank">http://www.meetup.com/glusterfs-India/</a><br>
>>>><br>
>>>> --Humble<br>
>>>><br>
>>>> --Humble<br>
>>>><br>
>>>><br>
>>>> On Sun, Nov 16, 2014 at 11:23 AM, Gopu Krishnan <<br>
>>>> <a href="mailto:gopukrishnantec@gmail.com">gopukrishnantec@gmail.com</a>> wrote:<br>
>>>><br>
>>>>> How can we monitor the glusters and alert us if something happened<br>
>>>>> wrong. I found some nagios plugins and didn't work until this time. I am<br>
>>>>> still experimenting with those. Any suggestions would be much helpful<br>
>>>>><br>
>>>>> _______________________________________________<br>
>>>>> Gluster-users mailing list<br>
>>>>> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>>>>> <a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
>>>>><br>
>>>><br>
>>>><br>
>>><br>
>><br>
>> _______________________________________________<br>
>> Gluster-users mailing list<br>
>> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>> <a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
>><br>
><br>
><br>
><br>
><br>
><br>
> _______________________________________________<br>
> Gluster-users mailing list<br>
> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
> <a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">-Vince Loschiavo<br></div>
</div>