<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 5/13/2014 10:43 PM, Sahina Bose

      wrote:<br>

    </div>

    <blockquote cite="mid:5373027B.3040301@redhat.com" type="cite">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      <br>

      <div class="moz-cite-prefix">On 05/14/2014 07:42 AM, Milo&#353; Koz&aacute;k

        wrote:<br>

      </div>

      <blockquote cite="mid:5372D100.40508@lejmr.com" type="cite">Hi, <br>

        I am running a field trial of Gluster 3.5 on two servers. These

        two server use one 10k HDD each with XFS as a brick. On top of

        these bricks I have one replica 2 volume: <br>

        <br>

        [root@nodef01i ~]# gluster volume info ph-fs-0 <br>

        <br>

        Volume Name: ph-fs-0 <br>

        Type: Replicate <br>

        Volume ID: 5085e018-7c47-4d4f-8dcb-cd89ec240393 <br>

        Status: Started <br>

        Number of Bricks: 1 x 2 = 2 <br>

        Transport-type: tcp <br>

        Bricks: <br>

        Brick1: 10.11.100.1:/gfs/s3-sata-10k/brick <br>

        Brick2: 10.11.100.2:/gfs/s3-sata-10k/brick <br>

        Options Reconfigured: <br>

        performance.io-thread-count: 12 <br>

        network.ping-timeout: 2 <br>

        performance.cache-max-file-size: 0 <br>

        performance.flush-behind: on <br>

        <br>

        Additionally I am running nagios to monitor everything where I

        use <a moz-do-not-send="true" class="moz-txt-link-freetext"

href="http://exchange.nagios.org/directory/Plugins/System-Metrics/File-System/GlusterFS-checks/details">http://exchange.nagios.org/directory/Plugins/System-Metrics/File-System/GlusterFS-checks/details</a>.

        I improved it slightly such that I monitor number of split-brain

        files and all this information go to the performance data,

        therefore I can draw pictures out of it (these pictures are in

        attachement). <br>

        <br>

        My problem is that I am receiving quite a lot of false warning

        from nagios during a day because there are some unsync files

        (gluster volume heal XXX info). I dont know if it is a bug or it

        is cause by my configuration. Either way it is quite disturbing

        and I am afraid that after receiving a lot false warning I could

        just omit an important one.. <br>

      </blockquote>

      <br>

      <br>

      I think the issue is because the "gluster volume heal info" also

      reports files undergoing I/O in addition to files that need

      self-heal. see <a moz-do-not-send="true"

        class="moz-txt-link-freetext"

href="http://supercolony.gluster.org/pipermail/gluster-users/2014-May/040239.html">http://supercolony.gluster.org/pipermail/gluster-users/2014-May/040239.html</a>

      for more information on this. Pranith, please correct me if wrong.<br>

      <br>

    </blockquote>

    <br>

    That's what I've seen as well.<br>

    <br>

    <blockquote cite="mid:5373027B.3040301@redhat.com" type="cite"> On

      another note, we are also developing Nagios plugins that can be

      used to monitor the various entities and services in the gluster

      cluster. The repositories are here - <br>

      <br>

      gluster-nagios-addons - <a moz-do-not-send="true"

        class="moz-txt-link-freetext"

        href="http://review.gluster.org/#/admin/projects/gluster-nagios-addons">http://review.gluster.org/#/admin/projects/gluster-nagios-addons</a><br>

      nagios-server-addons - <a moz-do-not-send="true"

        class="moz-txt-link-freetext"

        href="http://review.gluster.org/#/admin/projects/nagios-server-addons">http://review.gluster.org/#/admin/projects/nagios-server-addons</a><br>

      <br>

      We will be putting together a short doc on these soon, meanwhile,

      please feel free to check it out and give us your valuable

      feedback.<br>

      <br>

      <br>

      <br>

      <blockquote cite="mid:5372D100.40508@lejmr.com" type="cite"> <br>

        network.ping-timeout is set to 2, because I can not allow VM

        servers to hang for 2x42sec when other node is rebooted (we have

        some kind of reboot policy).. <br>

        <br>

        Thanks for help, <br>

        Milos <br>

        <br>

        <br>

        <br>

        <br>

        <br>

        <fieldset class="mimeAttachmentHeader"></fieldset>

        <br>

        <pre wrap="">_______________________________________________

Gluster-users mailing list

<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>

      </blockquote>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Gluster-users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>

<a class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>

    </blockquote>

    <br>

  </body>

</html>