<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#ffffff" text="#000000">

    All - <br>

    &nbsp; A much more detailed update from engineering is below. The range

    of affected versions is bigger that I thought, I apologize for not

    getting that right earlier.&nbsp; Please let me know if you have any

    questions. <br>

    <br>

    <span style="font-weight: bold;">VERSIONS AFFECTED: All Current

      GlusterFS releases up to 3.1.5 and 3.2.1<br>

      <br>

      SEVERITY: For Enomaly users, a conflict results in denial of

      service but no loss of data. We have not observed the same race

      conditions in non-Enomaly environments.<br>

    </span><br>

    <b>CAUSE: &nbsp;Gluster and Enomaly incompatibility&nbsp;</b>

    <div>There is an incompatibility between GlusterFS and Enomaly and

      how they perform directory operations.&nbsp;Enomaly's agents monitor

      node failures and migrate VMs automatically. These distributed

      agents communicate with each other by performing directory

      operations, such as mkdir, for the purpose of inter-node

      communication. These directory operations triggers a race

      condition GlusterFS, locking up a storage node. Enomaly agents get

      confused and propagate the error across the site even for a single

      node failure.

      <br>

    </div>

    <div><br>

    </div>

    There was a race condition related to changing GFID's that was fixed

    in 3.1.5 - <a

      href="http://blog.gluster.com/2011/06/glusterfs-3-1-5-now-available/"

      target="_blank">

      http://blog.gluster.com/2011/06/glusterfs-3-1-5-now-available/</a>

    - this is a partial fix for the behavior described. Race conditions

    can occur again.<br>

    <br>

    After fixing the initial outage, if any node fails, you will see the

    issue again. Upgrading GlusterFS to 3.1.5 and restarting GlusterFS

    and Enomaly is a temporary fix. A permanent&nbsp;solution requires the

    Gluster 3.1.6 or 3.2.2 release (coming soon, see "Solution" below).<br>

    <br>

    Other possible race conditions are fixed in the current source tree,

    subject to further testing.<br>

    <div><br>

    </div>

    <div><b>SOLUTION:</b></div>

    <div>

      <div><span style="font-weight: normal;">This issue has been fixed

          in our source repository (<a class="moz-txt-link-freetext" href="https://github.com/gluster">https://github.com/gluster</a>) and will

          be released soon with 3.1.6

        </span><span style="font-weight: normal;">and 3.2.2. If you'd

          like to help test the current fixes, please contact us before

          you do anything foolish (read: use in production). Users who

          test these patches in their non-critical, development

          environments and send us feedback will each get a Gluster

          t-shirt, maybe even a hat!!<br>

          <br>

          We will send out another alert as soon as both releases are

          GA.<br>

        </span></div>

    </div>

    <br>

    <div class="moz-signature">-- <br>

      <meta http-equiv="content-type" content="text/html;

        charset=ISO-8859-1">

      <title></title>

      <span><span style="font-family: Arial;">Thanks,</span><br

          style="font-family: Arial;">

        <br style="font-family: Arial;">

        <span style="font-family: Arial;">Craig Carl</span><br

          style="font-family: Arial;">

        <span style="font-family: Arial;">Senior Systems Engineer |

          Gluster</span><br style="font-family: Arial;">

        <span class="" id="OBJ_PREFIX_DWT11525" style="font-family:

          Arial;"><a href="callto:+1408-829-9953">408-829-9953</a></span><span

          style="font-family: Arial;"> | San Francisco, CA<br>

        </span><a href="http://www.gluster.com/gluster-for-aws/"

          style="font-family: Arial;">http://www.gluster.com/gluster-for-aws/</a>

        <br>

      </span></div>

    <br>

    <br>

    <blockquote style="border: 0px none;"

      cite="mid:4E04FA34.6000504@gluster.com" type="cite">

      <div style="margin-left: 40px;">

        <hr style="border-width: 1px 0pt 0pt; border-style: dotted none

          none; border-color: rgb(181, 181, 181) -moz-use-text-color

          -moz-use-text-color; height: 1px; margin: 0pt;"

          class="__pbConvHr"><br>

      </div>

      <div style="margin-left: 40px;"><a moz-do-not-send="true"

          href="mailto:craig@gluster.com" style="color: rgb(0, 136, 204)

          ! important; text-decoration: none ! important;">Craig Carl</a><br>

        <font color="#888888">June 24, 2011 1:57 PM</font></div>

      <br>

      <div style="color: rgb(136, 136, 136); margin-left: 40px;"

        __pbrmquotes="true" class="__pbConvBody"><br>

        <meta content="text/html; charset=ISO-8859-1"

          http-equiv="Content-Type">

        All - <br>

        <br>

        Gluster has identified a serious issue that affects anyone

        hosting VM images for the Enomaly Elastic Cloud Platform with

        Gluster. This issue is limited to Gluster versions =&lt;3.1.4.&nbsp;

        We strongly encourage anyone using Enomaly and Gluster to

        upgrade to Gluster version 3.1.5 or higher.&nbsp; <br>

        <br>

        What causes the failure - <br>

        <br>

        Use a distribute-replicate volume.<br>

        Mount using either NFS or the GlusterFS.<br>

        Fail one of replica nodes. <br>

        &nbsp;&nbsp;&nbsp; ** Production is unaffected at this point.<br>

        Restart the failed node.&nbsp; <br>

        &nbsp;&nbsp; ** All the virtual machines fail. <br>

        &nbsp;&nbsp; ** The ecpagent service on each hypervisor will constantly

        restart.<br>

        <br>

        Root cause - <br>

        <br>

        Enomaly uses a locking mechanism in addition to and above the

        standard POSIX locking to make sure that a VM never starts on

        two servers at the same time. When a Enomaly sever starts a VM

        it writes a file (&lt;randomstuff.tmp&gt; to the directory. The

        combination of self-heal and a race between the ecpagents on the

        hypervisors results in the VMs failing to start. <span>No data

          is lost or damaged.</span><br>

        <br>

        Again, this issue is specific to Enomaly. Enomaly users should

        immediatly upgrade to a version of Gluster =&gt;3.1.5.<br>

        <br>

        <a moz-do-not-send="true"

          href="http://download.gluster.com/pub/gluster/glusterfs/"

          class="moz-txt-link-freetext">http://download.gluster.com/pub/gluster/glusterfs/</a><br>

        <br>

        <div>_______________________________________________<br>

          Gluster-users mailing list<br>

          <a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

          <a class="moz-txt-link-freetext" href="http://gluster.org/cgi-bin/mailman/listinfo/gluster-users">http://gluster.org/cgi-bin/mailman/listinfo/gluster-users</a><br>

        </div>

      </div>

    </blockquote>

  </body>

</html>