<table cellspacing="0" cellpadding="0" border="0"><tr><td valign="top"><div>I see. Thanks a tonne for the thorough explanation! :) I can see that our setup would be vulnerable here because the logger on one server is not generally aware of the state of the replica on the other server. So, it is possible that the log files may have been renamed before heal had a chance to kick in. <br /><br />Could I also request you for the bug ID (should there be one) against which you are coding up the fix, so that we could get a notification once it is passed?<br /><br />Also, as an aside, is O_DIRECT supposed to prevent this from occurring if one were to make allowance for the performance hit? <br /><br />Thanks again,<br />Anirban</div></td></tr></table>            <div id="_origMsg_">
                <div>
                    <br />
                    <div>
                        <div style="font-size:0.9em">
                            <hr size="1">
                            <b>
                                <span style="font-weight:bold">From:</span>
                            </b>
                            Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;;                            <br>
                            <b>
                                <span style="font-weight:bold">To:</span>
                            </b>
                            Anirban Ghoshal &lt;chalcogen_eg_oxygen@yahoo.com&gt;;  &lt;gluster-users@gluster.org&gt;;                                                                             <br>
                            <b>
                                <span style="font-weight:bold">Subject:</span>
                            </b>
                            Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors                            <br>
                            <b>
                                <span style="font-weight:bold">Sent:</span>
                            </b>
                            Sun, Oct 19, 2014 9:01:58 AM                            <br>
                        </div>
                            <br>
                            <table cellspacing="0" cellpadding="0" border="0">
                                <tbody>
                                    <tr>
                                        <td valign="top">
    <br clear="none">
    <div class="moz-cite-prefix">On 10/19/2014 01:36 PM, Anirban Ghoshal
      wrote:<br clear="none">
    </div>
    <blockquote type="cite">
      <table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
              <div>It is possible, yes, because these are actually a
                kind of log files. I suppose, like other logging
                frameworks these files an remain open for a considerable
                period, and then get renamed to support log rotate
                semantics. <br clear="none">
                <br clear="none">
                That said, I might need to check with the team that
                actually manages the logging framework to be sure. I
                only take care of the file-system stuff. I can tell you
                for sure Monday. <br clear="none">
                <br clear="none">
                If it is the same race that you mention, is there a fix
                for it?<br clear="none">
                <br clear="none">
                Thanks,<br clear="none">
                Anirban</div>
            </td></tr></tbody></table>
      <div id="_origMsg_">
        <div> <br clear="none">
        </div>
      </div>
    </blockquote>
    I am working on the fix.<br clear="none">
    <br clear="none">
    RCA:<br clear="none">
    0) Lets say the file &#39;abc.log&#39; is opened for writing on replica pair
    (brick-0, brick-1)<br clear="none">
    1) brick-0 went down<br clear="none">
    2) abc.log is renamed to abc.log.1<br clear="none">
    3) brick-0 comes back up<br clear="none">
    4) re-open on old abc.log happens from mount to brick-0<br clear="none">
    5) self-heal kicks in and deletes old abc.log and creates and syncs
    abc.log.1<br clear="none">
    6) But the mount is still writing to the deleted &#39;old abc.log&#39; on
    brick-0 so abc.log.1 file remains at the same size while abc.log.1
    file keeps increasing on brick-1. This leads to size mismatch
    split-brain on abc.log.1.<br clear="none">
    <br clear="none">
    Race happens between steps 4), 5). If 5) happens before 4) no
    split-brain will be observed.<br clear="none">
    <br clear="none">
    Work-around:<br clear="none">
    <br clear="none">
    0) Take backup of good abc.log.1 file from brick-1. (Just being
    paranoid)<br clear="none">
    <br clear="none">
    Do any of the following two steps to make sure the stale file that
    is open is closed<br clear="none">
    1-a) Take the brick process with bad file down using kill -9
    &lt;brick-pid&gt; (In my example brick-0).<br clear="none">
    1-b) Introduce a temporary disconnect between mount and brick-0.<br clear="none">
    (I would choose 1-a)<br clear="none">
    2) Remove the bad file(abc.log.1) and its gfid-backend-file from
    brick-0<br clear="none">
    3) Bring the brick back up (gluster volume start &lt;volname&gt;
    force)/restore the connection and let it heal by doing &#39;stat&#39; on the
    file abc.log.1 on the mount.<br clear="none">
    <br clear="none">
    This bug existed from 2012, from the first time I implemented
    rename/hard-link self-heal. It is difficult to re-create. I have to
    put break-points at several places in the process to hit the race.<br clear="none">
    <br clear="none">
    Pranith<div class="yqt0243337077" id="yqtfd49805"><br clear="none">
    <blockquote type="cite">
      <table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
              <div><br clear="none">
                Thanks,<br clear="none">
                Anirban</div>
            </td></tr></tbody></table>
    </blockquote>
    <blockquote type="cite">
      <div id="_origMsg_">
        <div>
          <div>
            <div style="font-size:0.9em;">
              <hr size="1"> <b> <span style="font-weight:bold;">From:</span>
              </b> Pranith Kumar Karampuri <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E" ymailto="mailto:pkarampu@redhat.com" target="_blank" href="javascript:return">&lt;pkarampu@redhat.com&gt;</a>;
              <br clear="none">
              <b> <span style="font-weight:bold;">To:</span> </b>
              Anirban Ghoshal <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E" ymailto="mailto:chalcogen_eg_oxygen@yahoo.com" target="_blank" href="javascript:return">&lt;chalcogen_eg_oxygen@yahoo.com&gt;</a>;
              <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E" ymailto="mailto:gluster-users@gluster.org" target="_blank" href="javascript:return">&lt;gluster-users@gluster.org&gt;</a>; <br clear="none">
              <b> <span style="font-weight:bold;">Subject:</span> </b>
              Re: [Gluster-users] Split-brain seen with [0 0] pending
              matrix and io-cache page errors <br clear="none">
              <b> <span style="font-weight:bold;">Sent:</span> </b>
              Sun, Oct 19, 2014 5:42:24 AM <br clear="none">
            </div>
            <br clear="none">
            <table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top"> <br clear="none">
                    <div class="moz-cite-prefix">On 10/18/2014 04:36 PM,
                      Anirban Ghoshal wrote:<br clear="none">
                    </div>
                    <blockquote type="cite">
                      <table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
                              <div>Hi,<br clear="none">
                                <br clear="none">
                                Yes, they do, and considerably. I&#39;d
                                forgotten to mention that on my last
                                email. Their mtimes, however, as far as
                                i could tell on separate servers, seemed
                                to coincide. <br clear="none">
                                <br clear="none">
                                Thanks,<br clear="none">
                                Anirban</div>
                            </td></tr></tbody></table>
                      <div id="_origMsg_">
                        <div> <br clear="none">
                        </div>
                      </div>
                    </blockquote>
                    <br clear="none">
                    Are these files always open? And is it possible that
                    the file could have been renamed when one of the
                    bricks was offline? I know of a race which can
                    introduce this one. Just trying to find if it is the
                    same case.<br clear="none">
                    <br clear="none">
                    Pranith
                    <div class="yqt7836717428" id="yqtfd37976"><br clear="none">
                      <br clear="none">
                      <blockquote type="cite">
                        <div id="_origMsg_">
                          <div>
                            <div>
                              <div style="font-size:0.9em;">
                                <hr size="1"> <b> <span style="font-weight:bold;">From:</span>
                                </b> Pranith Kumar Karampuri <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E">&lt;pkarampu@redhat.com&gt;</a>;
                                <br clear="none">
                                <b> <span style="font-weight:bold;">To:</span>
                                </b> Anirban Ghoshal <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E">&lt;chalcogen_eg_oxygen@yahoo.com&gt;</a>;
                                <a rel="nofollow" shape="rect" class="moz-txt-link-abbreviated">gluster-users@gluster.org</a>
                                <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E">&lt;gluster-users@gluster.org&gt;</a>;
                                <br clear="none">
                                <b> <span style="font-weight:bold;">Subject:</span>
                                </b> Re: [Gluster-users] Split-brain
                                seen with [0 0] pending matrix and
                                io-cache page errors <br clear="none">
                                <b> <span style="font-weight:bold;">Sent:</span>
                                </b> Sat, Oct 18, 2014 12:26:08 AM <br clear="none">
                              </div>
                              <br clear="none">
                              <table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top"> hi,<br clear="none">
                                            Could you see if the size of
                                      the file mismatches?<br clear="none">
                                      <br clear="none">
                                      Pranith<br clear="none">
                                      <br clear="none">
                                      <div class="yqt8170167658" id="yqt92241">
                                        <div class="moz-cite-prefix">On
                                          10/18/2014 04:20 AM, Anirban
                                          Ghoshal wrote:<br clear="none">
                                        </div>
                                        <blockquote type="cite">
                                          <div style="color:#000;background-color:#fff;font-family:Courier New, courier, monaco, monospace, sans-serif;font-size:13px;">
                                            <div class="" style="">Hi
                                              everyone,</div>
                                            <div class="" style=""><br clear="none" class="" style="">
                                            </div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-family:'Courier                                               New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;">I
                                              have this really confusing
                                              split-brain here that&#39;s
                                              bothering me. I am running
                                              glusterfs 3.4.2 over linux
                                              2.6.34. I have a replica 2
                                              volume &#39;testvol&#39; that is
                                              It seems I cannot
                                              read/stat/edit the file in
                                              question, and `gluster
                                              volume heal testvol info
                                              split-brain` shows
                                              nothing. Here are the logs
                                              from the fuse-mount for
                                              the volume:</div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-family:'Courier                                               New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
                                            </div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-family:'Courier                                               New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:53:02.867111] W
                                                [fuse-bridge.c:1172:fuse_err_cbk]
                                                0-glusterfs-fuse:
                                                4560969: FLUSH() ERR
                                                =&gt; -1 (Input/output
                                                error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.007799] W
                                                [page.c:991:__ioc_page_error]
                                                0-testvol-io-cache: page
                                                error for page =
                                                0x7fd5c8529d20 &amp;
                                                waitq = 0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.007854] W
                                                [fuse-bridge.c:2089:fuse_readv_cbk]
                                                0-glusterfs-fuse:
                                                4561103: READ =&gt; -1
                                                (Input/output error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.008018] W
                                                [page.c:991:__ioc_page_error]
                                                0-testvol-io-cache: page
                                                error for page =
                                                0x7fd5c8607ee0 &amp;
                                                waitq = 0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.008056] W
                                                [fuse-bridge.c:2089:fuse_readv_cbk]
                                                0-glusterfs-fuse:
                                                4561104: READ =&gt; -1
                                                (Input/output error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.008233] W
                                                [page.c:991:__ioc_page_error]
                                                0-testvol-io-cache: page
                                                error for page =
                                                0x7fd5c8066f30 &amp;
                                                waitq = 0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.008269] W
                                                [fuse-bridge.c:2089:fuse_readv_cbk]
                                                0-glusterfs-fuse:
                                                4561105: READ =&gt; -1
                                                (Input/output error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.008800] W
                                                [page.c:991:__ioc_page_error]
                                                0-testvol-io-cache: page
                                                error for page =
                                                0x7fd5c860bcf0 &amp;
                                                waitq = 0x7fd5c863b1f0 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.008839] W
                                                [fuse-bridge.c:2089:fuse_readv_cbk]
                                                0-glusterfs-fuse:
                                                4561107: READ =&gt; -1
                                                (Input/output error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.009365] W
                                                [page.c:991:__ioc_page_error]
                                                0-testvol-io-cache: page
                                                error for page =
                                                0x7fd5c85fd120 &amp;
                                                waitq = 0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.009413] W
                                                [fuse-bridge.c:2089:fuse_readv_cbk]
                                                0-glusterfs-fuse:
                                                4561109: READ =&gt; -1
                                                (Input/output error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.040549] W
                                                [afr-open.c:213:afr_open]
                                                0-testvol-replicate-0:
                                                failed to open as split
                                                brain seen, returning
                                                EIO </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
                                              <font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
                                                07:54:16.040594] W
                                                [fuse-bridge.c:915:fuse_fd_cbk]
                                                0-glusterfs-fuse:
                                                4561142: OPEN()
                                                /SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log
                                                =&gt; -1 (Input/output
                                                error)</font><br clear="none" class="" style="">
                                            </div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-family:'Courier                                               New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
                                            </div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-style:normal;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;">Could
                                                somebody please give me
                                                some clue on where to
                                                begin? I checked the
                                                xattrs on <span class="">/SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log


                                                  and it seems the
                                                  changelogs are [0, 0]
                                                  on both replicas, and
                                                  the gfid&#39;s match.</span></span></div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-style:normal;font-family:verdana, helvetica, sans-serif;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;"><span class=""><br clear="none">
                                                </span></span></div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-style:normal;font-family:verdana, helvetica, sans-serif;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;"><span class="">Thank you
                                                  very much for any help
                                                  on this.</span></span></div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-style:normal;font-family:verdana, helvetica, sans-serif;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;"><span class="">Anirban</span></span></div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-family:Verdana, Arial, Helvetica, sans-serif;font-style:normal;background-color:transparent;"><span class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;"><br clear="none">
                                              </span></div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-family:'Courier                                               New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
                                            </div>
                                            <div class="" style="color:rgb(0, 0,                                               0);font-size:13px;font-family:'Courier                                               New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
                                            </div>
                                          </div>
                                          <br clear="none">
                                          <fieldset class="mimeAttachmentHeader"></fieldset>
                                          <br clear="none">
                                          <pre>_______________________________________________
Gluster-users mailing list
<a rel="nofollow" shape="rect" class="moz-txt-link-abbreviated">Gluster-users@gluster.org</a>
<a rel="nofollow" shape="rect" class="moz-txt-link-freetext" target="_blank" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
                                        </blockquote>
                                      </div>
                                      <br clear="none">
                                    </td></tr></tbody></table>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                      <br clear="none">
                    </div>
                  </td></tr></tbody></table>
          </div>
        </div>
      </div>
    </blockquote>
    <br clear="none">
  </div></td>
                                    </tr>
                                </tbody>
                            </table>
                    </div>
                </div>
            </div>