<table cellspacing="0" cellpadding="0" border="0"><tr><td valign="top"><div>Ok, no problem. The issue is very rare, even with our setup - we have seen it only once on one site even though we have been in production for several months now. For now, we can live with that IMO. <br /><br />And, thanks again. <br /><br />Anirban</div></td></tr></table> <div id="_origMsg_">
<div>
<br />
<div>
<div style="font-size:0.9em">
<hr size="1">
<b>
<span style="font-weight:bold">From:</span>
</b>
Pranith Kumar Karampuri <pkarampu@redhat.com>; <br>
<b>
<span style="font-weight:bold">To:</span>
</b>
Anirban Ghoshal <chalcogen_eg_oxygen@yahoo.com>; <gluster-users@gluster.org>; <br>
<b>
<span style="font-weight:bold">Subject:</span>
</b>
Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors <br>
<b>
<span style="font-weight:bold">Sent:</span>
</b>
Mon, Oct 20, 2014 4:08:05 AM <br>
</div>
<br>
<table cellspacing="0" cellpadding="0" border="0">
<tbody>
<tr>
<td valign="top">
<br clear="none">
<div class="moz-cite-prefix">On 10/19/2014 06:05 PM, Anirban Ghoshal
wrote:<br clear="none">
</div>
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
<div>I see. Thanks a tonne for the thorough explanation!
:) I can see that our setup would be vulnerable here
because the logger on one server is not generally aware
of the state of the replica on the other server. So, it
is possible that the log files may have been renamed
before heal had a chance to kick in. <br clear="none">
<br clear="none">
Could I also request you for the bug ID (should there be
one) against which you are coding up the fix, so that we
could get a notification once it is passed?<br clear="none">
</div>
</td></tr></tbody></table>
</blockquote>
This bug was reported by Redhat QE and the bug is cloned upstream. I
copied the relevant content so you would understand the context:<br clear="none">
<a rel="nofollow" shape="rect" class="moz-txt-link-freetext" target="_blank" href="https://bugzilla.redhat.com/show_bug.cgi?id=1154491">https://bugzilla.redhat.com/show_bug.cgi?id=1154491</a><br clear="none">
<br clear="none">
Pranith<br clear="none">
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
<div><br clear="none">
Also, as an aside, is O_DIRECT supposed to prevent this
from occurring if one were to make allowance for the
performance hit? <br clear="none">
</div>
</td></tr></tbody></table>
</blockquote>
Unfortunately no :-(. As far as I understand that was the only
work-around.<br clear="none">
<br clear="none">
Pranith<div class="yqt8270079415" id="yqtfd57243"><br clear="none">
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
<div><br clear="none">
Thanks again,<br clear="none">
Anirban</div>
</td></tr></tbody></table>
<div id="_origMsg_">
<div> <br clear="none">
<div>
<div style="font-size:0.9em;">
<hr size="1"> <b> <span style="font-weight:bold;">From:</span>
</b> Pranith Kumar Karampuri <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E" ymailto="mailto:pkarampu@redhat.com" target="_blank" href="javascript:return"><pkarampu@redhat.com></a>;
<br clear="none">
<b> <span style="font-weight:bold;">To:</span> </b>
Anirban Ghoshal <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E" ymailto="mailto:chalcogen_eg_oxygen@yahoo.com" target="_blank" href="javascript:return"><chalcogen_eg_oxygen@yahoo.com></a>;
<a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E" ymailto="mailto:gluster-users@gluster.org" target="_blank" href="javascript:return"><gluster-users@gluster.org></a>; <br clear="none">
<b> <span style="font-weight:bold;">Subject:</span> </b>
Re: [Gluster-users] Split-brain seen with [0 0] pending
matrix and io-cache page errors <br clear="none">
<b> <span style="font-weight:bold;">Sent:</span> </b>
Sun, Oct 19, 2014 9:01:58 AM <br clear="none">
</div>
<br clear="none">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top"> <br clear="none">
<div class="moz-cite-prefix">On 10/19/2014 01:36 PM,
Anirban Ghoshal wrote:<br clear="none">
</div>
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
<div>It is possible, yes, because these
are actually a kind of log files. I
suppose, like other logging frameworks
these files an remain open for a
considerable period, and then get
renamed to support log rotate semantics.
<br clear="none">
<br clear="none">
That said, I might need to check with
the team that actually manages the
logging framework to be sure. I only
take care of the file-system stuff. I
can tell you for sure Monday. <br clear="none">
<br clear="none">
If it is the same race that you mention,
is there a fix for it?<br clear="none">
<br clear="none">
Thanks,<br clear="none">
Anirban</div>
</td></tr></tbody></table>
<div id="_origMsg_">
<div> <br clear="none">
</div>
</div>
</blockquote>
I am working on the fix.<br clear="none">
<br clear="none">
RCA:<br clear="none">
0) Lets say the file 'abc.log' is opened for writing
on replica pair (brick-0, brick-1)<br clear="none">
1) brick-0 went down<br clear="none">
2) abc.log is renamed to abc.log.1<br clear="none">
3) brick-0 comes back up<br clear="none">
4) re-open on old abc.log happens from mount to
brick-0<br clear="none">
5) self-heal kicks in and deletes old abc.log and
creates and syncs abc.log.1<br clear="none">
6) But the mount is still writing to the deleted
'old abc.log' on brick-0 so abc.log.1 file remains
at the same size while abc.log.1 file keeps
increasing on brick-1. This leads to size mismatch
split-brain on abc.log.1.<br clear="none">
<br clear="none">
Race happens between steps 4), 5). If 5) happens
before 4) no split-brain will be observed.<br clear="none">
<br clear="none">
Work-around:<br clear="none">
<br clear="none">
0) Take backup of good abc.log.1 file from brick-1.
(Just being paranoid)<br clear="none">
<br clear="none">
Do any of the following two steps to make sure the
stale file that is open is closed<br clear="none">
1-a) Take the brick process with bad file down using
kill -9 <brick-pid> (In my example brick-0).<br clear="none">
1-b) Introduce a temporary disconnect between mount
and brick-0.<br clear="none">
(I would choose 1-a)<br clear="none">
2) Remove the bad file(abc.log.1) and its
gfid-backend-file from brick-0<br clear="none">
3) Bring the brick back up (gluster volume start
<volname> force)/restore the connection and
let it heal by doing 'stat' on the file abc.log.1 on
the mount.<br clear="none">
<br clear="none">
This bug existed from 2012, from the first time I
implemented rename/hard-link self-heal. It is
difficult to re-create. I have to put break-points
at several places in the process to hit the race.<br clear="none">
<br clear="none">
Pranith
<div class="yqt0243337077" id="yqtfd49805"><br clear="none">
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
<div><br clear="none">
Thanks,<br clear="none">
Anirban</div>
</td></tr></tbody></table>
</blockquote>
<blockquote type="cite">
<div id="_origMsg_">
<div>
<div>
<div style="font-size:0.9em;">
<hr size="1"> <b> <span style="font-weight:bold;">From:</span>
</b> Pranith Kumar Karampuri <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E"><pkarampu@redhat.com></a>;
<br clear="none">
<b> <span style="font-weight:bold;">To:</span>
</b> Anirban Ghoshal <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E"><chalcogen_eg_oxygen@yahoo.com></a>;
<a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E"><gluster-users@gluster.org></a>;
<br clear="none">
<b> <span style="font-weight:bold;">Subject:</span>
</b> Re: [Gluster-users] Split-brain
seen with [0 0] pending matrix and
io-cache page errors <br clear="none">
<b> <span style="font-weight:bold;">Sent:</span>
</b> Sun, Oct 19, 2014 5:42:24 AM <br clear="none">
</div>
<br clear="none">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top"> <br clear="none">
<div class="moz-cite-prefix">On
10/18/2014 04:36 PM, Anirban
Ghoshal wrote:<br clear="none">
</div>
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
<div>Hi,<br clear="none">
<br clear="none">
Yes, they do, and
considerably. I'd
forgotten to mention
that on my last email.
Their mtimes, however,
as far as i could tell
on separate servers,
seemed to coincide. <br clear="none">
<br clear="none">
Thanks,<br clear="none">
Anirban</div>
</td></tr></tbody></table>
<div id="_origMsg_">
<div> <br clear="none">
</div>
</div>
</blockquote>
<br clear="none">
Are these files always open? And
is it possible that the file could
have been renamed when one of the
bricks was offline? I know of a
race which can introduce this one.
Just trying to find if it is the
same case.<br clear="none">
<br clear="none">
Pranith
<div class="yqt7836717428" id="yqtfd37976"><br clear="none">
<br clear="none">
<blockquote type="cite">
<div id="_origMsg_">
<div>
<div>
<div style="font-size:0.9em;">
<hr size="1"> <b> <span style="font-weight:bold;">From:</span> </b> Pranith Kumar Karampuri <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E"><pkarampu@redhat.com></a>;
<br clear="none">
<b> <span style="font-weight:bold;">To:</span>
</b> Anirban Ghoshal <a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E"><chalcogen_eg_oxygen@yahoo.com></a>;
<a rel="nofollow" shape="rect" class="moz-txt-link-abbreviated">gluster-users@gluster.org</a>
<a rel="nofollow" shape="rect" class="moz-txt-link-rfc2396E"><gluster-users@gluster.org></a>;
<br clear="none">
<b> <span style="font-weight:bold;">Subject:</span>
</b> Re:
[Gluster-users]
Split-brain seen with
[0 0] pending matrix
and io-cache page
errors <br clear="none">
<b> <span style="font-weight:bold;">Sent:</span>
</b> Sat, Oct 18, 2014
12:26:08 AM <br clear="none">
</div>
<br clear="none">
<table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td colspan="1" rowspan="1" valign="top">
hi,<br clear="none">
Could you
see if the size
of the file
mismatches?<br clear="none">
<br clear="none">
Pranith<br clear="none">
<br clear="none">
<div class="yqt8170167658" id="yqt92241">
<div class="moz-cite-prefix">On
10/18/2014
04:20 AM,
Anirban
Ghoshal wrote:<br clear="none">
</div>
<blockquote type="cite">
<div style="color:#000;background-color:#fff;font-family:Courier New, courier, monaco, monospace, sans-serif;font-size:13px;">
<div class="" style="">Hi
everyone,</div>
<div class="" style=""><br clear="none" class="" style="">
</div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-family:'Courier New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;">I
have this
really
confusing
split-brain
here that's
bothering me.
I am running
glusterfs
3.4.2 over
linux 2.6.34.
I have a
replica 2
volume
'testvol' that
is It seems I
cannot
read/stat/edit
the file in
question, and
`gluster
volume heal
testvol info
split-brain`
shows nothing.
Here are the
logs from the
fuse-mount for
the volume:</div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-family:'Courier New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
</div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-family:'Courier New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:53:02.867111]
W
[fuse-bridge.c:1172:fuse_err_cbk]
0-glusterfs-fuse:
4560969:
FLUSH() ERR
=> -1
(Input/output
error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.007799]
W
[page.c:991:__ioc_page_error]
0-testvol-io-cache:
page error for
page =
0x7fd5c8529d20
& waitq =
0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.007854]
W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561103: READ
=> -1
(Input/output
error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.008018]
W
[page.c:991:__ioc_page_error]
0-testvol-io-cache:
page error for
page =
0x7fd5c8607ee0
& waitq =
0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.008056]
W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561104: READ
=> -1
(Input/output
error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.008233]
W
[page.c:991:__ioc_page_error]
0-testvol-io-cache:
page error for
page =
0x7fd5c8066f30
& waitq =
0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.008269]
W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561105: READ
=> -1
(Input/output
error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.008800]
W
[page.c:991:__ioc_page_error]
0-testvol-io-cache:
page error for
page =
0x7fd5c860bcf0
& waitq =
0x7fd5c863b1f0 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.008839]
W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561107: READ
=> -1
(Input/output
error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.009365]
W
[page.c:991:__ioc_page_error]
0-testvol-io-cache:
page error for
page =
0x7fd5c85fd120
& waitq =
0x7fd5c8067d40 </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.009413]
W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561109: READ
=> -1
(Input/output
error) </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.040549]
W
[afr-open.c:213:afr_open]
0-testvol-replicate-0:
failed to open
as split brain
seen,
returning EIO </font><br clear="none" class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">
<font class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;">[2014-09-29
07:54:16.040594]
W
[fuse-bridge.c:915:fuse_fd_cbk]
0-glusterfs-fuse:
4561142:
OPEN()
/SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log
=> -1
(Input/output
error)</font><br clear="none" class="" style="">
</div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-family:'Courier New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
</div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-style:normal;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;">Could
somebody
please give me
some clue on
where to
begin? I
checked the
xattrs on <span class="">/SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log
and it seems
the changelogs
are [0, 0] on
both replicas,
and the gfid's
match.</span></span></div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-style:normal;font-family:verdana, helvetica, sans-serif;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;"><span class=""><br clear="none">
</span></span></div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-style:normal;font-family:verdana, helvetica, sans-serif;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;"><span class="">Thank
you very much
for any help
on this.</span></span></div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-style:normal;font-family:verdana, helvetica, sans-serif;background-color:transparent;"><span style="font-family:verdana, helvetica, sans-serif;"><span class="">Anirban</span></span></div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-family:Verdana, Arial, Helvetica, sans-serif;font-style:normal;background-color:transparent;"><span class="" style="font-family:Verdana, Arial, Helvetica, sans-serif;"><br clear="none">
</span></div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-family:'Courier New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
</div>
<div class="" style="color:rgb(0, 0, 0);font-size:13px;font-family:'Courier New', courier, monaco, monospace, sans-serif;font-style:normal;background-color:transparent;"><br clear="none" class="" style="">
</div>
</div>
<br clear="none">
<fieldset class="mimeAttachmentHeader"></fieldset>
<br clear="none">
<pre>_______________________________________________
Gluster-users mailing list
<a rel="nofollow" shape="rect" class="moz-txt-link-abbreviated">Gluster-users@gluster.org</a>
<a rel="nofollow" shape="rect" class="moz-txt-link-freetext" target="_blank" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
</div>
<br clear="none">
</td></tr></tbody></table>
</div>
</div>
</div>
</blockquote>
<br clear="none">
</div>
</td></tr></tbody></table>
</div>
</div>
</div>
</blockquote>
<br clear="none">
</div>
</td></tr></tbody></table>
</div>
</div>
</div>
</blockquote>
<br clear="none">
</div></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>