<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<div class="moz-cite-prefix">On 10/19/2014 01:36 PM, Anirban Ghoshal
wrote:<br>
</div>
<blockquote
cite="mid:1413705960.93919.YahooMailMobile@web193905.mail.sg3.yahoo.com"
type="cite">
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td valign="top">
<div>It is possible, yes, because these are actually a
kind of log files. I suppose, like other logging
frameworks these files an remain open for a considerable
period, and then get renamed to support log rotate
semantics. <br>
<br>
That said, I might need to check with the team that
actually manages the logging framework to be sure. I
only take care of the file-system stuff. I can tell you
for sure Monday. <br>
<br>
If it is the same race that you mention, is there a fix
for it?<br>
<br>
Thanks,<br>
Anirban</div>
</td>
</tr>
</tbody>
</table>
<div id="_origMsg_">
<div> <br>
</div>
</div>
</blockquote>
I am working on the fix.<br>
<br>
RCA:<br>
0) Lets say the file 'abc.log' is opened for writing on replica pair
(brick-0, brick-1)<br>
1) brick-0 went down<br>
2) abc.log is renamed to abc.log.1<br>
3) brick-0 comes back up<br>
4) re-open on old abc.log happens from mount to brick-0<br>
5) self-heal kicks in and deletes old abc.log and creates and syncs
abc.log.1<br>
6) But the mount is still writing to the deleted 'old abc.log' on
brick-0 so abc.log.1 file remains at the same size while abc.log.1
file keeps increasing on brick-1. This leads to size mismatch
split-brain on abc.log.1.<br>
<br>
Race happens between steps 4), 5). If 5) happens before 4) no
split-brain will be observed.<br>
<br>
Work-around:<br>
<br>
0) Take backup of good abc.log.1 file from brick-1. (Just being
paranoid)<br>
<br>
Do any of the following two steps to make sure the stale file that
is open is closed<br>
1-a) Take the brick process with bad file down using kill -9
<brick-pid> (In my example brick-0).<br>
1-b) Introduce a temporary disconnect between mount and brick-0.<br>
(I would choose 1-a)<br>
2) Remove the bad file(abc.log.1) and its gfid-backend-file from
brick-0<br>
3) Bring the brick back up (gluster volume start <volname>
force)/restore the connection and let it heal by doing 'stat' on the
file abc.log.1 on the mount.<br>
<br>
This bug existed from 2012, from the first time I implemented
rename/hard-link self-heal. It is difficult to re-create. I have to
put break-points at several places in the process to hit the race.<br>
<br>
Pranith<br>
<blockquote
cite="mid:1413705960.93919.YahooMailMobile@web193905.mail.sg3.yahoo.com"
type="cite">
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td valign="top">
<div><br>
Thanks,<br>
Anirban</div>
</td>
</tr>
</tbody>
</table>
</blockquote>
<blockquote
cite="mid:1413705960.93919.YahooMailMobile@web193905.mail.sg3.yahoo.com"
type="cite">
<div id="_origMsg_">
<div>
<div>
<div style="font-size:0.9em">
<hr size="1"> <b> <span style="font-weight:bold">From:</span>
</b> Pranith Kumar Karampuri <a class="moz-txt-link-rfc2396E" href="mailto:pkarampu@redhat.com"><pkarampu@redhat.com></a>;
<br>
<b> <span style="font-weight:bold">To:</span> </b>
Anirban Ghoshal <a class="moz-txt-link-rfc2396E" href="mailto:chalcogen_eg_oxygen@yahoo.com"><chalcogen_eg_oxygen@yahoo.com></a>;
<a class="moz-txt-link-rfc2396E" href="mailto:gluster-users@gluster.org"><gluster-users@gluster.org></a>; <br>
<b> <span style="font-weight:bold">Subject:</span> </b>
Re: [Gluster-users] Split-brain seen with [0 0] pending
matrix and io-cache page errors <br>
<b> <span style="font-weight:bold">Sent:</span> </b>
Sun, Oct 19, 2014 5:42:24 AM <br>
</div>
<br>
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td valign="top"> <br clear="none">
<div class="moz-cite-prefix">On 10/18/2014 04:36 PM,
Anirban Ghoshal wrote:<br clear="none">
</div>
<blockquote type="cite">
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td colspan="1" rowspan="1" valign="top">
<div>Hi,<br clear="none">
<br clear="none">
Yes, they do, and considerably. I'd
forgotten to mention that on my last
email. Their mtimes, however, as far as
i could tell on separate servers, seemed
to coincide. <br clear="none">
<br clear="none">
Thanks,<br clear="none">
Anirban</div>
</td>
</tr>
</tbody>
</table>
<div id="_origMsg_">
<div> <br clear="none">
</div>
</div>
</blockquote>
<br clear="none">
Are these files always open? And is it possible that
the file could have been renamed when one of the
bricks was offline? I know of a race which can
introduce this one. Just trying to find if it is the
same case.<br clear="none">
<br clear="none">
Pranith
<div class="yqt7836717428" id="yqtfd37976"><br
clear="none">
<br clear="none">
<blockquote type="cite">
<div id="_origMsg_">
<div>
<div>
<div style="font-size:0.9em;">
<hr size="1"> <b> <span
style="font-weight:bold;">From:</span>
</b> Pranith Kumar Karampuri <a
moz-do-not-send="true" rel="nofollow"
shape="rect"
class="moz-txt-link-rfc2396E"
ymailto="mailto:pkarampu@redhat.com"
target="_blank"
href="javascript:return"><pkarampu@redhat.com></a>;
<br clear="none">
<b> <span style="font-weight:bold;">To:</span>
</b> Anirban Ghoshal <a
moz-do-not-send="true" rel="nofollow"
shape="rect"
class="moz-txt-link-rfc2396E"
ymailto="mailto:chalcogen_eg_oxygen@yahoo.com"
target="_blank"
href="javascript:return"><chalcogen_eg_oxygen@yahoo.com></a>;
<a moz-do-not-send="true" rel="nofollow"
shape="rect"
class="moz-txt-link-abbreviated"
ymailto="mailto:gluster-users@gluster.org"
target="_blank"
href="javascript:return">gluster-users@gluster.org</a>
<a moz-do-not-send="true" rel="nofollow"
shape="rect"
class="moz-txt-link-rfc2396E"
ymailto="mailto:gluster-users@gluster.org"
target="_blank"
href="javascript:return"><gluster-users@gluster.org></a>;
<br clear="none">
<b> <span style="font-weight:bold;">Subject:</span>
</b> Re: [Gluster-users] Split-brain
seen with [0 0] pending matrix and
io-cache page errors <br clear="none">
<b> <span style="font-weight:bold;">Sent:</span>
</b> Sat, Oct 18, 2014 12:26:08 AM <br
clear="none">
</div>
<br clear="none">
<table border="0" cellpadding="0"
cellspacing="0">
<tbody>
<tr>
<td colspan="1" rowspan="1"
valign="top"> hi,<br clear="none">
Could you see if the size of
the file mismatches?<br
clear="none">
<br clear="none">
Pranith<br clear="none">
<br clear="none">
<div class="yqt8170167658"
id="yqt92241">
<div class="moz-cite-prefix">On
10/18/2014 04:20 AM, Anirban
Ghoshal wrote:<br clear="none">
</div>
<blockquote type="cite">
<div
style="color:#000;background-color:#fff;font-family:Courier
New, courier, monaco,
monospace,
sans-serif;font-size:13px;">
<div class="" style="">Hi
everyone,</div>
<div class="" style=""><br
class="" style=""
clear="none">
</div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-family:'Courier
New', courier, monaco,
monospace,
sans-serif;font-style:normal;background-color:transparent;">I
have this really confusing
split-brain here that's
bothering me. I am running
glusterfs 3.4.2 over linux
2.6.34. I have a replica 2
volume 'testvol' that is
It seems I cannot
read/stat/edit the file in
question, and `gluster
volume heal testvol info
split-brain` shows
nothing. Here are the logs
from the fuse-mount for
the volume:</div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-family:'Courier
New', courier, monaco,
monospace,
sans-serif;font-style:normal;background-color:transparent;"><br
class="" style=""
clear="none">
</div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-family:'Courier
New', courier, monaco,
monospace,
sans-serif;font-style:normal;background-color:transparent;"><font
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:53:02.867111] W
[fuse-bridge.c:1172:fuse_err_cbk]
0-glusterfs-fuse:
4560969: FLUSH() ERR
=> -1 (Input/output
error) </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.007799] W
[page.c:991:__ioc_page_error]
0-testvol-io-cache: page
error for page =
0x7fd5c8529d20 &
waitq = 0x7fd5c8067d40 </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.007854] W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561103: READ => -1
(Input/output error) </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.008018] W
[page.c:991:__ioc_page_error]
0-testvol-io-cache: page
error for page =
0x7fd5c8607ee0 &
waitq = 0x7fd5c8067d40 </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.008056] W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561104: READ => -1
(Input/output error) </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.008233] W
[page.c:991:__ioc_page_error]
0-testvol-io-cache: page
error for page =
0x7fd5c8066f30 &
waitq = 0x7fd5c8067d40 </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.008269] W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561105: READ => -1
(Input/output error) </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.008800] W
[page.c:991:__ioc_page_error]
0-testvol-io-cache: page
error for page =
0x7fd5c860bcf0 &
waitq = 0x7fd5c863b1f0 </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.008839] W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561107: READ => -1
(Input/output error) </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.009365] W
[page.c:991:__ioc_page_error]
0-testvol-io-cache: page
error for page =
0x7fd5c85fd120 &
waitq = 0x7fd5c8067d40 </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.009413] W
[fuse-bridge.c:2089:fuse_readv_cbk]
0-glusterfs-fuse:
4561109: READ => -1
(Input/output error) </font><br
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.040549] W
[afr-open.c:213:afr_open]
0-testvol-replicate-0:
failed to open as split
brain seen, returning
EIO </font><br class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"
clear="none">
<font class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;">[2014-09-29
07:54:16.040594] W
[fuse-bridge.c:915:fuse_fd_cbk]
0-glusterfs-fuse:
4561142: OPEN()
/SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log
=> -1 (Input/output
error)</font><br
class="" style=""
clear="none">
</div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-family:'Courier
New', courier, monaco,
monospace,
sans-serif;font-style:normal;background-color:transparent;"><br
class="" style=""
clear="none">
</div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-style:normal;background-color:transparent;"><span
style="font-family:verdana,
helvetica, sans-serif;">Could
somebody please give me
some clue on where to
begin? I checked the
xattrs on <span class="">/SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log
and it seems the
changelogs are [0, 0]
on both replicas, and
the gfid's match.</span></span></div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-style:normal;font-family:verdana,
helvetica,
sans-serif;background-color:transparent;"><span
style="font-family:verdana,
helvetica, sans-serif;"><span
class=""><br
clear="none">
</span></span></div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-style:normal;font-family:verdana,
helvetica,
sans-serif;background-color:transparent;"><span
style="font-family:verdana,
helvetica, sans-serif;"><span
class="">Thank you
very much for any help
on this.</span></span></div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-style:normal;font-family:verdana,
helvetica,
sans-serif;background-color:transparent;"><span
style="font-family:verdana,
helvetica, sans-serif;"><span
class="">Anirban</span></span></div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-family:Verdana,
Arial, Helvetica,
sans-serif;font-style:normal;background-color:transparent;"><span
class=""
style="font-family:Verdana,
Arial, Helvetica,
sans-serif;"><br
clear="none">
</span></div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-family:'Courier
New', courier, monaco,
monospace,
sans-serif;font-style:normal;background-color:transparent;"><br
class="" style=""
clear="none">
</div>
<div class=""
style="color:rgb(0, 0,
0);font-size:13px;font-family:'Courier
New', courier, monaco,
monospace,
sans-serif;font-style:normal;background-color:transparent;"><br
class="" style=""
clear="none">
</div>
</div>
<br clear="none">
<fieldset
class="mimeAttachmentHeader"></fieldset>
<br clear="none">
<pre>_______________________________________________
Gluster-users mailing list
<a moz-do-not-send="true" rel="nofollow" shape="rect" class="moz-txt-link-abbreviated">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" rel="nofollow" shape="rect" class="moz-txt-link-freetext" target="_blank" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
</div>
<br clear="none">
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</blockquote>
<br clear="none">
</div>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>