<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<div class="moz-cite-prefix">On 08/05/2014 02:06 PM, Roman wrote:<br>
</div>
<blockquote
cite="mid:CAFR=TBo=b+c7OCOc-fO-Q+yE1TDh3ZxEZ4sgPQnmLS5NsSNbuw@mail.gmail.com"
type="cite">
<div dir="ltr">Well, it seems like it doesn't see the changes were
made to the volume ? I created two files 200 and 100 MB (from
/dev/zero) after I disconnected the first brick. Then connected
it back and got these logs:
<div>
<br>
</div>
<div>
<div>[2014-08-05 08:30:37.830150] I
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] 0-glusterfs: No
change in volfile, continuing</div>
<div>[2014-08-05 08:30:37.830207] I
[rpc-clnt.c:1676:rpc_clnt_reconfig]
0-HA-fast-150G-PVE1-client-0: changing port to 49153 (from
0)</div>
<div>[2014-08-05 08:30:37.830239] W
[socket.c:514:__socket_rwv] 0-HA-fast-150G-PVE1-client-0:
readv failed (No data available)</div>
<div>[2014-08-05 08:30:37.831024] I
[client-handshake.c:1659:select_server_supported_programs]
0-HA-fast-150G-PVE1-client-0: Using Program GlusterFS 3.3,
Num (1298437), Version (330)</div>
<div>[2014-08-05 08:30:37.831375] I
[client-handshake.c:1456:client_setvolume_cbk]
0-HA-fast-150G-PVE1-client-0: Connected to <a
moz-do-not-send="true" href="http://10.250.0.1:49153">10.250.0.1:49153</a>,
attached to remote volume '/exports/fast-test/150G'.</div>
<div>[2014-08-05 08:30:37.831394] I
[client-handshake.c:1468:client_setvolume_cbk]
0-HA-fast-150G-PVE1-client-0: Server and Client lk-version
numbers are not same, reopening the fds</div>
<div>[2014-08-05 08:30:37.831566] I
[client-handshake.c:450:client_set_lk_version_cbk]
0-HA-fast-150G-PVE1-client-0: Server lk version = 1</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>[2014-08-05 08:30:37.830150] I
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] 0-glusterfs: No
change in volfile, continuing<br>
</div>
<div>this line seems weird to me tbh.</div>
<div>I do not see any traffic on switch interfaces between
gluster servers, which means, there is no syncing between
them.</div>
<div>I tried to ls -l the files on the client and servers to
trigger the healing, but seems like no success. Should I wait
more?</div>
</div>
</blockquote>
Yes, it should take around 10-15 minutes. Could you provide
'getfattr -d -m. -e hex <file-on-brick>' on both the bricks.<br>
<br>
Pranith<br>
<blockquote
cite="mid:CAFR=TBo=b+c7OCOc-fO-Q+yE1TDh3ZxEZ4sgPQnmLS5NsSNbuw@mail.gmail.com"
type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-05 11:25 GMT+03:00 Pranith
Kumar Karampuri <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class=""> <br>
<div>On 08/05/2014 01:10 PM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Ahha! For some reason I was not able to
start the VM anymore, Proxmox VE told me, that it is
not able to read the qcow2 header due to permission
is denied for some reason. So I just deleted that
file and created a new VM. And the nex message I've
got was this:</div>
</blockquote>
</div>
Seems like these are the messages where you took down the
bricks before self-heal. Could you restart the run waiting
for self-heals to complete before taking down the next
brick?<span class="HOEnZb"><font color="#888888"><br>
<br>
Pranith</font></span>
<div>
<div class="h5"><br>
<blockquote type="cite">
<div dir="ltr">
<div> <br>
<div><br>
</div>
<div>
<div>[2014-08-05 07:31:25.663412] E
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
0-HA-fast-150G-PVE1-replicate-0: Unable to
self-heal contents of
'/images/124/vm-124-disk-1.qcow2' (possible
split-brain). Please delete the file from
all but the preferred subvolume.- Pending
matrix: [ [ 0 60 ] [ 11 0 ] ]</div>
<div>[2014-08-05 07:31:25.663955] E
[afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]
0-HA-fast-150G-PVE1-replicate-0: background
data self-heal failed on
/images/124/vm-124-disk-1.qcow2</div>
</div>
<div><br>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-05 10:13
GMT+03:00 Pranith Kumar Karampuri <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> I just
responded to your earlier mail about how the
log looks. The log comes on the mount's
logfile<span><font color="#888888"><br>
<br>
Pranith</font></span>
<div>
<div><br>
<div>On 08/05/2014 12:41 PM, Roman
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Ok, so I've waited
enough, I think. Had no any traffic
on switch ports between servers.
Could not find any suitable log
message about completed self-heal
(waited about 30 minutes). Plugged
out the other server's UTP cable
this time and got in the same
situation:
<div>
<div>root@gluster-test1:~# cat
/var/log/dmesg</div>
<div>-bash: /bin/cat: Input/output
error</div>
</div>
<div><br>
</div>
<div>brick logs:</div>
<div>
<div>[2014-08-05 07:09:03.005474]
I
[server.c:762:server_rpc_notify]
0-HA-fast-150G-PVE1-server:
disconnecting connectionfrom
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0</div>
<div>[2014-08-05 07:09:03.005530]
I
[server-helpers.c:729:server_connection_put]
0-HA-fast-150G-PVE1-server:
Shutting down connection
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0</div>
<div>[2014-08-05 07:09:03.005560]
I
[server-helpers.c:463:do_fd_cleanup]
0-HA-fast-150G-PVE1-server: fd
cleanup on
/images/124/vm-124-disk-1.qcow2</div>
<div>[2014-08-05 07:09:03.005797]
I
[server-helpers.c:617:server_connection_destroy]
0-HA-fast-150G-PVE1-server:
destroyed connection of
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-05
9:53 GMT+03:00 Pranith Kumar
Karampuri <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000"
bgcolor="#FFFFFF"> Do you
think it is possible for you
to do these tests on the
latest version 3.5.2? 'gluster
volume heal <volname>
info' would give you that
information in versions >
3.5.1.<br>
Otherwise you will have to
check it from either the logs,
there will be self-heal
completed message on the mount
logs (or) by observing
'getfattr -d -m. -e hex
<image-file-on-bricks>'<span><font
color="#888888"><br>
<br>
Pranith</font></span>
<div>
<div><br>
<br>
<div>On 08/05/2014 12:09
PM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Ok, I
understand. I will try
this shortly.
<div>How can I be
sure, that healing
process is done, if
I am not able to see
its status?</div>
</div>
<div class="gmail_extra"><br>
<br>
<div
class="gmail_quote">2014-08-05
9:30 GMT+03:00
Pranith Kumar
Karampuri <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote
class="gmail_quote"
style="margin:0 0
0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF">
Mounts will do
the healing, not
the
self-heal-daemon.
The problem I
feel is that
whichever
process does the
healing has the
latest
information
about the good
bricks in this
usecase. Since
for VM usecase,
mounts should
have the latest
information, we
should let the
mounts do the
healing. If the
mount accesses
the VM image
either by
someone doing
operations
inside the VM or
explicit stat on
the file it
should do the
healing.<span><font
color="#888888"><br>
<br>
Pranith.</font></span>
<div>
<div><br>
<br>
<div>On
08/05/2014
10:39 AM,
Roman wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">Hmmm,
you told me to
turn it off.
Did I
understood
something
wrong? After I
issued the
command you've
sent me, I was
not able to
watch the
healing
process, it
said, it won't
be healed,
becouse its
turned off.</div>
<div
class="gmail_extra"><br>
<br>
<div
class="gmail_quote">2014-08-05
5:39 GMT+03:00
Pranith Kumar
Karampuri <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF"> You didn't mention anything about self-healing. Did
you wait until
the self-heal
is complete?<span><font
color="#888888"><br>
<br>
Pranith</font></span>
<div>
<div><br>
<div>On
08/04/2014
05:49 PM,
Roman wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">
<div>Hi!</div>
<div>Result is
pretty same. I
set the switch
port down for
1st server, it
was ok. Then
set it up back
and set other
server's port
off. and it
triggered IO
error on two
virtual
machines: one
with local
root FS but
network
mounted
storage. and
other with
network root
FS. 1st gave
an error on
copying to or
from the
mounted
network disk,
other just
gave me an
error for even
reading
log.files.</div>
<div><br>
</div>
<div>
<div>cat:
/var/log/alternatives.log:
Input/output
error<br>
</div>
<div>then I
reset the kvm
VM and it said
me, there is
no boot
device. Next I
virtually
powered it off
and then back
on and it has
booted.</div>
<div><br>
</div>
<div>By the
way, did I
have to
start/stop
volume?</div>
</div>
<div><br>
</div>
<div>>> <span
style="font-family:arial,sans-serif;font-size:13px">Could you do the
following and
test it again?</span></div>
<span
style="font-family:arial,sans-serif;font-size:13px">>>
gluster volume
set
<volname>
cluster.self-heal-daemon
off</span><br
style="font-family:arial,sans-serif;font-size:13px">
<br
style="font-family:arial,sans-serif;font-size:13px">
<span
style="font-family:arial,sans-serif;font-size:13px">>>Pranith</span>
<div><br>
</div>
<div><br>
</div>
</div>
<div
class="gmail_extra"><br>
<br>
<div
class="gmail_quote">2014-08-04
14:10
GMT+03:00
Pranith Kumar
Karampuri <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF">
<div> <br>
<div>On
08/04/2014
03:33 PM,
Roman wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">
<div><span
style="font-family:arial,sans-serif;font-size:13px">Hello!</span></div>
<div><span
style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<span
style="font-family:arial,sans-serif;font-size:13px">Facing
the same
problem as
mentioned
here:</span>
<div
style="font-family:arial,sans-serif;font-size:13px">
<br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px"><a
moz-do-not-send="true"
href="http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html"
target="_blank">http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html</a></div>
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">my
set up is up
and running,
so i'm ready
to help you
back with
feedback.</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
<br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">setup:</div>
<div
style="font-family:arial,sans-serif;font-size:13px">proxmox
server as
client</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
2 gluster
physical
servers</div>
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">server
side and
client side
both running
atm 3.4.4
glusterfs from
gluster repo.</div>
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">the
problem is:</div>
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
1. craeted
replica
bricks.</div>
<div
style="font-family:arial,sans-serif;font-size:13px">2.
mounted in
proxmox (tried
both promox
ways: via GUI
and fstab
(with backup
volume line),
btw while
mounting via
fstab I'm
unable to
launch a VM
without cache,
meanwhile
direct-io-mode
is enabled in
fstab line)</div>
<div
style="font-family:arial,sans-serif;font-size:13px">3.
installed VM</div>
<div
style="font-family:arial,sans-serif;font-size:13px">4.
bring one
volume down -
ok</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
5. bringing
up, waiting
for sync is
done.</div>
<div
style="font-family:arial,sans-serif;font-size:13px">6.
bring other
volume down -
getting IO
errors on VM
guest and not
able to
restore the VM
after I reset
the VM via
host. It says
(no bootable
media). After
I shut it down
(forced) and
bring back up,
it boots.</div>
</div>
</blockquote>
</div>
Could you do
the following
and test it
again?<br>
gluster volume
set
<volname>
cluster.self-heal-daemon
off<br>
<br>
Pranith<br>
<blockquote
type="cite">
<div>
<div dir="ltr">
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">Need
help. Tried
3.4.3, 3.4.4.</div>
<div
style="font-family:arial,sans-serif;font-size:13px">Still
missing pkg-s
for 3.4.5 for
debian and
3.5.2 (3.5.1
always gives a
healing error
for some
reason)</div>
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </div>
<br>
<fieldset></fieldset>
<br>
</div>
<pre>_______________________________________________
Gluster-users mailing list
<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
<br
clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br
clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman.
</div>
</blockquote>
<br>
</body>
</html>