<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<div class="moz-cite-prefix">On 08/27/2014 12:24 PM, Roman wrote:<br>
</div>
<blockquote
cite="mid:CAFR=TBquQe9WxMecqMcgV=Phqe37=f09i15a0e8O5JfwPFBZbA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>root@stor1:~# ls -l /usr/sbin/glfsheal</div>
<div>ls: cannot access /usr/sbin/glfsheal: No such file or
directory</div>
<div>Seems like not.</div>
</div>
</blockquote>
Humble,<br>
    Seems like the binary is still not packaged?<br>
<br>
Pranith<br>
<blockquote
cite="mid:CAFR=TBquQe9WxMecqMcgV=Phqe37=f09i15a0e8O5JfwPFBZbA@mail.gmail.com"
type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">
2014-08-27 9:50 GMT+03:00 Pranith Kumar Karampuri <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> <br>
<div>On 08/27/2014 11:53 AM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Okay.
<div>so here are first results:</div>
<div><br>
</div>
<div>after I disconnected the first server, I've got
this:</div>
<div><br>
</div>
<div>
<div>root@stor2:~# gluster volume heal
HA-FAST-PVE1-150G info</div>
<div>Volume heal failed</div>
</div>
</div>
</blockquote>
Can you check if the following binary is present?<br>
/usr/sbin/glfsheal<br>
<br>
Pranith<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div><br>
</div>
<div>but</div>
<div>
<div>[2014-08-26 11:45:35.315974] I
[afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
0-HA-FAST-PVE1-150G-replicate-0: Â foreground data
self heal  is successfully completed,  data self
heal from HA-FAST-PVE1-150G-client-1 Â to sinks
 HA-FAST-PVE1-150G-client-0, with 16108814336
bytes on HA-FAST-PVE1-150G-client-0, 16108814336
bytes on HA-FAST-PVE1-150G-client-1, Â data -
Pending matrix: Â [ [ 0 0 ] [ 348 0 ] ] Â on
<gfid:e3ede9c6-28d6-4755-841a-d8329e42ccc4></div>
</div>
<div><br>
</div>
<div>something wrong during upgrade?</div>
<div><br>
</div>
<div>I've got two VM-s on different volumes: one with
HD on and other with HD off.</div>
<div>Both survived the outage and both seemed synced.</div>
<div><br>
</div>
<div>but today I've found kind of a bug with log
rotation.</div>
<div><br>
</div>
<div>logs rotated both on server and client sides, but
logs are being written in *.log.1 file :)</div>
<div><br>
</div>
<div>/var/log/glusterfs/mnt-pve-HA-MED-PVE1-1T.log.1<br>
</div>
<div>/var/log/glusterfs/glustershd.log.1<br>
</div>
<div><br>
</div>
<div>such behavior came after upgrade.</div>
<div><br>
</div>
<div>logrotate.d conf files include the HUP for
gluster pid-s.</div>
<div><br>
</div>
<div>client:</div>
<div>
<div>/var/log/glusterfs/*.log {</div>
<div>Â Â Â Â daily</div>
<div>Â Â Â Â rotate 7</div>
<div>Â Â Â Â delaycompress</div>
<div>Â Â Â Â compress</div>
<div>Â Â Â Â notifempty</div>
<div>Â Â Â Â missingok</div>
<div>Â Â Â Â postrotate</div>
<div>Â Â Â Â Â Â Â Â [ ! -f /var/run/glusterd.pid ]
|| kill -HUP `cat /var/run/glusterd.pid`</div>
<div>Â Â Â Â endscript</div>
<div>}</div>
</div>
<div><br>
</div>
<div>but I'm not able to ls the pid on client side
(should it be there?) :(</div>
<div><br>
</div>
<div>and servers:</div>
<div>
<div>/var/log/glusterfs/*.log {</div>
<div>Â Â Â Â daily</div>
<div>Â Â Â Â rotate 7</div>
<div>Â Â Â Â delaycompress</div>
<div>Â Â Â Â compress</div>
<div>Â Â Â Â notifempty</div>
<div> Â Â Â Â missingok</div>
<div>Â Â Â Â postrotate</div>
<div>Â Â Â Â Â Â Â Â [ ! -f /var/run/glusterd.pid ]
|| kill -HUP `cat /var/run/glusterd.pid`</div>
<div>Â Â Â Â endscript</div>
<div>}</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>/var/log/glusterfs/*/*.log {</div>
<div>Â Â Â Â daily</div>
<div>Â Â Â Â rotate 7</div>
<div>Â Â Â Â delaycompress</div>
<div>Â Â Â Â compress</div>
<div>Â Â Â Â notifempty</div>
<div>Â Â Â Â missingok</div>
<div>Â Â Â Â copytruncate</div>
<div>Â Â Â Â postrotate</div>
<div>Â Â Â Â Â Â Â Â [ ! -f /var/run/glusterd.pid ]
|| kill -HUP `cat /var/run/glusterd.pid`</div>
<div>Â Â Â Â endscript</div>
<div>}</div>
</div>
<div><br>
</div>
<div>I do have /var/run/glusterd.pid on server side.</div>
<div><br>
</div>
<div>Should I change something? Logrotation seems to
be broken.</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-26 9:29 GMT+03:00
Pranith Kumar Karampuri <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div> <br>
<div>On 08/26/2014 11:55 AM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hello all again!
<div>I'm back from vacation and I'm pretty
happy with 3.5.2 available for wheezy.
Thanks! Just made my updates.</div>
<div>For 3.5.2 do I still have to
set cluster.self-heal-daemon to off?</div>
</div>
</blockquote>
</div>
Welcome back :-). If you set it to off, the test
case you execute will work(Validate please :-)
). But we need to test it with self-heal-daemon
'on' and fix any bugs if the test case does not
work?<span><font color="#888888"><br>
<br>
Pranith.</font></span>
<div>
<div><br>
<blockquote type="cite">
<div dir="ltr"> </div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-06
12:49 GMT+03:00 Humble Chirammal <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:hchiramm@redhat.com"
target="_blank">hchiramm@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div><br>
<br>
<br>
----- Original Message -----<br>
| From: "Pranith Kumar Karampuri"
<<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>><br>
| To: "Roman" <<a
moz-do-not-send="true"
href="mailto:romeo.r@gmail.com"
target="_blank">romeo.r@gmail.com</a>><br>
| Cc: <a moz-do-not-send="true"
href="mailto:gluster-users@gluster.org"
target="_blank">gluster-users@gluster.org</a>,
"Niels de Vos" <<a
moz-do-not-send="true"
href="mailto:ndevos@redhat.com"
target="_blank">ndevos@redhat.com</a>>,
"Humble Chirammal" <<a
moz-do-not-send="true"
href="mailto:hchiramm@redhat.com"
target="_blank">hchiramm@redhat.com</a>><br>
| Sent: Wednesday, August 6, 2014
12:09:57 PM<br>
| Subject: Re: [Gluster-users]
libgfapi failover problem on
replica bricks<br>
|<br>
| Roman,<br>
|Â Â Â The file went into
split-brain. I think we should do
these tests<br>
| with 3.5.2. Where monitoring the
heals is easier. Let me also come
up<br>
| with a document about how to do
this testing you are trying to do.<br>
|<br>
| Humble/Niels,<br>
|Â Â Â Do we have debs available
for 3.5.2? In 3.5.1 there was
packaging<br>
| issue where /usr/bin/glfsheal is
not packaged along with the deb. I<br>
| think that should be fixed now
as well?<br>
|<br>
</div>
Pranith,<br>
<br>
The 3.5.2 packages for debian is not
available yet. We are co-ordinating
internally to get it processed.<br>
I will update the list once its
available.<br>
<br>
--Humble<br>
<div>|<br>
| On 08/06/2014 11:52 AM, Roman
wrote:<br>
| > good morning,<br>
| ><br>
| > root@stor1:~# getfattr -d
-m. -e hex<br>
| >
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| > getfattr: Removing leading
'/' from absolute path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| > getfattr: Removing leading
'/' from absolute path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| ><br>
| ><br>
| > 2014-08-06 9:20 GMT+03:00
Pranith Kumar Karampuri <<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a><br>
</div>
| > <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>| ><br>
| ><br>
| >Â Â Â On 08/06/2014 11:30 AM,
Roman wrote:<br>
| >>Â Â Â Also, this time
files are not the same!<br>
| >><br>
| >>Â Â Â root@stor1:~#
md5sum<br>
| >>Â Â
 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>Â Â
 32411360c53116b96a059f17306caeda<br>
| >>Â Â Â
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >><br>
| >>Â Â Â root@stor2:~#
md5sum<br>
| >>Â Â
 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>Â Â
 65b8a6031bcb6f5fb3a11cb1e8b1c9c9<br>
| >>Â Â Â
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >Â Â Â What is the getfattr
output?<br>
| ><br>
| >Â Â Â Pranith<br>
| ><br>
| >><br>
| >><br>
| >>Â Â Â 2014-08-05 16:33
GMT+03:00 Roman <<a
moz-do-not-send="true"
href="mailto:romeo.r@gmail.com"
target="_blank">romeo.r@gmail.com</a><br>
</div>
| >>Â Â Â <mailto:<a
moz-do-not-send="true"
href="mailto:romeo.r@gmail.com"
target="_blank">romeo.r@gmail.com</a>>>:<br>
<div>| >><br>
| >>Â Â Â Â Â Nope, it is not
working. But this time it went a
bit other way<br>
| >><br>
| >>Â Â Â Â
 root@gluster-client:~# dmesg<br>
| >>Â Â Â Â Â Segmentation
fault<br>
| >><br>
| >><br>
| >>Â Â Â Â Â I was not able
even to start the VM after I done
the tests<br>
| >><br>
| >>Â Â Â Â Â Could not read
qcow2 header: Operation not
permitted<br>
| >><br>
| >>Â Â Â Â Â And it seems,
it never starts to sync files
after first<br>
| >>Â Â Â Â Â disconnect. VM
survives first disconnect, but not
second (I<br>
| >>Â Â Â Â Â waited around
30 minutes). Also, I've<br>
| >>Â Â Â Â Â got
network.ping-timeout: 2 in volume
settings, but logs<br>
| >>Â Â Â Â Â react on first
disconnect around 30 seconds.
Second was<br>
| >>Â Â Â Â Â faster, 2
seconds.<br>
| >><br>
| >>Â Â Â Â Â Reaction was
different also:<br>
| >><br>
| >>Â Â Â Â Â slower one:<br>
| >>Â Â Â Â Â [2014-08-05
13:26:19.558435] W
[socket.c:514:__socket_rwv]<br>
| >>Â Â Â Â Â 0-glusterfs:
readv failed (Connection timed
out)<br>
| >>Â Â Â Â Â [2014-08-05
13:26:19.558485] W<br>
| >>Â Â Â Â
 [socket.c:1962:__socket_proto_state_machine]
0-glusterfs:<br>
| >>Â Â Â Â Â reading from
socket failed. Error (Connection
timed out),<br>
</div>
| >>Â Â Â Â Â peer (<a
moz-do-not-send="true"
href="http://10.250.0.1:24007"
target="_blank">10.250.0.1:24007</a>
<<a moz-do-not-send="true"
href="http://10.250.0.1:24007"
target="_blank">http://10.250.0.1:24007</a>>)<br>
<div>| >>Â Â Â Â Â [2014-08-05
13:26:21.281426] W
[socket.c:514:__socket_rwv]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-0:
readv failed (Connection timed
out)<br>
| >>Â Â Â Â Â [2014-08-05
13:26:21.281474] W<br>
| >>Â Â Â Â
 [socket.c:1962:__socket_proto_state_machine]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-0:
reading from socket failed.<br>
| >>Â Â Â Â Â Error
(Connection timed out), peer (<a
moz-do-not-send="true"
href="http://10.250.0.1:49153"
target="_blank">10.250.0.1:49153</a><br>
</div>
| >>Â Â Â Â Â <<a
moz-do-not-send="true"
href="http://10.250.0.1:49153"
target="_blank">http://10.250.0.1:49153</a>>)<br>
<div>| >>Â Â Â Â Â [2014-08-05
13:26:21.281507] I<br>
| >>Â Â Â Â
 [client.c:2098:client_rpc_notify]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-0:
disconnected<br>
| >><br>
| >>Â Â Â Â Â the fast one:<br>
| >>Â Â Â Â Â 2014-08-05
12:52:44.607389] C<br>
| >>Â Â Â Â
 [client-handshake.c:127:rpc_client_ping_timer_expired]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-1:
server <a moz-do-not-send="true"
href="http://10.250.0.2:49153"
target="_blank">10.250.0.2:49153</a><br>
</div>
| >>Â Â Â Â Â <<a
moz-do-not-send="true"
href="http://10.250.0.2:49153"
target="_blank">http://10.250.0.2:49153</a>>
has not responded in the last 2<br>
<div>
<div>| >>Â Â Â Â Â seconds,
disconnecting.<br>
| >>Â Â Â Â Â [2014-08-05
12:52:44.607491] W
[socket.c:514:__socket_rwv]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-1:
readv failed (No data available)<br>
| >>Â Â Â Â Â [2014-08-05
12:52:44.607585] E<br>
| >>Â Â Â Â
 [rpc-clnt.c:368:saved_frames_unwind]<br>
| >>Â Â Â Â
 (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >>Â Â Â Â
 [0x7fcb1b4b0558]<br>
| >>Â Â Â Â
 (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >>Â Â Â Â
 [0x7fcb1b4aea63]<br>
| >>Â Â Â Â
 (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >>Â Â Â Â
 [0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >>Â Â Â Â Â unwinding
frame type(GlusterFS 3.3)
op(LOOKUP(27)) called at<br>
| >>Â Â Â Â Â 2014-08-05
12:52:42.463881 (xid=0x381883x)<br>
| >>Â Â Â Â Â [2014-08-05
12:52:44.607604] W<br>
| >>Â Â Â Â
 [client-rpc-fops.c:2624:client3_3_lookup_cbk]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-1:
remote operation failed:<br>
| >>Â Â Â Â Â Transport
endpoint is not connected. Path:
/<br>
| >>Â Â Â Â
 (00000000-0000-0000-0000-000000000001)<br>
| >>Â Â Â Â Â [2014-08-05
12:52:44.607736] E<br>
| >>Â Â Â Â
 [rpc-clnt.c:368:saved_frames_unwind]<br>
| >>Â Â Â Â
 (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >>Â Â Â Â
 [0x7fcb1b4b0558]<br>
| >>Â Â Â Â
 (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >>Â Â Â Â
 [0x7fcb1b4aea63]<br>
| >>Â Â Â Â
 (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >>Â Â Â Â
 [0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >>Â Â Â Â Â unwinding
frame type(GlusterFS Handshake)
op(PING(3)) called<br>
| >>Â Â Â Â Â at 2014-08-05
12:52:42.463891 (xid=0x381884x)<br>
| >>Â Â Â Â Â [2014-08-05
12:52:44.607753] W<br>
| >>Â Â Â Â
 [client-handshake.c:276:client_ping_cbk]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-1:
timer must have expired<br>
| >>Â Â Â Â Â [2014-08-05
12:52:44.607776] I<br>
| >>Â Â Â Â
 [client.c:2098:client_rpc_notify]<br>
| >>Â Â Â Â
 0-HA-fast-150G-PVE1-client-1:
disconnected<br>
| >><br>
| >><br>
| >><br>
| >>Â Â Â Â Â I've got SSD
disks (just for an info).<br>
| >>Â Â Â Â Â Should I go
and give a try for 3.5.2?<br>
| >><br>
| >><br>
| >><br>
| >>Â Â Â Â Â 2014-08-05
13:06 GMT+03:00 Pranith Kumar
Karampuri<br>
</div>
</div>
| >>Â Â Â Â Â <<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>
<mailto:<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>| >><br>
| >>Â Â Â Â Â Â Â reply along
with gluster-users please :-). May
be you are<br>
| >>Â Â Â Â Â Â Â hitting
'reply' instead of 'reply all'?<br>
| >><br>
| >>Â Â Â Â Â Â Â Pranith<br>
| >><br>
| >>Â Â Â Â Â Â Â On
08/05/2014 03:35 PM, Roman wrote:<br>
| >>>Â Â Â Â Â Â Â To make
sure and clean, I've created
another VM with raw<br>
| >>>Â Â Â Â Â Â Â format
and goint to repeat those steps.
So now I've got<br>
| >>>Â Â Â Â Â Â Â two
VM-s one with qcow2 format and
other with raw<br>
| >>>Â Â Â Â Â Â Â format.
I will send another e-mail
shortly.<br>
| >>><br>
| >>><br>
| >>>Â Â Â Â Â Â
 2014-08-05 13:01 GMT+03:00
Pranith Kumar Karampuri<br>
</div>
| >>>Â Â Â Â Â Â Â <<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>
<mailto:<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>| >>><br>
| >>><br>
| >>>Â Â Â Â Â Â Â Â
 On 08/05/2014 03:07 PM, Roman
wrote:<br>
| >>>>Â Â Â Â Â Â Â
  really, seems like the same
file<br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
  stor1:<br>
| >>>>Â Â Â Â Â Â Â
Â
 a951641c5230472929836f9fcede6b04<br>
| >>>>Â Â Â Â Â Â Â
 Â
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
  stor2:<br>
| >>>>Â Â Â Â Â Â Â
Â
 a951641c5230472929836f9fcede6b04<br>
| >>>>Â Â Â Â Â Â Â
 Â
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>>><br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
  one thing I've seen from
logs, that somehow proxmox<br>
| >>>>Â Â Â Â Â Â Â
  VE is connecting with wrong
version to servers?<br>
| >>>>Â Â Â Â Â Â Â
  [2014-08-05 09:23:45.218550]
I<br>
| >>>>Â Â Â Â Â Â Â
Â
 [client-handshake.c:1659:select_server_supported_programs]<br>
| >>>>Â Â Â Â Â Â Â
  0-HA-fast-150G-PVE1-client-0:
Using Program<br>
| >>>>Â Â Â Â Â Â Â
  GlusterFS 3.3, Num (1298437),
Version (330)<br>
| >>>Â Â Â Â Â Â Â Â
 It is the rpc (over the network
data structures)<br>
| >>>Â Â Â Â Â Â Â Â
 version, which is not changed
at all from 3.3 so<br>
| >>>Â Â Â Â Â Â Â Â
 thats not a problem. So what is
the conclusion? Is<br>
| >>>Â Â Â Â Â Â Â Â
 your test case working now or
not?<br>
| >>><br>
| >>>Â Â Â Â Â Â Â Â
 Pranith<br>
| >>><br>
| >>>>Â Â Â Â Â Â Â
  but if I issue:<br>
| >>>>Â Â Â Â Â Â Â
  root@pve1:~# glusterfs -V<br>
| >>>>Â Â Â Â Â Â Â
  glusterfs 3.4.4 built on Jun
28 2014 03:44:57<br>
| >>>>Â Â Â Â Â Â Â
  seems ok.<br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
  server use 3.4.4 meanwhile<br>
| >>>>Â Â Â Â Â Â Â
  [2014-08-05 09:23:45.117875]
I<br>
| >>>>Â Â Â Â Â Â Â
Â
 [server-handshake.c:567:server_setvolume]<br>
| >>>>Â Â Â Â Â Â Â
  0-HA-fast-150G-PVE1-server:
accepted client from<br>
| >>>>Â Â Â Â Â Â Â
Â
 stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0<br>
| >>>>Â Â Â Â Â Â Â
  (version: 3.4.4)<br>
| >>>>Â Â Â Â Â Â Â
  [2014-08-05 09:23:49.103035]
I<br>
| >>>>Â Â Â Â Â Â Â
Â
 [server-handshake.c:567:server_setvolume]<br>
| >>>>Â Â Â Â Â Â Â
  0-HA-fast-150G-PVE1-server:
accepted client from<br>
| >>>>Â Â Â Â Â Â Â
Â
 stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0<br>
| >>>>Â Â Â Â Â Â Â
  (version: 3.4.4)<br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
  if this could be the reason,
of course.<br>
| >>>>Â Â Â Â Â Â Â
  I did restart the Proxmox VE
yesterday (just for an<br>
| >>>>Â Â Â Â Â Â Â
  information)<br>
| >>>><br>
| >>>><br>
| >>>><br>
| >>>><br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
  2014-08-05 12:30 GMT+03:00
Pranith Kumar Karampuri<br>
</div>
</div>
| >>>>Â Â Â Â Â Â Â Â
 <<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>
<mailto:<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>| >>>><br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
    On 08/05/2014 02:33 PM,
Roman wrote:<br>
| >>>>>Â Â Â Â Â
      Waited long enough
for now, still different<br>
| >>>>>Â Â Â Â Â
      sizes and no logs
about healing :(<br>
| >>>>><br>
| >>>>>Â Â Â Â Â
      stor1<br>
| >>>>>Â Â Â Â Â
      # file:<br>
| >>>>>Â Â Â Â Â
    Â
 exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>>>>Â Â Â Â Â
    Â
 trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
| >>>>>Â Â Â Â Â
    Â
 trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
| >>>>>Â Â Â Â Â
    Â
 trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
| >>>>><br>
| >>>>>Â Â Â Â Â
      root@stor1:~# du -sh<br>
| >>>>>Â Â Â Â Â
    Â
 /exports/fast-test/150G/images/127/<br>
| >>>>>Â Â Â Â Â
      1.2GÂ
/exports/fast-test/150G/images/127/<br>
| >>>>><br>
| >>>>><br>
| >>>>>Â Â Â Â Â
      stor2<br>
| >>>>>Â Â Â Â Â
      # file:<br>
| >>>>>Â Â Â Â Â
    Â
 exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>>>>Â Â Â Â Â
    Â
 trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
| >>>>>Â Â Â Â Â
    Â
 trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
| >>>>>Â Â Â Â Â
    Â
 trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
| >>>>><br>
| >>>>><br>
| >>>>>Â Â Â Â Â
      root@stor2:~# du -sh<br>
| >>>>>Â Â Â Â Â
    Â
 /exports/fast-test/150G/images/127/<br>
| >>>>>Â Â Â Â Â
      1.4GÂ
/exports/fast-test/150G/images/127/<br>
| >>>>Â Â Â Â Â Â Â
    According to the
changelogs, the file doesn't<br>
| >>>>Â Â Â Â Â Â Â
    need any healing. Could
you stop the operations<br>
| >>>>Â Â Â Â Â Â Â
    on the VMs and take
md5sum on both these machines?<br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
    Pranith<br>
| >>>><br>
| >>>>><br>
| >>>>><br>
| >>>>><br>
| >>>>><br>
| >>>>>Â Â Â Â Â
      2014-08-05 11:49
GMT+03:00 Pranith Kumar<br>
| >>>>>Â Â Â Â Â
      Karampuri <<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
| >>>>>Â Â Â Â Â Â Â
    <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>| >>>>><br>
| >>>>><br>
| >>>>>Â Â Â Â Â
        On 08/05/2014
02:06 PM, Roman wrote:<br>
| >>>>>>Â Â Â
          Well, it
seems like it doesn't see the<br>
| >>>>>>Â Â Â
          changes were
made to the volume ? I<br>
| >>>>>>Â Â Â
          created two
files 200 and 100 MB (from<br>
| >>>>>>Â Â Â
          /dev/zero)
after I disconnected the first<br>
| >>>>>>Â Â Â
          brick. Then
connected it back and got<br>
| >>>>>>Â Â Â
          these logs:<br>
| >>>>>><br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.830150] I<br>
| >>>>>>Â Â Â
        Â
 [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
| >>>>>>Â Â Â
          0-glusterfs:
No change in volfile, continuing<br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.830207] I<br>
| >>>>>>Â Â Â
        Â
 [rpc-clnt.c:1676:rpc_clnt_reconfig]<br>
| >>>>>>Â Â Â
        Â
 0-HA-fast-150G-PVE1-client-0:
changing<br>
| >>>>>>Â Â Â
          port to 49153
(from 0)<br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.830239] W<br>
| >>>>>>Â Â Â
        Â
 [socket.c:514:__socket_rwv]<br>
| >>>>>>Â Â Â
        Â
 0-HA-fast-150G-PVE1-client-0:
readv<br>
| >>>>>>Â Â Â
          failed (No
data available)<br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.831024] I<br>
| >>>>>>Â Â Â
        Â
 [client-handshake.c:1659:select_server_supported_programs]<br>
| >>>>>>Â Â Â
        Â
 0-HA-fast-150G-PVE1-client-0:
Using<br>
| >>>>>>Â Â Â
          Program
GlusterFS 3.3, Num (1298437),<br>
| >>>>>>Â Â Â
          Version (330)<br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.831375] I<br>
| >>>>>>Â Â Â
        Â
 [client-handshake.c:1456:client_setvolume_cbk]<br>
| >>>>>>Â Â Â
        Â
 0-HA-fast-150G-PVE1-client-0:
Connected<br>
| >>>>>>Â Â Â
          to <a
moz-do-not-send="true"
href="http://10.250.0.1:49153"
target="_blank">10.250.0.1:49153</a><br>
</div>
</div>
| >>>>>>Â Â Â Â Â
        <<a
moz-do-not-send="true"
href="http://10.250.0.1:49153"
target="_blank">http://10.250.0.1:49153</a>>,
attached to<br>
<div>
<div>| >>>>>>Â Â
           remote
volume
'/exports/fast-test/150G'.<br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.831394] I<br>
| >>>>>>Â Â Â
        Â
 [client-handshake.c:1468:client_setvolume_cbk]<br>
| >>>>>>Â Â Â
        Â
 0-HA-fast-150G-PVE1-client-0:
Server and<br>
| >>>>>>Â Â Â
          Client
lk-version numbers are not same,<br>
| >>>>>>Â Â Â
          reopening the
fds<br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.831566] I<br>
| >>>>>>Â Â Â
        Â
 [client-handshake.c:450:client_set_lk_version_cbk]<br>
| >>>>>>Â Â Â
        Â
 0-HA-fast-150G-PVE1-client-0:
Server lk<br>
| >>>>>>Â Â Â
          version = 1<br>
| >>>>>><br>
| >>>>>><br>
| >>>>>>Â Â Â
          [2014-08-05
08:30:37.830150] I<br>
| >>>>>>Â Â Â
        Â
 [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
| >>>>>>Â Â Â
          0-glusterfs:
No change in volfile, continuing<br>
| >>>>>>Â Â Â
          this line
seems weird to me tbh.<br>
| >>>>>>Â Â Â
          I do not see
any traffic on switch<br>
| >>>>>>Â Â Â
          interfaces
between gluster servers, which<br>
| >>>>>>Â Â Â
          means, there
is no syncing between them.<br>
| >>>>>>Â Â Â
          I tried to ls
-l the files on the client<br>
| >>>>>>Â Â Â
          and servers
to trigger the healing, but<br>
| >>>>>>Â Â Â
          seems like no
success. Should I wait more?<br>
| >>>>>Â Â Â Â Â
        Yes, it should
take around 10-15 minutes.<br>
| >>>>>Â Â Â Â Â
        Could you provide
'getfattr -d -m. -e hex<br>
| >>>>>Â Â Â Â Â
      Â
 <file-on-brick>' on both
the bricks.<br>
| >>>>><br>
| >>>>>Â Â Â Â Â
        Pranith<br>
| >>>>><br>
| >>>>>><br>
| >>>>>><br>
| >>>>>>Â Â Â
          2014-08-05
11:25 GMT+03:00 Pranith Kumar<br>
| >>>>>>Â Â Â
          Karampuri
<<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
| >>>>>>Â Â Â Â Â
        <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>| >>>>>><br>
| >>>>>><br>
| >>>>>>Â Â Â
            On
08/05/2014 01:10 PM, Roman
wrote:<br>
| >>>>>>>Â
              Ahha!
For some reason I was not able<br>
| >>>>>>>Â
              to
start the VM anymore, Proxmox VE<br>
| >>>>>>>Â
              told
me, that it is not able to read<br>
| >>>>>>>Â
              the
qcow2 header due to permission<br>
| >>>>>>>Â
              is
denied for some reason. So I
just<br>
| >>>>>>>Â
            Â
 deleted that file and created a
new<br>
| >>>>>>>Â
              VM.
And the nex message I've got was<br>
| >>>>>>>Â
              this:<br>
| >>>>>>Â Â Â
            Seems
like these are the messages<br>
| >>>>>>Â Â Â
            where you
took down the bricks before<br>
| >>>>>>Â Â Â
          Â
 self-heal. Could you restart
the run<br>
| >>>>>>Â Â Â
            waiting
for self-heals to complete<br>
| >>>>>>Â Â Â
            before
taking down the next brick?<br>
| >>>>>><br>
| >>>>>>Â Â Â
            Pranith<br>
| >>>>>><br>
| >>>>>>><br>
| >>>>>>><br>
| >>>>>>>Â
            Â
 [2014-08-05 07:31:25.663412] E<br>
| >>>>>>>Â
            Â
 [afr-self-heal-common.c:197:afr_sh_print_split_brain_log]<br>
| >>>>>>>Â
            Â
 0-HA-fast-150G-PVE1-replicate-0:<br>
| >>>>>>>Â
            Â
 Unable to self-heal contents of<br>
| >>>>>>>Â
            Â
 '/images/124/vm-124-disk-1.qcow2'<br>
| >>>>>>>Â
            Â
 (possible split-brain). Please<br>
| >>>>>>>Â
            Â
 delete the file from all but
the<br>
| >>>>>>>Â
            Â
 preferred subvolume.- Pending<br>
| >>>>>>>Â
            Â
 matrix: [ [ 0 60 ] [ 11 0 ] ]<br>
| >>>>>>>Â
            Â
 [2014-08-05 07:31:25.663955] E<br>
| >>>>>>>Â
            Â
 [afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]<br>
| >>>>>>>Â
            Â
 0-HA-fast-150G-PVE1-replicate-0:<br>
| >>>>>>>Â
            Â
 background data self-heal
failed on<br>
| >>>>>>>Â
            Â
 /images/124/vm-124-disk-1.qcow2<br>
| >>>>>>><br>
| >>>>>>><br>
| >>>>>>><br>
| >>>>>>>Â
            Â
 2014-08-05 10:13 GMT+03:00
Pranith<br>
| >>>>>>>Â
              Kumar
Karampuri <<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
| >>>>>>>Â Â Â
            <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>><br>
| >>>>>>>Â
                I
just responded to your earlier<br>
| >>>>>>>Â
              Â
 mail about how the log looks.<br>
| >>>>>>>Â
              Â
 The log comes on the mount's
logfile<br>
| >>>>>>><br>
| >>>>>>>Â
              Â
 Pranith<br>
| >>>>>>><br>
| >>>>>>>Â
              Â
 On 08/05/2014 12:41 PM, Roman
wrote:<br>
|
>>>>>>>>Â
              Â
 Ok, so I've waited enough, I<br>
|
>>>>>>>>Â
              Â
 think. Had no any traffic on<br>
|
>>>>>>>>Â
              Â
 switch ports between servers.<br>
|
>>>>>>>>Â
              Â
 Could not find any suitable log<br>
|
>>>>>>>>Â
              Â
 message about completed<br>
|
>>>>>>>>Â
              Â
 self-heal (waited about 30<br>
|
>>>>>>>>Â
              Â
 minutes). Plugged out the other<br>
|
>>>>>>>>Â
              Â
 server's UTP cable this time<br>
|
>>>>>>>>Â
              Â
 and got in the same situation:<br>
|
>>>>>>>>Â
              Â
 root@gluster-test1:~# cat<br>
|
>>>>>>>>Â
              Â
 /var/log/dmesg<br>
|
>>>>>>>>Â
              Â
 -bash: /bin/cat: Input/output
error<br>
|
>>>>>>>><br>
|
>>>>>>>>Â
              Â
 brick logs:<br>
|
>>>>>>>>Â
              Â
 [2014-08-05 07:09:03.005474] I<br>
|
>>>>>>>>Â
              Â
 [server.c:762:server_rpc_notify]<br>
|
>>>>>>>>Â
              Â
 0-HA-fast-150G-PVE1-server:<br>
|
>>>>>>>>Â
              Â
 disconnecting connectionfrom<br>
|
>>>>>>>>Â
              Â
 pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>>>>>Â
              Â
 [2014-08-05 07:09:03.005530] I<br>
|
>>>>>>>>Â
              Â
 [server-helpers.c:729:server_connection_put]<br>
|
>>>>>>>>Â
              Â
 0-HA-fast-150G-PVE1-server:<br>
|
>>>>>>>>Â
              Â
 Shutting down connection<br>
|
>>>>>>>>Â
              Â
 pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>>>>>Â
              Â
 [2014-08-05 07:09:03.005560] I<br>
|
>>>>>>>>Â
              Â
 [server-helpers.c:463:do_fd_cleanup]<br>
|
>>>>>>>>Â
              Â
 0-HA-fast-150G-PVE1-server: fd<br>
|
>>>>>>>>Â
              Â
 cleanup on<br>
|
>>>>>>>>Â
              Â
 /images/124/vm-124-disk-1.qcow2<br>
|
>>>>>>>>Â
              Â
 [2014-08-05 07:09:03.005797] I<br>
|
>>>>>>>>Â
              Â
 [server-helpers.c:617:server_connection_destroy]<br>
|
>>>>>>>>Â
              Â
 0-HA-fast-150G-PVE1-server:<br>
|
>>>>>>>>Â
              Â
 destroyed connection of<br>
|
>>>>>>>>Â
              Â
 pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>>Â
              Â
 2014-08-05 9:53 GMT+03:00<br>
|
>>>>>>>>Â
              Â
 Pranith Kumar Karampuri<br>
|
>>>>>>>>Â
              Â
 <<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
| >>>>>>>>Â
              Â
 <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>><br>
|
>>>>>>>>Â
               Â
  Do you think it is possible<br>
|
>>>>>>>>Â
               Â
  for you to do these tests<br>
|
>>>>>>>>Â
               Â
  on the latest version<br>
|
>>>>>>>>Â
               Â
  3.5.2? 'gluster volume heal<br>
|
>>>>>>>>Â
               Â
  <volname> info' would
give<br>
|
>>>>>>>>Â
               Â
  you that information in<br>
|
>>>>>>>>Â
               Â
  versions > 3.5.1.<br>
|
>>>>>>>>Â
               Â
  Otherwise you will have to<br>
|
>>>>>>>>Â
               Â
  check it from either the<br>
|
>>>>>>>>Â
               Â
  logs, there will be<br>
|
>>>>>>>>Â
               Â
  self-heal completed message<br>
|
>>>>>>>>Â
               Â
  on the mount logs (or) by<br>
|
>>>>>>>>Â
               Â
  observing 'getfattr -d -m.<br>
|
>>>>>>>>Â
               Â
  -e hex
<image-file-on-bricks>'<br>
|
>>>>>>>><br>
|
>>>>>>>>Â
               Â
  Pranith<br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>>Â
               Â
  On 08/05/2014 12:09 PM,<br>
|
>>>>>>>>Â
               Â
  Roman wrote:<br>
|
>>>>>>>>>Â
               Â
  Ok, I understand. I will<br>
|
>>>>>>>>>Â
               Â
  try this shortly.<br>
|
>>>>>>>>>Â
               Â
  How can I be sure, that<br>
|
>>>>>>>>>Â
               Â
  healing process is done,<br>
|
>>>>>>>>>Â
               Â
  if I am not able to see<br>
|
>>>>>>>>>Â
               Â
  its status?<br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>>Â
               Â
  2014-08-05 9:30 GMT+03:00<br>
|
>>>>>>>>>Â
               Â
  Pranith Kumar Karampuri<br>
|
>>>>>>>>>Â
               Â
  <<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>>>Â
                Â
 <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>>><br>
|
>>>>>>>>>Â
               Â
    Mounts will do the<br>
|
>>>>>>>>>Â
               Â
    healing, not the<br>
|
>>>>>>>>>Â
               Â
    self-heal-daemon. The<br>
|
>>>>>>>>>Â
               Â
    problem I feel is that<br>
|
>>>>>>>>>Â
               Â
    whichever process does<br>
|
>>>>>>>>>Â
               Â
    the healing has the<br>
|
>>>>>>>>>Â
               Â
    latest information<br>
|
>>>>>>>>>Â
               Â
    about the good bricks<br>
|
>>>>>>>>>Â
               Â
    in this usecase. Since<br>
|
>>>>>>>>>Â
               Â
    for VM usecase, mounts<br>
|
>>>>>>>>>Â
               Â
    should have the latest<br>
|
>>>>>>>>>Â
               Â
    information, we should<br>
|
>>>>>>>>>Â
               Â
    let the mounts do the<br>
|
>>>>>>>>>Â
               Â
    healing. If the mount<br>
|
>>>>>>>>>Â
               Â
    accesses the VM image<br>
|
>>>>>>>>>Â
               Â
    either by someone<br>
|
>>>>>>>>>Â
               Â
    doing operations<br>
|
>>>>>>>>>Â
               Â
    inside the VM or<br>
|
>>>>>>>>>Â
               Â
    explicit stat on the<br>
|
>>>>>>>>>Â
               Â
    file it should do the<br>
|
>>>>>>>>>Â
               Â
    healing.<br>
|
>>>>>>>>><br>
|
>>>>>>>>>Â
               Â
    Pranith.<br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>>Â
               Â
    On 08/05/2014 10:39<br>
|
>>>>>>>>>Â
               Â
    AM, Roman wrote:<br>
|
>>>>>>>>>>Â
               Â
    Hmmm, you told me to<br>
|
>>>>>>>>>>Â
               Â
    turn it off. Did I<br>
|
>>>>>>>>>>Â
               Â
    understood something<br>
|
>>>>>>>>>>Â
               Â
    wrong? After I issued<br>
|
>>>>>>>>>>Â
               Â
    the command you've<br>
|
>>>>>>>>>>Â
               Â
    sent me, I was not<br>
|
>>>>>>>>>>Â
               Â
    able to watch the<br>
|
>>>>>>>>>>Â
               Â
    healing process, it<br>
|
>>>>>>>>>>Â
               Â
    said, it won't be<br>
|
>>>>>>>>>>Â
               Â
    healed, becouse its<br>
|
>>>>>>>>>>Â
               Â
    turned off.<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>Â
               Â
    2014-08-05 5:39<br>
|
>>>>>>>>>>Â
               Â
    GMT+03:00 Pranith<br>
|
>>>>>>>>>>Â
               Â
    Kumar Karampuri<br>
|
>>>>>>>>>>Â
               Â
    <<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>>>>Â
                 Â
  <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>>>><br>
|
>>>>>>>>>>Â
               Â
      You didn't<br>
|
>>>>>>>>>>Â
               Â
      mention anything<br>
|
>>>>>>>>>>Â
               Â
      about<br>
|
>>>>>>>>>>Â
               Â
      self-healing. Did<br>
|
>>>>>>>>>>Â
               Â
      you wait until<br>
|
>>>>>>>>>>Â
               Â
      the self-heal is<br>
|
>>>>>>>>>>Â
               Â
      complete?<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>Â
               Â
      Pranith<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>Â
               Â
      On 08/04/2014<br>
|
>>>>>>>>>>Â
               Â
      05:49 PM, Roman<br>
|
>>>>>>>>>>Â
               Â
      wrote:<br>
|
>>>>>>>>>>>Â
               Â
      Hi!<br>
|
>>>>>>>>>>>Â
               Â
      Result is pretty<br>
|
>>>>>>>>>>>Â
               Â
      same. I set the<br>
|
>>>>>>>>>>>Â
               Â
      switch port down<br>
|
>>>>>>>>>>>Â
               Â
      for 1st server,<br>
|
>>>>>>>>>>>Â
               Â
      it was ok. Then<br>
|
>>>>>>>>>>>Â
               Â
      set it up back<br>
|
>>>>>>>>>>>Â
               Â
      and set other<br>
|
>>>>>>>>>>>Â
               Â
      server's port<br>
|
>>>>>>>>>>>Â
               Â
      off. and it<br>
|
>>>>>>>>>>>Â
               Â
      triggered IO<br>
|
>>>>>>>>>>>Â
               Â
      error on two<br>
|
>>>>>>>>>>>Â
               Â
      virtual<br>
|
>>>>>>>>>>>Â
               Â
      machines: one<br>
|
>>>>>>>>>>>Â
               Â
      with local root<br>
|
>>>>>>>>>>>Â
               Â
      FS but network<br>
|
>>>>>>>>>>>Â
               Â
      mounted storage.<br>
|
>>>>>>>>>>>Â
               Â
      and other with<br>
|
>>>>>>>>>>>Â
               Â
      network root FS.<br>
|
>>>>>>>>>>>Â
               Â
      1st gave an<br>
|
>>>>>>>>>>>Â
               Â
      error on copying<br>
|
>>>>>>>>>>>Â
               Â
      to or from the<br>
|
>>>>>>>>>>>Â
               Â
      mounted network<br>
|
>>>>>>>>>>>Â
               Â
      disk, other just<br>
|
>>>>>>>>>>>Â
               Â
      gave me an error<br>
|
>>>>>>>>>>>Â
               Â
      for even reading<br>
|
>>>>>>>>>>>Â
               Â
      log.files.<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
      cat:<br>
|
>>>>>>>>>>>Â
               Â
    Â
 /var/log/alternatives.log:<br>
|
>>>>>>>>>>>Â
               Â
      Input/output error<br>
|
>>>>>>>>>>>Â
               Â
      then I reset the<br>
|
>>>>>>>>>>>Â
               Â
      kvm VM and it<br>
|
>>>>>>>>>>>Â
               Â
      said me, there<br>
|
>>>>>>>>>>>Â
               Â
      is no boot<br>
|
>>>>>>>>>>>Â
               Â
      device. Next I<br>
|
>>>>>>>>>>>Â
               Â
      virtually<br>
|
>>>>>>>>>>>Â
               Â
      powered it off<br>
|
>>>>>>>>>>>Â
               Â
      and then back on<br>
|
>>>>>>>>>>>Â
               Â
      and it has booted.<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
      By the way, did<br>
|
>>>>>>>>>>>Â
               Â
      I have to<br>
|
>>>>>>>>>>>Â
               Â
      start/stop volume?<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
      >> Could you do<br>
|
>>>>>>>>>>>Â
               Â
      the following<br>
|
>>>>>>>>>>>Â
               Â
      and test it again?<br>
|
>>>>>>>>>>>Â
               Â
      >> gluster
volume<br>
|
>>>>>>>>>>>Â
               Â
      set <volname><br>
|
>>>>>>>>>>>Â
               Â
    Â
 cluster.self-heal-daemon<br>
|
>>>>>>>>>>>Â
               Â
      off<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
      >>Pranith<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
      2014-08-04 14:10<br>
|
>>>>>>>>>>>Â
               Â
      GMT+03:00<br>
|
>>>>>>>>>>>Â
               Â
      Pranith Kumar<br>
|
>>>>>>>>>>>Â
               Â
      Karampuri<br>
|
>>>>>>>>>>>Â
               Â
      <<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>>>>>Â
                 Â
    <mailto:<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
        On<br>
|
>>>>>>>>>>>Â
               Â
        08/04/2014<br>
|
>>>>>>>>>>>Â
               Â
        03:33 PM,<br>
|
>>>>>>>>>>>Â
               Â
        Roman wrote:<br>
|
>>>>>>>>>>>>Â
               Â
        Hello!<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        Facing the<br>
|
>>>>>>>>>>>>Â
               Â
        same<br>
|
>>>>>>>>>>>>Â
               Â
        problem as<br>
|
>>>>>>>>>>>>Â
               Â
        mentioned<br>
|
>>>>>>>>>>>>Â
               Â
        here:<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        <a
moz-do-not-send="true"
href="http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html"
target="_blank">http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html</a><br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        my set up<br>
|
>>>>>>>>>>>>Â
               Â
        is up and<br>
|
>>>>>>>>>>>>Â
               Â
        running, so<br>
|
>>>>>>>>>>>>Â
               Â
        i'm ready<br>
|
>>>>>>>>>>>>Â
               Â
        to help you<br>
|
>>>>>>>>>>>>Â
               Â
        back with<br>
|
>>>>>>>>>>>>Â
               Â
        feedback.<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        setup:<br>
|
>>>>>>>>>>>>Â
               Â
        proxmox<br>
|
>>>>>>>>>>>>Â
               Â
        server as<br>
|
>>>>>>>>>>>>Â
               Â
        client<br>
|
>>>>>>>>>>>>Â
               Â
        2 gluster<br>
|
>>>>>>>>>>>>Â
               Â
        physical<br>
|
>>>>>>>>>>>>Â
               Â
        servers<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        server side<br>
|
>>>>>>>>>>>>Â
               Â
        and client<br>
|
>>>>>>>>>>>>Â
               Â
        side both<br>
|
>>>>>>>>>>>>Â
               Â
        running atm<br>
|
>>>>>>>>>>>>Â
               Â
        3.4.4<br>
|
>>>>>>>>>>>>Â
               Â
        glusterfs<br>
|
>>>>>>>>>>>>Â
               Â
        from<br>
|
>>>>>>>>>>>>Â
               Â
        gluster repo.<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        the problem is:<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        1. craeted<br>
|
>>>>>>>>>>>>Â
               Â
        replica bricks.<br>
|
>>>>>>>>>>>>Â
               Â
        2. mounted<br>
|
>>>>>>>>>>>>Â
               Â
        in proxmox<br>
|
>>>>>>>>>>>>Â
               Â
        (tried both<br>
|
>>>>>>>>>>>>Â
               Â
        promox<br>
|
>>>>>>>>>>>>Â
               Â
        ways: via<br>
|
>>>>>>>>>>>>Â
               Â
        GUI and<br>
|
>>>>>>>>>>>>Â
               Â
        fstab (with<br>
|
>>>>>>>>>>>>Â
               Â
        backup<br>
|
>>>>>>>>>>>>Â
               Â
        volume<br>
|
>>>>>>>>>>>>Â
               Â
        line), btw<br>
|
>>>>>>>>>>>>Â
               Â
        while<br>
|
>>>>>>>>>>>>Â
               Â
        mounting<br>
|
>>>>>>>>>>>>Â
               Â
        via fstab<br>
|
>>>>>>>>>>>>Â
               Â
        I'm unable<br>
|
>>>>>>>>>>>>Â
               Â
        to launch a<br>
|
>>>>>>>>>>>>Â
               Â
        VM without<br>
|
>>>>>>>>>>>>Â
               Â
        cache,<br>
|
>>>>>>>>>>>>Â
               Â
        meanwhile<br>
|
>>>>>>>>>>>>Â
               Â
        direct-io-mode<br>
|
>>>>>>>>>>>>Â
               Â
        is enabled<br>
|
>>>>>>>>>>>>Â
               Â
        in fstab line)<br>
|
>>>>>>>>>>>>Â
               Â
        3. installed VM<br>
|
>>>>>>>>>>>>Â
               Â
        4. bring<br>
|
>>>>>>>>>>>>Â
               Â
        one volume<br>
|
>>>>>>>>>>>>Â
               Â
        down - ok<br>
|
>>>>>>>>>>>>Â
               Â
        5. bringing<br>
|
>>>>>>>>>>>>Â
               Â
        up, waiting<br>
|
>>>>>>>>>>>>Â
               Â
        for sync is<br>
|
>>>>>>>>>>>>Â
               Â
        done.<br>
|
>>>>>>>>>>>>Â
               Â
        6. bring<br>
|
>>>>>>>>>>>>Â
               Â
        other<br>
|
>>>>>>>>>>>>Â
               Â
        volume down<br>
|
>>>>>>>>>>>>Â
               Â
        - getting<br>
|
>>>>>>>>>>>>Â
               Â
        IO errors<br>
|
>>>>>>>>>>>>Â
               Â
        on VM guest<br>
|
>>>>>>>>>>>>Â
               Â
        and not<br>
|
>>>>>>>>>>>>Â
               Â
        able to<br>
|
>>>>>>>>>>>>Â
               Â
        restore the<br>
|
>>>>>>>>>>>>Â
               Â
        VM after I<br>
|
>>>>>>>>>>>>Â
               Â
        reset the<br>
|
>>>>>>>>>>>>Â
               Â
        VM via<br>
|
>>>>>>>>>>>>Â
               Â
        host. It<br>
|
>>>>>>>>>>>>Â
               Â
        says (no<br>
|
>>>>>>>>>>>>Â
               Â
        bootable<br>
|
>>>>>>>>>>>>Â
               Â
        media).<br>
|
>>>>>>>>>>>>Â
               Â
        After I<br>
|
>>>>>>>>>>>>Â
               Â
        shut it<br>
|
>>>>>>>>>>>>Â
               Â
        down<br>
|
>>>>>>>>>>>>Â
               Â
        (forced)<br>
|
>>>>>>>>>>>>Â
               Â
        and bring<br>
|
>>>>>>>>>>>>Â
               Â
        back up, it<br>
|
>>>>>>>>>>>>Â
               Â
        boots.<br>
|
>>>>>>>>>>>Â
               Â
        Could you do<br>
|
>>>>>>>>>>>Â
               Â
        the<br>
|
>>>>>>>>>>>Â
               Â
        following<br>
|
>>>>>>>>>>>Â
               Â
        and test it<br>
|
>>>>>>>>>>>Â
               Â
        again?<br>
|
>>>>>>>>>>>Â
               Â
        gluster<br>
|
>>>>>>>>>>>Â
               Â
        volume set<br>
|
>>>>>>>>>>>Â
               Â
        <volname><br>
|
>>>>>>>>>>>Â
               Â
      Â
 cluster.self-heal-daemon<br>
|
>>>>>>>>>>>Â
               Â
        off<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
        Pranith<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        Need help.<br>
|
>>>>>>>>>>>>Â
               Â
        Tried<br>
|
>>>>>>>>>>>>Â
               Â
        3.4.3, 3.4.4.<br>
|
>>>>>>>>>>>>Â
               Â
        Still<br>
|
>>>>>>>>>>>>Â
               Â
        missing<br>
|
>>>>>>>>>>>>Â
               Â
        pkg-s for<br>
|
>>>>>>>>>>>>Â
               Â
        3.4.5 for<br>
|
>>>>>>>>>>>>Â
               Â
        debian and<br>
|
>>>>>>>>>>>>Â
               Â
        3.5.2<br>
|
>>>>>>>>>>>>Â
               Â
        (3.5.1<br>
|
>>>>>>>>>>>>Â
               Â
        always<br>
|
>>>>>>>>>>>>Â
               Â
        gives a<br>
|
>>>>>>>>>>>>Â
               Â
        healing<br>
|
>>>>>>>>>>>>Â
               Â
        error for<br>
|
>>>>>>>>>>>>Â
               Â
        some reason)<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
        --<br>
|
>>>>>>>>>>>>Â
               Â
        Best regards,<br>
|
>>>>>>>>>>>>Â
               Â
        Roman.<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>Â
               Â
      Â
 _______________________________________________<br>
|
>>>>>>>>>>>>Â
               Â
        Gluster-users<br>
|
>>>>>>>>>>>>Â
               Â
        mailing list<br>
|
>>>>>>>>>>>>Â
               Â
        <a
moz-do-not-send="true"
href="mailto:Gluster-users@gluster.org"
target="_blank">Gluster-users@gluster.org</a><br>
</div>
</div>
|
>>>>>>>>>>>>Â
                 Â
      <mailto:<a
moz-do-not-send="true"
href="mailto:Gluster-users@gluster.org"
target="_blank">Gluster-users@gluster.org</a>><br>
<div>
<div>|
>>>>>>>>>>>>Â
               Â
        <a
moz-do-not-send="true"
href="http://supercolony.gluster.org/mailman/listinfo/gluster-users"
target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>Â
               Â
      --<br>
|
>>>>>>>>>>>Â
               Â
      Best regards,<br>
|
>>>>>>>>>>>Â
               Â
      Roman.<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>Â
               Â
    --<br>
|
>>>>>>>>>>Â
               Â
    Best regards,<br>
|
>>>>>>>>>>Â
               Â
    Roman.<br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>>Â
               Â
  --<br>
|
>>>>>>>>>Â
               Â
  Best regards,<br>
|
>>>>>>>>>Â
               Â
  Roman.<br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>>Â
              Â
 --<br>
|
>>>>>>>>Â
              Â
 Best regards,<br>
|
>>>>>>>>Â
              Â
 Roman.<br>
| >>>>>>><br>
| >>>>>>><br>
| >>>>>>><br>
| >>>>>>><br>
| >>>>>>>Â
              --<br>
| >>>>>>>Â
              Best
regards,<br>
| >>>>>>>Â
            Â
 Roman.<br>
| >>>>>><br>
| >>>>>><br>
| >>>>>><br>
| >>>>>><br>
| >>>>>>Â Â Â
          --<br>
| >>>>>>Â Â Â
          Best regards,<br>
| >>>>>>Â Â Â
          Roman.<br>
| >>>>><br>
| >>>>><br>
| >>>>><br>
| >>>>><br>
| >>>>>Â Â Â Â Â
      --<br>
| >>>>>Â Â Â Â Â
      Best regards,<br>
| >>>>>Â Â Â Â Â
      Roman.<br>
| >>>><br>
| >>>><br>
| >>>><br>
| >>>><br>
| >>>>Â Â Â Â Â Â Â
  --<br>
| >>>>Â Â Â Â Â Â Â
  Best regards,<br>
| >>>>Â Â Â Â Â Â Â
  Roman.<br>
| >>><br>
| >>><br>
| >>><br>
| >>><br>
| >>>Â Â Â Â Â Â Â --<br>
| >>>Â Â Â Â Â Â Â Best
regards,<br>
| >>>Â Â Â Â Â Â
 Roman.<br>
| >><br>
| >><br>
| >><br>
| >><br>
| >>Â Â Â Â Â --<br>
| >>Â Â Â Â Â Best regards,<br>
| >>Â Â Â Â Â Roman.<br>
| >><br>
| >><br>
| >><br>
| >><br>
| >>Â Â Â --<br>
| >>Â Â Â Best regards,<br>
| >>Â Â Â Roman.<br>
| ><br>
| ><br>
| ><br>
| ><br>
| > --<br>
| > Best regards,<br>
| > Roman.<br>
|<br>
|<br>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<span class="HOEnZb"><font
color="#888888">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </font></span></div>
<span class="HOEnZb"><font color="#888888">
</font></span></blockquote>
<span class="HOEnZb"><font color="#888888">
<br>
</font></span></div>
<span class="HOEnZb"><font color="#888888"> </font></span></div>
<span class="HOEnZb"><font color="#888888"> </font></span></div>
<span class="HOEnZb"><font color="#888888"> </font></span></blockquote>
<span class="HOEnZb"><font color="#888888"> </font></span></div>
<span class="HOEnZb"><font color="#888888"> <br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </font></span></div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman.
</div>
</blockquote>
<br>
</body>
</html>