<div dir="ltr">Just to be sure: why do you guys create an updated version of glusterfs package for wheezy, if it is not able to install it on wheezy? :)</div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-08 9:03 GMT+03:00 Roman <span dir="ltr"><<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Oh, unfortunately I won't be able to install 3.5.2 nor 3.4.5 :( They both require libc6 update. I would not risk that way.<div>
<br></div><div><div>Â glusterfs-common : Depends: libc6 (>= 2.14) but 2.13-38+deb7u3 is to be installed</div>
<div>Â Â Â Â Â Â Â Â Â Â Depends: liblvm2app2.2 (>= 2.02.106) but 2.02.95-8 is to be installed</div><div>Â Â Â Â Â Â Â Â Â Â Depends: librdmacm1 (>= 1.0.16) but 1.0.15-1+deb7u1 is to be installed</div></div><div><br>
</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-07 15:32 GMT+03:00 Roman <span dir="ltr"><<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">I'm really sorry to bother, but it seems like all my previous test were waste of time with those generated from /dev/zero files :). Its good and bad news. Now I use real files for my tests. As it my almost last workday, only things I prefer to do is to test and document :) .. so here are some new results:<div>
<br></div><div>So this time I've got two gluster volumes:</div><div><br></div><div>1. with cluster.self-heal-daemon off</div><div>2. with cluster.self-heal-daemon on</div><div><br></div><div>1. real results with SHD off:</div>
<div>Seems like all is working as expected. VM survives both glusterfs servers outage. And I'm able to see the sync via network traffic. FINE! </div><div><br></div><div>Sometimes healing occurs a bit late (takes time from 1 minute to 1 hour to sync). Don't know why. Ideas?</div>
<div><br></div><div>2. test results on server with SHD on:</div><div>
VM is not able to survive second server restart (as was previously defined). gives IO errors, Although files are synced. Some locks, that do not allow KVM hypervisor to reconnect to the storage in time?</div><div><br></div>
<div><br></div><div>So the problem actually is stripped files inside a VM :). If one uses them (generates from /dev/zero ie), VM will crash and never come up due to errors in qcow2 file headers. Another bug?</div><div><br>
</div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-07 9:53 GMT+03:00 Roman <span dir="ltr"><<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Ok, then I hope that we will be able to test it two weeks later. Thanks for your time and  patience. </div>
<div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-07 9:49 GMT+03:00 Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><div><div>
<br>
<div>On 08/07/2014 12:17 PM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Well, one thing is definitely true: If there is no
healing daemon running, I'm not able to start the VM after
outage. Seems like the qcow2 file is corrupted (KVM unable to
read its header).</div>
</blockquote>
We shall see this again once I have the document with all the steps
that need to be carried out :-)<br>
<br>
Pranith<br>
</div></div><blockquote type="cite">
<div class="gmail_extra">
<br>
<br>
<div class="gmail_quote"><div><div>2014-08-07 9:35 GMT+03:00 Roman <span dir="ltr"><<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>></span>:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
<div dir="ltr">>Â <span style="font-family:arial,sans-serif;font-size:13px">This
should not happen if you do the writes lets say from
'/dev/urandom' instead of '/dev/zero'</span>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px">Somewhere
deep inside me I thought so ! zero is zero :)</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px">></span><span style="font-family:arial,sans-serif;font-size:13px">I
will provide you with a document for testing this
issue properly. I have a lot going on in my day job so
not getting enough time to write that out. Considering
the weekend is approaching I will > get a bit of
time definitely over the weekend so I will send you
the document over the weekend.</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><font face="arial, sans-serif">Thank you a lot. I'll
wait. Tomorrow starts my vacation and I'll be out for
two weeks, so don't hurry very much. </font></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
</div>
</div></div><div class="gmail_extra"><br>
<br>
<div class="gmail_quote"><div><div>
2014-08-07 9:26 GMT+03:00 Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><div><div> <br>
<div>On 08/07/2014 11:48 AM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">How can they be in sync, if they
are different in size ? And why then VM is not
able to survive gluster outage? I really want to
use glusterfs in our production for
infrastructure virtualization due to its simple
setup, but I'm not able to at this moment. Maybe
you've got some testing agenda? Or could you
list me the steps to make right tests, so our
VM-s would survive the outages.</div>
</blockquote>
This is because of sparse files. <a href="http://en.wikipedia.org/wiki/Sparse_file" target="_blank">http://en.wikipedia.org/wiki/Sparse_file</a><br>
This should not happen if you do the writes lets say
from '/dev/urandom' instead of '/dev/zero'<br>
<br>
I will provide you with a document for testing this
issue properly. I have a lot going on in my day job
so not getting enough time to write that out.
Considering the weekend is approaching I will get a
bit of time definitely over the weekend so I will
send you the document over the weekend.<br>
<br>
Pranith<br>
</div></div><blockquote type="cite"><div><div>
<div dir="ltr">
<div> <br>
</div>
<div>We would like to be sure, that in
situation, when one of storages is down, the
VM-s are running - it is OK, we see this.</div>
<div>We would like to be sure, that data after
the server is back up is synced - we can't see
that atm</div>
<div>We would like to be sure, that VMs are
failovering to the second storage during the
outage - we can't see this atm </div>
<div>:(</div>
</div>
</div></div><div class="gmail_extra"><br>
<br>
<div class="gmail_quote"><div><div>2014-08-07 9:12
GMT+03:00 Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><div><div> <br>
<div>On 08/07/2014 11:33 AM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">File size increases
because of me :) I generate files on
VM from /dev/zero during the outage of
one server. Then I bring up the downed
server and it seems files never sync.
I'll keep on testing today. Can't read
much from logs also :(. This morning
both VM-s (one on volume with
self-healing and other on volume
without it) survived second server
outage (first server was down
yesterday), while file sizes are
different, VM-s ran without problems.
But I've restarted them before
bringing the second gluster server
down. <br>
</div>
</blockquote>
Then there is no bug :-). It seems the
files are already in sync according to the
extended attributes you have pasted. How
to do you test if the files are in sync or
not?<br>
<br>
Pranith<br>
</div></div><blockquote type="cite"><div><div>
<div dir="ltr">
<div> <br>
</div>
<div>So I'm a bit lost at this moment.
I'll try to keep my testings ordered
and write here, what will happen.</div>
</div>
</div></div><div class="gmail_extra"><br>
<br>
<div class="gmail_quote"><div><div>2014-08-07
8:29 GMT+03:00 Pranith Kumar
Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><div><div>
<div> <br>
<div>On 08/07/2014 10:46 AM,
Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">yes, they do.
<div><br>
</div>
<div>
<div>getfattr: Removing
leading '/' from
absolute path names</div>
<div># file:
exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-0=0x000000000000000000000000</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-1=0x000000000000000000000000</div>
<div>trusted.gfid=0x207984df4e6e4ef983f285ed0c4ce8fa</div>
<div><br>
</div>
<div>root@stor1:~# du
-sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>1.6G Â
 /exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor1:~#
md5sum
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>c117d73c9f8a2e09ef13da31f7225fa6
 /exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor1:~# du
-sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>1.6G Â
 /exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>root@stor2:~#
getfattr -d -m. -e hex
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>getfattr: Removing
leading '/' from
absolute path names</div>
<div># file:
exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-0=0x000000000000000000000000</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-1=0x000000000000000000000000</div>
<div>trusted.gfid=0x207984df4e6e4ef983f285ed0c4ce8fa</div>
<div><br>
</div>
<div>root@stor2:~#
md5sum
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>c117d73c9f8a2e09ef13da31f7225fa6
 /exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor2:~# du
-sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>2.6G Â
 /exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
</div>
</div>
</blockquote>
</div>
I think the files are differing
in size because of the sparse
file healing issue. Could you
raise a bug with steps to
re-create this issue where after
healing size of the file is
increasing?<span><font color="#888888"><br>
<br>
Pranith</font></span>
</div></div><div>
<div><br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote"><div><div>
2014-08-06 12:49
GMT+03:00 Humble
Chirammal <span dir="ltr"><<a href="mailto:hchiramm@redhat.com" target="_blank">hchiramm@redhat.com</a>></span>:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
<div><br>
<br>
<br>
----- Original
Message -----<br>
| From: "Pranith
Kumar Karampuri"
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>><br>
| To: "Roman" <<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>><br>
| Cc: <a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a>,
"Niels de Vos"
<<a href="mailto:ndevos@redhat.com" target="_blank">ndevos@redhat.com</a>>,
"Humble Chirammal"
<<a href="mailto:hchiramm@redhat.com" target="_blank">hchiramm@redhat.com</a>><br>
| Sent: Wednesday,
August 6, 2014
12:09:57 PM<br>
| Subject: Re:
[Gluster-users]
libgfapi failover
problem on replica
bricks<br>
|<br>
| Roman,<br>
| Â Â Â The file
went into
split-brain. I
think we should do
these tests<br>
| with 3.5.2.
Where monitoring
the heals is
easier. Let me
also come up<br>
| with a document
about how to do
this testing you
are trying to do.<br>
|<br>
| Humble/Niels,<br>
| Â Â Â Do we have
debs available for
3.5.2? In 3.5.1
there was
packaging<br>
| issue where
/usr/bin/glfsheal
is not packaged
along with the
deb. I<br>
| think that
should be fixed
now as well?<br>
|<br>
</div>
Pranith,<br>
<br>
The 3.5.2 packages
for debian is not
available yet. We
are co-ordinating
internally to get it
processed.<br>
I will update the
list once its
available.<br>
<br>
--Humble<br>
<div>|<br>
| On 08/06/2014
11:52 AM, Roman
wrote:<br>
| > good
morning,<br>
| ><br>
| >
root@stor1:~#
getfattr -d -m. -e
hex<br>
| >
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| > getfattr:
Removing leading
'/' from absolute
path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| > getfattr:
Removing leading
'/' from absolute
path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| ><br>
| ><br>
| > 2014-08-06
9:20 GMT+03:00
Pranith Kumar
Karampuri <<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
| > <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>| ><br>
| ><br>
| > Â Â On
08/06/2014 11:30
AM, Roman wrote:<br>
| >> Â Â
Also, this time
files are not the
same!<br>
| >><br>
| >> Â Â
root@stor1:~#
md5sum<br>
| >> Â Â
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >> Â Â
32411360c53116b96a059f17306caeda<br>
| >> Â Â
 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >><br>
| >> Â Â
root@stor2:~#
md5sum<br>
| >> Â Â
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >> Â Â
65b8a6031bcb6f5fb3a11cb1e8b1c9c9<br>
| >> Â Â
 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| > Â Â What is
the getfattr
output?<br>
| ><br>
| > Â Â Pranith<br>
| ><br>
| >><br>
| >><br>
| >> Â Â
2014-08-05 16:33
GMT+03:00 Roman
<<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a><br>
</div>
| >> Â Â
<mailto:<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>>>:<br>
<div>| >><br>
| >> Â Â Â Â
Nope, it is not
working. But this
time it went a bit
other way<br>
| >><br>
| >> Â Â Â Â
root@gluster-client:~#
dmesg<br>
| >> Â Â Â Â
Segmentation fault<br>
| >><br>
| >><br>
| >> Â Â Â Â
I was not able
even to start the
VM after I done
the tests<br>
| >><br>
| >> Â Â Â Â
Could not read
qcow2 header:
Operation not
permitted<br>
| >><br>
| >> Â Â Â Â
And it seems, it
never starts to
sync files after
first<br>
| >> Â Â Â Â
disconnect. VM
survives first
disconnect, but
not second (I<br>
| >> Â Â Â Â
waited around 30
minutes). Also,
I've<br>
| >> Â Â Â Â
got
network.ping-timeout:
2 in volume
settings, but logs<br>
| >> Â Â Â Â
react on first
disconnect around
30 seconds. Second
was<br>
| >> Â Â Â Â
faster, 2 seconds.<br>
| >><br>
| >> Â Â Â Â
Reaction was
different also:<br>
| >><br>
| >> Â Â Â Â
slower one:<br>
| >> Â Â Â Â
[2014-08-05
13:26:19.558435] W
[socket.c:514:__socket_rwv]<br>
| >> Â Â Â Â
0-glusterfs: readv
failed (Connection
timed out)<br>
| >> Â Â Â Â
[2014-08-05
13:26:19.558485] W<br>
| >> Â Â Â Â
[socket.c:1962:__socket_proto_state_machine]
0-glusterfs:<br>
| >> Â Â Â Â
reading from
socket failed.
Error (Connection
timed out),<br>
</div>
| >> Â Â Â Â
peer (<a href="http://10.250.0.1:24007" target="_blank">10.250.0.1:24007</a> <<a href="http://10.250.0.1:24007" target="_blank">http://10.250.0.1:24007</a>>)<br>
<div>| >> Â Â
  [2014-08-05
13:26:21.281426] W
[socket.c:514:__socket_rwv]<br>
| >> Â Â Â Â
0-HA-fast-150G-PVE1-client-0:
readv failed
(Connection timed
out)<br>
| >> Â Â Â Â
[2014-08-05
13:26:21.281474] W<br>
| >> Â Â Â Â
[socket.c:1962:__socket_proto_state_machine]<br>
| >> Â Â Â Â
0-HA-fast-150G-PVE1-client-0:
reading from
socket failed.<br>
| >> Â Â Â Â
Error (Connection
timed out), peer (<a href="http://10.250.0.1:49153" target="_blank">10.250.0.1:49153</a><br>
</div>
| >> Â Â Â Â
<<a href="http://10.250.0.1:49153" target="_blank">http://10.250.0.1:49153</a>>)<br>
<div>| >> Â Â
  [2014-08-05
13:26:21.281507] I<br>
| >> Â Â Â Â
[client.c:2098:client_rpc_notify]<br>
| >> Â Â Â Â
0-HA-fast-150G-PVE1-client-0:
disconnected<br>
| >><br>
| >> Â Â Â Â
the fast one:<br>
| >> Â Â Â Â
2014-08-05
12:52:44.607389] C<br>
| >> Â Â Â Â
[client-handshake.c:127:rpc_client_ping_timer_expired]<br>
| >> Â Â Â Â
0-HA-fast-150G-PVE1-client-1:
server <a href="http://10.250.0.2:49153" target="_blank">10.250.0.2:49153</a><br>
</div>
| >> Â Â Â Â
<<a href="http://10.250.0.2:49153" target="_blank">http://10.250.0.2:49153</a>>
has not responded in
the last 2<br>
<div>
<div>| >> Â
   seconds,
disconnecting.<br>
| >> Â Â Â
 [2014-08-05
12:52:44.607491]
W
[socket.c:514:__socket_rwv]<br>
| >> Â Â Â
Â
0-HA-fast-150G-PVE1-client-1:
readv failed (No
data available)<br>
| >> Â Â Â
 [2014-08-05
12:52:44.607585]
E<br>
| >> Â Â Â
Â
[rpc-clnt.c:368:saved_frames_unwind]<br>
| >> Â Â Â
Â
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >> Â Â Â
Â
[0x7fcb1b4b0558]<br>
| >> Â Â Â
Â
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >> Â Â Â
Â
[0x7fcb1b4aea63]<br>
| >> Â Â Â
Â
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >> Â Â Â
Â
[0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >> Â Â Â
 unwinding
frame
type(GlusterFS
3.3)
op(LOOKUP(27))
called at<br>
| >> Â Â Â
 2014-08-05
12:52:42.463881
(xid=0x381883x)<br>
| >> Â Â Â
 [2014-08-05
12:52:44.607604]
W<br>
| >> Â Â Â
Â
[client-rpc-fops.c:2624:client3_3_lookup_cbk]<br>
| >> Â Â Â
Â
0-HA-fast-150G-PVE1-client-1:
remote operation
failed:<br>
| >> Â Â Â
 Transport
endpoint is not
connected. Path:
/<br>
| >> Â Â Â
Â
(00000000-0000-0000-0000-000000000001)<br>
| >> Â Â Â
 [2014-08-05
12:52:44.607736]
E<br>
| >> Â Â Â
Â
[rpc-clnt.c:368:saved_frames_unwind]<br>
| >> Â Â Â
Â
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >> Â Â Â
Â
[0x7fcb1b4b0558]<br>
| >> Â Â Â
Â
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >> Â Â Â
Â
[0x7fcb1b4aea63]<br>
| >> Â Â Â
Â
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >> Â Â Â
Â
[0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >> Â Â Â
 unwinding
frame
type(GlusterFS
Handshake)
op(PING(3))
called<br>
| >> Â Â Â
 at 2014-08-05
12:52:42.463891
(xid=0x381884x)<br>
| >> Â Â Â
 [2014-08-05
12:52:44.607753]
W<br>
| >> Â Â Â
Â
[client-handshake.c:276:client_ping_cbk]<br>
| >> Â Â Â
Â
0-HA-fast-150G-PVE1-client-1:
timer must have
expired<br>
| >> Â Â Â
 [2014-08-05
12:52:44.607776]
I<br>
| >> Â Â Â
Â
[client.c:2098:client_rpc_notify]<br>
| >> Â Â Â
Â
0-HA-fast-150G-PVE1-client-1:
disconnected<br>
| >><br>
| >><br>
| >><br>
| >> Â Â Â
 I've got SSD
disks (just for
an info).<br>
| >> Â Â Â
 Should I go
and give a try
for 3.5.2?<br>
| >><br>
| >><br>
| >><br>
| >> Â Â Â
 2014-08-05
13:06 GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
</div>
| >> Â Â Â Â
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>| >><br>
| >> Â Â Â Â
  reply along
with gluster-users
please :-). May be
you are<br>
| >> Â Â Â Â
  hitting
'reply' instead of
'reply all'?<br>
| >><br>
| >> Â Â Â Â
  Pranith<br>
| >><br>
| >> Â Â Â Â
  On 08/05/2014
03:35 PM, Roman
wrote:<br>
| >>> Â Â
    To make
sure and clean,
I've created
another VM with
raw<br>
| >>> Â Â
    format and
goint to repeat
those steps. So
now I've got<br>
| >>> Â Â
    two VM-s
one with qcow2
format and other
with raw<br>
| >>> Â Â
    format. I
will send another
e-mail shortly.<br>
| >>><br>
| >>><br>
| >>> Â Â
    2014-08-05
13:01 GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
| >>> Â Â Â
   <<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>><br>
| >>><br>
| >>> Â
       On
08/05/2014 03:07
PM, Roman wrote:<br>
|
>>>>
       Â
really, seems
like the same
file<br>
|
>>>><br>
|
>>>>
       Â
stor1:<br>
|
>>>>
       Â
a951641c5230472929836f9fcede6b04<br>
|
>>>>
       Â
 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>><br>
|
>>>>
       Â
stor2:<br>
|
>>>>
       Â
a951641c5230472929836f9fcede6b04<br>
|
>>>>
       Â
 /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>><br>
|
>>>><br>
|
>>>>
       Â
one thing I've
seen from logs,
that somehow
proxmox<br>
|
>>>>
       Â
VE is connecting
with wrong
version to
servers?<br>
|
>>>>
       Â
[2014-08-05
09:23:45.218550]
I<br>
|
>>>>
       Â
[client-handshake.c:1659:select_server_supported_programs]<br>
|
>>>>
       Â
0-HA-fast-150G-PVE1-client-0:
Using Program<br>
|
>>>>
       Â
GlusterFS 3.3,
Num (1298437),
Version (330)<br>
| >>> Â
       It
is the rpc (over
the network data
structures)<br>
| >>> Â
      Â
version, which
is not changed
at all from 3.3
so<br>
| >>> Â
      Â
thats not a
problem. So what
is the
conclusion? Is<br>
| >>> Â
      Â
your test case
working now or
not?<br>
| >>><br>
| >>> Â
      Â
Pranith<br>
| >>><br>
|
>>>>
       Â
but if I issue:<br>
|
>>>>
       Â
root@pve1:~#
glusterfs -V<br>
|
>>>>
       Â
glusterfs 3.4.4
built on Jun 28
2014 03:44:57<br>
|
>>>>
       Â
seems ok.<br>
|
>>>><br>
|
>>>>
       Â
server  use
3.4.4 meanwhile<br>
|
>>>>
       Â
[2014-08-05
09:23:45.117875]
I<br>
|
>>>>
       Â
[server-handshake.c:567:server_setvolume]<br>
|
>>>>
       Â
0-HA-fast-150G-PVE1-server:
accepted client
from<br>
|
>>>>
       Â
stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0<br>
|
>>>>
       Â
(version: 3.4.4)<br>
|
>>>>
       Â
[2014-08-05
09:23:49.103035]
I<br>
|
>>>>
       Â
[server-handshake.c:567:server_setvolume]<br>
|
>>>>
       Â
0-HA-fast-150G-PVE1-server:
accepted client
from<br>
|
>>>>
       Â
stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>
       Â
(version: 3.4.4)<br>
|
>>>><br>
|
>>>>
       Â
if this could be
the reason, of
course.<br>
|
>>>>
       Â
I did restart
the Proxmox VE
yesterday (just
for an<br>
|
>>>>
       Â
information)<br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>>
       Â
2014-08-05 12:30
GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
</div>
| >>>> Â
       <<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>><br>
|
>>>><br>
|
>>>>
       Â
  On
08/05/2014 02:33
PM, Roman wrote:<br>
|
>>>>>
       Â
  Waited long
enough for now,
still different<br>
|
>>>>>
       Â
  sizes and no
logs about
healing :(<br>
|
>>>>><br>
|
>>>>>
       Â
  stor1<br>
|
>>>>>
       Â
  # file:<br>
|
>>>>>
       Â
 Â
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>>>
       Â
 Â
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
|
>>>>>
       Â
 Â
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
|
>>>>>
       Â
 Â
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
|
>>>>><br>
|
>>>>>
       Â
 Â
root@stor1:~# du
-sh<br>
|
>>>>>
       Â
 Â
/exports/fast-test/150G/images/127/<br>
|
>>>>>
       Â
  1.2G
 /exports/fast-test/150G/images/127/<br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
       Â
  stor2<br>
|
>>>>>
       Â
  # file:<br>
|
>>>>>
       Â
 Â
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>>>
       Â
 Â
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
|
>>>>>
       Â
 Â
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
|
>>>>>
       Â
 Â
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
       Â
 Â
root@stor2:~# du
-sh<br>
|
>>>>>
       Â
 Â
/exports/fast-test/150G/images/127/<br>
|
>>>>>
       Â
  1.4G
 /exports/fast-test/150G/images/127/<br>
|
>>>>
       Â
  According to
the changelogs,
the file doesn't<br>
|
>>>>
       Â
  need any
healing. Could
you stop the
operations<br>
|
>>>>
       Â
  on the VMs
and take md5sum
on both these
machines?<br>
|
>>>><br>
|
>>>>
       Â
  Pranith<br>
|
>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
       Â
  2014-08-05
11:49 GMT+03:00
Pranith Kumar<br>
|
>>>>>
       Â
  Karampuri
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>
         Â
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>><br>
|
>>>>><br>
|
>>>>>
       Â
    On
08/05/2014 02:06
PM, Roman wrote:<br>
|
>>>>>>
       Â
    Well, it
seems like it
doesn't see the<br>
|
>>>>>>
       Â
    changes
were made to the
volume ? I<br>
|
>>>>>>
       Â
    created
two files 200
and 100 MB (from<br>
|
>>>>>>
       Â
   Â
/dev/zero) after
I disconnected
the first<br>
|
>>>>>>
       Â
    brick.
Then connected
it back and got<br>
|
>>>>>>
       Â
    these
logs:<br>
|
>>>>>><br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.830150]
I<br>
|
>>>>>>
       Â
   Â
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
|
>>>>>>
       Â
   Â
0-glusterfs: No
change in
volfile,
continuing<br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.830207]
I<br>
|
>>>>>>
       Â
   Â
[rpc-clnt.c:1676:rpc_clnt_reconfig]<br>
|
>>>>>>
       Â
   Â
0-HA-fast-150G-PVE1-client-0:
changing<br>
|
>>>>>>
       Â
    port to
49153 (from 0)<br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.830239]
W<br>
|
>>>>>>
       Â
   Â
[socket.c:514:__socket_rwv]<br>
|
>>>>>>
       Â
   Â
0-HA-fast-150G-PVE1-client-0:
readv<br>
|
>>>>>>
       Â
    failed
(No data
available)<br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.831024]
I<br>
|
>>>>>>
       Â
   Â
[client-handshake.c:1659:select_server_supported_programs]<br>
|
>>>>>>
       Â
   Â
0-HA-fast-150G-PVE1-client-0:
Using<br>
|
>>>>>>
       Â
    Program
GlusterFS 3.3,
Num (1298437),<br>
|
>>>>>>
       Â
    Version
(330)<br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.831375]
I<br>
|
>>>>>>
       Â
   Â
[client-handshake.c:1456:client_setvolume_cbk]<br>
|
>>>>>>
       Â
   Â
0-HA-fast-150G-PVE1-client-0:
Connected<br>
|
>>>>>>
       Â
    to <a href="http://10.250.0.1:49153" target="_blank">10.250.0.1:49153</a><br>
</div>
</div>
|
>>>>>>
         Â
  <<a href="http://10.250.0.1:49153" target="_blank">http://10.250.0.1:49153</a>>,
attached to<br>
</div></div><div>
<div><div><div>|
>>>>>>
       Â
    remote
volume
'/exports/fast-test/150G'.<br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.831394]
I<br>
|
>>>>>>
       Â
   Â
[client-handshake.c:1468:client_setvolume_cbk]<br>
|
>>>>>>
       Â
   Â
0-HA-fast-150G-PVE1-client-0:
Server and<br>
|
>>>>>>
       Â
    Client
lk-version
numbers are not
same,<br>
|
>>>>>>
       Â
   Â
reopening the
fds<br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.831566]
I<br>
|
>>>>>>
       Â
   Â
[client-handshake.c:450:client_set_lk_version_cbk]<br>
|
>>>>>>
       Â
   Â
0-HA-fast-150G-PVE1-client-0:
Server lk<br>
|
>>>>>>
       Â
    version
= 1<br>
|
>>>>>><br>
|
>>>>>><br>
|
>>>>>>
       Â
   Â
[2014-08-05
08:30:37.830150]
I<br>
|
>>>>>>
       Â
   Â
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
|
>>>>>>
       Â
   Â
0-glusterfs: No
change in
volfile,
continuing<br></div></div><div>
|
>>>>>>
       Â
    this
line seems weird
to me tbh.<br>
|
>>>>>>
       Â
    I do not
see any traffic
on switch<br>
|
>>>>>>
       Â
   Â
interfaces
between gluster
servers, which<br>
</div></div></div></blockquote></div></div></blockquote></div></div></div></blockquote></div></div></blockquote></div></blockquote></div></div></blockquote></div></blockquote>
</div></div></blockquote></div></div></blockquote></div></blockquote></div></div>...<br><br>[ПиÑьмо показано не полноÑтью]  </blockquote></div><span><font color="#888888"><br><br clear="all"><span class="HOEnZb"><font color="#888888"><div>
<br></div>
-- <br>Best regards,<br>Roman.
</font></span></font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></blockquote></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><div><br></div>-- <br>Best regards,<br>Roman.
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Best regards,<br>Roman.
</div>