<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<div class="moz-cite-prefix">On 08/08/2014 11:35 AM, Roman wrote:<br>
</div>
<blockquote
cite="mid:CAFR=TBq=_2PN7kdxQyEiho=BKcfPB7LTt4yaDiaTiOcj6eXuqg@mail.gmail.com"
type="cite">
<div dir="ltr">Just to be sure: why do you guys create an updated
version of glusterfs package for wheezy, if it is not able to
install it on wheezy? :)</div>
</blockquote>
CCed lala, Humble, kaleb who may know the answer<br>
<br>
Pranith<br>
<blockquote
cite="mid:CAFR=TBq=_2PN7kdxQyEiho=BKcfPB7LTt4yaDiaTiOcj6eXuqg@mail.gmail.com"
type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-08 9:03 GMT+03:00 Roman <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Oh, unfortunately I won't be able to install
3.5.2 nor 3.4.5 :( They both require libc6 update. I would
not risk that way.
<div>
<br>
</div>
<div>
<div> glusterfs-common : Depends: libc6 (>= 2.14) but
2.13-38+deb7u3 is to be installed</div>
<div> Depends: liblvm2app2.2 (>=
2.02.106) but 2.02.95-8 is to be installed</div>
<div> Depends: librdmacm1 (>=
1.0.16) but 1.0.15-1+deb7u1 is to be installed</div>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-07 15:32 GMT+03:00 Roman
<span dir="ltr"><<a moz-do-not-send="true"
href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">I'm really sorry to bother, but it
seems like all my previous test were waste of time
with those generated from /dev/zero files :). Its
good and bad news. Now I use real files for my
tests. As it my almost last workday, only things I
prefer to do is to test and document :) .. so here
are some new results:
<div>
<br>
</div>
<div>So this time I've got two gluster volumes:</div>
<div><br>
</div>
<div>1. with cluster.self-heal-daemon off</div>
<div>2. with cluster.self-heal-daemon on</div>
<div><br>
</div>
<div>1. real results with SHD off:</div>
<div>Seems like all is working as expected. VM
survives both glusterfs servers outage. And I'm
able to see the sync via network traffic. FINE! </div>
<div><br>
</div>
<div>Sometimes healing occurs a bit late (takes time
from 1 minute to 1 hour to sync). Don't know why.
Ideas?</div>
<div><br>
</div>
<div>2. test results on server with SHD on:</div>
<div>
VM is not able to survive second server restart
(as was previously defined). gives IO errors,
Although files are synced. Some locks, that do not
allow KVM hypervisor to reconnect to the storage
in time?</div>
<div><br>
</div>
<div><br>
</div>
<div>So the problem actually is stripped files
inside a VM :). If one uses them (generates from
/dev/zero ie), VM will crash and never come up due
to errors in qcow2 file headers. Another bug?</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-07 9:53 GMT+03:00
Roman <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:romeo.r@gmail.com"
target="_blank">romeo.r@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0
0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">Ok, then I hope that we will be
able to test it two weeks later. Thanks for
your time and patience. </div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-07 9:49
GMT+03:00 Pranith Kumar Karampuri <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:pkarampu@redhat.com"
target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>
<div> <br>
<div>On 08/07/2014 12:17 PM, Roman
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Well, one thing is
definitely true: If there is no
healing daemon running, I'm not
able to start the VM after
outage. Seems like the qcow2
file is corrupted (KVM unable to
read its header).</div>
</blockquote>
We shall see this again once I have
the document with all the steps that
need to be carried out :-)<br>
<br>
Pranith<br>
</div>
</div>
<blockquote type="cite">
<div class="gmail_extra"> <br>
<br>
<div class="gmail_quote">
<div>
<div>2014-08-07 9:35 GMT+03:00
Roman <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:romeo.r@gmail.com"
target="_blank">romeo.r@gmail.com</a>></span>:<br>
</div>
</div>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div>
<div>
<div dir="ltr">> <span
style="font-family:arial,sans-serif;font-size:13px">This
should not happen if you
do the writes lets say
from '/dev/urandom'
instead of '/dev/zero'</span>
<div><span
style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span
style="font-family:arial,sans-serif;font-size:13px">Somewhere
deep inside me I
thought so ! zero is
zero :)</span></div>
<div><span
style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span
style="font-family:arial,sans-serif;font-size:13px">></span><span
style="font-family:arial,sans-serif;font-size:13px">I will provide you
with a document for
testing this issue
properly. I have a lot
going on in my day job
so not getting enough
time to write that
out. Considering the
weekend is approaching
I will > get a bit
of time definitely
over the weekend so I
will send you the
document over the
weekend.</span></div>
<div><span
style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><font face="arial,
sans-serif">Thank you
a lot. I'll wait.
Tomorrow starts my
vacation and I'll be
out for two weeks, so
don't hurry very
much. </font></div>
<div><span
style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span
style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">
<div>
<div> 2014-08-07 9:26
GMT+03:00 Pranith Kumar
Karampuri <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
</div>
</div>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000"
bgcolor="#FFFFFF">
<div>
<div> <br>
<div>On 08/07/2014
11:48 AM, Roman
wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">How
can they be in
sync, if they
are different in
size ? And why
then VM is not
able to survive
gluster outage?
I really want to
use glusterfs in
our production
for
infrastructure
virtualization
due to its
simple setup,
but I'm not able
to at this
moment. Maybe
you've got some
testing agenda?
Or could you
list me the
steps to make
right tests, so
our VM-s would
survive the
outages.</div>
</blockquote>
This is because of
sparse files. <a
moz-do-not-send="true"
href="http://en.wikipedia.org/wiki/Sparse_file" target="_blank">http://en.wikipedia.org/wiki/Sparse_file</a><br>
This should not
happen if you do the
writes lets say from
'/dev/urandom'
instead of
'/dev/zero'<br>
<br>
I will provide you
with a document for
testing this issue
properly. I have a
lot going on in my
day job so not
getting enough time
to write that out.
Considering the
weekend is
approaching I will
get a bit of time
definitely over the
weekend so I will
send you the
document over the
weekend.<br>
<br>
Pranith<br>
</div>
</div>
<blockquote type="cite">
<div>
<div>
<div dir="ltr">
<div> <br>
</div>
<div>We would
like to be
sure, that in
situation,
when one of
storages is
down, the VM-s
are running -
it is OK, we
see this.</div>
<div>We would
like to be
sure, that
data after the
server is back
up is synced -
we can't see
that atm</div>
<div>We would
like to be
sure, that VMs
are
failovering to
the second
storage during
the outage -
we can't see
this atm </div>
<div>:(</div>
</div>
</div>
</div>
<div
class="gmail_extra"><br>
<br>
<div
class="gmail_quote">
<div>
<div>2014-08-07
9:12 GMT+03:00
Pranith Kumar
Karampuri <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
</div>
</div>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF">
<div>
<div> <br>
<div>On
08/07/2014
11:33 AM,
Roman wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">File
size increases
because of me
:) I generate
files on VM
from /dev/zero
during the
outage of one
server. Then I
bring up the
downed server
and it seems
files never
sync. I'll
keep on
testing today.
Can't read
much from logs
also :(. This
morning both
VM-s (one on
volume with
self-healing
and other on
volume without
it) survived
second server
outage (first
server was
down
yesterday),
while file
sizes are
different,
VM-s ran
without
problems. But
I've restarted
them before
bringing the
second gluster
server down. <br>
</div>
</blockquote>
Then there is
no bug :-). It
seems the
files are
already in
sync according
to the
extended
attributes you
have pasted.
How to do you
test if the
files are in
sync or not?<br>
<br>
Pranith<br>
</div>
</div>
<blockquote
type="cite">
<div>
<div>
<div dir="ltr">
<div> <br>
</div>
<div>So I'm a
bit lost at
this moment.
I'll try to
keep my
testings
ordered and
write here,
what will
happen.</div>
</div>
</div>
</div>
<div
class="gmail_extra"><br>
<br>
<div
class="gmail_quote">
<div>
<div>2014-08-07
8:29 GMT+03:00
Pranith Kumar
Karampuri <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
</div>
</div>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF">
<div>
<div>
<div> <br>
<div>On
08/07/2014
10:46 AM,
Roman wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">yes,
they do.
<div><br>
</div>
<div>
<div>getfattr:
Removing
leading '/'
from absolute
path names</div>
<div># file:
exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-0=0x000000000000000000000000</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-1=0x000000000000000000000000</div>
<div>trusted.gfid=0x207984df4e6e4ef983f285ed0c4ce8fa</div>
<div><br>
</div>
<div>root@stor1:~#
du -sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>1.6G
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor1:~#
md5sum
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>c117d73c9f8a2e09ef13da31f7225fa6
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor1:~#
du -sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>1.6G
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>root@stor2:~#
getfattr -d
-m. -e hex
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>getfattr:
Removing
leading '/'
from absolute
path names</div>
<div># file:
exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-0=0x000000000000000000000000</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-1=0x000000000000000000000000</div>
<div>trusted.gfid=0x207984df4e6e4ef983f285ed0c4ce8fa</div>
<div><br>
</div>
<div>root@stor2:~#
md5sum
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>c117d73c9f8a2e09ef13da31f7225fa6
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor2:~#
du -sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>2.6G
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
</div>
</div>
</blockquote>
</div>
I think the
files are
differing in
size because
of the sparse
file healing
issue. Could
you raise a
bug with steps
to re-create
this issue
where after
healing size
of the file is
increasing?<span><font
color="#888888"><br>
<br>
Pranith</font></span>
</div>
</div>
<div>
<div><br>
<blockquote
type="cite">
<div dir="ltr">
<div><br>
</div>
</div>
<div
class="gmail_extra"><br>
<br>
<div
class="gmail_quote">
<div>
<div>
2014-08-06
12:49
GMT+03:00
Humble
Chirammal <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:hchiramm@redhat.com" target="_blank">hchiramm@redhat.com</a>></span>:<br>
</div>
</div>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div>
<div>
<div><br>
<br>
<br>
----- Original
Message -----<br>
| From:
"Pranith Kumar
Karampuri"
<<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>><br>
| To: "Roman"
<<a
moz-do-not-send="true"
href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>><br>
| Cc: <a
moz-do-not-send="true"
href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a>,
"Niels de Vos"
<<a
moz-do-not-send="true"
href="mailto:ndevos@redhat.com" target="_blank">ndevos@redhat.com</a>>,
"Humble
Chirammal"
<<a
moz-do-not-send="true"
href="mailto:hchiramm@redhat.com" target="_blank">hchiramm@redhat.com</a>><br>
| Sent:
Wednesday,
August 6, 2014
12:09:57 PM<br>
| Subject: Re:
[Gluster-users]
libgfapi
failover
problem on
replica bricks<br>
|<br>
| Roman,<br>
| The
file went into
split-brain. I
think we
should do
these tests<br>
| with 3.5.2.
Where
monitoring the
heals is
easier. Let me
also come up<br>
| with a
document about
how to do this
testing you
are trying to
do.<br>
|<br>
|
Humble/Niels,<br>
| Do we
have debs
available for
3.5.2? In
3.5.1 there
was packaging<br>
| issue where
/usr/bin/glfsheal
is not
packaged along
with the deb.
I<br>
| think that
should be
fixed now as
well?<br>
|<br>
</div>
Pranith,<br>
<br>
The 3.5.2
packages for
debian is not
available yet.
We are
co-ordinating
internally to
get it
processed.<br>
I will update
the list once
its available.<br>
<br>
--Humble<br>
<div>|<br>
| On
08/06/2014
11:52 AM,
Roman wrote:<br>
| > good
morning,<br>
| ><br>
| >
root@stor1:~#
getfattr -d
-m. -e hex<br>
| >
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
getfattr:
Removing
leading '/'
from absolute
path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| >
getfattr:
Removing
leading '/'
from absolute
path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| ><br>
| ><br>
| >
2014-08-06
9:20 GMT+03:00
Pranith Kumar
Karampuri <<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
| >
<mailto:<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>| ><br>
| ><br>
| > On
08/06/2014
11:30 AM,
Roman wrote:<br>
| >>
Also, this
time files are
not the same!<br>
| >><br>
| >>
root@stor1:~#
md5sum<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>
32411360c53116b96a059f17306caeda<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >><br>
| >>
root@stor2:~#
md5sum<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>
65b8a6031bcb6f5fb3a11cb1e8b1c9c9<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
What is the
getfattr
output?<br>
| ><br>
| >
Pranith<br>
| ><br>
| >><br>
| >><br>
| >>
2014-08-05
16:33
GMT+03:00
Roman <<a
moz-do-not-send="true"
href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a><br>
</div>
| >>
<mailto:<a
moz-do-not-send="true" href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>>>:<br>
<div>|
>><br>
| >>
Nope, it
is not
working. But
this time it
went a bit
other way<br>
| >><br>
| >>
root@gluster-client:~#
dmesg<br>
| >>
Segmentation
fault<br>
| >><br>
| >><br>
| >>
I was not
able even to
start the VM
after I done
the tests<br>
| >><br>
| >>
Could not
read qcow2
header:
Operation not
permitted<br>
| >><br>
| >>
And it
seems, it
never starts
to sync files
after first<br>
| >>
disconnect. VM
survives first
disconnect,
but not second
(I<br>
| >>
waited
around 30
minutes).
Also, I've<br>
| >>
got
network.ping-timeout:
2 in volume
settings, but
logs<br>
| >>
react on
first
disconnect
around 30
seconds.
Second was<br>
| >>
faster, 2
seconds.<br>
| >><br>
| >>
Reaction
was different
also:<br>
| >><br>
| >>
slower
one:<br>
| >>
[2014-08-05
13:26:19.558435]
W
[socket.c:514:__socket_rwv]<br>
| >>
0-glusterfs:
readv failed
(Connection
timed out)<br>
| >>
[2014-08-05
13:26:19.558485]
W<br>
| >>
[socket.c:1962:__socket_proto_state_machine]
0-glusterfs:<br>
| >>
reading
from socket
failed. Error
(Connection
timed out),<br>
</div>
| >>
peer (<a
moz-do-not-send="true"
href="http://10.250.0.1:24007" target="_blank">10.250.0.1:24007</a> <<a
moz-do-not-send="true" href="http://10.250.0.1:24007" target="_blank">http://10.250.0.1:24007</a>>)<br>
<div>|
>>
[2014-08-05
13:26:21.281426]
W
[socket.c:514:__socket_rwv]<br>
| >>
0-HA-fast-150G-PVE1-client-0:
readv failed
(Connection
timed out)<br>
| >>
[2014-08-05
13:26:21.281474]
W<br>
| >>
[socket.c:1962:__socket_proto_state_machine]<br>
| >>
0-HA-fast-150G-PVE1-client-0:
reading from
socket failed.<br>
| >>
Error
(Connection
timed out),
peer (<a
moz-do-not-send="true"
href="http://10.250.0.1:49153" target="_blank">10.250.0.1:49153</a><br>
</div>
| >>
<<a
moz-do-not-send="true"
href="http://10.250.0.1:49153" target="_blank">http://10.250.0.1:49153</a>>)<br>
<div>|
>>
[2014-08-05
13:26:21.281507]
I<br>
| >>
[client.c:2098:client_rpc_notify]<br>
| >>
0-HA-fast-150G-PVE1-client-0:
disconnected<br>
| >><br>
| >>
the fast
one:<br>
| >>
2014-08-05
12:52:44.607389]
C<br>
| >>
[client-handshake.c:127:rpc_client_ping_timer_expired]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
server <a
moz-do-not-send="true"
href="http://10.250.0.2:49153" target="_blank">10.250.0.2:49153</a><br>
</div>
| >>
<<a
moz-do-not-send="true"
href="http://10.250.0.2:49153" target="_blank">http://10.250.0.2:49153</a>>
has not
responded in
the last 2<br>
<div>
<div>|
>>
seconds,
disconnecting.<br>
| >>
[2014-08-05
12:52:44.607491]
W
[socket.c:514:__socket_rwv]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
readv failed
(No data
available)<br>
| >>
[2014-08-05
12:52:44.607585]
E<br>
| >>
[rpc-clnt.c:368:saved_frames_unwind]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >>
[0x7fcb1b4b0558]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >>
[0x7fcb1b4aea63]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >>
[0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >>
unwinding
frame
type(GlusterFS
3.3)
op(LOOKUP(27))
called at<br>
| >>
2014-08-05
12:52:42.463881
(xid=0x381883x)<br>
| >>
[2014-08-05
12:52:44.607604]
W<br>
| >>
[client-rpc-fops.c:2624:client3_3_lookup_cbk]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
remote
operation
failed:<br>
| >>
Transport
endpoint is
not connected.
Path: /<br>
| >>
(00000000-0000-0000-0000-000000000001)<br>
| >>
[2014-08-05
12:52:44.607736]
E<br>
| >>
[rpc-clnt.c:368:saved_frames_unwind]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >>
[0x7fcb1b4b0558]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >>
[0x7fcb1b4aea63]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >>
[0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >>
unwinding
frame
type(GlusterFS
Handshake)
op(PING(3))
called<br>
| >>
at
2014-08-05
12:52:42.463891
(xid=0x381884x)<br>
| >>
[2014-08-05
12:52:44.607753]
W<br>
| >>
[client-handshake.c:276:client_ping_cbk]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
timer must
have expired<br>
| >>
[2014-08-05
12:52:44.607776]
I<br>
| >>
[client.c:2098:client_rpc_notify]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
disconnected<br>
| >><br>
| >><br>
| >><br>
| >>
I've got
SSD disks
(just for an
info).<br>
| >>
Should I
go and give a
try for 3.5.2?<br>
| >><br>
| >><br>
| >><br>
| >>
2014-08-05
13:06
GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
</div>
| >>
<<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>|
>><br>
| >>
reply
along with
gluster-users
please :-).
May be you are<br>
| >>
hitting
'reply'
instead of
'reply all'?<br>
| >><br>
| >>
Pranith<br>
| >><br>
| >>
On
08/05/2014
03:35 PM,
Roman wrote:<br>
| >>>
To
make sure and
clean, I've
created
another VM
with raw<br>
| >>>
format and
goint to
repeat those
steps. So now
I've got<br>
| >>>
two VM-s one
with qcow2
format and
other with raw<br>
| >>>
format. I will
send another
e-mail
shortly.<br>
| >>><br>
| >>><br>
| >>>
2014-08-05
13:01
GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
| >>>
<<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>><br>
| >>><br>
| >>>
On
08/05/2014
03:07 PM,
Roman wrote:<br>
|
>>>>
really,
seems like the
same file<br>
|
>>>><br>
|
>>>>
stor1:<br>
|
>>>>
a951641c5230472929836f9fcede6b04<br>
|
>>>>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>><br>
|
>>>>
stor2:<br>
|
>>>>
a951641c5230472929836f9fcede6b04<br>
|
>>>>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>><br>
|
>>>><br>
|
>>>>
one thing
I've seen from
logs, that
somehow
proxmox<br>
|
>>>>
VE is
connecting
with wrong
version to
servers?<br>
|
>>>>
[2014-08-05
09:23:45.218550]
I<br>
|
>>>>
[client-handshake.c:1659:select_server_supported_programs]<br>
|
>>>>
0-HA-fast-150G-PVE1-client-0:
Using Program<br>
|
>>>>
GlusterFS
3.3, Num
(1298437),
Version (330)<br>
| >>>
It is the
rpc (over the
network data
structures)<br>
| >>>
version,
which is not
changed at all
from 3.3 so<br>
| >>>
thats not a
problem. So
what is the
conclusion? Is<br>
| >>>
your test
case working
now or not?<br>
| >>><br>
| >>>
Pranith<br>
| >>><br>
|
>>>>
but if I
issue:<br>
|
>>>>
root@pve1:~#
glusterfs -V<br>
|
>>>>
glusterfs
3.4.4 built on
Jun 28 2014
03:44:57<br>
|
>>>>
seems ok.<br>
|
>>>><br>
|
>>>>
server use
3.4.4
meanwhile<br>
|
>>>>
[2014-08-05
09:23:45.117875]
I<br>
|
>>>>
[server-handshake.c:567:server_setvolume]<br>
|
>>>>
0-HA-fast-150G-PVE1-server:
accepted
client from<br>
|
>>>>
stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0<br>
|
>>>>
(version:
3.4.4)<br>
|
>>>>
[2014-08-05
09:23:49.103035]
I<br>
|
>>>>
[server-handshake.c:567:server_setvolume]<br>
|
>>>>
0-HA-fast-150G-PVE1-server:
accepted
client from<br>
|
>>>>
stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>
(version:
3.4.4)<br>
|
>>>><br>
|
>>>>
if this
could be the
reason, of
course.<br>
|
>>>>
I did
restart the
Proxmox VE
yesterday
(just for an<br>
|
>>>>
information)<br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>>
2014-08-05
12:30
GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
</div>
|
>>>>
<<a
moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>><br>
|
>>>><br>
|
>>>>
On
08/05/2014
02:33 PM,
Roman wrote:<br>
|
>>>>>
Waited
long enough
for now, still
different<br>
|
>>>>>
sizes
and no logs
about healing
:(<br>
|
>>>>><br>
|
>>>>>
stor1<br>
|
>>>>>
# file:<br>
|
>>>>>
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
|
>>>>>
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
|
>>>>><br>
|
>>>>>
root@stor1:~#
du -sh<br>
|
>>>>>
/exports/fast-test/150G/images/127/<br>
|
>>>>>
1.2G
/exports/fast-test/150G/images/127/<br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
stor2<br>
|
>>>>>
# file:<br>
|
>>>>>
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
|
>>>>>
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
root@stor2:~#
du -sh<br>
|
>>>>>
/exports/fast-test/150G/images/127/<br>
|
>>>>>
1.4G
/exports/fast-test/150G/images/127/<br>
|
>>>>
According to
the
changelogs,
the file
doesn't<br>
|
>>>>
need any
healing. Could
you stop the
operations<br>
|
>>>>
on the
VMs and take
md5sum on both
these
machines?<br>
|
>>>><br>
|
>>>>
Pranith<br>
|
>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
2014-08-05
11:49
GMT+03:00
Pranith Kumar<br>
|
>>>>>
Karampuri <<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>
<mailto:<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>><br>
|
>>>>><br>
|
>>>>>
On
08/05/2014
02:06 PM,
Roman wrote:<br>
|
>>>>>>
Well, it seems
like it
doesn't see
the<br>
|
>>>>>>
changes were
made to the
volume ? I<br>
|
>>>>>>
created two
files 200 and
100 MB (from<br>
|
>>>>>>
/dev/zero)
after I
disconnected
the first<br>
|
>>>>>>
brick. Then
connected it
back and got<br>
|
>>>>>>
these logs:<br>
|
>>>>>><br>
|
>>>>>>
[2014-08-05
08:30:37.830150]
I<br>
|
>>>>>>
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
|
>>>>>>
0-glusterfs:
No change in
volfile,
continuing<br>
|
>>>>>>
[2014-08-05
08:30:37.830207]
I<br>
|
>>>>>>
[rpc-clnt.c:1676:rpc_clnt_reconfig]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
changing<br>
|
>>>>>>
port
to 49153 (from
0)<br>
|
>>>>>>
[2014-08-05
08:30:37.830239]
W<br>
|
>>>>>>
[socket.c:514:__socket_rwv]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
readv<br>
|
>>>>>>
failed (No
data
available)<br>
|
>>>>>>
[2014-08-05
08:30:37.831024]
I<br>
|
>>>>>>
[client-handshake.c:1659:select_server_supported_programs]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Using<br>
|
>>>>>>
Program
GlusterFS 3.3,
Num (1298437),<br>
|
>>>>>>
Version (330)<br>
|
>>>>>>
[2014-08-05
08:30:37.831375]
I<br>
|
>>>>>>
[client-handshake.c:1456:client_setvolume_cbk]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Connected<br>
|
>>>>>>
to <a
moz-do-not-send="true" href="http://10.250.0.1:49153" target="_blank">10.250.0.1:49153</a><br>
</div>
</div>
|
>>>>>>
<<a
moz-do-not-send="true" href="http://10.250.0.1:49153" target="_blank">http://10.250.0.1:49153</a>>,
attached to<br>
</div>
</div>
<div>
<div>
<div>
<div>|
>>>>>>
remote volume
'/exports/fast-test/150G'.<br>
|
>>>>>>
[2014-08-05
08:30:37.831394]
I<br>
|
>>>>>>
[client-handshake.c:1468:client_setvolume_cbk]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Server and<br>
|
>>>>>>
Client
lk-version
numbers are
not same,<br>
|
>>>>>>
reopening the
fds<br>
|
>>>>>>
[2014-08-05
08:30:37.831566]
I<br>
|
>>>>>>
[client-handshake.c:450:client_set_lk_version_cbk]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Server lk<br>
|
>>>>>>
version = 1<br>
|
>>>>>><br>
|
>>>>>><br>
|
>>>>>>
[2014-08-05
08:30:37.830150]
I<br>
|
>>>>>>
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
|
>>>>>>
0-glusterfs:
No change in
volfile,
continuing<br>
</div>
</div>
<div> |
>>>>>>
this
line seems
weird to me
tbh.<br>
|
>>>>>>
I do
not see any
traffic on
switch<br>
|
>>>>>>
interfaces
between
gluster
servers, which<br>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
...<br>
<br>
[Письмо показано не полностью] </blockquote>
</div>
<span><font color="#888888"><br>
<br clear="all">
<span class="HOEnZb"><font color="#888888">
<div>
<br>
</div>
-- <br>
Best regards,<br>
Roman.
</font></span></font></span></div>
<span class="HOEnZb"><font color="#888888">
</font></span></blockquote>
</div>
<span class="HOEnZb"><font color="#888888"><br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman.
</font></span></div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman.
</div>
</blockquote>
<br>
</body>
</html>