<div dir="ltr">Ok, then I hope that we will be able to test it two weeks later. Thanks for your time and patience. </div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-07 9:49 GMT+03:00 Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<br>
<div>On 08/07/2014 12:17 PM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Well, one thing is definitely true: If there is no
healing daemon running, I'm not able to start the VM after
outage. Seems like the qcow2 file is corrupted (KVM unable to
read its header).</div>
</blockquote>
We shall see this again once I have the document with all the steps
that need to be carried out :-)<br>
<br>
Pranith<br>
<blockquote type="cite">
<div class="gmail_extra">
<br>
<br>
<div class="gmail_quote">2014-08-07 9:35 GMT+03:00 Roman <span dir="ltr"><<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">> <span style="font-family:arial,sans-serif;font-size:13px">This
should not happen if you do the writes lets say from
'/dev/urandom' instead of '/dev/zero'</span>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px">Somewhere
deep inside me I thought so ! zero is zero :)</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px">></span><span style="font-family:arial,sans-serif;font-size:13px">I
will provide you with a document for testing this
issue properly. I have a lot going on in my day job so
not getting enough time to write that out. Considering
the weekend is approaching I will > get a bit of
time definitely over the weekend so I will send you
the document over the weekend.</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><font face="arial, sans-serif">Thank you a lot. I'll
wait. Tomorrow starts my vacation and I'll be out for
two weeks, so don't hurry very much. </font></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">
2014-08-07 9:26 GMT+03:00 Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> <br>
<div>On 08/07/2014 11:48 AM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">How can they be in sync, if they
are different in size ? And why then VM is not
able to survive gluster outage? I really want to
use glusterfs in our production for
infrastructure virtualization due to its simple
setup, but I'm not able to at this moment. Maybe
you've got some testing agenda? Or could you
list me the steps to make right tests, so our
VM-s would survive the outages.</div>
</blockquote>
This is because of sparse files. <a href="http://en.wikipedia.org/wiki/Sparse_file" target="_blank">http://en.wikipedia.org/wiki/Sparse_file</a><br>
This should not happen if you do the writes lets say
from '/dev/urandom' instead of '/dev/zero'<br>
<br>
I will provide you with a document for testing this
issue properly. I have a lot going on in my day job
so not getting enough time to write that out.
Considering the weekend is approaching I will get a
bit of time definitely over the weekend so I will
send you the document over the weekend.<br>
<br>
Pranith<br>
<blockquote type="cite">
<div dir="ltr">
<div> <br>
</div>
<div>We would like to be sure, that in
situation, when one of storages is down, the
VM-s are running - it is OK, we see this.</div>
<div>We would like to be sure, that data after
the server is back up is synced - we can't see
that atm</div>
<div>We would like to be sure, that VMs are
failovering to the second storage during the
outage - we can't see this atm </div>
<div>:(</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-07 9:12
GMT+03:00 Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> <br>
<div>On 08/07/2014 11:33 AM, Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">File size increases
because of me :) I generate files on
VM from /dev/zero during the outage of
one server. Then I bring up the downed
server and it seems files never sync.
I'll keep on testing today. Can't read
much from logs also :(. This morning
both VM-s (one on volume with
self-healing and other on volume
without it) survived second server
outage (first server was down
yesterday), while file sizes are
different, VM-s ran without problems.
But I've restarted them before
bringing the second gluster server
down. <br>
</div>
</blockquote>
Then there is no bug :-). It seems the
files are already in sync according to the
extended attributes you have pasted. How
to do you test if the files are in sync or
not?<br>
<br>
Pranith<br>
<blockquote type="cite">
<div dir="ltr">
<div> <br>
</div>
<div>So I'm a bit lost at this moment.
I'll try to keep my testings ordered
and write here, what will happen.</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">2014-08-07
8:29 GMT+03:00 Pranith Kumar
Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div> <br>
<div>On 08/07/2014 10:46 AM,
Roman wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">yes, they do.
<div><br>
</div>
<div>
<div>getfattr: Removing
leading '/' from
absolute path names</div>
<div># file:
exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-0=0x000000000000000000000000</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-1=0x000000000000000000000000</div>
<div>trusted.gfid=0x207984df4e6e4ef983f285ed0c4ce8fa</div>
<div><br>
</div>
<div>root@stor1:~# du
-sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>1.6G
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor1:~#
md5sum
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>c117d73c9f8a2e09ef13da31f7225fa6
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor1:~# du
-sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>1.6G
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>root@stor2:~#
getfattr -d -m. -e hex
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>getfattr: Removing
leading '/' from
absolute path names</div>
<div># file:
exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-0=0x000000000000000000000000</div>
<div>trusted.afr.HA-MED-PVE1-1T-client-1=0x000000000000000000000000</div>
<div>trusted.gfid=0x207984df4e6e4ef983f285ed0c4ce8fa</div>
<div><br>
</div>
<div>root@stor2:~#
md5sum
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>c117d73c9f8a2e09ef13da31f7225fa6
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>root@stor2:~# du
-sh
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
<div>2.6G
/exports/pve1/1T/images/125/vm-125-disk-1.qcow2</div>
</div>
</div>
</blockquote>
</div>
I think the files are differing
in size because of the sparse
file healing issue. Could you
raise a bug with steps to
re-create this issue where after
healing size of the file is
increasing?<span><font color="#888888"><br>
<br>
Pranith</font></span>
<div>
<div><br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">
2014-08-06 12:49
GMT+03:00 Humble
Chirammal <span dir="ltr"><<a href="mailto:hchiramm@redhat.com" target="_blank">hchiramm@redhat.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><br>
<br>
<br>
----- Original
Message -----<br>
| From: "Pranith
Kumar Karampuri"
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>><br>
| To: "Roman" <<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>><br>
| Cc: <a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a>,
"Niels de Vos"
<<a href="mailto:ndevos@redhat.com" target="_blank">ndevos@redhat.com</a>>,
"Humble Chirammal"
<<a href="mailto:hchiramm@redhat.com" target="_blank">hchiramm@redhat.com</a>><br>
| Sent: Wednesday,
August 6, 2014
12:09:57 PM<br>
| Subject: Re:
[Gluster-users]
libgfapi failover
problem on replica
bricks<br>
|<br>
| Roman,<br>
| The file
went into
split-brain. I
think we should do
these tests<br>
| with 3.5.2.
Where monitoring
the heals is
easier. Let me
also come up<br>
| with a document
about how to do
this testing you
are trying to do.<br>
|<br>
| Humble/Niels,<br>
| Do we have
debs available for
3.5.2? In 3.5.1
there was
packaging<br>
| issue where
/usr/bin/glfsheal
is not packaged
along with the
deb. I<br>
| think that
should be fixed
now as well?<br>
|<br>
</div>
Pranith,<br>
<br>
The 3.5.2 packages
for debian is not
available yet. We
are co-ordinating
internally to get it
processed.<br>
I will update the
list once its
available.<br>
<br>
--Humble<br>
<div>|<br>
| On 08/06/2014
11:52 AM, Roman
wrote:<br>
| > good
morning,<br>
| ><br>
| >
root@stor1:~#
getfattr -d -m. -e
hex<br>
| >
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| > getfattr:
Removing leading
'/' from absolute
path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| > getfattr:
Removing leading
'/' from absolute
path names<br>
| > # file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000<br>
| >
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
| >
trusted.gfid=0x23c79523075a4158bea38078da570449<br>
| ><br>
| ><br>
| ><br>
| > 2014-08-06
9:20 GMT+03:00
Pranith Kumar
Karampuri <<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
| > <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>| ><br>
| ><br>
| > On
08/06/2014 11:30
AM, Roman wrote:<br>
| >>
Also, this time
files are not the
same!<br>
| >><br>
| >>
root@stor1:~#
md5sum<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>
32411360c53116b96a059f17306caeda<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >><br>
| >>
root@stor2:~#
md5sum<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| >>
65b8a6031bcb6f5fb3a11cb1e8b1c9c9<br>
| >>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
| > What is
the getfattr
output?<br>
| ><br>
| > Pranith<br>
| ><br>
| >><br>
| >><br>
| >>
2014-08-05 16:33
GMT+03:00 Roman
<<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a><br>
</div>
| >>
<mailto:<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>>>:<br>
<div>| >><br>
| >>
Nope, it is not
working. But this
time it went a bit
other way<br>
| >><br>
| >>
root@gluster-client:~#
dmesg<br>
| >>
Segmentation fault<br>
| >><br>
| >><br>
| >>
I was not able
even to start the
VM after I done
the tests<br>
| >><br>
| >>
Could not read
qcow2 header:
Operation not
permitted<br>
| >><br>
| >>
And it seems, it
never starts to
sync files after
first<br>
| >>
disconnect. VM
survives first
disconnect, but
not second (I<br>
| >>
waited around 30
minutes). Also,
I've<br>
| >>
got
network.ping-timeout:
2 in volume
settings, but logs<br>
| >>
react on first
disconnect around
30 seconds. Second
was<br>
| >>
faster, 2 seconds.<br>
| >><br>
| >>
Reaction was
different also:<br>
| >><br>
| >>
slower one:<br>
| >>
[2014-08-05
13:26:19.558435] W
[socket.c:514:__socket_rwv]<br>
| >>
0-glusterfs: readv
failed (Connection
timed out)<br>
| >>
[2014-08-05
13:26:19.558485] W<br>
| >>
[socket.c:1962:__socket_proto_state_machine]
0-glusterfs:<br>
| >>
reading from
socket failed.
Error (Connection
timed out),<br>
</div>
| >>
peer (<a href="http://10.250.0.1:24007" target="_blank">10.250.0.1:24007</a> <<a href="http://10.250.0.1:24007" target="_blank">http://10.250.0.1:24007</a>>)<br>
<div>| >>
[2014-08-05
13:26:21.281426] W
[socket.c:514:__socket_rwv]<br>
| >>
0-HA-fast-150G-PVE1-client-0:
readv failed
(Connection timed
out)<br>
| >>
[2014-08-05
13:26:21.281474] W<br>
| >>
[socket.c:1962:__socket_proto_state_machine]<br>
| >>
0-HA-fast-150G-PVE1-client-0:
reading from
socket failed.<br>
| >>
Error (Connection
timed out), peer (<a href="http://10.250.0.1:49153" target="_blank">10.250.0.1:49153</a><br>
</div>
| >>
<<a href="http://10.250.0.1:49153" target="_blank">http://10.250.0.1:49153</a>>)<br>
<div>| >>
[2014-08-05
13:26:21.281507] I<br>
| >>
[client.c:2098:client_rpc_notify]<br>
| >>
0-HA-fast-150G-PVE1-client-0:
disconnected<br>
| >><br>
| >>
the fast one:<br>
| >>
2014-08-05
12:52:44.607389] C<br>
| >>
[client-handshake.c:127:rpc_client_ping_timer_expired]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
server <a href="http://10.250.0.2:49153" target="_blank">10.250.0.2:49153</a><br>
</div>
| >>
<<a href="http://10.250.0.2:49153" target="_blank">http://10.250.0.2:49153</a>>
has not responded in
the last 2<br>
<div>
<div>| >>
seconds,
disconnecting.<br>
| >>
[2014-08-05
12:52:44.607491]
W
[socket.c:514:__socket_rwv]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
readv failed (No
data available)<br>
| >>
[2014-08-05
12:52:44.607585]
E<br>
| >>
[rpc-clnt.c:368:saved_frames_unwind]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >>
[0x7fcb1b4b0558]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >>
[0x7fcb1b4aea63]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >>
[0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >>
unwinding
frame
type(GlusterFS
3.3)
op(LOOKUP(27))
called at<br>
| >>
2014-08-05
12:52:42.463881
(xid=0x381883x)<br>
| >>
[2014-08-05
12:52:44.607604]
W<br>
| >>
[client-rpc-fops.c:2624:client3_3_lookup_cbk]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
remote operation
failed:<br>
| >>
Transport
endpoint is not
connected. Path:
/<br>
| >>
(00000000-0000-0000-0000-000000000001)<br>
| >>
[2014-08-05
12:52:44.607736]
E<br>
| >>
[rpc-clnt.c:368:saved_frames_unwind]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)<br>
| >>
[0x7fcb1b4b0558]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)<br>
| >>
[0x7fcb1b4aea63]<br>
| >>
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)<br>
| >>
[0x7fcb1b4ae97e])))
0-HA-fast-150G-PVE1-client-1:
forced<br>
| >>
unwinding
frame
type(GlusterFS
Handshake)
op(PING(3))
called<br>
| >>
at 2014-08-05
12:52:42.463891
(xid=0x381884x)<br>
| >>
[2014-08-05
12:52:44.607753]
W<br>
| >>
[client-handshake.c:276:client_ping_cbk]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
timer must have
expired<br>
| >>
[2014-08-05
12:52:44.607776]
I<br>
| >>
[client.c:2098:client_rpc_notify]<br>
| >>
0-HA-fast-150G-PVE1-client-1:
disconnected<br>
| >><br>
| >><br>
| >><br>
| >>
I've got SSD
disks (just for
an info).<br>
| >>
Should I go
and give a try
for 3.5.2?<br>
| >><br>
| >><br>
| >><br>
| >>
2014-08-05
13:06 GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
</div>
| >>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>| >><br>
| >>
reply along
with gluster-users
please :-). May be
you are<br>
| >>
hitting
'reply' instead of
'reply all'?<br>
| >><br>
| >>
Pranith<br>
| >><br>
| >>
On 08/05/2014
03:35 PM, Roman
wrote:<br>
| >>>
To make
sure and clean,
I've created
another VM with
raw<br>
| >>>
format and
goint to repeat
those steps. So
now I've got<br>
| >>>
two VM-s
one with qcow2
format and other
with raw<br>
| >>>
format. I
will send another
e-mail shortly.<br>
| >>><br>
| >>><br>
| >>>
2014-08-05
13:01 GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
| >>>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>><br>
| >>><br>
| >>>
On
08/05/2014 03:07
PM, Roman wrote:<br>
|
>>>>
really, seems
like the same
file<br>
|
>>>><br>
|
>>>>
stor1:<br>
|
>>>>
a951641c5230472929836f9fcede6b04<br>
|
>>>>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>><br>
|
>>>>
stor2:<br>
|
>>>>
a951641c5230472929836f9fcede6b04<br>
|
>>>>
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>><br>
|
>>>><br>
|
>>>>
one thing I've
seen from logs,
that somehow
proxmox<br>
|
>>>>
VE is connecting
with wrong
version to
servers?<br>
|
>>>>
[2014-08-05
09:23:45.218550]
I<br>
|
>>>>
[client-handshake.c:1659:select_server_supported_programs]<br>
|
>>>>
0-HA-fast-150G-PVE1-client-0:
Using Program<br>
|
>>>>
GlusterFS 3.3,
Num (1298437),
Version (330)<br>
| >>>
It
is the rpc (over
the network data
structures)<br>
| >>>
version, which
is not changed
at all from 3.3
so<br>
| >>>
thats not a
problem. So what
is the
conclusion? Is<br>
| >>>
your test case
working now or
not?<br>
| >>><br>
| >>>
Pranith<br>
| >>><br>
|
>>>>
but if I issue:<br>
|
>>>>
root@pve1:~#
glusterfs -V<br>
|
>>>>
glusterfs 3.4.4
built on Jun 28
2014 03:44:57<br>
|
>>>>
seems ok.<br>
|
>>>><br>
|
>>>>
server use
3.4.4 meanwhile<br>
|
>>>>
[2014-08-05
09:23:45.117875]
I<br>
|
>>>>
[server-handshake.c:567:server_setvolume]<br>
|
>>>>
0-HA-fast-150G-PVE1-server:
accepted client
from<br>
|
>>>>
stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0<br>
|
>>>>
(version: 3.4.4)<br>
|
>>>>
[2014-08-05
09:23:49.103035]
I<br>
|
>>>>
[server-handshake.c:567:server_setvolume]<br>
|
>>>>
0-HA-fast-150G-PVE1-server:
accepted client
from<br>
|
>>>>
stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>
(version: 3.4.4)<br>
|
>>>><br>
|
>>>>
if this could be
the reason, of
course.<br>
|
>>>>
I did restart
the Proxmox VE
yesterday (just
for an<br>
|
>>>>
information)<br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>>
2014-08-05 12:30
GMT+03:00
Pranith Kumar
Karampuri<br>
</div>
</div>
| >>>>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>><br>
|
>>>><br>
|
>>>>
On
08/05/2014 02:33
PM, Roman wrote:<br>
|
>>>>>
Waited long
enough for now,
still different<br>
|
>>>>>
sizes and no
logs about
healing :(<br>
|
>>>>><br>
|
>>>>>
stor1<br>
|
>>>>>
# file:<br>
|
>>>>>
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
|
>>>>>
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
|
>>>>><br>
|
>>>>>
root@stor1:~# du
-sh<br>
|
>>>>>
/exports/fast-test/150G/images/127/<br>
|
>>>>>
1.2G
/exports/fast-test/150G/images/127/<br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
stor2<br>
|
>>>>>
# file:<br>
|
>>>>>
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000<br>
|
>>>>>
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000<br>
|
>>>>>
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921<br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
root@stor2:~# du
-sh<br>
|
>>>>>
/exports/fast-test/150G/images/127/<br>
|
>>>>>
1.4G
/exports/fast-test/150G/images/127/<br>
|
>>>>
According to
the changelogs,
the file doesn't<br>
|
>>>>
need any
healing. Could
you stop the
operations<br>
|
>>>>
on the VMs
and take md5sum
on both these
machines?<br>
|
>>>><br>
|
>>>>
Pranith<br>
|
>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
2014-08-05
11:49 GMT+03:00
Pranith Kumar<br>
|
>>>>>
Karampuri
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>><br>
|
>>>>><br>
|
>>>>>
On
08/05/2014 02:06
PM, Roman wrote:<br>
|
>>>>>>
Well, it
seems like it
doesn't see the<br>
|
>>>>>>
changes
were made to the
volume ? I<br>
|
>>>>>>
created
two files 200
and 100 MB (from<br>
|
>>>>>>
/dev/zero) after
I disconnected
the first<br>
|
>>>>>>
brick.
Then connected
it back and got<br>
|
>>>>>>
these
logs:<br>
|
>>>>>><br>
|
>>>>>>
[2014-08-05
08:30:37.830150]
I<br>
|
>>>>>>
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
|
>>>>>>
0-glusterfs: No
change in
volfile,
continuing<br>
|
>>>>>>
[2014-08-05
08:30:37.830207]
I<br>
|
>>>>>>
[rpc-clnt.c:1676:rpc_clnt_reconfig]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
changing<br>
|
>>>>>>
port to
49153 (from 0)<br>
|
>>>>>>
[2014-08-05
08:30:37.830239]
W<br>
|
>>>>>>
[socket.c:514:__socket_rwv]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
readv<br>
|
>>>>>>
failed
(No data
available)<br>
|
>>>>>>
[2014-08-05
08:30:37.831024]
I<br>
|
>>>>>>
[client-handshake.c:1659:select_server_supported_programs]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Using<br>
|
>>>>>>
Program
GlusterFS 3.3,
Num (1298437),<br>
|
>>>>>>
Version
(330)<br>
|
>>>>>>
[2014-08-05
08:30:37.831375]
I<br>
|
>>>>>>
[client-handshake.c:1456:client_setvolume_cbk]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Connected<br>
|
>>>>>>
to <a href="http://10.250.0.1:49153" target="_blank">10.250.0.1:49153</a><br>
</div>
</div>
|
>>>>>>
<<a href="http://10.250.0.1:49153" target="_blank">http://10.250.0.1:49153</a>>,
attached to<br>
<div>
<div>|
>>>>>>
remote
volume
'/exports/fast-test/150G'.<br>
|
>>>>>>
[2014-08-05
08:30:37.831394]
I<br>
|
>>>>>>
[client-handshake.c:1468:client_setvolume_cbk]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Server and<br>
|
>>>>>>
Client
lk-version
numbers are not
same,<br>
|
>>>>>>
reopening the
fds<br>
|
>>>>>>
[2014-08-05
08:30:37.831566]
I<br>
|
>>>>>>
[client-handshake.c:450:client_set_lk_version_cbk]<br>
|
>>>>>>
0-HA-fast-150G-PVE1-client-0:
Server lk<br>
|
>>>>>>
version
= 1<br>
|
>>>>>><br>
|
>>>>>><br>
|
>>>>>>
[2014-08-05
08:30:37.830150]
I<br>
|
>>>>>>
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]<br>
|
>>>>>>
0-glusterfs: No
change in
volfile,
continuing<br>
|
>>>>>>
this
line seems weird
to me tbh.<br>
|
>>>>>>
I do not
see any traffic
on switch<br>
|
>>>>>>
interfaces
between gluster
servers, which<br>
|
>>>>>>
means,
there is no
syncing between
them.<br>
|
>>>>>>
I tried
to ls -l the
files on the
client<br>
|
>>>>>>
and
servers to
trigger the
healing, but<br>
|
>>>>>>
seems
like no success.
Should I wait
more?<br>
|
>>>>>
Yes, it
should take
around 10-15
minutes.<br>
|
>>>>>
Could
you provide
'getfattr -d -m.
-e hex<br>
|
>>>>>
<file-on-brick>'
on both the
bricks.<br>
|
>>>>><br>
|
>>>>>
Pranith<br>
|
>>>>><br>
|
>>>>>><br>
|
>>>>>><br>
|
>>>>>>
2014-08-05 11:25
GMT+03:00
Pranith Kumar<br>
|
>>>>>>
Karampuri <<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>><br>
|
>>>>>><br>
|
>>>>>>
On
08/05/2014 01:10
PM, Roman wrote:<br>
|
>>>>>>>
Ahha! For some
reason I was not
able<br>
|
>>>>>>>
to
start the VM
anymore, Proxmox
VE<br>
|
>>>>>>>
told
me, that it is
not able to read<br>
|
>>>>>>>
the
qcow2 header due
to permission<br>
|
>>>>>>>
is
denied for some
reason. So I
just<br>
|
>>>>>>>
deleted that
file and created
a new<br>
|
>>>>>>>
VM.
And the nex
message I've got
was<br>
|
>>>>>>>
this:<br>
|
>>>>>>
Seems like these
are the messages<br>
|
>>>>>>
where you took
down the bricks
before<br>
|
>>>>>>
self-heal. Could
you restart the
run<br>
|
>>>>>>
waiting for
self-heals to
complete<br>
|
>>>>>>
before taking
down the next
brick?<br>
|
>>>>>><br>
|
>>>>>>
Pranith<br>
|
>>>>>><br>
|
>>>>>>><br>
|
>>>>>>><br>
|
>>>>>>>
[2014-08-05
07:31:25.663412]
E<br>
|
>>>>>>>
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]<br>
|
>>>>>>>
0-HA-fast-150G-PVE1-replicate-0:<br>
|
>>>>>>>
Unable to
self-heal
contents of<br>
|
>>>>>>>
'/images/124/vm-124-disk-1.qcow2'<br>
|
>>>>>>>
(possible
split-brain).
Please<br>
|
>>>>>>>
delete the file
from all but the<br>
|
>>>>>>>
preferred
subvolume.-
Pending<br>
|
>>>>>>>
matrix: [ [ 0
60 ] [ 11 0 ] ]<br>
|
>>>>>>>
[2014-08-05
07:31:25.663955]
E<br>
|
>>>>>>>
[afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]<br>
|
>>>>>>>
0-HA-fast-150G-PVE1-replicate-0:<br>
|
>>>>>>>
background data
self-heal failed
on<br>
|
>>>>>>>
/images/124/vm-124-disk-1.qcow2<br>
|
>>>>>>><br>
|
>>>>>>><br>
|
>>>>>>><br>
|
>>>>>>>
2014-08-05 10:13
GMT+03:00
Pranith<br>
|
>>>>>>>
Kumar Karampuri
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>><br>
|
>>>>>>>
I just responded
to your earlier<br>
|
>>>>>>>
mail about how
the log looks.<br>
|
>>>>>>>
The log comes on
the mount's
logfile<br>
|
>>>>>>><br>
|
>>>>>>>
Pranith<br>
|
>>>>>>><br>
|
>>>>>>>
On 08/05/2014
12:41 PM, Roman
wrote:<br>
|
>>>>>>>>
Ok, so I've
waited enough, I<br>
|
>>>>>>>>
think. Had no
any traffic on<br>
|
>>>>>>>>
switch ports
between servers.<br>
|
>>>>>>>>
Could not find
any suitable log<br>
|
>>>>>>>>
message about
completed<br>
|
>>>>>>>>
self-heal
(waited about 30<br>
|
>>>>>>>>
minutes).
Plugged out the
other<br>
|
>>>>>>>>
server's UTP
cable this time<br>
|
>>>>>>>>
and got in the
same situation:<br>
|
>>>>>>>>
root@gluster-test1:~#
cat<br>
|
>>>>>>>>
/var/log/dmesg<br>
|
>>>>>>>>
-bash: /bin/cat:
Input/output
error<br>
|
>>>>>>>><br>
|
>>>>>>>>
brick logs:<br>
|
>>>>>>>>
[2014-08-05
07:09:03.005474]
I<br>
|
>>>>>>>>
[server.c:762:server_rpc_notify]<br>
|
>>>>>>>>
0-HA-fast-150G-PVE1-server:<br>
|
>>>>>>>>
disconnecting
connectionfrom<br>
|
>>>>>>>>
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>>>>>
[2014-08-05
07:09:03.005530]
I<br>
|
>>>>>>>>
[server-helpers.c:729:server_connection_put]<br>
|
>>>>>>>>
0-HA-fast-150G-PVE1-server:<br>
|
>>>>>>>>
Shutting down
connection<br>
|
>>>>>>>>
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>>>>>
[2014-08-05
07:09:03.005560]
I<br>
|
>>>>>>>>
[server-helpers.c:463:do_fd_cleanup]<br>
|
>>>>>>>>
0-HA-fast-150G-PVE1-server:
fd<br>
|
>>>>>>>>
cleanup on<br>
|
>>>>>>>>
/images/124/vm-124-disk-1.qcow2<br>
|
>>>>>>>>
[2014-08-05
07:09:03.005797]
I<br>
|
>>>>>>>>
[server-helpers.c:617:server_connection_destroy]<br>
|
>>>>>>>>
0-HA-fast-150G-PVE1-server:<br>
|
>>>>>>>>
destroyed
connection of<br>
|
>>>>>>>>
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0<br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>>
2014-08-05 9:53
GMT+03:00<br>
|
>>>>>>>>
Pranith Kumar
Karampuri<br>
|
>>>>>>>>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>><br>
|
>>>>>>>>
Do you think
it is possible<br>
|
>>>>>>>>
for you to
do these tests<br>
|
>>>>>>>>
on the
latest version<br>
|
>>>>>>>>
3.5.2?
'gluster volume
heal<br>
|
>>>>>>>>
<volname>
info' would give<br>
|
>>>>>>>>
you that
information in<br>
|
>>>>>>>>
versions
> 3.5.1.<br>
|
>>>>>>>>
Otherwise
you will have to<br>
|
>>>>>>>>
check it
from either the<br>
|
>>>>>>>>
logs, there
will be<br>
|
>>>>>>>>
self-heal
completed
message<br>
|
>>>>>>>>
on the mount
logs (or) by<br>
|
>>>>>>>>
observing
'getfattr -d -m.<br>
|
>>>>>>>>
-e hex
<image-file-on-bricks>'<br>
|
>>>>>>>><br>
|
>>>>>>>>
Pranith<br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>>
On
08/05/2014 12:09
PM,<br>
|
>>>>>>>>
Roman wrote:<br>
|
>>>>>>>>>
Ok, I
understand. I
will<br>
|
>>>>>>>>>
try this
shortly.<br>
|
>>>>>>>>>
How can I be
sure, that<br>
|
>>>>>>>>>
healing
process is done,<br>
|
>>>>>>>>>
if I am not
able to see<br>
|
>>>>>>>>>
its status?<br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>>
2014-08-05
9:30 GMT+03:00<br>
|
>>>>>>>>>
Pranith
Kumar Karampuri<br>
|
>>>>>>>>>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>>>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>>><br>
|
>>>>>>>>>
Mounts
will do the<br>
|
>>>>>>>>>
healing,
not the<br>
|
>>>>>>>>>
self-heal-daemon.
The<br>
|
>>>>>>>>>
problem
I feel is that<br>
|
>>>>>>>>>
whichever
process does<br>
|
>>>>>>>>>
the
healing has the<br>
|
>>>>>>>>>
latest
information<br>
|
>>>>>>>>>
about
the good bricks<br>
|
>>>>>>>>>
in this
usecase. Since<br>
|
>>>>>>>>>
for VM
usecase, mounts<br>
|
>>>>>>>>>
should
have the latest<br>
|
>>>>>>>>>
information, we
should<br>
|
>>>>>>>>>
let the
mounts do the<br>
|
>>>>>>>>>
healing.
If the mount<br>
|
>>>>>>>>>
accesses
the VM image<br>
|
>>>>>>>>>
either
by someone<br>
|
>>>>>>>>>
doing
operations<br>
|
>>>>>>>>>
inside
the VM or<br>
|
>>>>>>>>>
explicit
stat on the<br>
|
>>>>>>>>>
file it
should do the<br>
|
>>>>>>>>>
healing.<br>
|
>>>>>>>>><br>
|
>>>>>>>>>
Pranith.<br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>>
On
08/05/2014 10:39<br>
|
>>>>>>>>>
AM,
Roman wrote:<br>
|
>>>>>>>>>>
Hmmm,
you told me to<br>
|
>>>>>>>>>>
turn it
off. Did I<br>
|
>>>>>>>>>>
understood
something<br>
|
>>>>>>>>>>
wrong?
After I issued<br>
|
>>>>>>>>>>
the
command you've<br>
|
>>>>>>>>>>
sent me,
I was not<br>
|
>>>>>>>>>>
able to
watch the<br>
|
>>>>>>>>>>
healing
process, it<br>
|
>>>>>>>>>>
said, it
won't be<br>
|
>>>>>>>>>>
healed,
becouse its<br>
|
>>>>>>>>>>
turned
off.<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>
2014-08-05 5:39<br>
|
>>>>>>>>>>
GMT+03:00
Pranith<br>
|
>>>>>>>>>>
Kumar
Karampuri<br>
|
>>>>>>>>>>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>>>>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>>>><br>
|
>>>>>>>>>>
You
didn't<br>
|
>>>>>>>>>>
mention anything<br>
|
>>>>>>>>>>
about<br>
|
>>>>>>>>>>
self-healing.
Did<br>
|
>>>>>>>>>>
you
wait until<br>
|
>>>>>>>>>>
the
self-heal is<br>
|
>>>>>>>>>>
complete?<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>
Pranith<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>
On
08/04/2014<br>
|
>>>>>>>>>>
05:49 PM, Roman<br>
|
>>>>>>>>>>
wrote:<br>
|
>>>>>>>>>>>
Hi!<br>
|
>>>>>>>>>>>
Result is pretty<br>
|
>>>>>>>>>>>
same. I set the<br>
|
>>>>>>>>>>>
switch port down<br>
|
>>>>>>>>>>>
for
1st server,<br>
|
>>>>>>>>>>>
it
was ok. Then<br>
|
>>>>>>>>>>>
set
it up back<br>
|
>>>>>>>>>>>
and
set other<br>
|
>>>>>>>>>>>
server's port<br>
|
>>>>>>>>>>>
off.
and it<br>
|
>>>>>>>>>>>
triggered IO<br>
|
>>>>>>>>>>>
error on two<br>
|
>>>>>>>>>>>
virtual<br>
|
>>>>>>>>>>>
machines: one<br>
|
>>>>>>>>>>>
with
local root<br>
|
>>>>>>>>>>>
FS
but network<br>
|
>>>>>>>>>>>
mounted storage.<br>
|
>>>>>>>>>>>
and
other with<br>
|
>>>>>>>>>>>
network root FS.<br>
|
>>>>>>>>>>>
1st
gave an<br>
|
>>>>>>>>>>>
error on copying<br>
|
>>>>>>>>>>>
to
or from the<br>
|
>>>>>>>>>>>
mounted network<br>
|
>>>>>>>>>>>
disk, other just<br>
|
>>>>>>>>>>>
gave
me an error<br>
|
>>>>>>>>>>>
for
even reading<br>
|
>>>>>>>>>>>
log.files.<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
cat:<br>
|
>>>>>>>>>>>
/var/log/alternatives.log:<br>
|
>>>>>>>>>>>
Input/output
error<br>
|
>>>>>>>>>>>
then
I reset the<br>
|
>>>>>>>>>>>
kvm
VM and it<br>
|
>>>>>>>>>>>
said
me, there<br>
|
>>>>>>>>>>>
is
no boot<br>
|
>>>>>>>>>>>
device. Next I<br>
|
>>>>>>>>>>>
virtually<br>
|
>>>>>>>>>>>
powered it off<br>
|
>>>>>>>>>>>
and
then back on<br>
|
>>>>>>>>>>>
and
it has booted.<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
By
the way, did<br>
|
>>>>>>>>>>>
I
have to<br>
|
>>>>>>>>>>>
start/stop
volume?<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
>> Could
you do<br>
|
>>>>>>>>>>>
the
following<br>
|
>>>>>>>>>>>
and
test it again?<br>
|
>>>>>>>>>>>
>> gluster
volume<br>
|
>>>>>>>>>>>
set
<volname><br>
|
>>>>>>>>>>>
cluster.self-heal-daemon<br>
|
>>>>>>>>>>>
off<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
>>Pranith<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
2014-08-04 14:10<br>
|
>>>>>>>>>>>
GMT+03:00<br>
|
>>>>>>>>>>>
Pranith Kumar<br>
|
>>>>>>>>>>>
Karampuri<br>
|
>>>>>>>>>>>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
</div>
</div>
|
>>>>>>>>>>>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<br>
<div>
<div>|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
On<br>
|
>>>>>>>>>>>
08/04/2014<br>
|
>>>>>>>>>>>
03:33 PM,<br>
|
>>>>>>>>>>>
Roman wrote:<br>
|
>>>>>>>>>>>>
Hello!<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
Facing the<br>
|
>>>>>>>>>>>>
same<br>
|
>>>>>>>>>>>>
problem as<br>
|
>>>>>>>>>>>>
mentioned<br>
|
>>>>>>>>>>>>
here:<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
<a href="http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html" target="_blank">http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html</a><br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
my set up<br>
|
>>>>>>>>>>>>
is up and<br>
|
>>>>>>>>>>>>
running, so<br>
|
>>>>>>>>>>>>
i'm ready<br>
|
>>>>>>>>>>>>
to help you<br>
|
>>>>>>>>>>>>
back with<br>
|
>>>>>>>>>>>>
feedback.<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
setup:<br>
|
>>>>>>>>>>>>
proxmox<br>
|
>>>>>>>>>>>>
server as<br>
|
>>>>>>>>>>>>
client<br>
|
>>>>>>>>>>>>
2 gluster<br>
|
>>>>>>>>>>>>
physical<br>
|
>>>>>>>>>>>>
servers<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
server side<br>
|
>>>>>>>>>>>>
and client<br>
|
>>>>>>>>>>>>
side both<br>
|
>>>>>>>>>>>>
running atm<br>
|
>>>>>>>>>>>>
3.4.4<br>
|
>>>>>>>>>>>>
glusterfs<br>
|
>>>>>>>>>>>>
from<br>
|
>>>>>>>>>>>>
gluster repo.<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
the problem is:<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
1. craeted<br>
|
>>>>>>>>>>>>
replica bricks.<br>
|
>>>>>>>>>>>>
2. mounted<br>
|
>>>>>>>>>>>>
in proxmox<br>
|
>>>>>>>>>>>>
(tried both<br>
|
>>>>>>>>>>>>
promox<br>
|
>>>>>>>>>>>>
ways: via<br>
|
>>>>>>>>>>>>
GUI and<br>
|
>>>>>>>>>>>>
fstab (with<br>
|
>>>>>>>>>>>>
backup<br>
|
>>>>>>>>>>>>
volume<br>
|
>>>>>>>>>>>>
line), btw<br>
|
>>>>>>>>>>>>
while<br>
|
>>>>>>>>>>>>
mounting<br>
|
>>>>>>>>>>>>
via fstab<br>
|
>>>>>>>>>>>>
I'm unable<br>
|
>>>>>>>>>>>>
to launch a<br>
|
>>>>>>>>>>>>
VM without<br>
|
>>>>>>>>>>>>
cache,<br>
|
>>>>>>>>>>>>
meanwhile<br>
|
>>>>>>>>>>>>
direct-io-mode<br>
|
>>>>>>>>>>>>
is enabled<br>
|
>>>>>>>>>>>>
in fstab line)<br>
|
>>>>>>>>>>>>
3. installed VM<br>
|
>>>>>>>>>>>>
4. bring<br>
|
>>>>>>>>>>>>
one volume<br>
|
>>>>>>>>>>>>
down - ok<br>
|
>>>>>>>>>>>>
5. bringing<br>
|
>>>>>>>>>>>>
up, waiting<br>
|
>>>>>>>>>>>>
for sync is<br>
|
>>>>>>>>>>>>
done.<br>
|
>>>>>>>>>>>>
6. bring<br>
|
>>>>>>>>>>>>
other<br>
|
>>>>>>>>>>>>
volume down<br>
|
>>>>>>>>>>>>
- getting<br>
|
>>>>>>>>>>>>
IO errors<br>
|
>>>>>>>>>>>>
on VM guest<br>
|
>>>>>>>>>>>>
and not<br>
|
>>>>>>>>>>>>
able to<br>
|
>>>>>>>>>>>>
restore the<br>
|
>>>>>>>>>>>>
VM after I<br>
|
>>>>>>>>>>>>
reset the<br>
|
>>>>>>>>>>>>
VM via<br>
|
>>>>>>>>>>>>
host. It<br>
|
>>>>>>>>>>>>
says (no<br>
|
>>>>>>>>>>>>
bootable<br>
|
>>>>>>>>>>>>
media).<br>
|
>>>>>>>>>>>>
After I<br>
|
>>>>>>>>>>>>
shut it<br>
|
>>>>>>>>>>>>
down<br>
|
>>>>>>>>>>>>
(forced)<br>
|
>>>>>>>>>>>>
and bring<br>
|
>>>>>>>>>>>>
back up, it<br>
|
>>>>>>>>>>>>
boots.<br>
|
>>>>>>>>>>>
Could you do<br>
|
>>>>>>>>>>>
the<br>
|
>>>>>>>>>>>
following<br>
|
>>>>>>>>>>>
and test it<br>
|
>>>>>>>>>>>
again?<br>
|
>>>>>>>>>>>
gluster<br>
|
>>>>>>>>>>>
volume set<br>
|
>>>>>>>>>>>
<volname><br>
|
>>>>>>>>>>>
cluster.self-heal-daemon<br>
|
>>>>>>>>>>>
off<br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
Pranith<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
Need help.<br>
|
>>>>>>>>>>>>
Tried<br>
|
>>>>>>>>>>>>
3.4.3, 3.4.4.<br>
|
>>>>>>>>>>>>
Still<br>
|
>>>>>>>>>>>>
missing<br>
|
>>>>>>>>>>>>
pkg-s for<br>
|
>>>>>>>>>>>>
3.4.5 for<br>
|
>>>>>>>>>>>>
debian and<br>
|
>>>>>>>>>>>>
3.5.2<br>
|
>>>>>>>>>>>>
(3.5.1<br>
|
>>>>>>>>>>>>
always<br>
|
>>>>>>>>>>>>
gives a<br>
|
>>>>>>>>>>>>
healing<br>
|
>>>>>>>>>>>>
error for<br>
|
>>>>>>>>>>>>
some reason)<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
--<br>
|
>>>>>>>>>>>>
Best regards,<br>
|
>>>>>>>>>>>>
Roman.<br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>><br>
|
>>>>>>>>>>>>
_______________________________________________<br>
|
>>>>>>>>>>>>
Gluster-users<br>
|
>>>>>>>>>>>>
mailing list<br>
|
>>>>>>>>>>>>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
</div>
</div>
|
>>>>>>>>>>>>
<mailto:<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>><br>
<div>
<div>|
>>>>>>>>>>>>
<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>><br>
|
>>>>>>>>>>>
--<br>
|
>>>>>>>>>>>
Best
regards,<br>
|
>>>>>>>>>>>
Roman.<br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>><br>
|
>>>>>>>>>>
--<br>
|
>>>>>>>>>>
Best
regards,<br>
|
>>>>>>>>>>
Roman.<br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>><br>
|
>>>>>>>>>
--<br>
|
>>>>>>>>>
Best
regards,<br>
|
>>>>>>>>>
Roman.<br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>><br>
|
>>>>>>>>
--<br>
|
>>>>>>>>
Best regards,<br>
|
>>>>>>>>
Roman.<br>
|
>>>>>>><br>
|
>>>>>>><br>
|
>>>>>>><br>
|
>>>>>>><br>
|
>>>>>>>
--<br>
|
>>>>>>>
Best
regards,<br>
|
>>>>>>>
Roman.<br>
|
>>>>>><br>
|
>>>>>><br>
|
>>>>>><br>
|
>>>>>><br>
|
>>>>>>
--<br>
|
>>>>>>
Best
regards,<br>
|
>>>>>>
Roman.<br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>><br>
|
>>>>>
--<br>
|
>>>>>
Best
regards,<br>
|
>>>>>
Roman.<br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>><br>
|
>>>>
--<br>
|
>>>>
Best regards,<br>
|
>>>>
Roman.<br>
| >>><br>
| >>><br>
| >>><br>
| >>><br>
| >>>
--<br>
| >>>
Best
regards,<br>
| >>>
Roman.<br>
| >><br>
| >><br>
| >><br>
| >><br>
| >>
--<br>
| >>
Best regards,<br>
| >>
Roman.<br>
| >><br>
| >><br>
| >><br>
| >><br>
| >>
--<br>
| >>
Best regards,<br>
| >>
Roman.<br>
| ><br>
| ><br>
| ><br>
| ><br>
| > --<br>
| > Best
regards,<br>
| > Roman.<br>
|<br>
|<br>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<span><font color="#888888">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </font></span></div>
<span><font color="#888888"> </font></span></blockquote>
<span><font color="#888888">
<br>
</font></span></div>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> </font></span></blockquote>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> <br>
<br clear="all">
<span><font color="#888888">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </font></span></font></span></div>
<span><font color="#888888"> </font></span></blockquote>
<span><font color="#888888"> <br>
</font></span></div>
<span><font color="#888888"> </font></span></blockquote>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> <br>
<br clear="all">
<span><font color="#888888">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman. </font></span></font></span></div>
<span><font color="#888888"> </font></span></blockquote>
<span><font color="#888888"> <br>
</font></span></div>
<span><font color="#888888">
</font></span></blockquote>
</div>
<span><font color="#888888"><br>
<br clear="all"><span class="HOEnZb"><font color="#888888">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman.
</font></span></font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></blockquote><span class="HOEnZb"><font color="#888888">
</font></span></div><span class="HOEnZb"><font color="#888888">
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Best regards,<br>
Roman.
</font></span></div>
</blockquote>
<br>
</div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Best regards,<br>Roman.
</div>