<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=ks_c_5601-1987">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:±¼¸²;
        panose-1:2 11 6 0 0 1 1 1 1 1;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:"¸¼Àº °íµñ";
        panose-1:2 11 5 3 2 0 0 2 0 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:"\@¸¼Àº °íµñ";
        panose-1:2 11 5 3 2 0 0 2 0 4;}
@font-face
        {font-family:"\@±¼¸²";
        panose-1:2 11 6 0 0 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
pre
        {mso-style-priority:99;
        mso-style-link:"¹Ì¸® ¼½ÄÀÌ ÁöÁ¤µÈ HTML Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";
        color:black;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"dz¼± µµ¿ò¸» ÅؽºÆ® Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:9.0pt;
        font-family:"¸¼Àº °íµñ";
        color:black;}
span.HTMLChar
        {mso-style-name:"¹Ì¸® ¼½ÄÀÌ ÁöÁ¤µÈ HTML Char";
        mso-style-priority:99;
        mso-style-link:"¹Ì¸® ¼½ÄÀÌ ÁöÁ¤µÈ HTML";
        font-family:"Courier New";
        color:black;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"¸¼Àº °íµñ";
        color:#1F497D;}
span.Char
        {mso-style-name:"dz¼± µµ¿ò¸» ÅؽºÆ® Char";
        mso-style-priority:99;
        mso-style-link:"dz¼± µµ¿ò¸» ÅؽºÆ®";
        font-family:"¸¼Àº °íµñ";
        color:black;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page Section1
        {size:612.0pt 792.0pt;
        margin:3.0cm 72.0pt 72.0pt 72.0pt;}
div.Section1
        {page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body bgcolor=white lang=KO link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"¸¼Àº °íµñ";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"¸¼Àº °íµñ";
color:#1F497D'>Maybe the stalling time could be decreased if you set the gluster.data-self-heal-algorithm
as ¡®diff¡¯ <o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"¸¼Àº °íµñ";
color:#1F497D'>But don¡¯t expect too much speed up. <o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"¸¼Àº °íµñ";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"¸¼Àº °íµñ";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"¸¼Àº °íµñ";
color:#1F497D'><o:p> </o:p></span></p>
<div>
<table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0>
<tr>
<td style='padding:0cm 0cm 0cm 0cm'></td>
</tr>
<tr>
<td style='padding:3.75pt 3.75pt 0cm 11.25pt'></td>
</tr>
<tr>
<td style='padding:3.75pt 3.75pt 0cm 11.25pt'></td>
</tr>
</table>
</div>
<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"¸¼Àº °íµñ";
color:#1F497D'><o:p> </o:p></span></p>
<div>
<div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'>
<p class=MsoNormal><b><span lang=EN-US style='font-size:10.0pt;font-family:
"Tahoma","sans-serif";color:windowtext'>From:</span></b><span lang=EN-US
style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'>
gluster-users-bounces@gluster.org [mailto:gluster-users-bounces@gluster.org] <b>On
Behalf Of </b>Peter Linder<br>
<b>Sent:</b> Tuesday, October 11, 2011 6:06 PM<br>
<b>To:</b> gluster-users@gluster.org<br>
<b>Subject:</b> Re: [Gluster-users] Problem with VM images when one node goes
online (self-healing) on a 2 node replication gluster for VMware datastore<o:p></o:p></span></p>
</div>
</div>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>With 3.2.4 during self-heal, no operation
on the file being healed is allowed so your VM's will stall and time out if the
self-heal isn't finished quick enough. gluster 3.3 will fix this, but I don't
know when it will be released. There are betas to try out though :). Perhaps
somebody else can say how stable 3.3-beta2 is compared to 3.2.4?<br>
<br>
On 10/11/2011 10:57 AM, keith wrote: <o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Hi all <br>
<br>
I am testing gluster-3.2.4 on a 2 nodes storage with replication as our VMware
datastore. <br>
<br>
The setup is running replication on 2 nodes with ucarp and mount it on WMware
using NFS to gluster as a datastore. <br>
<br>
<br>
<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Volume Name: GLVOL1 <br>
Type: Replicate <br>
Status: Started <br>
Number of Bricks: 2 <br>
Transport-type: tcp <br>
Bricks: <br>
Brick1: t4-01.store:/EXPORT/GLVOL1 <br>
Brick2: t4-03.store:/EXPORT/GLVOL1 <br>
Options Reconfigured: <br>
performance.cache-size: 4096MB <o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><br>
High-availability testing goes on smoothly without any problem or
data-corruption, that is when any node is down, all VM guests runs normally
without any problem. <br>
<br>
The problem arises when I bring up the failed node and the node start doing
self-healing. All my VM guests get kernel error messages and finally the
VM guests ended up with "EXT3-fs error: ext3_journal_start_sb: detected
aborted journal" remount filesystem (root) as read-only. <br>
<br>
Below are some of the VM guests kernel error generated when I bring up the
failed gluster node for self-healing: <br>
<br>
<br>
<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Oct 11 15:57:58 testvm3 kernel: pvscsi:
task abort on host 1, ffff8100221c90c0 <br>
Oct 11 15:57:58 testvm3 kernel: pvscsi: task abort on host 1, ffff8100221c9240 <br>
Oct 11 15:57:58 testvm3 kernel: pvscsi: task abort on host 1, ffff8100221c93c0 <br>
Oct 11 15:58:34 testvm3 kernel: INFO: task kjournald:2081 blocked for more than
120 seconds. <br>
Oct 11 15:58:34 testvm3 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message. <br>
Oct 11 15:58:34 testvm3 kernel: kjournald D
ffff810001736420 0 2081
14 2494 2060
(L-TLB) <br>
Oct 11 15:58:34 testvm3 kernel: ffff81003c087cf0 0000000000000046 ffff810030ef2288
ffff81003f5d6048 <br>
Oct 11 15:58:34 testvm3 kernel: 00000000037685c8 000000000000000a
ffff810037c53820 ffffffff80314b60 <br>
Oct 11 15:58:34 testvm3 kernel: 00001883cb68d47d 0000000000002c4e
ffff810037c53a08 000000003f5128b8 <br>
Oct 11 15:58:34 testvm3 kernel: Call Trace: <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8006ec8f>]
do_gettimeofday+0x40/0x90 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800155d3>] sync_buffer+0x0/0x3f
<br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800637ce>]
io_schedule+0x3f/0x67 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8001560e>]
sync_buffer+0x3b/0x3f <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800639fa>]
__wait_on_bit+0x40/0x6e <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800155d3>] sync_buffer+0x0/0x3f
<br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80063a94>]
out_of_line_wait_on_bit+0x6c/0x78 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800a2e2b>]
wake_bit_function+0x0/0x23 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff88033a41>]
:jbd:journal_commit_transaction+0x553/0x10aa <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8003d85b>]
lock_timer_base+0x1b/0x3c <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8004ad98>]
try_to_del_timer_sync+0x7f/0x88 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff88037662>]
:jbd:kjournald+0xc1/0x213 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800a2dfd>]
autoremove_wake_function+0x0/0x2e <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800a2be5>]
keventd_create_kthread+0x0/0xc4 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff880375a1>]
:jbd:kjournald+0x0/0x213 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800a2be5>]
keventd_create_kthread+0x0/0xc4 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80032722>] kthread+0xfe/0x132 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800a2be5>]
keventd_create_kthread+0x0/0xc4 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80032624>] kthread+0x0/0x132 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11 <br>
Oct 11 15:58:34 testvm3 kernel: <br>
Oct 11 15:58:34 testvm3 kernel: INFO: task crond:3418 blocked for more than 120
seconds. <br>
Oct 11 15:58:34 testvm3 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message. <br>
Oct 11 15:58:34 testvm3 kernel:
crond D
ffff810001736420 0
3418
1 3436 3405 (NOTLB)
<br>
Oct 11 15:58:34 testvm3 kernel: ffff810036c55ca8 0000000000000086
0000000000000000 ffffffff80019e3e <br>
Oct 11 15:58:34 testvm3 kernel: 0000000000065bf2 0000000000000007
ffff81003ce4b080 ffffffff80314b60 <br>
Oct 11 15:58:34 testvm3 kernel: 000018899ae16270 0000000000023110
ffff81003ce4b268 000000008804ec00 <br>
Oct 11 15:58:34 testvm3 kernel: Call Trace: <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80019e3e>] __getblk+0x25/0x22c <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8006ec8f>]
do_gettimeofday+0x40/0x90 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800155d3>] sync_buffer+0x0/0x3f
<br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800637ce>]
io_schedule+0x3f/0x67 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8001560e>]
sync_buffer+0x3b/0x3f <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80063912>]
__wait_on_bit_lock+0x36/0x66 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800155d3>] sync_buffer+0x0/0x3f
<br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800639ae>]
out_of_line_wait_on_bit_lock+0x6c/0x78 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800a2e2b>]
wake_bit_function+0x0/0x23 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8803181e>]
:jbd:do_get_write_access+0x54/0x522 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80019e3e>] __getblk+0x25/0x22c <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff88031d0e>]
:jbd:journal_get_write_access+0x22/0x33 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8804dd37>]
:ext3:ext3_reserve_inode_write+0x38/0x90 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8804ddb0>]
:ext3:ext3_mark_inode_dirty+0x21/0x3c <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff88050d35>]
:ext3:ext3_dirty_inode+0x63/0x7b <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80013d98>]
__mark_inode_dirty+0x29/0x16e <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff80025a49>] filldir+0x0/0xb7 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8003516b>]
vfs_readdir+0x8c/0xa9 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff800389db>]
sys_getdents+0x75/0xbd <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8005d229>] tracesys+0x71/0xe0 <br>
Oct 11 15:58:34 testvm3 kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 <br>
Oct 11 15:58:34 testvm3 kernel: <br>
Oct 11 15:58:34 testvm3 kernel: INFO: task httpd:3452 blocked for more than 120
seconds. <br>
Oct 11 15:58:34 testvm3 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message. <br>
Oct 11 15:58:34 testvm3 kernel:
httpd D
ffff810001736420 0 3452
3405
3453 (NOTLB) <br>
Oct 11 15:58:34 testvm3 kernel: ffff810035ea9dc8 0000000000000086
0000000000000000 ffffffff80009a1c <br>
Oct 11 15:58:34 testvm3 kernel: ffff810035ea9e28 0000000000000009
ffff810037e52080 ffffffff80314b60 <br>
Oct 11 15:58:34 testvm3 kernel: 000018839f75405c 000000000003363d
ffff810037e52268 000000003f5e7150 <o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><br>
Please note that although I am using ucarp for IP failover and by default ucarp
will alway have a preferred master, I have added codes to make sure that the
ucarp master will always become slave when it goes down and come up
again. This will ensure that WMware will not connect back to the failed
node when it comes back up. <br>
<br>
However this does not prevent the problem I describe above. <br>
<br>
There are a lot of logs generated during self-healing process. It doesn't
make any sense to me. I am attaching it. It's over 900k. So I zip them
up. Hopefully the mailling list allow attachment. <br>
<br>
Is there any best practices to setup/run gluster with replication as a
datastore to VMware that make sure VM guests run smoothly even when one node
goes into self-healing? <br>
<br>
Any advise is appreciated. <br>
<br>
Keith <br>
<br>
<br>
<br>
<br>
<o:p></o:p></span></p>
<pre><span lang=EN-US><o:p> </o:p></span></pre><pre><span lang=EN-US><o:p> </o:p></span></pre><pre><span
lang=EN-US>_______________________________________________<o:p></o:p></span></pre><pre><span
lang=EN-US>Gluster-users mailing list<o:p></o:p></span></pre><pre><span
lang=EN-US><a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><o:p></o:p></span></pre><pre><span
lang=EN-US><a href="http://gluster.org/cgi-bin/mailman/listinfo/gluster-users">http://gluster.org/cgi-bin/mailman/listinfo/gluster-users</a><o:p></o:p></span></pre>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
</div>
</body>
</html>