<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">


<head>

<meta http-equiv=Content-Type content="text/html; charset=ks_c_5601-1987">

<meta name=Generator content="Microsoft Word 12 (filtered medium)">

<style>

<!--

 /* Font Definitions */

 @font-face

        {font-family:굴림;

        panose-1:2 11 6 0 0 1 1 1 1 1;}

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:"맑은 고딕";

        panose-1:2 11 5 3 2 0 0 2 0 4;}

@font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

@font-face

        {font-family:"\@맑은 고딕";

        panose-1:2 11 5 3 2 0 0 2 0 4;}

@font-face

        {font-family:"\@굴림";

        panose-1:2 11 6 0 0 1 1 1 1 1;}

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman","serif";

        color:black;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

pre

        {mso-style-priority:99;

        mso-style-link:"미리 서식이 지정된 HTML Char";

        margin:0cm;

        margin-bottom:.0001pt;

        font-size:10.0pt;

        font-family:"Courier New";

        color:black;}

p.MsoAcetate, li.MsoAcetate, div.MsoAcetate

        {mso-style-priority:99;

        mso-style-link:"풍선 도움말 텍스트 Char";

        margin:0cm;

        margin-bottom:.0001pt;

        font-size:9.0pt;

        font-family:"맑은 고딕";

        color:black;}

span.HTMLChar

        {mso-style-name:"미리 서식이 지정된 HTML Char";

        mso-style-priority:99;

        mso-style-link:"미리 서식이 지정된 HTML";

        font-family:"Courier New";

        color:black;}

span.EmailStyle19

        {mso-style-type:personal-reply;

        font-family:"맑은 고딕";

        color:#1F497D;}

span.Char

        {mso-style-name:"풍선 도움말 텍스트 Char";

        mso-style-priority:99;

        mso-style-link:"풍선 도움말 텍스트";

        font-family:"맑은 고딕";

        color:black;}

.MsoChpDefault

        {mso-style-type:export-only;}

@page Section1

        {size:612.0pt 792.0pt;

        margin:3.0cm 72.0pt 72.0pt 72.0pt;}

div.Section1

        {page:Section1;}

-->

</style>

<!--[if gte mso 9]><xml>

 <o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

 <o:shapelayout v:ext="edit">

  <o:idmap v:ext="edit" data="1" />

 </o:shapelayout></xml><![endif]-->

</head>


<body bgcolor=white lang=KO link=blue vlink=purple>


<div class=Section1>


<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"맑은 고딕";

color:#1F497D'><o:p>&nbsp;</o:p></span></p>


<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"맑은 고딕";

color:#1F497D'>Maybe the stalling time could be decreased if you set the gluster.data-self-heal-algorithm

as ‘diff’ <o:p></o:p></span></p>


<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"맑은 고딕";

color:#1F497D'>But don’t expect too much speed up. <o:p></o:p></span></p>


<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"맑은 고딕";

color:#1F497D'><o:p>&nbsp;</o:p></span></p>


<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"맑은 고딕";

color:#1F497D'><o:p>&nbsp;</o:p></span></p>


<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"맑은 고딕";

color:#1F497D'><o:p>&nbsp;</o:p></span></p>


<div>


<table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0>

 <tr>

  <td style='padding:0cm 0cm 0cm 0cm'></td>

 </tr>

 <tr>

  <td style='padding:3.75pt 3.75pt 0cm 11.25pt'></td>

 </tr>

 <tr>

  <td style='padding:3.75pt 3.75pt 0cm 11.25pt'></td>

 </tr>

</table>


</div>


<p class=MsoNormal><span lang=EN-US style='font-size:10.0pt;font-family:"맑은 고딕";

color:#1F497D'><o:p>&nbsp;</o:p></span></p>


<div>


<div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'>


<p class=MsoNormal><b><span lang=EN-US style='font-size:10.0pt;font-family:

"Tahoma","sans-serif";color:windowtext'>From:</span></b><span lang=EN-US

style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'>

gluster-users-bounces@gluster.org [mailto:gluster-users-bounces@gluster.org] <b>On

Behalf Of </b>Peter Linder<br>

<b>Sent:</b> Tuesday, October 11, 2011 6:06 PM<br>

<b>To:</b> gluster-users@gluster.org<br>

<b>Subject:</b> Re: [Gluster-users] Problem with VM images when one node goes

online (self-healing) on a 2 node replication gluster for VMware datastore<o:p></o:p></span></p>


</div>


</div>


<p class=MsoNormal><span lang=EN-US><o:p>&nbsp;</o:p></span></p>


<p class=MsoNormal><span lang=EN-US>With 3.2.4 during self-heal, no operation

on the file being healed is allowed so your VM's will stall and time out if the

self-heal isn't finished quick enough. gluster 3.3 will fix this, but I don't

know when it will be released. There are betas to try out though :). Perhaps

somebody else can say how stable 3.3-beta2 is compared to 3.2.4?<br>

<br>

On 10/11/2011 10:57 AM, keith wrote: <o:p></o:p></span></p>


<p class=MsoNormal><span lang=EN-US>Hi all <br>

<br>

I am testing gluster-3.2.4 on a 2 nodes storage with replication as our VMware

datastore. <br>

<br>

The setup is running replication on 2 nodes with ucarp and mount it on WMware

using NFS to gluster as a datastore. <br>

<br>

<br>

<o:p></o:p></span></p>


<p class=MsoNormal><span lang=EN-US>Volume Name: GLVOL1 <br>

Type: Replicate <br>

Status: Started <br>

Number of Bricks: 2 <br>

Transport-type: tcp <br>

Bricks: <br>

Brick1: t4-01.store:/EXPORT/GLVOL1 <br>

Brick2: t4-03.store:/EXPORT/GLVOL1 <br>

Options Reconfigured: <br>

performance.cache-size: 4096MB <o:p></o:p></span></p>


<p class=MsoNormal><span lang=EN-US><br>

High-availability testing goes on smoothly without any problem or

data-corruption, that is when any node is down, all VM guests runs normally

without any problem. <br>

<br>

The problem arises when I bring up the failed node and the node start doing

self-healing.&nbsp; All my VM guests get kernel error messages and finally the

VM guests ended up with &quot;EXT3-fs error: ext3_journal_start_sb: detected

aborted journal&quot; remount filesystem (root) as read-only. <br>

<br>

Below are some of the VM guests kernel error generated when I bring up the

failed gluster node for self-healing: <br>

<br>

<br>

<o:p></o:p></span></p>


<p class=MsoNormal><span lang=EN-US>Oct 11 15:57:58 testvm3 kernel: pvscsi:

task abort on host 1, ffff8100221c90c0 <br>

Oct 11 15:57:58 testvm3 kernel: pvscsi: task abort on host 1, ffff8100221c9240 <br>

Oct 11 15:57:58 testvm3 kernel: pvscsi: task abort on host 1, ffff8100221c93c0 <br>

Oct 11 15:58:34 testvm3 kernel: INFO: task kjournald:2081 blocked for more than

120 seconds. <br>

Oct 11 15:58:34 testvm3 kernel: &quot;echo 0 &gt;

/proc/sys/kernel/hung_task_timeout_secs&quot; disables this message. <br>

Oct 11 15:58:34 testvm3 kernel: kjournald&nbsp;&nbsp;&nbsp;&nbsp; D

ffff810001736420&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; 2081&nbsp;&nbsp;&nbsp;&nbsp;

14&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2494&nbsp; 2060

(L-TLB) <br>

Oct 11 15:58:34 testvm3 kernel: ffff81003c087cf0 0000000000000046 ffff810030ef2288

ffff81003f5d6048 <br>

Oct 11 15:58:34 testvm3 kernel: 00000000037685c8 000000000000000a

ffff810037c53820 ffffffff80314b60 <br>

Oct 11 15:58:34 testvm3 kernel: 00001883cb68d47d 0000000000002c4e

ffff810037c53a08 000000003f5128b8 <br>

Oct 11 15:58:34 testvm3 kernel: Call Trace: <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8006ec8f&gt;]

do_gettimeofday+0x40/0x90 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800155d3&gt;] sync_buffer+0x0/0x3f

<br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800637ce&gt;]

io_schedule+0x3f/0x67 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8001560e&gt;]

sync_buffer+0x3b/0x3f <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800639fa&gt;]

__wait_on_bit+0x40/0x6e <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800155d3&gt;] sync_buffer+0x0/0x3f

<br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80063a94&gt;]

out_of_line_wait_on_bit+0x6c/0x78 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800a2e2b&gt;]

wake_bit_function+0x0/0x23 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff88033a41&gt;]

:jbd:journal_commit_transaction+0x553/0x10aa <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8003d85b&gt;]

lock_timer_base+0x1b/0x3c <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8004ad98&gt;]

try_to_del_timer_sync+0x7f/0x88 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff88037662&gt;]

:jbd:kjournald+0xc1/0x213 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800a2dfd&gt;]

autoremove_wake_function+0x0/0x2e <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800a2be5&gt;]

keventd_create_kthread+0x0/0xc4 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff880375a1&gt;]

:jbd:kjournald+0x0/0x213 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800a2be5&gt;]

keventd_create_kthread+0x0/0xc4 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80032722&gt;] kthread+0xfe/0x132 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8005dfb1&gt;] child_rip+0xa/0x11 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800a2be5&gt;]

keventd_create_kthread+0x0/0xc4 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80032624&gt;] kthread+0x0/0x132 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8005dfa7&gt;] child_rip+0x0/0x11 <br>

Oct 11 15:58:34 testvm3 kernel: <br>

Oct 11 15:58:34 testvm3 kernel: INFO: task crond:3418 blocked for more than 120

seconds. <br>

Oct 11 15:58:34 testvm3 kernel: &quot;echo 0 &gt;

/proc/sys/kernel/hung_task_timeout_secs&quot; disables this message. <br>

Oct 11 15:58:34 testvm3 kernel:

crond&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; D

ffff810001736420&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;

3418&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3436&nbsp; 3405 (NOTLB)

<br>

Oct 11 15:58:34 testvm3 kernel: ffff810036c55ca8 0000000000000086

0000000000000000 ffffffff80019e3e <br>

Oct 11 15:58:34 testvm3 kernel: 0000000000065bf2 0000000000000007

ffff81003ce4b080 ffffffff80314b60 <br>

Oct 11 15:58:34 testvm3 kernel: 000018899ae16270 0000000000023110

ffff81003ce4b268 000000008804ec00 <br>

Oct 11 15:58:34 testvm3 kernel: Call Trace: <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80019e3e&gt;] __getblk+0x25/0x22c <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8006ec8f&gt;]

do_gettimeofday+0x40/0x90 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800155d3&gt;] sync_buffer+0x0/0x3f

<br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800637ce&gt;]

io_schedule+0x3f/0x67 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8001560e&gt;]

sync_buffer+0x3b/0x3f <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80063912&gt;]

__wait_on_bit_lock+0x36/0x66 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800155d3&gt;] sync_buffer+0x0/0x3f

<br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800639ae&gt;]

out_of_line_wait_on_bit_lock+0x6c/0x78 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800a2e2b&gt;]

wake_bit_function+0x0/0x23 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8803181e&gt;]

:jbd:do_get_write_access+0x54/0x522 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80019e3e&gt;] __getblk+0x25/0x22c <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff88031d0e&gt;]

:jbd:journal_get_write_access+0x22/0x33 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8804dd37&gt;]

:ext3:ext3_reserve_inode_write+0x38/0x90 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8804ddb0&gt;]

:ext3:ext3_mark_inode_dirty+0x21/0x3c <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff88050d35&gt;]

:ext3:ext3_dirty_inode+0x63/0x7b <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80013d98&gt;]

__mark_inode_dirty+0x29/0x16e <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff80025a49&gt;] filldir+0x0/0xb7 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8003516b&gt;]

vfs_readdir+0x8c/0xa9 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff800389db&gt;]

sys_getdents+0x75/0xbd <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8005d229&gt;] tracesys+0x71/0xe0 <br>

Oct 11 15:58:34 testvm3 kernel: [&lt;ffffffff8005d28d&gt;] tracesys+0xd5/0xe0 <br>

Oct 11 15:58:34 testvm3 kernel: <br>

Oct 11 15:58:34 testvm3 kernel: INFO: task httpd:3452 blocked for more than 120

seconds. <br>

Oct 11 15:58:34 testvm3 kernel: &quot;echo 0 &gt;

/proc/sys/kernel/hung_task_timeout_secs&quot; disables this message. <br>

Oct 11 15:58:34 testvm3 kernel:

httpd&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; D

ffff810001736420&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; 3452&nbsp;&nbsp;

3405&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

3453&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (NOTLB) <br>

Oct 11 15:58:34 testvm3 kernel: ffff810035ea9dc8 0000000000000086

0000000000000000 ffffffff80009a1c <br>

Oct 11 15:58:34 testvm3 kernel: ffff810035ea9e28 0000000000000009

ffff810037e52080 ffffffff80314b60 <br>

Oct 11 15:58:34 testvm3 kernel: 000018839f75405c 000000000003363d

ffff810037e52268 000000003f5e7150 <o:p></o:p></span></p>


<p class=MsoNormal><span lang=EN-US><br>

Please note that although I am using ucarp for IP failover and by default ucarp

will alway have a preferred master, I have added codes to make sure that the

ucarp master will always become slave when it goes down and come up

again.&nbsp; This will ensure that WMware will not connect back to the failed

node when it comes back up. <br>

<br>

However this does not prevent the problem I describe above. <br>

<br>

There are a lot of logs generated during self-healing process.&nbsp; It doesn't

make any sense to me.&nbsp; I am attaching it. It's over 900k. So I zip them

up.&nbsp; Hopefully the mailling list allow attachment. <br>

<br>

Is there any best practices to setup/run gluster with replication as a

datastore to VMware that make sure VM guests run smoothly even when one node

goes into self-healing? <br>

<br>

Any advise is appreciated. <br>

<br>

Keith <br>

<br>

<br>

<br>

<br>

<o:p></o:p></span></p>


<pre><span lang=EN-US><o:p>&nbsp;</o:p></span></pre><pre><span lang=EN-US><o:p>&nbsp;</o:p></span></pre><pre><span

lang=EN-US>_______________________________________________<o:p></o:p></span></pre><pre><span

lang=EN-US>Gluster-users mailing list<o:p></o:p></span></pre><pre><span

lang=EN-US><a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><o:p></o:p></span></pre><pre><span

lang=EN-US><a href="http://gluster.org/cgi-bin/mailman/listinfo/gluster-users">http://gluster.org/cgi-bin/mailman/listinfo/gluster-users</a><o:p></o:p></span></pre>


<p class=MsoNormal><span lang=EN-US><o:p>&nbsp;</o:p></span></p>


</div>


</body>


</html>