<div dir="ltr"><div><div><div><div><br></div>Hi Daniel,<br><br></div>Ok, if gluster can&#39;t self-heal from this situation, I hope at least I can manually restore the volume by using the good brick available. So would you please tell me how can I &quot;simply rebuild the filesystem and let gluster attempt to restore it from a *clean* filesystem&quot;?<br>

<br></div>Many thanks.<br><br></div>Liang<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jan 9, 2013 at 9:04 AM, Daniel Taylor <span dir="ltr">&lt;<a href="mailto:dtaylor@vocalabs.com" target="_blank">dtaylor@vocalabs.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It seems to me that what you need to do is replace the failed brick, or simply rebuild the filesystem and let gluster attempt to restore it from a *clean* filesystem.<br>


<br>

I haven&#39;t seen anywhere that allows gluster to actually change the replication count on a live cluster, which is what you seem to be requesting.<div class="im"><br>

<br>

On 01/09/2013 07:57 AM, Liang Ma wrote:<br>

</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">

Todd,<br>

<br>

Thanks for your reply. But how can I take this brick offline? Since the gluster volume has replicate count 2, it won&#39;t allow me to remove one brick. Is there a command which can take one replicate brick offline?<br>

<br>

Many thanks.<br>

<br>

Liang<br>

<br>

<br></div><div class="im">

On Tue, Jan 8, 2013 at 3:02 PM, Todd Pfaff &lt;<a href="mailto:pfaff@rhpcs.mcmaster.ca" target="_blank">pfaff@rhpcs.mcmaster.ca</a> &lt;mailto:<a href="mailto:pfaff@rhpcs.mcmaster.ca" target="_blank">pfaff@rhpcs.mcmaster.<u></u>ca</a>&gt;&gt; wrote:<br>


<br>

    Liang,<br>

<br>

    I don&#39;t claim to know the answer to your question, and my<br>

    knowledge of zfs<br>

    is minimal at best so I may be way off base here, but it seems to<br>

    me that<br>

    your attempted random corruption with this command:<br>

<br>

<br>

      dd if=/dev/urandom of=/dev/sda6 bs=1024 count=20480<br>

<br>

    is likely going to corrupt the underlying zfs filesystem metadata, not<br>

    just file data, and I wouldn&#39;t expect gluster to be able to fixed a<br>

    brick&#39;s corrupted filesystem.  Perhaps you now have to take the brick<br>

    offline, fix any zfs filesystem errors if possible, bring the<br>

    brick back<br>

    online and see what then happens with self-heal.<br>

<br>

    --<br></div>

    Todd Pfaff &lt;<a href="mailto:pfaff@mcmaster.ca" target="_blank">pfaff@mcmaster.ca</a> &lt;mailto:<a href="mailto:pfaff@mcmaster.ca" target="_blank">pfaff@mcmaster.ca</a>&gt;&gt;<div><div class="h5"><br>

    <a href="http://www.rhpcs.mcmaster.ca/" target="_blank">http://www.rhpcs.mcmaster.ca/</a><br>

<br>

<br>

    On Tue, 8 Jan 2013, Liang Ma wrote:<br>

<br>

        Hi There,<br>

<br>

        I&#39;d like to test and understand the self heal feature of<br>

        glusterfs. This is<br>

        what I did with 3.3.1-ubuntu1~precise4 on Ubuntu 12.04.1 LTS.<br>

<br>

        gluster volume create gtest replica 2 gluster3:/zfs-test<br>

        gluster4:/zfs-test<br>

        where zfs-test is a zfs pool on partition /dev/sda6 in both nodes.<br>

<br>

        To simulate a random corruption on node gluster3<br>

<br>

        dd if=/dev/urandom of=/dev/sda6 bs=1024 count=20480<br>

<br>

        Now zfs detected the corrupted files<br>

<br>

          pool: zfs-test<br>

         state: ONLINE<br>

        status: One or more devices has experienced an error resulting<br>

        in data<br>

                corruption.  Applications may be affected.<br>

        action: Restore the file in question if possible.  Otherwise<br>

        restore the<br>

                entire pool from backup.<br>

           see: <a href="http://zfsonlinux.org/msg/ZFS-8000-8A" target="_blank">http://zfsonlinux.org/msg/ZFS-<u></u>8000-8A</a><br>

         scan: none requested<br>

        config:<br>

<br>

                NAME        STATE     READ WRITE CKSUM<br>

                zfs-test   ONLINE       0     0 2.29K<br>

                  sda6     ONLINE       0     0 4.59K<br>

<br>

        errors: Permanent errors have been detected in the following<br>

        files:<br>

<br>

                /zfs-test/&lt;xattrdir&gt;/trusted.<u></u>gfid<br>

                       /zfs-test/.glusterfs/b0/1e/<u></u>b01ec17c-14cc-4999-938b-<u></u>b4a71e358b46<br>

                       /zfs-test/.glusterfs/b0/1e/<u></u>b01ec17c-14cc-4999-938b-<u></u>b4a71e358b46/&lt;xat<br>

        trdir&gt;/trusted.gfid<br>

                       /zfs-test/.glusterfs/dd/8c/<u></u>dd8c6797-18c3-4f3b-b1ca-<u></u>86def2b578c5/&lt;xat<br>

        trdir&gt;/trusted.gfid<br>

<br>

        Now the gluster log file shows the self heal can&#39;t fix the<br>

        corruption<br>

        [2013-01-08 12:46:03.371214] W<br>

        [afr-common.c:1196:afr_detect_<u></u>self_heal_by_iatt]<br>

        2-gtest-replicate-0:<br>

        /K.iso: gfid different on subvolume<br>

        [2013-01-08 12:46:03.373539] E<br>

        [afr-self-heal-common.c:1419:<u></u>afr_sh_common_lookup_cbk]<br>

        2-gtest-replicate-0:<br>

        Missing Gfids for /K.iso<br>

        [2013-01-08 12:46:03.385701] E<br>

        [afr-self-heal-common.c:2160:<u></u>afr_self_heal_completion_cbk]<br>

        2-gtest-replicate-0: background  gfid self-heal failed on /K.iso<br>

        [2013-01-08 12:46:03.385760] W [fuse-bridge.c:292:fuse_entry_<u></u>cbk]<br>

        0-glusterfs-fuse: 11901: LOOKUP() /K.iso =&gt; -1 (No data available)<br>

<br>

        where K.iso is one of the sample files affected by the dd command.<br>

<br>

        So could anyone tell me what is the best way to repair the<br>

        simulated<br>

        corruption?<br>

<br>

        Thank you.<br>

<br>

        Liang<br>

<br>

<br>

<br>

<br>

<br></div></div>

______________________________<u></u>_________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.<u></u>org/mailman/listinfo/gluster-<u></u>users</a><span class="HOEnZb"><font color="#888888"><br>

</font></span></blockquote><span class="HOEnZb"><font color="#888888">

<br>

-- <br>

Daniel Taylor             VP Operations       Vocal Laboratories, Inc<br>

<a href="mailto:dtaylor@vocalabs.com" target="_blank">dtaylor@vocalabs.com</a>                                     <a href="tel:612-235-5711" value="+16122355711" target="_blank">612-235-5711</a><br>

<br>

______________________________<u></u>_________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.<u></u>org/mailman/listinfo/gluster-<u></u>users</a><br>

</font></span></blockquote></div><br></div>