<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<blockquote type="cite">On storage2 all is ok, but i have a split
brain condition. /export/brick1 on storage1 doesn't contain
datas....</blockquote>
FYI, not split-brain. Split-brain is where two differing copies both
think they're the correct one.<br>
<br>
I replied to the questions you asked on my blog, but I'll repeat the
answers here:<br>
<br>
<blockquote type="cite">Should i stop the volume with gluster volume
stop? Can i maintain up other volumes (and glusterfs) ?</blockquote>
<br>
No need to stop the volume. The brick cannot start without the
volume-id so there's nothing running that could be affected.<br>
<br>
<blockquote type="cite">Next i have to re assign the volume id. Is
this right?<br>
A this point can i re-start the corrupted volume?</blockquote>
<br>
Once the volume-id is set you can restart the brick by either
restarting glusterd or "gluster volume start $vol force".<br>
<br>
<blockquote type="cite">Should i start any healing procedure? How?</blockquote>
<br>
<meta http-equiv="content-type" content="text/html;
charset=windows-1252">
I would do a "gluster volume heal $vol full" to ensure all files are
crawled for a heal.<br>
<br>
<br>
<div class="moz-cite-prefix">On 10/08/2014 09:09 AM, Marco Marino
wrote:<br>
</div>
<blockquote
cite="mid:CAFHVVuL9HDXQSVqK5JeybffBpSFtj21d1=HT0X7fmFb3p7Qs0w@mail.gmail.com"
type="cite">
<div dir="ltr">Can someone help me?
<div>I'd like to restore my /export/brick1 on server1. Actually
i have datas only on server2. </div>
<div>I think that right instructions are:</div>
<div>1) setfattr -n ... on server1 ( this is a bug. Here more
info -> <a moz-do-not-send="true"
href="http://www.joejulian.name/blog/replacing-a-brick-on-glusterfs-340/">http://www.joejulian.name/blog/replacing-a-brick-on-glusterfs-340/</a>
I have the same error in logs)</div>
<div>2) Now i think i can re-start volume, so i should see an
automatic healing procedure</div>
<div>3) All datas are replicated on server1</div>
<div><br>
</div>
<div>Can i have a confirmation about this procedure? Other
volumes are affected? Please, i cannot loose my data</div>
<div><br>
</div>
<div>Thanks</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">2014-10-03 20:07 GMT+02:00 Marco Marino
<span dir="ltr"><<a moz-do-not-send="true"
href="mailto:marino.mrc@gmail.com" target="_blank">marino.mrc@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Hi,
<div>I'm trying to use glusterfs with my openstack private
cloud for storing ephemeral disks. In this way, each
compute node mount glusterfs in /nova and save instances
on a remote glusterfs (shared between the compute nodes,
so live migration is very fast).</div>
<div>I have 2 storage node (storage1 and storage2) with
replica 2.</div>
<div>In a first configuration i've used nfs on the
clients. In /etc/fstab of the compute nodes i have:</div>
<div>storage1:/cloud_rootdisk /nova nfs
mountproto=tcp,vers=3 0 0<br>
</div>
<div><br>
</div>
<div>This creates a single point of failure because if
storage1 goes down, i have to remount manually on
storage2. And this causes the complete disk corruption
of all VMs that running on all the compute nodes. Really
funny... </div>
<div><br>
</div>
<div>In a second configuration, i've used the gluster
native client with "backupvolfile-server=storage2". i've
made few tests, but it seems to work.</div>
<div>What i've tested:</div>
<div>on the compute node i have:</div>
<div>mount -t glusterfs -o
backupvolfile-server=server2,fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log
server1:/test-volume /gluster_mount<br>
</div>
<div><br>
</div>
<div>Then, I booted a vm and started to download a large
file (1GB) from the vm (so, i'm writing on the ephemeral
disk stored via glusterfs). During this download, i
rebooted storage1 and the VM seems to be not corrupted
(so, the vm write only on storage2).</div>
<div>Can i have a confirmation about this? Is this the
right way?</div>
<div><br>
</div>
<div>Next question:</div>
<div>When i rebooted storage1, it fails to start. it tells
me that /dev/sdc1 (the partition that i'm using for the
test) is corrupted. It could be a normal behavior
because the server goes down during a write. So, started
the storage1 in single user mode and xfs_repair
/dev/sdc1. This make me able to start storage1. (yuppy)</div>
<div>Glusterfs starts correctly, but now i have "brick1 is
not connected", where /export/brick1 is the brick that
i'm using on storage1 for the volume used for tests.</div>
<div>On storage2 all is ok, but i have a split brain
condition. /export/brick1 on storage1 doesn't contain
datas....</div>
<div>What can i have to do to restore /export/brick1 on
storage1 ???</div>
<div><br>
</div>
<br>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
P.S. Sorry I couldn't help earlier. I had a very busy week out of
town.<br>
</body>
</html>