<div dir="ltr">Hi Marco,<br>There is nothing suspicious in Configuration. Can you attach the complete logfiles of both glusterfs process?<br><br>regards,<br><br><div class="gmail_quote">On Thu, Aug 7, 2008 at 9:19 PM, Marco Trevisan <span dir="ltr">&lt;<a href="mailto:marco.trevisan@cardinis.com">marco.trevisan@cardinis.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi all,<br>

<br>

I&#39;m in the process of evaluating GlusterFS as a clustered file system, I<br>

like it very much because -among the other cool features- it&#39;s very easy<br>

to configure and it allows me to reuse the filesystems I already know as<br>

storage backends.<br>

<br>

Before trying it on expensive hardware, I decided to try it on a very<br>

low HW configuration:<br>

<br>

- 2 old PCs &nbsp;(one P4 class CPU, IDE drives, one 100 Mbps ethernet card)<br>

and a 100 Mbps switch.<br>

<br>

The OS is Debian &#39;lenny&#39; in both nodes. &#39;Lenny&#39; comes with FUSE v2.7.2.<br>

<br>

I then compiled glusterfs 1.3.10 on both nodes and setup a server-side,<br>

single-process AFR configuration (the file content is reported below).<br>

I did NOT use the glusterfs-patched FUSE library.<br>

<br>

On top of that I&#39;ve put some VMWare Server virtual machines. Each<br>

virtual machine image is split into a few 2 GB &quot;vmdk&quot; files (not<br>

preallocated).<br>

<br>

I was successful in starting up and running my virtual machines (with<br>

only an additional line in their configuration files), so I was very<br>

happy with it.<br>

<br>

The problem now is, after putting a virtual machine under &quot;intense&quot;<br>

I/O, &nbsp;when I rebooted it today I found its root filesystem (=the vmdk)<br>

was corrupted. It lost some important directories (e.g. kernel modules<br>

directory under /lib/modules).<br>

<br>

Just to give you a little more detail of the behaviour under I/O, when<br>

the virtual machine is doing I/O to the VMDK file, iptraf shows the<br>

corresponding traffic on the ethernet link at about 15-50 Mbps, so it<br>

looks like only the modified portions of the file are being sent to the<br>

other AFR node, infact if I simulate a failure by powering off the other<br>

AFR node, at reboot I see 90 Mbps (link saturation) traffic as I try to<br>

open the VMDK file, and that operation blocks until full synchronization<br>

has finished.<br>

<br>

The glusterfs.log content is as follows:<br>

[...]<br>

2008-08-06 12:20:24 E [posix-locks.c:1148:pl_lk] gfs-ds-locks: returning<br>

EAGAIN<br>

2008-08-06 12:20:24 E [afr.c:3190:afr_lk_cbk] gfs-ds-afr:<br>

(path=/logs-127.0.1.1/.terracotta-logging.lock child=gfs-ds-locks)<br>

op_ret=-1 op_errno=11<br>

2008-08-06 12:42:25 E [posix-locks.c:1148:pl_lk] gfs-ds-locks: returning<br>

EAGAIN<br>

2008-08-06 12:42:25 E [afr.c:3190:afr_lk_cbk] gfs-ds-afr:<br>

(path=/logs-127.0.1.1/.terracotta-logging.lock child=gfs-ds-locks)<br>

op_ret=-1 op_errno=11<br>

2008-08-07 11:42:17 E [posix-locks.c:1180:pl_forget] gfs-ds-locks:<br>

Active locks found!<br>

<br>

<br>

The above log does not seem to justify such file corruption... there is<br>

nothing related to &quot;vmdk&quot; files.<br>

<br>

Any suggestions?<br>

Is the HW configuration way too slow for afr to work reliably?<br>

Are there mistakes in the configuration file?<br>

<br>

Any help is really appreciated.<br>

<br>

Kind regards,<br>

 &nbsp; Marco<br>

<br>

<br>

----------GlusterFS config file ----------------<br>

<br>

# dataspace on storage1<br>

volume gfs-ds<br>

 &nbsp;type storage/posix<br>

 &nbsp;option directory /mnt/hda7/gfs-ds<br>

end-volume<br>

<br>

# posix locks<br>

volume gfs-ds-locks<br>

 &nbsp;type features/posix-locks<br>

 &nbsp;subvolumes gfs-ds<br>

end-volume<br>

<br>

volume gfs-ds-threads<br>

 &nbsp;type performance/io-threads<br>

 &nbsp;option thread-count 1<br>

 &nbsp;option cache-size 32MB<br>

 &nbsp;subvolumes gfs-ds-locks<br>

end-volume<br>

<br>

volume server<br>

 &nbsp;type protocol/server<br>

 &nbsp;option transport-type tcp/server<br>

 &nbsp;subvolumes gfs-ds-threads<br>

 &nbsp;# storage network access only<br>

 &nbsp;option auth.ip.gfs-ds-threads.allow *<br>

 &nbsp;option auth.ip.gfs-ds-afr.allow *<br>

end-volume<br>

<br>

<br>

# dataspace on storage2<br>

volume gfs-storage2-ds<br>

 &nbsp;type protocol/client<br>

 &nbsp;option transport-type tcp/client<br>

 &nbsp;option remote-host &lt;the other node&#39;s IP&gt; &nbsp;# storage network<br>

 &nbsp;option remote-subvolume gfs-ds-threads<br>

 &nbsp;option transport-timeout 10 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # value in seconds; it should be<br>

set relatively low<br>

end-volume<br>

<br>

# automatic file replication translator for dataspace<br>

volume gfs-ds-afr<br>

 &nbsp;type cluster/afr<br>

 &nbsp;subvolumes gfs-ds-locks gfs-storage2-ds &nbsp; &nbsp; &nbsp; &nbsp; # local and remote<br>

dataspaces<br>

end-volume<br>

<br>

volume writebehind<br>

 &nbsp;type performance/write-behind<br>

 &nbsp;option aggregate-size 128kB<br>

 &nbsp;subvolumes gfs-ds-afr<br>

end-volume<br>

<br>

volume readahead<br>

 &nbsp;type performance/read-ahead<br>

 &nbsp;option page-size 64kB<br>

 &nbsp;option page-count 16<br>

 &nbsp;subvolumes writebehind<br>

end-volume<br>

-------------<br>

<br>

_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

<a href="http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users" target="_blank">http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users</a><br>

</blockquote></div><br><br clear="all"><br>-- <br>Raghavendra G<br><br>A centipede was happy quite, until a toad in fun,<br>Said, &quot;Prey, which leg comes after which?&quot;,<br>This raised his doubts to such a pitch,<br>

He fell flat into the ditch,<br>Not knowing how to run.<br>-Anonymous<br>

</div>