Big thanks to the devel group for fixing all the memory leak issues with the earlier RC releases. 2.0.0 has been great so far without any memory issues what-so-ever.<div><br></div><div>I am seeing some oddities with the replication/distribute translators however. I have three partitions on each gluster server exporting three bricks - We have two servers. The gluster clients replicates each brick between the two servers and then i have a distribute translator for all the replicated bricks - basically gluster raid10.</div>
<div><br></div><div>There are a handful of files which have been copied into the gluster volume but since have disappeared, however the physical files exist on both bricks.</div><div><br></div><div>(from a client)</div><div>
<br></div><div><div><div>[root@client1 049891002526]# pwd</div><div>/intstore/data/tracks/tmg/2008_02_05/049891002526</div><div>[root@client1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k</div><div>ls: 049891002526_01_09.wma.sigKey01.k: No such file or directory</div>
<div><div>[root@client1 049891002526]# head 049891002526_01_09.wma.sigKey01.k</div><div>head: cannot open `049891002526_01_09.wma.sigKey01.k' for reading: No such file or directory</div><div>[root@client1 049891002526]#</div>
<div><br></div></div><div><br></div><div>(from a server brick)</div><div><div><br></div><div>[root@server1 049891002526]# pwd</div><div>/intstore/intstore01c/gcdata/data/tracks/tmg/2008_02_05/049891002526</div><div>[root@server1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k</div>
<div>-rw-rw-rw- 1 10015 root 19377712 Feb 6 2008 049891002526_01_09.wma.sigKey01.k</div><div>[root@server1 049891002526]# attr -l 049891002526_01_09.wma.sigKey01.k</div><div>Attribute "glusterfs.createtime" has a 10 byte value for 049891002526_01_09.wma.sigKey01.k</div>
<div>Attribute "glusterfs.version" has a 1 byte value for 049891002526_01_09.wma.sigKey01.k</div><div>Attribute "selinux" has a 24 byte value for 049891002526_01_09.wma.sigKey01.k</div><div><div>[root@server1 049891002526]# attr -l .</div>
<div>Attribute "glusterfs.createtime" has a 10 byte value for .</div><div>Attribute "glusterfs.version" has a 1 byte value for .</div><div>Attribute "glusterfs.dht" has a 16 byte value for .</div>
<div>Attribute "selinux" has a 24 byte value for .</div><div><br></div></div><div><br></div><div>Nothing in both the client and server logs. I've tried all the normal replication checks and self-heal such as ls -alR. If i copy the file back from one of the bricks into the volume it will show up again however it has a 1/3 chance of getting written to the files original location. So then i end up with two identical files on two different bricks.</div>
<div><br></div><div>This volume has over 40 million files and directories so it can be very tedious to find <span class="Apple-style-span" style="border-collapse: collapse; font-family: Arial; white-space: pre; -webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px; ">anomalies. I wrote a quick perl script to search 1/25 of our total files in the volume for missing files and md5 checksum differences and as of now its about 15% (138,500 files) complete and has found ~7000 missing files and 0 md5 checksum differences.</span></div>
<div><br></div><div>How could i debug this? I'd image it has something to do with the extended attributes on either the file or parent directory...but as far as i can tell that all looks fine.</div><div><br></div><div>
thanks,</div><div>liam</div><div><br></div><div>client glusterfs.vol:</div><div><br></div><div><div>volume brick1a</div><div> type protocol/client</div><div> option transport-type tcp</div><div> option remote-host server1</div>
<div> option remote-subvolume brick1a</div><div>end-volume</div><div><br></div><div>volume brick1b</div><div> type protocol/client</div><div> option transport-type tcp</div><div> option remote-host server1</div><div> option remote-subvolume brick1b</div>
<div>end-volume</div><div><br></div><div>volume brick1c</div><div> type protocol/client</div><div> option transport-type tcp</div><div> option remote-host server1</div><div> option remote-subvolume brick1c</div><div>end-volume</div>
<div><br></div><div>volume brick2a</div><div> type protocol/client</div><div> option transport-type tcp</div><div> option remote-host server2</div><div> option remote-subvolume brick2a</div><div>end-volume</div><div><br>
</div><div>volume brick2b</div><div> type protocol/client</div><div> option transport-type tcp</div><div> option remote-host server2</div><div> option remote-subvolume brick2b</div><div>end-volume</div><div><br></div>
</div><div><div>volume brick2c</div><div> type protocol/client</div><div> option transport-type tcp</div><div> option remote-host server2</div><div> option remote-subvolume brick2c</div><div>end-volume</div><div><br></div>
<div>volume bricks1</div><div> type cluster/replicate</div><div> subvolumes brick1a brick2a</div><div>end-volume</div><div><br></div><div>volume bricks2</div><div> type cluster/replicate</div><div> subvolumes brick1b brick2b</div>
<div>end-volume</div><div><br></div><div>volume bricks3</div><div> type cluster/replicate</div><div> subvolumes brick1c brick2c</div><div>end-volume</div><div><br></div><div>volume distribute</div><div> type cluster/distribute</div>
<div> subvolumes bricks1 bricks2 bricks3</div><div>end-volume</div><div><br></div><div>volume writebehind</div><div> type performance/write-behind</div><div> option block-size 1MB</div><div> option cache-size 64MB</div>
<div> option flush-behind on</div><div> subvolumes distribute</div><div>end-volume</div><div><br></div><div><div>volume cache</div><div> type performance/io-cache</div><div> option cache-size 2048MB</div><div> subvolumes writebehind</div>
<div>end-volume</div><div><br></div></div></div><div>server glusterfsd.vol:</div><div><br></div><div><div>volume intstore01a</div><div> type storage/posix</div><div> option directory /intstore/intstore01a/gcdata</div><div>
end-volume</div><div><br></div><div>volume intstore01b</div><div> type storage/posix</div><div> option directory /intstore/intstore01b/gcdata</div><div>end-volume</div><div><br></div><div>volume intstore01c</div><div> type storage/posix</div>
<div> option directory /intstore/intstore01c/gcdata</div><div>end-volume</div><div><br></div><div>volume locksa</div><div> type features/posix-locks</div><div> option mandatory-locks on</div><div> subvolumes intstore01a</div>
<div>end-volume</div><div><br></div><div>volume locksb</div><div> type features/posix-locks</div><div> option mandatory-locks on</div><div> subvolumes intstore01b</div><div>end-volume</div><div><br></div><div><div>volume locksc</div>
<div> type features/posix-locks</div><div> option mandatory-locks on</div><div> subvolumes intstore01c</div><div>end-volume</div><div><br></div><div>volume brick1a</div><div> type performance/io-threads</div><div> option thread-count 32</div>
<div> subvolumes locksa</div><div>end-volume</div><div><br></div><div>volume brick1b</div><div> type performance/io-threads</div><div> option thread-count 32</div><div> subvolumes locksb</div><div>end-volume</div><div>
<br></div><div>volume brick1c</div><div> type performance/io-threads</div><div> option thread-count 32</div><div> subvolumes locksc</div><div>end-volume</div><div><br></div><div><div>volume server</div><div> type protocol/server</div>
<div> option transport-type tcp</div><div> option auth.addr.brick1a.allow 192.168.12.*</div><div> option auth.addr.brick1b.allow 192.168.12.*</div><div> option auth.addr.brick1c.allow 192.168.12.*</div><div> subvolumes brick1a brick1b brick1c</div>
<div>end-volume</div><div><br></div></div></div></div><div><br></div></div></div></div><div><div class="gmail_quote">On Wed, Apr 22, 2009 at 5:43 PM, Liam Slusser <span dir="ltr"><<a href="mailto:lslusser@gmail.com">lslusser@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><br></div>Avati,<div><br></div><div>Big thanks. Looks like that did the trick. I'll report back in the morning if anything has changed but its looking MUCH better. Thanks again!</div>
<div><br></div><div><font color="#888888">liam</font><div><div></div><div class="h5"><div>
<div><br><div class="gmail_quote">On Wed, Apr 22, 2009 at 2:32 PM, Anand Avati <span dir="ltr"><<a href="mailto:avati@gluster.com" target="_blank">avati@gluster.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Liam,<br>
An fd leak and a lock structure leak has been fixed in the git<br>
repository, which explains a leak in the first subvolume's server.<br>
Please pull the latest patches and let us know if it does not fixe<br>
your issues. Thanks!<br>
<font color="#888888"><br>
Avati<br>
</font><div><div></div><div><br>
On Tue, Apr 21, 2009 at 3:41 PM, Liam Slusser <<a href="mailto:lslusser@gmail.com" target="_blank">lslusser@gmail.com</a>> wrote:<br>
> There is still a memory leak with rc8 on my setup. The first server in a<br>
> cluster or two servers starts out using 18M and just slowly increases.<br>
> After 30mins it has doubled in size to over 30M and just keeps growing -<br>
> the more memory it uses the worst the performance. Funny that the second<br>
> server in my cluster using the same configuration file has no such memory<br>
> problem.<br>
> My glusterfsd.vol has no performance translators, just 3 storage/posix -> 3<br>
> features/posix-locks -> protocol/server.<br>
> thanks,<br>
> liam<br>
> On Mon, Apr 20, 2009 at 2:01 PM, Gordan Bobic <<a href="mailto:gordan@bobich.net" target="_blank">gordan@bobich.net</a>> wrote:<br>
>><br>
>> Gordan Bobic wrote:<br>
>>><br>
>>> First-access failing bug still seems to be present.<br>
>>> But other than that, it seems to be distinctly better than rc4. :)<br>
>>> Good work! :)<br>
>><br>
>> And that massive memory leak is gone, too! The process hasn't grown by a<br>
>> KB after a kernel compile! :D<br>
>><br>
>> s/Good work/Awesome work/<br>
>><br>
>> :)<br>
>><br>
>><br>
>> Gordan<br>
>><br>
>><br>
>> _______________________________________________<br>
>> Gluster-devel mailing list<br>
>> <a href="mailto:Gluster-devel@nongnu.org" target="_blank">Gluster-devel@nongnu.org</a><br>
>> <a href="http://lists.nongnu.org/mailman/listinfo/gluster-devel" target="_blank">http://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>
><br>
><br>
> _______________________________________________<br>
> Gluster-devel mailing list<br>
> <a href="mailto:Gluster-devel@nongnu.org" target="_blank">Gluster-devel@nongnu.org</a><br>
> <a href="http://lists.nongnu.org/mailman/listinfo/gluster-devel" target="_blank">http://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>
><br>
><br>
</div></div></blockquote></div><br></div></div></div></div></div>
</blockquote></div><br></div>