This sounds like a perfect use case for <a href="https://github.com/jdarcy/negative-lookup">https://github.com/jdarcy/negative-lookup</a><div><br></div><div>Jeff Darcy made this as an example of building your own translators. But it basically caches nagative-lookups so repeated asking of the same questions go away.</div>
<div><br></div><div>Hell, might be beneficial to make a translator that just immediately replies "does not exist" for all filenames starting ._*</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Dec 19, 2012 at 8:00 PM, Adam Tygart <span dir="ltr"><<a href="mailto:mozes@k-state.edu" target="_blank">mozes@k-state.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dan,<br>
<br>
Does something like this help for SAMBA?<br>
<a href="http://www.e-rave.nl/disable-creation-of-ds_store-files-on-samba-shares" target="_blank">http://www.e-rave.nl/disable-creation-of-ds_store-files-on-samba-shares</a><br>
<br>
Negative lookups are a particularly expensive operation on distributed<br>
filesystems, GlusterFS in particular. It has to check through each<br>
brick to see that the file *really* doesn't exist, increasing the<br>
lookup overhead immensely. If samba doesn't pass these queries to<br>
GlusterFS, it might help a lot.<br>
<br>
--<br>
Adam Tygart<br>
Beocat Sysadmin<br>
<div><div class="h5"><br>
On Wed, Dec 19, 2012 at 8:29 PM, Daniel Mons <<a href="mailto:daemons@kanuka.com.au">daemons@kanuka.com.au</a>> wrote:<br>
> I'm rolling out 4 lots of 6-node GlusterFS setups for my employer. Each<br>
> node is ~33TB of RAID6 backed storage (16x 3TB SATA disks in RAID6 with a<br>
> hot spare hanging off an LSI controller, with 2x SSDs configured for<br>
> caching), and Gluster is configured in distribute-replicate. Each cluster<br>
> is 200TB of raw space, 100TB usable after replication. When complete, there<br>
> will be 4 of these clusters.<br>
><br>
> Nodes are configured as XFS with 512byte inodes, running a fully patched<br>
> CentOS6 and Gluster 3.3.1. Each node has a 6 core Xeon processor (with HT<br>
> for 12 threads) with 32GB of RAM. Each node runs 2x 10Gbps Ethernet over<br>
> fiber in a bonded configuration (single IP address per node) for a full<br>
> 20Gbits per node.<br>
><br>
> GlusterFS FUSE performance under Linux is great (clients run a mix of Ubuntu<br>
> 12.04 LTS for workstations and CentOS6 for servers). Samba performance back<br>
> to Windows 7 clients is great. NFS performance via both Gluster's userspace<br>
> setup as well as CentOS6's native NFS4 kernel server are great to most other<br>
> systems where we can't get the Gluster FUSE client loaded (large<br>
> industry-specific Linux boxes that are provided by vendors as a "black box"<br>
> solution, and only allow limited access via NFS or SMB/CIFS). All testing<br>
> so far under those conditions proves orders of magnitude faster throughput<br>
> than our existing single NAS solutions.<br>
><br>
> MacOSX Finder performance is a problem, however. There's a huge bug in<br>
> MacOSX itself that prevents using NFS at all (discussions on other mailing<br>
> lists suggest it occurred somewhere around 10.6, and continues through into<br>
> 10.7 and 10.8).<br>
><br>
> Mounting via SMB under OSX is more stable than NFS, however in folders with<br>
> a large amount of files, Finder goes looking for a corresponding Apple<br>
> Resource Fork file (for every "filename.ext", it looks for a<br>
> "._filename.ext"). Running tcpdump and wireshark on the Gluster nodes shows<br>
> that the resulting "FILE_NOT_FOUND" error back to the client takes a very<br>
> long time. Configuring a single node as a pure NAS with the same software<br>
> (but no Gluster implementation) is lightening fast. As soon as GlusterFS<br>
> comes in to play, reporting of each "FILE_NOT_FOUND" slows down the process<br>
> dramatically, causing a directory with ~1000 images in it to take well over<br>
> 5 minutes to display the contents in MacOSX finder.<br>
><br>
> This problem is resolved somewhat by switching to AFP (via Netatalk loaded<br>
> on the GlusterFS nodes), but it has it's own problems unique to that<br>
> protocol, and I'd rather stick to GlusterFS-FUSE, NFS or SMB in that order<br>
> of preference.<br>
><br>
> It's worth noting that through the terminal, these problems don't exist.<br>
> Mounting via SMB, browsing to the volume in terminal and running "ls" or<br>
> "find" style commands retrieve file listings at a similar speed to Linux and<br>
> Windows. The problem is limited to clients using Finder to browse<br>
> directories, and again particularly ones with a large number of files that<br>
> don't have matching Apple Resource Fork files. (Of note, creating empty<br>
> files of the matching "._filename.ext" format solves the performance<br>
> problem, but litters our filestores with millions of empty files, which we<br>
> don't want).<br>
><br>
> I understand the problem is not strictly Gluster's issue. Finder is looking<br>
> for a heck of a lot of files that don't exist (which is a pretty silly<br>
> design), and it tends to occur only with Samba re-exporting GlusterFS<br>
> volumes that we can see. And likewise Apple's NFS bug that has now been in<br>
> existence across three releases of their OS is pretty horrible. But<br>
> hopefully I can at least describe the problem and prompt some testing by<br>
> others.<br>
><br>
> I haven't had a chance to test a MacOSX FUSE client due to time constraints,<br>
> but that would at least answer the question if the problem is Gluster's lag<br>
> in reporting of files not found, or Samba's.<br>
><br>
> -Dan<br>
><br>
><br>
</div></div>> _______________________________________________<br>
> Gluster-users mailing list<br>
> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
> <a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote></div><br></div>