<div dir="ltr">Thanks for the suggestion!<div><br></div><div style>My testfiles are a few bytes each, so the max-file-size setting doesn't apply, unfortunately.</div><div style><br></div><div style>Disabling quick-read for the gluster client mount indeed gives a bit lower latency, but it still isn't better than the Gluster NFS client (or using the native Linux NFS server, for that matter).</div>
<div style><br></div><div style>Cheers!</div><div style>//Willem</div><div style><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Apr 24, 2013 at 5:31 AM, Raghavendra Gowdappa <span dir="ltr"><<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi willem,<br>
<br>
Please find the inlined comments:<br>
<div><div class="h5"><br>
----- Original Message -----<br>
> From: "Willem" <<a href="mailto:gwillem@gmail.com">gwillem@gmail.com</a>><br>
> To: <a href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a><br>
> Sent: Thursday, April 18, 2013 11:58:46 PM<br>
> Subject: [Gluster-users] Low (<0.2ms) latency reads, is it possible at all?<br>
><br>
> I'm testing GlusterFS viability for use with a typical PHP webapp (ie. lots<br>
> of small files). I don't care so much for the C in the CAP theorem, as I<br>
> have very few writes. I could live with a write propagation delay of 5<br>
> minutes (or dirty caches for up to 5 minutes).<br>
><br>
> So I'm optimizing for low latency reads of small files. My testsetup is 2<br>
> node replication. Each node is both server and gluster client. Both are in<br>
> sync. I stop glusterfs-server @ node2. @node1, I run a simple benchmark:<br>
> repeatedly (to prime the cache) open & close 1000 small files. I have<br>
> enabled the client-side io-cache and quick-read translators (see below for<br>
> config).<br>
><br>
> The results are consistently 2 ms per open (O_RDONLY) call. Which is too<br>
> slow, unfortunately, as I need < 0.2ms.<br>
><br>
> The same test with a local Gluster server and NFS mount, I get somewhat<br>
> better performance but still 0.6ms.<br>
><br>
> The same test with Linux NFS server (v3) and local mount, I get 0.12ms per<br>
> open.<br>
><br>
> I can't explain the lag using Gluster, because I can't see any traffic being<br>
> sent to node2. I would expect that using the io-cache translator and<br>
> local-only operation, the performance would approach that of the kernel FS<br>
> cache.<br>
><br>
> Is this assumption correct? If yes, how would I profile the client sub system<br>
> to detect the bottleneck?<br>
><br>
> If no, then I have to accept that 0.8ms open calls are the best that I could<br>
> squeeze out of this system. Then I'll probably look into AFS, userspace<br>
> async replication or gluster NFS mount with cachefilesd. Which would you<br>
> recommend?<br>
><br>
> Thanks a lot!<br>
> BTW I like Gluster a lot, and hope that it is also suitable for this small<br>
> files use case ;)<br>
><br>
> //Willem<br>
><br>
> PS Am testing with kernel 3.5.0-17-generic 64bit and gluster 3.2.5-1ubuntu1.<br>
><br>
> Client volfile:<br>
> +------------------------------------------------------------------------------+<br>
> 1: volume testvol-client-0<br>
> 2: type protocol/client<br>
> 3: option remote-host g1<br>
> 4: option remote-subvolume /data<br>
> 5: option transport-type tcp<br>
> 6: end-volume<br>
> 7:<br>
> 8: volume testvol-client-1<br>
> 9: type protocol/client<br>
> 10: option remote-host g2<br>
> 11: option remote-subvolume /data<br>
> 12: option transport-type tcp<br>
> 13: end-volume<br>
> 14:<br>
> 15: volume testvol-replicate-0<br>
> 16: type cluster/replicate<br>
> 17: subvolumes testvol-client-0 testvol-client-1<br>
> 18: end-volume<br>
> 19:<br>
> 20: volume testvol-write-behind<br>
> 21: type performance/write-behind<br>
> 22: option flush-behind on<br>
> 23: subvolumes testvol-replicate-0<br>
> 24: end-volume<br>
> 25:<br>
> 26: volume testvol-io-cache<br>
> 27: type performance/io-cache<br>
> 28: option max-file-size 256KB<br>
> 29: option cache-timeout 60<br>
> 30: option priority *.php:3,*:0<br>
> 31: option cache-size 256MB<br>
> 32: subvolumes testvol-write-behind<br>
> 33: end-volume<br>
> 34:<br>
> 35: volume testvol-quick-read<br>
> 36: type performance/quick-read<br>
<br>
</div></div>default value for option "max-file-size" is 64KB. Seems like your files are bigger than 64KB. Can you add this option and rerun the tests? Also can you rerun the tests by disabling quick-read and compare the results?<br>
<div><div class="h5"><br>
> 37: option cache-size 256MB<br>
> 38: subvolumes testvol-io-cache<br>
> 39: end-volume<br>
> 40:<br>
> 41: volume testvol<br>
> 42: type debug/io-stats<br>
> 43: option latency-measurement off<br>
> 44: option count-fop-hits off<br>
> 45: subvolumes testvol-quick-read<br>
> 46: end-volume<br>
><br>
> Server volfile:<br>
> +------------------------------------------------------------------------------+<br>
> 1: volume testvol-posix<br>
> 2: type storage/posix<br>
> 3: option directory /data<br>
> 4: end-volume<br>
> 5:<br>
> 6: volume testvol-access-control<br>
> 7: type features/access-control<br>
> 8: subvolumes testvol-posix<br>
> 9: end-volume<br>
> 10:<br>
> 11: volume testvol-locks<br>
> 12: type features/locks<br>
> 13: subvolumes testvol-access-control<br>
> 14: end-volume<br>
> 15:<br>
> 16: volume testvol-io-threads<br>
> 17: type performance/io-threads<br>
> 18: subvolumes testvol-locks<br>
> 19: end-volume<br>
> 20:<br>
> 21: volume testvol-marker<br>
> 22: type features/marker<br>
> 23: option volume-uuid bc89684f-569c-48b0-bc67-09bfd30ba253<br>
> 24: option timestamp-file /etc/glusterd/vols/testvol/marker.tstamp<br>
> 25: option xtime off<br>
> 26: option quota off<br>
> 27: subvolumes testvol-io-threads<br>
> 28: end-volume<br>
> 29:<br>
> 30: volume /data<br>
> 31: type debug/io-stats<br>
> 32: option latency-measurement off<br>
> 33: option count-fop-hits off<br>
> 34: subvolumes testvol-marker<br>
> 35: end-volume<br>
> 36:<br>
> 37: volume testvol-server<br>
> 38: type protocol/server<br>
> 39: option transport-type tcp<br>
> 40: option auth.addr./data.allow *<br>
> 41: subvolumes /data<br>
> 42: end-volume<br>
><br>
> My benchmark to simulate PHP webapp i/o:<br>
> #!/usr/bin/env python<br>
><br>
> import sys<br>
> import os<br>
> import time<br>
> import optparse<br>
><br>
> def print_timing(func):<br>
> def wrapper(*arg):<br>
> t1 = time.time()<br>
> res = func(*arg)<br>
> t2 = time.time()<br>
> print '%-15.15s %6d ms' % (func.func_name, int ( (t2-t1)*1000.0 ))<br>
> return res<br>
> return wrapper<br>
><br>
><br>
> def parse_options():<br>
> parser = optparse.OptionParser()<br>
> parser.add_option("--path", '-p', default="/mnt/glusterfs",<br>
> help="Base directory for running tests (default: /mnt/glusterfs)",<br>
> )<br>
> parser.add_option("--num", '-n', type="int", default=100,<br>
> help="Number of files per test (default: 100)",<br>
> )<br>
> (options, args) = parser.parse_args()<br>
> return options<br>
><br>
> class FSBench():<br>
> def __init__(self,path="/tmp",num=100):<br>
> self.path = path<br>
> self.num = num<br>
> @print_timing<br>
> def test_open_read(self):<br>
> for filename in self.get_files():<br>
> f = open(filename)<br>
> data = f.read()<br>
> f.close()<br>
> def get_files(self):<br>
> for i in range(self.num):<br>
> filename = self.path + "/test_%03d" % i<br>
> yield filename<br>
> @print_timing<br>
> def test_stat(self):<br>
> for filename in self.get_files():<br>
> os.stat(filename)<br>
><br>
> @print_timing<br>
> def test_stat_nonexist(self):<br>
> for filename in self.get_files():<br>
> try:<br>
> os.stat(filename+"blkdsflskdf")<br>
> except OSError:<br>
> pass<br>
> @print_timing<br>
> def test_write(self):<br>
> for filename in self.get_files():<br>
> f = open(filename,'w')<br>
> f.write('hi there\n')<br>
> f.close()<br>
> @print_timing<br>
> def test_delete(self):<br>
> for filename in self.get_files():<br>
> os.unlink(filename)<br>
> if __name__ == '__main__':<br>
><br>
> options = parse_options()<br>
> bench = FSBench(path=options.path, num=options.num)<br>
> bench.test_write()<br>
> bench.test_open_read()<br>
> bench.test_stat()<br>
> bench.test_stat_nonexist()<br>
> bench.test_delete()<br>
><br>
><br>
><br>
><br>
</div></div>> _______________________________________________<br>
> Gluster-users mailing list<br>
> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
> <a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote></div><br></div>