<html><head/><body><p style="color:#000000;">Thanks. I believe this is fixed now.<br><br>-------- Original Message --------<br> From: Pitichai Pitimaneeyakul <pitichai@2-cans.com><br> Sent: Fri, Jul 27, 2012 04:48 AM<br> To: gluster-devel@nongnu.org<br> CC: <br> Subject: Re: [Gluster-devel] [Gluster-users] kernel parameters for improving gluster writes on millions of small writes (long)<br><br></p>Hi John,
<br>
I try to access the page that you mention, and somehow I got page not
<br>
found.
<br>
I also try to access <a href="http://community.gluster.org">http://community.gluster.org</a>/ and found that is the
<br>
helpshift website.
<br>
<br>
Regards,
<br>
Pitichai
<br>
<br>
On 07/26/2012 10:07 PM, John Mark Walker wrote:
<br>
> Harry,
<br>
>
<br>
> Have you seen this post?
<br>
>
<br>
> <a href="http://community.gluster.org/a/linux-kernel-tuning-for-glusterfs">http://community.gluster.org/a/linux-kernel-tuning-for-glusterfs</a>/
<br>
>
<br>
>
<br>
> Be sure and read all the comments, as Ben England chimes in on the comments, and he's one of the performance engineers at Red Hat.
<br>
>
<br>
> -JM
<br>
>
<br>
>
<br>
> ----- Harry Mangalam <hjmangalam@gmail.com> wrote:
<br>
>> This is a continuation of my previous posts about improving write perf
<br>
>> when trapping millions of small writes to a gluster filesystem.
<br>
>> I was able to improve write perf by ~30x by running STDOUT thru gzip
<br>
>> to consolidate and reduce the output stream.
<br>
>>
<br>
>> Today, another similar problem, having to do with yet another
<br>
>> bioinformatics program (which these days typically handle the 'short
<br>
>> reads' that come out of the majority of sequencing hardware, each read
<br>
>> being 30-150 characters, with some metadata typically in an ASCII file
<br>
>> containing millions of such entries). Reading them doesn't seem to be
<br>
>> a problem (at least on our systems) but writing them is quite awful..
<br>
>>
<br>
>> The program is called 'art_illumina' from the Broad Inst's 'ALLPATHS'
<br>
>> suite and it generates an artificial Illumina data set from an input
<br>
>> genome. In this case about 5GB of the type of data described above.
<br>
>> Like before, the gluster process goes to >100% and the program itself
<br>
>> slows to ~20-30% of a CPU. In this case, the app's output cannot be
<br>
>> extrnally trapped by redirecting thru gzip since the output flag
<br>
>> specifies the base filename for 2 files that are created internally
<br>
>> and then written directly. This prevents even setting up a named pipe
<br>
>> to trap and process the output.
<br>
>>
<br>
>> Since this gluster storage was set up specifically for bioinformatics,
<br>
>> this is a repeating problem and while some of the issues can be dealt
<br>
>> with by trapping and converting output, it would be VERY NICE if we
<br>
>> could deal with it at the OS level.
<br>
>>
<br>
>> The gluster volume is running over IPoIB on QDR IB and looks like this:
<br>
>> Volume Name: gl
<br>
>> Type: Distribute
<br>
>> Volume ID: 21f480f7-fc5a-4fd8-a084-3964634a9332
<br>
>> Status: Started
<br>
>> Number of Bricks: 8
<br>
>> Transport-type: tcp,rdma
<br>
>> Bricks:
<br>
>> Brick1: bs2:/raid1
<br>
>> Brick2: bs2:/raid2
<br>
>> Brick3: bs3:/raid1
<br>
>> Brick4: bs3:/raid2
<br>
>> Brick5: bs4:/raid1
<br>
>> Brick6: bs4:/raid2
<br>
>> Brick7: bs1:/raid1
<br>
>> Brick8: bs1:/raid2
<br>
>> Options Reconfigured:
<br>
>> performance.write-behind-window-size: 1024MB
<br>
>> performance.flush-behind: on
<br>
>> performance.cache-size: 268435456
<br>
>> nfs.disable: on
<br>
>> <a href="http://performance.io">performance.io</a>-cache: on
<br>
>> performance.quick-read: on
<br>
>> <a href="http://performance.io">performance.io</a>-thread-count: 64
<br>
>> auth.allow: 10.2.*.*,10.1.*.*
<br>
>>
<br>
>> I've tried to increase every caching option that might improve this
<br>
>> kind of performance, but it doesn't seem to help. At this point, I'm
<br>
>> wondering whether changing the client (or server) kernel parameters
<br>
>> will help.
<br>
>>
<br>
>> The client's meminfo is:
<br>
>> cat /proc/meminfo
<br>
>> MemTotal: 529425924 kB
<br>
>> MemFree: 241833188 kB
<br>
>> Buffers: 355248 kB
<br>
>> Cached: 279699444 kB
<br>
>> SwapCached: 0 kB
<br>
>> Active: 2241580 kB
<br>
>> Inactive: 278287248 kB
<br>
>> Active(anon): 190988 kB
<br>
>> Inactive(anon): 287952 kB
<br>
>> Active(file): 2050592 kB
<br>
>> Inactive(file): 277999296 kB
<br>
>> Unevictable: 16856 kB
<br>
>> Mlocked: 16856 kB
<br>
>> SwapTotal: 563198732 kB
<br>
>> SwapFree: 563198732 kB
<br>
>> Dirty: 1656 kB
<br>
>> Writeback: 0 kB
<br>
>> AnonPages: 486876 kB
<br>
>> Mapped: 19808 kB
<br>
>> Shmem: 164 kB
<br>
>> Slab: 1475476 kB
<br>
>> SReclaimable: 1205944 kB
<br>
>> SUnreclaim: 269532 kB
<br>
>> KernelStack: 5928 kB
<br>
>> PageTables: 27312 kB
<br>
>> NFS_Unstable: 0 kB
<br>
>> Bounce: 0 kB
<br>
>> WritebackTmp: 0 kB
<br>
>> CommitLimit: 827911692 kB
<br>
>> Committed_AS: 536852 kB
<br>
>> VmallocTotal: 34359738367 kB
<br>
>> VmallocUsed: 1227732 kB
<br>
>> VmallocChunk: 33888774404 kB
<br>
>> HardwareCorrupted: 0 kB
<br>
>> AnonHugePages: 376832 kB
<br>
>> HugePages_Total: 0
<br>
>> HugePages_Free: 0
<br>
>> HugePages_Rsvd: 0
<br>
>> HugePages_Surp: 0
<br>
>> Hugepagesize: 2048 kB
<br>
>> DirectMap4k: 201088 kB
<br>
>> DirectMap2M: 15509504 kB
<br>
>> DirectMap1G: 521142272 kB
<br>
>>
<br>
>> and the server's meminfo is:
<br>
>>
<br>
>> $ cat /proc/meminfo
<br>
>> MemTotal: 32861400 kB
<br>
>> MemFree: 1232172 kB
<br>
>> Buffers: 29116 kB
<br>
>> Cached: 30017272 kB
<br>
>> SwapCached: 44 kB
<br>
>> Active: 18840852 kB
<br>
>> Inactive: 11772428 kB
<br>
>> Active(anon): 492928 kB
<br>
>> Inactive(anon): 75264 kB
<br>
>> Active(file): 18347924 kB
<br>
>> Inactive(file): 11697164 kB
<br>
>> Unevictable: 0 kB
<br>
>> Mlocked: 0 kB
<br>
>> SwapTotal: 16382900 kB
<br>
>> SwapFree: 16382680 kB
<br>
>> Dirty: 8 kB
<br>
>> Writeback: 0 kB
<br>
>> AnonPages: 566876 kB
<br>
>> Mapped: 14212 kB
<br>
>> Shmem: 1276 kB
<br>
>> Slab: 429164 kB
<br>
>> SReclaimable: 324752 kB
<br>
>> SUnreclaim: 104412 kB
<br>
>> KernelStack: 3528 kB
<br>
>> PageTables: 16956 kB
<br>
>> NFS_Unstable: 0 kB
<br>
>> Bounce: 0 kB
<br>
>> WritebackTmp: 0 kB
<br>
>> CommitLimit: 32813600 kB
<br>
>> Committed_AS: 3053096 kB
<br>
>> VmallocTotal: 34359738367 kB
<br>
>> VmallocUsed: 340196 kB
<br>
>> VmallocChunk: 34342345980 kB
<br>
>> HardwareCorrupted: 0 kB
<br>
>> AnonHugePages: 200704 kB
<br>
>> HugePages_Total: 0
<br>
>> HugePages_Free: 0
<br>
>> HugePages_Rsvd: 0
<br>
>> HugePages_Surp: 0
<br>
>> Hugepagesize: 2048 kB
<br>
>> DirectMap4k: 6656 kB
<br>
>> DirectMap2M: 2072576 kB
<br>
>> DirectMap1G: 31457280 kB
<br>
>>
<br>
>> Does this suggest any approach? Is there a doc that suggests optimal
<br>
>> kernel parameters for gluster?
<br>
>>
<br>
>> I guess the only other option is to use the glusterfs as an NFS mount
<br>
>> and use the NFS client's caching..? That will help on a single
<br>
>> process but decrease the overall cluster bandwidth considerably.
<br>
>>
<br>
>> --
<br>
>> Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
<br>
>> [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
<br>
>> 415 South Circle View Dr, Irvine, CA, 92697 [shipping]
<br>
>> MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
<br>
>> _______________________________________________
<br>
>> Gluster-users mailing list
<br>
>> Gluster-users@gluster.org
<br>
>> <a href="http://gluster.org/cgi-bin/mailman/listinfo/gluster-users">http://gluster.org/cgi-bin/mailman/listinfo/gluster-users</a>
<br>
> _______________________________________________
<br>
> Gluster-users mailing list
<br>
> Gluster-users@gluster.org
<br>
> <a href="http://gluster.org/cgi-bin/mailman/listinfo/gluster-users">http://gluster.org/cgi-bin/mailman/listinfo/gluster-users</a>
<br>
<br>
<br>
_______________________________________________
<br>
Gluster-devel mailing list
<br>
Gluster-devel@nongnu.org
<br>
<a href="https://lists.nongnu.org/mailman/listinfo/gluster-devel">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a>
<br>
</body></html>