<div dir="ltr">Seems like heavy FINODELK contention. As a diagnostic step, can you try disabling eager-locking and check the write performance again (gluster volume set $name cluster.eager-lock off)?</div><div class="gmail_extra">

<br><br><div class="gmail_quote">On Tue, Aug 5, 2014 at 11:44 AM, David F. Robinson <span dir="ltr">&lt;<a href="mailto:david.robinson@corvidtec.com" target="_blank">david.robinson@corvidtec.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div>

<div>Forgot to attach profile info in previous email.  Attached...  </div><span class="HOEnZb"><font color="#888888">

<div> </div>

<div>David</div></font></span><div><div class="h5">

<div> </div>

<div> </div>

<div>------ Original Message ------</div>

<div>From: &quot;David F. Robinson&quot; &lt;<a href="mailto:david.robinson@corvidtec.com" target="_blank">david.robinson@corvidtec.com</a>&gt;</div>

<div>To: <a href="mailto:gluster-devel@gluster.org" target="_blank">gluster-devel@gluster.org</a></div>

<div>Sent: 8/5/2014 2:41:34 PM</div>

<div>Subject: Fw: Re: Corvid gluster testing</div>

<div> </div>

<div>

<blockquote cite="http://eme3fc1a31-df17-4978-a99c-f4b348e690f3@dfrobins-vaio" type="cite">

<div>I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues.  Pranith suggested that I send this to the gluster-devel email list, so here goes: </div>


<div> </div>

<div>I am running an MPI job that saves a restart file to the gluster file system.  When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds.</div>

<div> </div>

<div><em><font>    gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0<br></font></em></div>

<div>When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds.</div>

<div> </div>

<div><em><font>  gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0</font></em></div>

<div> </div>

<div>The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower.  </div>

<div> </div>

<div>I am running SL 6.4 and glusterfs-3.5.2-0.1.beta1.el6.x86_64... </div>

<div> </div>

<div><em><font>[root@gfs01a glusterfs]# gluster volume info homegfs<br>Volume Name: homegfs<br>Type: Distributed-Replicate<br>Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071<br>Status: Started<br>Number of Bricks: 2 x 2 = 4<br>

Transport-type: tcp<br>Bricks:<br>Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs<br>Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs<br>Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs<br>Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs</font></em><br>

</div>

<div> </div>

<div>David</div>

<div> </div>

<div>------ Forwarded Message ------</div>

<div>From: &quot;Pranith Kumar Karampuri&quot; &lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</div>

<div>To: &quot;David Robinson&quot; &lt;<a href="mailto:david.robinson@corvidtec.com" target="_blank">david.robinson@corvidtec.com</a>&gt;</div>

<div>Cc: &quot;Young Thomas&quot; &lt;<a href="mailto:tom.young@corvidtec.com" target="_blank">tom.young@corvidtec.com</a>&gt;</div>

<div>Sent: 8/5/2014 2:25:38 AM</div>

<div>Subject: Re: Corvid gluster testing</div>

<div> </div>

<div>

<div><a href="mailto:gluster-devel@gluster.org" target="_blank">gluster-devel@gluster.org</a> is the email-id for the mailing list. We should probably start with the initial run numbers and the comparison for glusterfs mount and nfs mounts. May be something like </div>


<div> </div>

<div>glusterfs mount: 90 minutes </div>

<div>nfs mount: 25 minutes </div>

<div> </div>

<div>And profile outputs, volume config, number of mounts, hardware configuration should be a good start. </div>

<div> </div>

<div>Pranith </div>

<div> </div>

<div>On 08/05/2014 09:28 AM, David Robinson wrote: </div>

<blockquote type="cite">

<div>Thanks pranith </div>

<div> </div>

<div> </div>

<div>=============================== </div>

<div>David F. Robinson, Ph.D. </div>

<div>President - Corvid Technologies </div>

<div><a href="tel:704.799.6944%20x101" value="+17047996944" target="_blank">704.799.6944 x101</a> [office] </div>

<div><a href="tel:704.252.1310" value="+17042521310" target="_blank">704.252.1310</a> [cell] </div>

<div><a href="tel:704.799.7974" value="+17047997974" target="_blank">704.799.7974</a> [fax] </div>

<div><a href="mailto:David.Robinson@corvidtec.com" target="_blank">David.Robinson@corvidtec.com</a> </div>

<div><a href="http://www.corvidtechnologies.com/" target="_blank">http://www.corvidtechnologies.com</a> </div>

<div> </div>

<blockquote type="cite">

<div>On Aug 4, 2014, at 11:22 PM, Pranith Kumar Karampuri &lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt; wrote: </div>

<div> </div>

<div> </div>

<blockquote type="cite">

<div>On 08/05/2014 08:33 AM, Pranith Kumar Karampuri wrote: </div>

<div> </div>

<div>On 08/05/2014 08:29 AM, David F. Robinson wrote: </div>

<blockquote type="cite">

<blockquote type="cite">

<blockquote type="cite">

<div>On 08/05/2014 12:51 AM, David F. Robinson wrote: </div>

<div>No. I don&#39;t want to use nfs. It eliminates most of the benefits of why I want to use gluster. Failover redundancy of the pair, load balancing, etc. </div></blockquote>

<div>What is the meaning of &#39;Failover redundancy of the pair, load balancing &#39; Could you elaborate more? smb/nfs/glusterfs are just access protocols that gluster supports functionality is almost same </div></blockquote>


<div>Here is my understanding. Please correct me where I am wrong. </div>

<div> </div>

<div>With gluster, if I am doing a write and one of the replicated pairs goes down, there is no interruption to the I/o. The failover is handled by gluster and the fuse client. This isn&#39;t done if I use an nfs mount unless the component of the pair that goes down isn&#39;t the one I used for the mount. </div>


<div> </div>

<div>With nfs, I will have to mount one of the bricks. So, if I have gfs01a, gfs01b, gfs02a, gfs02b, gfs03a, gfs03b, etc and my fstab mounts gfs01a, it is my understanding that all of my I/o will go through gfs01a which then gets distributed to all of the other bricks. Gfs01a throughput becomes a bottleneck. Where if I do a gluster mount using fuse, the load balancing is handled at the client side , not the server side. If I have 1000-nodes accessing 20-gluster bricks, I need the load balancing aspect. I cannot have all traffic going through the network interface on a single brick. </div>


<div> </div>

<div>If I am wrong with the above assumptions, I guess my question is why would one ever use the gluster mount instead of nfs and/or samba? </div>

<div> </div>

<div>Tom: feel free to chime in if I have missed anything. </div></blockquote>

<div>I see your point now. Yes the gluster server where you did the mount is kind of a bottle neck. </div></blockquote>

<div>Now that we established the problem is in the clients/protocols, you should send out a detailed mail on gluster-devel and see if anyone can help with you on performance xlators that can improve it a bit more. My area of expertise is more on replication. I am sub-maintainer for replication,locks components. I also know connection management/io-threads related issues which lead to hangs as I worked on them before. Performance xlators are black box to me. </div>


<div> </div>

<div>Performance xlators are enabled only on fuse gluster stack. On nfs server mounts we disable all the performance xlators except write-behind as nfs client does lots of things for improving performance. I suggest you guys follow up more on gluster-devel. </div>


<div> </div>

<div>Appreciate all the help you did for improving the product :-). Thanks a ton! </div>

<div>Pranith </div>

<blockquote type="cite">

<div>Pranith </div>

<blockquote type="cite">

<div>David (Sent from mobile) </div>

<div> </div>

<div>=============================== </div>

<div>David F. Robinson, Ph.D. </div>

<div>President - Corvid Technologies </div>

<div><a href="tel:704.799.6944%20x101" value="+17047996944" target="_blank">704.799.6944 x101</a> [office] </div>

<div><a href="tel:704.252.1310" value="+17042521310" target="_blank">704.252.1310</a> [cell] </div>

<div><a href="tel:704.799.7974" value="+17047997974" target="_blank">704.799.7974</a> [fax] </div>

<div><a href="mailto:David.Robinson@corvidtec.com" target="_blank">David.Robinson@corvidtec.com</a> </div>

<div><a href="http://www.corvidtechnologies.com/" target="_blank">http://www.corvidtechnologies.com</a> </div></blockquote></blockquote></blockquote></blockquote>

<div> </div></div></blockquote></div></div></div></div><br>_______________________________________________<br>

Gluster-devel mailing list<br>

<a href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a><br>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-devel" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-devel</a><br>

<br></blockquote></div><br></div>