<html><head/><body><p style="color:#000000;">Paul,<br><br>I think that's a fair summary. Basically, solutions for these use cases are akin to duct tape and bailing wire until we architect a real solution. Speaking of... if anyone wants to write their own scheduler, we welcome your collaboration.<br><br>To be blunt, this type of usage was not a priority for us, performance-wise. However, I can see that changing over the next quarter is so, especially as we get into the next round of roadmap planning.<br><br>Which brings me to.... roadmap planning. If you care about small file performance,  it is imperative that you participate in this process. If you're on gluster-devel, you'll see some announcements coming up shortly. Part of this means putting your feature requests into bugzilla, as well as participating in the conversation so that we know if we're moving in the right direction.<br><br>We know there are limitations. Help is on the way. For those of you who are struggling now, I invite you to help us make a better project for everyone. I appreciate your patience for sticking with us.<br><br>Thanks,<br>JM <br><br><br>-------- Original Message --------<br> From: pkoelle <pkoelle@gmail.com><br> Sent: Tue, Nov 6, 2012 05:16 AM<br> To: gluster-users@gluster.org<br> CC: <br> Subject: Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume<br><br></p>Hi,
<br>

<br>

from Joe's findings it seems the IO subsystem has little impact on 
<br>

performance. If a ramdisk doesn't help with small files and directory 
<br>

listings the problem is probably gluster itself.
<br>

<br>

That being said IMO this is a general limitation in the design. There is 
<br>

no central lock manager and last time I checked gluster had to stat() on 
<br>

each brick to find the most recent entry.
<br>

FUSE overhead and context switches make things worse, but to what extent 
<br>

and which part is responsible for high CPU/high latency is still 
<br>

unknown. AFAIK noone has measured in detail or the findings where such 
<br>

that internal policy is not to publish the results.
<br>

<br>

IIRC there was/is a stat cache translator but that is just a workaround 
<br>

and will lead to stale data.
<br>

<br>

cheers
<br>

  Paul
<br>

<br>

Am 06.11.2012 10:35, schrieb Fernando Frediani (Qube):
<br>

> Joe,
<br>

>
<br>

> I don't think we have to accept this as this is not acceptable thing. I have seen countless people complaining about this problem for a while and seems no improvements have been done.
<br>

> The thing about the ramdisk although might help, looks more a chewing gun. I have seen other distributed filesystems that don't suffer for the same problem, so why Gluster have to ?
<br>

>
<br>

> -----Original Message-----
<br>

> From: gluster-users-bounces@gluster.org [mailto:gluster-users-bounces@gluster.org] On Behalf Of Joe Landman
<br>

> Sent: 05 November 2012 15:07
<br>

> To: gluster-users@gluster.org
<br>

> Subject: Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume
<br>

>
<br>

> On 11/05/2012 09:57 AM, harry mangalam wrote:
<br>

>> Jeff Darcy wrote a nice piece in his hekafs blog about 'the importance
<br>

>> of keeping things sequential' which is essentially about the
<br>

>> contention for heads between data io and journal io.
<br>

>> <<a href="http://hekafs.org/index.php/2012/11/the-importance-of-staying-sequent">http://hekafs.org/index.php/2012/11/the-importance-of-staying-sequent</a>
<br>

>> ial/> (also congrats on the Linux Journal article on the glupy
<br>

>> python/gluster approach).
<br>

>>
<br>

>> We've been experimenting with SSDs on ZFS (using the SSDs fo the ZIL
<br>

>> (journal)) and while it's provided a little bit of a boost, it has not
<br>

>> been dramatic.  Ditto XFS.  However, we did not stress it at all with
<br>

>> heavy loads
<br>

>
<br>

> An issue you have to worry about is if the SSD streaming read/write path is around the same speed as the spinning rust performance.  If so, this design would be a wash at best.
<br>

>
<br>

> Also, if this is under Linux, the ZFS pathways may not be terribly well optimized.
<br>

>
<br>

>> in a gluster env and I'm now thinking that there is where you would
<br>

>> see the improvement. (see Jeff's graph about how the diff in
<br>

>> threads/load affects IOPS).
<br>

>>
<br>

>> Is anyone running a gluster system with the underlying XFS writing the
<br>

>> journal to SSDs?  If so, any improvement?  I would have expected to
<br>

>> hear about this as a recommended architecture for gluster if it had
<br>

>> performed MUCH better, but
<br>

>
<br>

> Yes, we've done this, and do this on occasion.  No, there's no dramatic speed boost for most use cases.
<br>

>
<br>

> Unfortunately, heavy metadata ops on GlusterFS are going to be slow, and we simply have to accept that for the near term.  This appears to be independent of the particular file system, or even storage technology.
<br>

> If you aren't doing metadata heavy ops, then you should be in good shape.  It appears that mirroring magnifies the metadata heavy ops significantly.
<br>

>
<br>

> For laughs, about a year ago, we set up large ram disks (tmpfs) in a cluster, put a loopback device on them, then a file system, then GlusterFS atop this.  Should have been very fast for metadata ops.  But it wasn't.  Gave some improvement, but not significant enough that we'd recommend doing "heroic" designs like this.
<br>

>
<br>

> If your workloads are metadata heavy, we'd recommend local IO, and if you are mostly small IO, an SSD.
<br>

>
<br>

>
<br>

>
<br>

>
<br>

<br>

<br>

-- 
<br>

<a href="https://ssl.facebook.com/help/contact.php?show_form=delete_account">https://ssl.facebook.com/help/contact.php?show_form=delete_account</a>
<br>

_______________________________________________
<br>

Gluster-users mailing list
<br>

Gluster-users@gluster.org
<br>

<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a>
<br>

</body></html>