Sorry, couldn't reply earlier, was indisposed the last few days.<br><br>Thanks for the input Brian, especially on the 'map' translator.<br>It lead me to another one called the 'switch' scheduler that seems to do exactly what I want.<br>
i.e. distribute files on to selective bricks, based on file extension.<br><br>Trying to find out more about it. Do tell me / point me to more info if you have any.<br><br>Regards,<br><br><br>Indivar Nair<br><br><br><br><div class="gmail_quote">
On Fri, Sep 28, 2012 at 1:12 PM, Brian Candler <span dir="ltr"><<a href="mailto:B.Candler@pobox.com" target="_blank">B.Candler@pobox.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Fri, Sep 28, 2012 at 08:58:55AM +0530, Indivar Nair wrote:<br>
> We were trying to cater to both large file (100MB - 2GB) read speed and<br>
> small file (10-50MB) read+write speed.<br>
> With Gluster, we were thinking of setting the individual stripe size to<br>
> 50MB so that each volume could hold a complete small file. While larger<br>
> files could be striped across in 50MB chunks.<br>
<br>
</div>Sure, although you can do the same type of striping with RAID too. The RAID<br>
might be slightly less efficient, in that a small file might straddle two<br>
chunks whereas a small file in gluster will always hit one brick. And<br>
parity RAID is unlikely to work well with such a large chunk size, i.e.<br>
it's very unlikely you will ever write a whole stripe at once to avoid<br>
readback of existing parity blocks.<br>
<div><br>
> The RAID Controllers that come with branded hardware does not allow<br>
> individual disk access (no passthrough mode)<br>
<br>
</div>It won't let you make one-disk RAID0 sets? However I'm not convinced that<br>
this will make much difference in your case anyway. You're looking for high<br>
throughput with large files, which is limited simply by drive throughput,<br>
and a write cache will not help with that.<br>
<br>
Write caches are useful for operations which write small bits of data and<br>
must have confirmation that the data has been written to disk before<br>
continuing (e.g. inode updates) which could result in filesystem corruption<br>
if not done in correct sequence.<br>
<br>
Of course, the best answer is simply to measure it with a test workload<br>
representative of your expected usage.<br>
<div><br>
> One more thought, is it possible to have a mix of RAID6 volumes, and<br>
> individual disks and force Gluster to write large files (*.ma) to RAID6<br>
> volumes and small files (*.iff) to individual disks. That would solve<br>
> our problem completely.<br>
<br>
</div>Technically yes, in that you could manually hack about with the translator<br>
stack and put in the 'map' translator:<br>
<a href="http://gluster.org/pipermail/gluster-users/2010-March/004292.html" target="_blank">http://gluster.org/pipermail/gluster-users/2010-March/004292.html</a><br>
<br>
However this is completely unsupported by RedHat (both manually hacking<br>
about with the stack, and the map translator itself)<br>
<br>
If it were me I'd just put the *.ma files on one volume and the *.iff ones<br>
on a different one, at the application layer. But I'd only do this if I<br>
really needed RAID10 for some data and RAID6 for other.<br>
<br>
Regards,<br>
<br>
Brian.<br>
</blockquote></div><br>