<html><head><meta http-equiv="content-type" content="text/html; charset=UTF-8" /></head><body bgcolor="#FFFFFF" text="#000000">Excellent! I&#39;ve been toying with the same concept in the back of my mind for a long while now. I&#39;m sure there is an unrealized desire for such tools.<br>

<br>

When your ready, please put such a toolset on <a href="http://forge.gluster.org">forge.gluster.org</a>. <br><br><div class="gmail_quote">On April 16, 2014 6:50:48 AM PDT, Michael Peek &lt;peek@nimbios.org&gt; wrote:<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">


    Hi guys,<br />

    <br />

    (I'm new to this, so pardon me if my shenanigans turns out to be a

    waste of your time.)<br />

    <br />

    I have been experimenting with Gluster by copying and deleting large

    numbers of files of all sizes.  What I found was that when deleting

    a large number of small files, the deletion process seems to take a

    good chunk of my time -- in some cases it seemed to take a

    significant percentage of the time that it took to copy the files to

    the cluster to begin with.  I'm guessing that the reason is a

    combination of find and rm -fr processing files serially and having

    to wait on the packets to travel back and forth over the network. 

    But with a clustering filesystem, the bottleneck is processing files

    serially and waiting for network packets when you don't have to.<br />

    <br />

    So I decided to try an experiment.  Instead of using /bin/rm to

    delete files serially, I wrote my own quick-and-dirty recursive rm

    (and recursive ls) that uses pthreads (listed as "cluster-rm" and

    "cluster-ls" in the table below):<br />

    <br />

    Methods:<br />

    <br />

    1) This was done on a Linux system.  I suspect that Linux (or any

    modern OS) caches filesystem information.  For example, after

    setting up a directory, when running rm -fr on that directory, the

    time for rm to complete is lessened if I first run find on the same

    directory.  So to avoid this caching effect, each command was run on

    it's own test directory.  (I.e. find was never run on the same

    directory as rm -fr or cluster-rm.)  This approach seemed to prevent

    inconsistencies resulting from any caching behavior, resulting in

    run times that were more consistent.<br />

    <br />

    2) Each test directory contained the exact same data for each of the

    four commands tested (find, cluster-ls, rm, cluster-rm) for each

    test run.<br />

    <br />

    3) All commands were run on a client machine and not one of the

    cluster nodes.<br />

    <br />

    Results:<br />

    <br />

    
    <table border="1" cellpadding="2" cellspacing="2" width="100%"><tbody><tr><td valign="top"><u><b>Data Size</b></u><br />

          </td><td valign="top"><u><b>Command</b></u><br />

          </td><td valign="top"><u><b>Test #1</b></u><br />

          </td><td valign="top"><u><b>Test #2</b></u><br />

          </td><td valign="top"><u><b>Test #3</b></u><br />

          </td><td valign="top"><u><b>Test #4</b></u><br />

          </td></tr><tr><td valign="top">49GB<br />

          </td><td valign="top">find -print<br />

          </td><td valign="top">real    6m45.066s<br />

            user    0m0.172s<br />

            sys    0m0.748s<br />

          </td><td valign="top">real    6m18.524s<br />

            user    0m0.140s<br />

            sys    0m0.508s<br />

          </td><td valign="top">real    5m45.301s<br />

            user    0m0.156s<br />

            sys    0m0.484s<br />

          </td><td valign="top">real    5m58.577s<br />

            user    0m0.132s<br />

            sys    0m0.480s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-ls<br />

          </td><td valign="top">real    2m32.770s<br />

            user    0m0.208s<br />

            sys    0m1.876s<br />

          </td><td valign="top">real    2m21.376s<br />

            user    0m0.164s<br />

            sys    0m1.568s<br />

          </td><td valign="top">real    2m40.511s<br />

            user    0m0.184s<br />

            sys    0m1.488s<br />

          </td><td valign="top">real    2m36.202s<br />

            user    0m0.172s<br />

            sys    0m1.412s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td></tr><tr><td valign="top">49GB<br />

          </td><td valign="top">rm -fr<br />

          </td><td valign="top">real    16m36.264s<br />

            user    0m0.232s<br />

            sys    0m1.724s<br />

          </td><td valign="top">real    16m16.795s<br />

            user    0m0.248s<br />

            sys    0m1.528s<br />

          </td><td valign="top">real    15m54.503s<br />

            user    0m0.204s<br />

            sys    0m1.396s<br />

          </td><td valign="top">real    16m10.037s<br />

            user    0m0.168s<br />

            sys    0m1.448s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-rm<br />

          </td><td valign="top">real    1m50.717s<br />

            user    0m0.236s<br />

            sys    0m1.820s<br />

          </td><td valign="top">real    1m44.803s<br />

            user    0m0.192s<br />

            sys    0m2.100s<br />

          </td><td valign="top">real    2m6.250s<br />

            user    0m0.224s<br />

            sys    0m2.200s<br />

          </td><td valign="top">real    2m6.367s<br />

            user    0m0.224s<br />

            sys    0m2.316s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td></tr><tr><td valign="top">97GB<br />

          </td><td valign="top">find -print<br />

          </td><td valign="top">real    11m39.990s<br />

            user    0m0.380s<br />

            sys    0m1.428s<br />

          </td><td valign="top">real    11m21.018s<br />

            user    0m0.380s<br />

            sys    0m1.224s<br />

          </td><td valign="top">real    11m33.257s<br />

            user    0m0.288s<br />

            sys    0m0.924s<br />

          </td><td valign="top">real    11m4.867s<br />

            user    0m0.332s<br />

            sys    0m1.244s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-ls<br />

          </td><td valign="top">real    4m46.829s<br />

            user    0m0.504s<br />

            sys    0m3.228s<br />

          </td><td valign="top">real    5m15.538s<br />

            user    0m0.408s<br />

            sys    0m3.736s<br />

          </td><td valign="top">real    4m52.075s<br />

            user    0m0.364s<br />

            sys    0m3.004s<br />

          </td><td valign="top">real    4m43.134s<br />

            user    0m0.452s<br />

            sys    0m3.140s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td></tr><tr><td valign="top">97GB<br />

          </td><td valign="top">rm -fr<br />

          </td><td valign="top">real    29m34.138s<br />

            user    0m0.520s<br />

            sys    0m3.908s<br />

          </td><td valign="top">real    28m11.000s<br />

            user    0m0.556s<br />

            sys    0m3.480s<br />

          </td><td valign="top">real    28m37.154s<br />

            user    0m0.412s<br />

            sys    0m2.756s<br />

          </td><td valign="top">real    28m41.724s<br />

            user    0m0.380s<br />

            sys    0m4.184s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-rm<br />

          </td><td valign="top">real    3m30.750s<br />

            user    0m0.524s<br />

            sys    0m4.932s<br />

          </td><td valign="top">real    4m20.195s<br />

            user    0m0.456s<br />

            sys    0m5.316s<br />

          </td><td valign="top">real    4m45.206s<br />

            user    0m0.444s<br />

            sys    0m4.584s<br />

          </td><td valign="top">real    4m26.894s<br />

            user    0m0.436s<br />

            sys    0m4.732s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td></tr><tr><td valign="top">145GB<br />

          </td><td valign="top">find -print<br />

          </td><td valign="top">real    16m26.498s<br />

            user    0m0.520s<br />

            sys    0m2.244s<br />

          </td><td valign="top">real    16m53.047s<br />

            user    0m0.596s<br />

            sys    0m1.740s<br />

          </td><td valign="top">real    15m10.704s<br />

            user    0m0.364s<br />

            sys    0m1.748s<br />

          </td><td valign="top">real    15m53.943s<br />

            user    0m0.456s<br />

            sys    0m1.764s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-ls<br />

          </td><td valign="top">real    6m52.006s<br />

            user    0m0.644s<br />

            sys    0m5.664s<br />

          </td><td valign="top">real    7m7.361s<br />

            user    0m0.804s<br />

            sys    0m5.432s<br />

          </td><td valign="top">real    7m4.109s<br />

            user    0m0.652s<br />

            sys    0m4.800s<br />

          </td><td valign="top">real    6m37.229s<br />

            user    0m0.656s<br />

            sys    0m4.652s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td></tr><tr><td valign="top">145GB<br />

          </td><td valign="top">rm -fr<br />

          </td><td valign="top">real    40m10.396s<br />

            user    0m0.624s<br />

            sys    0m5.492s<br />

          </td><td valign="top">real    42m17.851s<br />

            user    0m0.844s<br />

            sys    0m4.872s<br />

          </td><td valign="top">real    39m6.493s<br />

            user    0m0.484s<br />

            sys    0m4.868s<br />

          </td><td valign="top">real    39m52.047s<br />

            user    0m0.496s<br />

            sys    0m4.980s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-rm<br />

          </td><td valign="top">real    6m49.769s<br />

            user    0m0.708s<br />

            sys    0m6.440s<br />

          </td><td valign="top">real    8m34.644s<br />

            user    0m0.852s<br />

            sys    0m8.345s<br />

          </td><td valign="top">real    6m3.563s<br />

            user    0m0.636s<br />

            sys    0m5.844s<br />

          </td><td valign="top">real    6m31.808s<br />

            user    0m0.664s<br />

            sys    0m5.996s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td></tr><tr><td valign="top">1.1TB<br />

          </td><td valign="top">find -print</td><td valign="top">real    62m4.043s<br />

            user    0m1.300s<br />

            sys    0m5.448s<br />

          </td><td valign="top">real    61m11.584s<br />

            user    0m1.204s<br />

            sys    0m5.172s<br />

          </td><td valign="top">real    65m37.389s<br />

            user    0m1.708s<br />

            sys    0m4.276s<br />

          </td><td valign="top">real    63m51.822s<br />

            user    0m3.096s<br />

            sys    0m9.869s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-ls<br />

          </td><td valign="top">real    73m12.463s<br />

            user    0m2.472s<br />

            sys    0m19.289s<br />

          </td><td valign="top">real    68m37.846s<br />

            user    0m2.080s<br />

            sys    0m18.625s<br />

          </td><td valign="top">real    72m56.417s<br />

            user    0m2.516s<br />

            sys    0m18.601s<br />

          </td><td valign="top">real    69m3.575s<br />

            user    0m4.316s<br />

            sys    0m35.986s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td><td valign="top"><br />

          </td></tr><tr><td valign="top">1.1TB<br />

          </td><td valign="top">rm -fr<br />

          </td><td valign="top">real    188m1.925s<br />

            user    0m2.240s<br />

            sys    0m21.705s<br />

          </td><td valign="top">real    190m21.850s<br />

            user    0m2.372s<br />

            sys    0m18.885s<br />

          </td><td valign="top">real    200m25.712s<br />

            user    0m5.840s<br />

            sys    0m46.363s<br />

          </td><td valign="top">real    196m12.686s<br />

            user    0m4.916s<br />

            sys    0m41.519s<br />

          </td></tr><tr><td valign="top"><br />

          </td><td valign="top">cluster-rm<br />

          </td><td valign="top">real    85m46.463s<br />

            user    0m2.512s<br />

            sys    0m30.478s<br />

          </td><td valign="top">real    90m29.055s<br />

            user    0m2.600s<br />

            sys    0m30.382s<br />

          </td><td valign="top">real    88m16.063s<br />

            user    0m4.456s<br />

            sys    0m51.667<br />

          </td><td valign="top">real    77m42.096s<br />

            user    0m2.464s<br />

            sys    0m31.638s<br />

          </td></tr></tbody></table>

    <br />

    <br />

    Conclusions:<br />

    <br />

    1) Once I had a threaded version of rm, a threaded version of ls was

    easy to make, so I included it in the tests (listed above as

    cluster-ls).  Performance looked spiffy up until the 1.1TB range,

    when cluster-ls started taking more time than find.  Right now I

    can't explain why.  1.1TB takes a long time to set up and process

    (about a day for each set of four commands), it could be that

    regular nightly backups might be interfering with performance.  If

    that's the case, then it calls into question the usefulness of my

    threaded approach.  Also, naturally the output from cluster-ls is

    out of order, so grep and sed would most likely be used in

    conjunction with something like that, and I haven't yet time-tested

    'cluster-ls | some-other-command' against using plain old find by

    itself.<br />

    <br />

    2) Results from cluster-rm look pretty good to me across the board. 

    Again, performance seems to fall off in the 1.1TB tests, and the

    reasons are not clear to me at this time, but performance is still

    half that of rm -fr.  Run times fluctuate more than in the previous

    tests, but I suppose that's to be expected.  But since performance

    does drop, it makes me wonder how well this approach scales on

    larger sets of data.<br />

    <br />

    3) My threaded cluster-rm/ls commands are not clever.  While

    traversing directories, any subdirectories found would result in a

    new thread to process it, up until some hard-coded limit is reached

    (for the above results, 100 threads were used).  After the thread

    count limit is reached, directories are processed using plain old

    recursion until a thread exits, freeing up a thread to process

    another subdirectory.<br />

    <br />

    Further Research:<br />

    <br />

    A) I would like to test further with larger data sets.<br />

    <br />

    B) I would like to implement a smarter algorithm for determining how

    many threads to use to maximize performance.  Rather than a

    hard-coded maximum, a better approach might be to use some metric

    for measuring number of inodes processed per second, and use that to

    determine the effectiveness of adding more threads until a local

    maxima is reached.<br />

    <br />

    C) How do these numbers change if the commands are run on one of the

    cluster nodes instead of a client?<br />

    <br />

    I have some ideas of smarter things to try, but I am at best an

    inexperienced (if enthusiastic) dabbler in the programming arts.  A

    professional would likely do a much better job.<br />

    <br />

    But if this data looks at all interesting or useful, then maybe

    there would be a call for a handful of cluster-specific filesystem

    tools?<br />

    <br />

    Michael Peek<br />

    <br />

  
<p style="margin-top: 2.5em; margin-bottom: 1em; border-bottom: 1px solid #000"></p><pre class="k9mail"><hr /><br />Gluster-users mailing list<br />Gluster-users@gluster.org<br /><a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre></blockquote></div><br>

-- <br>

Sent from my Android device with K-9 Mail. Please excuse my brevity.</body></html>