<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    My minimal donation:<br>

    <br>

    <div class="moz-cite-prefix">On 07/10/2013 04:01 AM, Allan Latham

      wrote:<br>

    </div>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">There seems to be a problem with the way gluster is going.

For me it would be an ideal solution if it actually worked.</pre>

    </blockquote>

    Actually working is always the ideal. Actually working for all

    possible use cases... may be a little more difficult (though still

    ideal).<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">I have a simple scenario and it just simply doesn't work. Reading over

the network when the file is available locally is plainly wrong. Our

application cannot take the performance hit nor the extra network traffic.</pre>

    </blockquote>

    It's not "wrong" just not the way you envision it. <br>

    <br>

    Typically, in a scaled scenario where clustered storage has the

    strongest advantage, you'll have a limited number of storage servers

    and a much greater number of application servers. The likelihood

    that any of those application servers is going to have the file they

    want locally, even if they're shared-use, is pretty slim.

    Engineering for that probability is the "correct" solution in that

    use case.<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">I would suggest:

1. get a simple minimalist configuration working - 2 hosts and

replication only.

2. make it bomb-proof.

2a. it must cope with network failures, random reboots etc.

2b. if it stops it has to auto-recover quickly.</pre>

    </blockquote>

    So far, all done within reasonable parameters. "bomb proof" is an

    obvious exaggeration and is unattainable. If you literally blow up

    all your servers, you're going to lose data.<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">

2c. if it can't it needs thorough documentation and adequate logs so a

reasonable sysop can rescue it.</pre>

    </blockquote>

    Define "reasonable sysop". Correcting from any failure that isn't

    automatic is going to require a certain amount of understanding

    about clustering, split-brain, and split-brain recovery. That's not

    your typical first-tier sysop, IMHO.<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">

2d. it needs a fast validation scanner which verifies that data is where

it should be and is identical everywhere (md5sum).</pre>

    </blockquote>

    md5sum isn't the fastest checksum algorithm.<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">

3. make it efficient (read local whenever possible - use rsync

techniques - remove scalability obstacles so it doesn't get

exponentially slower as more files are replicated)</pre>

    </blockquote>

    See earlier point about scaled systems. Also it does not get

    "exponentially slower as more files are replicated". That would be

    silly.<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">4. when that works expand to multiple hosts and clever distribution

techniques.

(repeat items 2 and 3 in the more complex environment)

If it doesn't work rock solid in a simple scenario it will never work in

a large scale cluster.</pre>

    </blockquote>

    Not necessarily true. That's like <a

href="http://joejulian.name/blog/dont-get-stuck-micro-engineering-for-scale/">comparing

      Apples to Orchards</a>.<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">

Until point 3 is reached I cannot use it - which is a great

disappointment for me as well as the good guys doing the development.</pre>

    </blockquote>

    Consider expanding your thinking to bits you have more control over.

    Network latency is probably the biggest. Consider using low-latency

    10Gig cards(1) and switches(2) or infiniband.<br>

    <blockquote cite="mid:51DD3F0A.8080808@flexsys-group.de" type="cite">

      <pre wrap="">

Good luck and thanks again

Allan

</pre>

    </blockquote>

    1) <a class="moz-txt-link-freetext" href="http://www.solarflare.com">http://www.solarflare.com</a> makes sub microsecond latency adapters

    that can utilize a userspace driver pinned to the cpu doing the

    request eliminating a context switch<br>

    2) <a class="moz-txt-link-freetext" href="http://www.aristanetworks.com/en/products/7100t">http://www.aristanetworks.com/en/products/7100t</a> is a 2.5

    microsecond switch<br>

  </body>

</html>