<div dir="ltr"><div>Dear Gluster Community,<br><br></div><div>After researching a number of distributed file systems for deployment in a production environment with the main purpose of performing both batch and real-time distributed computing I&#39;ve identified Gluster as a potential solution.<br>


<br></div><div>The key properties that our system should exhibit:<br><br></div><div>- an open source, liberally licensed, yet production ready, e.g. a mature, <b>reliable</b>, community and commercially supported solution;<br>


</div><div>- ability to run on commodity hardware, preferably be designed for it;<br></div><div>- provide high availability of the data with the most focus on reads;<br></div><div>- high scalability, so operation over multiple data centres, possibly global; <br>


</div><div>- removal of single points of failure with the use of replication and distribution of (meta-)data.<br></div><div></div><div><br></div><div>The sensitivity points that were identified, and resulted in the following questions, are:<br>


<br></div><div>1) transparency to the processing layer / application with respect to data locality, e.g. know where data is physically located on a server level, mainly for resource allocation and fast processing, high performance, how can this be accomplished using GlusterFS?<br>


</div><div><br>2) posix compliance, or conformance: hadoop for example isn&#39;t posix compliant by design, what are the pro&#39;s and con&#39;s? What is GlusterFSs approach with respect to support for posix operations?<br>


</div><div><br>3) mainly with respect to evaluating the production readiness of GlusterFS, where is it currently used in production environments and for what specific usecases it seems most suitable? Are there any known issues / common pitfalls and workarounds available?<br>


<br></div><div>I realize that I&#39;ve posed quite a lot of questions above but any answer or help, or links to where the information could be found, are very much appreciated :) In addition, specifically for GlusterFS:<br>


</div><div><br>4) It seems Gluster has the advantage in Geo replication versus for example Ceph. What are the main advantages here?<br></div><div>5) Finally what would be the most compelling reason to go for Gluster and not for the alternatives?<br>


<br></div><div>I&#39;m looking forward to your replies. Thanks in advance! :)<br>

</div><div><br></div>With kind regards,<br><br>Tim van Elteren<br></div>