<div dir="ltr"><br><div>In terms of making something work really quickly, one approach would be to base off the existing POSIX translator, use a local FS backed by an RBD volume for the metadata, and store the file content directly using librados.  That would avoid the need to invent a way to map filesystem-style metadata to librados calls, while still getting reasonably efficient data operations through to rados.</div>

<div><br></div><div>I would doubt this would be very slick, but it could be a fun hack!</div><div><br></div><div>John</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, May 5, 2014 at 4:21 PM, Jeff Darcy <span dir="ltr">&lt;<a href="mailto:jdarcy@redhat.com" target="_blank">jdarcy@redhat.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Now that we&#39;re all one big happy family, I&#39;ve been mulling over<br>

different ways that the two technology stacks could work together.  One<br>

idea would be to use some of the GlusterFS upper layers for their<br>

interface and integration possibilities, but then falling down to RADOS<br>

instead of GlusterFS&#39;s own distribution and replication.  I must<br>

emphasize that I don&#39;t necessarily think this is The Right Way for<br>

anything real, but I think it&#39;s an important experiment just to see what<br>

the problems are and how well it performs.  So here&#39;s what I&#39;m thinking.<br>

<br>

For the Ceph folks, I&#39;ll describe just a tiny bit of how GlusterFS<br>

works.  The core concept in GlusterFS is a &quot;translator&quot; which accepts<br>

file system requests and generates file system requests in exactly the<br>

same form.  This allows them to be stacked in arbitrary orders, moved<br>

back and forth across the server/client divide, etc.  There are several<br>

broad classes of translators:<br>

<br>

* Some, such as FUSE or GFAPI, inject new requests into the translator<br>

  stack.<br>

<br>

* Some, such as &quot;posix&quot;, satisfy requests by calling a server-local FS.<br>

<br>

* The &quot;client&quot; and &quot;server&quot; translators together get requests from one<br>

  machine to another.<br>

<br>

* Some translators *route* requests (one in to one of several out).<br>

<br>

* Some translators *fan out* requests (one in to all of several out).<br>

<br>

* Most are one in, one out, to add e.g. locks or caching etc.<br>

<br>

Of particular interest here are the DHT (routing/distribution) and AFR<br>

(fan-out/replication) translators, which mirror functionality in RADOS.<br>

My idea is to cut out everything from these on below, in favor of a<br>

translator based on librados instead.  How this works is pretty obvious<br>

for file data - just read and write to RADOS objects instead of to<br>

files.  It&#39;s a bit less obvious for metadata, especially directory<br>

entries.  One really simple idea is to store metadata as data, in some<br>

format defined by the translator itself, and have it handle the<br>

read/modify/write for adding/deleting entries and such.  That would be<br>

enough to get some basic performance tests done.  A slightly more<br>

sophisticated idea might be to use OSD class methods to do the<br>

read/modify/write, but I don&#39;t know much about that mechanism so I&#39;m not<br>

sure that&#39;s even feasible.<br>

<br>

This is not something I&#39;m going to be working on as part of my main job,<br>

but I&#39;d like to get the experiment started in some of my &quot;spare&quot; time.<br>

Is there anyone else interested in collaborating, or are there any other<br>

obvious ideas I&#39;m missing?<br>

--<br>

To unsubscribe from this list: send the line &quot;unsubscribe ceph-devel&quot; in<br>

the body of a message to <a href="mailto:majordomo@vger.kernel.org">majordomo@vger.kernel.org</a><br>

More majordomo info at  <a href="http://vger.kernel.org/majordomo-info.html" target="_blank">http://vger.kernel.org/majordomo-info.html</a><br>

</blockquote></div><br></div>