<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 04/03/2013 02:42 PM, Jeff Darcy

      wrote:<br>

    </div>

    <blockquote cite="mid:515CA22B.10805@redhat.com" type="cite">

      <pre wrap="">(redirecting to gluster-devel as a more appropriate forum)

On 04/03/2013 05:03 PM, Jay Vyas wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">suppose I was going to serve a pedabyte of data sharded over 10 files

(1,2,3,...,10) over glusterfs, in 3 servers (call them Server1, Server2,

and Server3). 

The 3 servers would need access to the files such that :

Server 1 will usually only access file 1

Server 2 will usually only access file2.

Server 3 will access all ten files (the whole data set).

Is there a way to get gluster to rebalance bricks over time based on

access patterns ... or otherwise .. what is the best way to increase the

average locality of access to files in the cluster ?

</pre>

      </blockquote>

      <pre wrap="">

The flippant answer would be to move the computation to the data instead

of vice versa, like Hadoop is designed to do.  ;)

The less flippant answer is going to get a bit more complicated.  There

are three ways that you can control placement of a file, but none are

really supported and all could get you in trouble.  The first method is

to create the file (or a copy) with a special name of the form

file@dht:subvol, where the parts have the following meanings:

* file = the file name you really want

* dht = the name of the DHT translator in your client-side volfile

* subvol = the name (from the same volfile) of the DHT subvolume where

you want the file to go

This is reasonably safe, because it's part of how rebalance works.  To

get even fancier than that, you need to know something about how the DHT

translator uses "layouts" on directories to place files.  There's a

description here.

<a class="moz-txt-link-freetext" href="http://hekafs.org/index.php/2012/03/glusterfs-algorithms-distribution/">http://hekafs.org/index.php/2012/03/glusterfs-algorithms-distribution/</a>

The problem is that the user has very little control over how these

layouts are generated.  One thing you can do that's fairly easy is swap

the layout xattrs on two bricks, which (after a rebalance) will swap

what files they contain.  For example, if your file is on brick2 and you

want it to be on brick1, you swap the xattr values for that directory

within brick1 and brick2.

The ultimate level of control is to calculate your own layouts.  For

this to be useful in a scenario like yours, you'd need to copy or

reverse-engineer the code in the DHT translator that calculates the hash

for a file.  Knowing that, you could do something like this:

* assign a range for brick1 that contains the hash for file1

* assign a range for brick2 that contains the hash for file2

* assign the remaining range to brick3

I'm working on some mechanisms, and accompanying management/interface

models, to provide this sort of control in a less hacker-ish form.

Unfortunately, I'm tied down with about ten higher priorities, so I

don't have any idea when that will be ready.  In the meantime, please

try these techniques *only with test data*, and caveat emptor.

</pre>

    </blockquote>

    Even better than reverse-engineering the dht hash function in order

    to calculate the hashes, you can just use the library function

    directly like I do at

    <meta http-equiv="content-type" content="text/html;

      charset=ISO-8859-1">

    <a href="http://joejulian.name/blog/dht-misses-are-expensive/">http://joejulian.name/blog/dht-misses-are-expensive/</a>

  </body>

</html>