<br><br><div class="gmail_quote">On Sun, Apr 11, 2010 at 12:42 PM, Olivier Le Cam <span dir="ltr"><<a href="mailto:Olivier.LeCam@crdp.ac-versailles.fr">Olivier.LeCam@crdp.ac-versailles.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi -<div class="im"><br>
<br>
Raghavendra G wrote:<br>
<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
you can do that by sending the cached stats (here stat of file, stat of parent directory) through STACK_UNWIND.<br>
<br>
STACK_UNWIND_STRICT (lookup, frame, 0, 0, loc->inode, cached_stat, NULL, parent_stat);<br>
<br>
you can look into libglusterfs/src/defaults.c for default definitions of each of fop (file operations) and their call backs.<br>
</blockquote>
<br></div>
Thank you. I have been able to get a (quick and dirty) stats lookups caching translator working. I still not well understand everything with the GlusterFS internal library, most of the caching job is done by my own code.<br>
<br>
Anyway, it is enough at this step to make some benchmarkings with and to see if it is possible to improve performances significantly enough.<br>
<br>
My first impression is quite mitigate. I can indeed see some improvements accessing small files: stats caching does its job. But for some reason, io-cache still talks with the servers before delivering a file, even if that file is available in its cache.<br>
<br>
I can see three protocol calls:<br>
- client_open() (to both servers) <br></blockquote><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
- client_stat() (to one server only: load balancing?)<br>
- client_flush() (to both servers)<br>
<br></blockquote><div><br>io-cache indeed sends open and flush to server. This is needed for correct working of io-cache. As you've told below, since you are overriding lookup call, stat is sent to server (stat and lookup are two different calls).<br>
<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
This might be a problem with the implementation of my translator which only override "lookup" calls for now.<br>
<br>
My source code is attached (please be soft on it: as said before, it's a quick and dirty hack of the rot-13 translator).<br>
<br>
I'd like to get rid of any traffic with the servers when the file is available in the io-cache. That way I could really see if such a translator can be of any interest.<br>
<br>
Do you have any idea for achieving this goal?<div><div></div><div class="h5"><br>
<br>
Thanks and best regards,<br>
-- <br>
Olivier<br></div></div></blockquote><div><br><br>since lookup is being unwound if the stat is cached for an inode, you've to also implement calls like unlink, rmdir (which deletes files/directories) and flush the cache corresponding to the inode. Otherwise lookup will be succeeding even for unlinked files, but the actual operations (like open/chmod/chown etc) will fail.<br>
<br>You should also handle calls which can change the stat of a file or directory (like write/chmod/chown). As a simple implementation you can just flush the cache.<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div class="h5">
</div></div><br>/*<br>
Copyright (c) 2006-2009 Gluster, Inc. <<a href="http://www.gluster.com" target="_blank">http://www.gluster.com</a>><br>
This file is part of GlusterFS.<br>
<br>
GlusterFS is free software; you can redistribute it and/or modify<br>
it under the terms of the GNU General Public License as published<br>
by the Free Software Foundation; either version 3 of the License,<br>
or (at your option) any later version.<br>
<br>
GlusterFS is distributed in the hope that it will be useful, but<br>
WITHOUT ANY WARRANTY; without even the implied warranty of<br>
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU<br>
General Public License for more details.<br>
<br>
You should have received a copy of the GNU General Public License<br>
along with this program. If not, see<br>
<<a href="http://www.gnu.org/licenses/" target="_blank">http://www.gnu.org/licenses/</a>>.<br>
*/<br>
<br>
#include <ctype.h><br>
#include <sys/uio.h><br>
<br>
#ifndef _CONFIG_H<br>
#define _CONFIG_H<br>
#include "config.h"<br>
#endif<br>
<br>
#include "glusterfs.h"<br>
#include "xlator.h"<br>
#include "logging.h"<br>
#include <sys/time.h><br>
<br>
#include "rot-13.h"<br>
<br>
/*<br>
* This is a rot13 ``encryption'' xlator. It rot13's data when<br>
* writing to disk and rot13's it back when reading it.<br>
* This xlator is meant as an example, NOT FOR PRODUCTION<br>
* USE ;) (hence no error-checking)<br>
*/<br>
<br>
mdc_inode_cache_t *<br>
mdc_inode_cache_delete(mdc_private_t *priv, mdc_inode_cache_t *cache)<br>
{<br>
mdc_inode_cache_t *next = cache->next;<br>
<br>
if (cache->previous)<br>
cache->previous->next = cache->next;<br>
if (cache->next)<br>
cache->next->previous = cache->previous;<br>
FREE (cache);<br>
<br>
priv->count--;<br>
return next;<br>
}<br>
<br>
int32_t<br>
mdc_inode_cache_set(xlator_t *this, ino_t ino, const struct stat *stbuf, const struct stat *postparent)<br>
{<br>
mdc_private_t *priv = (mdc_private_t*) this->private;<br>
mdc_inode_cache_t *cache = priv->inode_cache_head[ino % HASH_POS];<br>
mdc_inode_cache_t *new = NULL;<br>
<br>
if (ino == 0 || stbuf == NULL || postparent == NULL)<br>
return 0;<br>
<br>
if (cache->next) {<br>
do {<br>
cache = cache->next;<br>
if (cache->ino == ino) {<br></blockquote><div><br>instead of just returning, you can choose to update the cached stat with the one passed as argument to this procedure.<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
return 0; /* already in */<br>
}<br>
} while(cache->next);<br>
}<br>
<br>
new = CALLOC (sizeof(mdc_inode_cache_t), 1);<br>
if (new == NULL) {<br>
return -1;<br>
}<br>
<br>
new->ino = ino;<br>
memcpy(&(new->stbuf), stbuf, sizeof(struct stat));<br>
memcpy(&(new->postparent), postparent, sizeof(struct stat));<br>
gettimeofday (&(new->tv), NULL);<br>
new->previous = cache;<br>
new->next = NULL;<br>
<br>
cache->next = new;<br>
priv->count++;<br>
<br>
return 0;<br>
}<br>
<br>
mdc_inode_cache_t *<br>
mdc_inode_cache_get(xlator_t *this, ino_t ino)<br>
{<br>
mdc_private_t *priv = (mdc_private_t*) this->private;<br>
mdc_inode_cache_t *cache = priv->inode_cache_head[ino % HASH_POS];<br>
struct timeval now = {0,};<br>
time_t timeout = 0;<br>
<br>
if (ino == 0)<br>
return NULL;<br>
<br>
gettimeofday(&now, NULL);<br>
timeout = now.tv_sec - priv->cache_timeout;<br>
<br>
while (cache) {<br>
if (cache->tv.tv_sec < timeout && cache->ino) {<br>
cache = mdc_inode_cache_delete (priv, cache);<br>
continue;<br>
}<br>
if (cache->ino == ino) {<br>
return cache;<br>
}<br>
cache = cache->next;<br>
}<br>
<br>
return NULL;<br>
}<br>
<br>
int32_t<br>
mdc_lookup_cbk (call_frame_t *frame, void *cookie, xlator_t *this,<br>
int32_t op_ret, int32_t op_errno, inode_t *inode,<br>
struct stat *stbuf, dict_t *dict, struct stat *postparent)<br>
{<br>
// char *path;<br>
// inode_path(inode, NULL, &path);<br>
<br>
if (inode == NULL)<br>
goto out;<br>
<br>
if (stbuf && stbuf->st_ino) {<br>
uint32_t ret;<br>
<br>
ret = mdc_inode_cache_set(this, stbuf->st_ino, stbuf, postparent);<br>
if (ret != 0) {<br>
gf_log (this->name, GF_LOG_WARNING,<br>
"Could not cache metadata (ino=%"PRIu64")",<br>
inode->ino);<br>
}<br>
}<br>
<br>
out :<br>
STACK_UNWIND_STRICT (lookup, frame, op_ret, op_errno, inode, stbuf, dict,<br>
postparent);<br>
}<br>
<br>
int32_t<br>
mdc_lookup (call_frame_t *frame, xlator_t *this, loc_t *loc,<br>
dict_t *xattr_req)<br>
{<br>
mdc_inode_cache_t *cache = NULL;<br>
<br>
if (loc == NULL || loc->inode == NULL) {<br>
goto out;<br>
}<br>
<br>
cache = mdc_inode_cache_get(this, loc->inode->ino);<br>
<br>
if (cache) {<br>
STACK_UNWIND_STRICT (lookup, frame, 0, 0, loc->inode,<br>
&cache->stbuf, NULL, &cache->postparent);<br>
return 0;<br>
}<br>
<br>
out :<br>
STACK_WIND (frame, mdc_lookup_cbk, FIRST_CHILD (this),<br>
FIRST_CHILD (this)->fops->lookup, loc, xattr_req);<br>
return 0;<br>
}<br>
<br>
int32_t<br>
init (xlator_t *this)<br>
{<br>
int i = 0;<br>
data_t *data = NULL;<br>
mdc_private_t *priv = NULL;<br>
<br>
if (!this->children || this->children->next) {<br>
gf_log ("mdc-cache", GF_LOG_ERROR,<br>
"FATAL: mdc-cache should have exactly one child");<br>
return -1;<br>
}<br>
<br>
if (!this->parents) {<br>
gf_log (this->name, GF_LOG_WARNING,<br>
"dangling volume. check volfile ");<br>
}<br>
<br>
priv = CALLOC (sizeof (mdc_private_t), 1);<br>
ERR_ABORT (priv);<br>
LOCK_INIT (&priv->lock);<br>
<br>
for (i = 0; i < HASH_POS; i++) {<br>
priv->inode_cache_head[i] = CALLOC (sizeof (mdc_inode_cache_t), 1);<br>
if (priv->inode_cache_head[i]) {<br>
priv->inode_cache_head[i]->ino = 0;<br>
priv->inode_cache_head[i]->previous = NULL;<br>
priv->inode_cache_head[i]->next = NULL;<br>
}<br>
}<br>
<br>
priv->cache_timeout = 1;<br>
data = dict_get (this->options, "cache-timeout");<br>
if (data) {<br>
priv->cache_timeout = data_to_uint32 (data);<br>
gf_log (this->name, GF_LOG_TRACE,<br>
"Using %d seconds to revalidate cache",<br>
priv->cache_timeout);<br>
}<br>
<br>
priv->count = 0;<br>
this->private = priv;<br>
<br>
gf_log ("mdc-cache", GF_LOG_WARNING, "metadata caching (mdc-cache) xlator loaded");<br>
return 0;<br>
}<br>
<br>
void<br>
fini (xlator_t *this)<br>
{<br>
mdc_private_t *priv = this->private;<br>
<br>
FREE (priv);<br>
<br>
return;<br>
}<br>
<br>
struct xlator_fops fops = {<br>
.lookup = mdc_lookup<br>
};<br>
<br>
struct xlator_mops mops = {<br>
};<br>
<br>
struct xlator_cbks cbks = {<br>
};<br>
<br>
struct volume_options options[] = {<br>
{ .key = {"cache-timeout"},<br>
.type = GF_OPTION_TYPE_INT,<br>
.min = 1,<br>
.max = 900<br>
},<br>
{ .key = {NULL} }<br>
};<br>
<br>/*<br>
Copyright (c) 2006-2009 Gluster, Inc. <<a href="http://www.gluster.com" target="_blank">http://www.gluster.com</a>><br>
This file is part of GlusterFS.<br>
<br>
GlusterFS is free software; you can redistribute it and/or modify<br>
it under the terms of the GNU General Public License as published<br>
by the Free Software Foundation; either version 3 of the License,<br>
or (at your option) any later version.<br>
<br>
GlusterFS is distributed in the hope that it will be useful, but<br>
WITHOUT ANY WARRANTY; without even the implied warranty of<br>
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU<br>
General Public License for more details.<br>
<br>
You should have received a copy of the GNU General Public License<br>
along with this program. If not, see<br>
<<a href="http://www.gnu.org/licenses/" target="_blank">http://www.gnu.org/licenses/</a>>.<br>
*/<br>
<br>
#ifndef __ROT_13_H__<br>
#define __ROT_13_H__<br>
<br>
#ifndef _CONFIG_H<br>
#define _CONFIG_H<br>
#include "config.h"<br>
#endif<br>
<br>
#include <sys/uio.h><br>
#include "call-stub.h"<br>
<br>
#define HASH_POS 1699<br>
<br>
struct mdc_inode_cache {<br>
ino_t ino;<br>
struct stat stbuf;<br>
struct stat postparent;<br>
struct timeval tv;<br>
struct mdc_inode_cache *previous;<br>
struct mdc_inode_cache *next;<br>
};<br>
typedef struct mdc_inode_cache mdc_inode_cache_t;<br>
<br>
struct mdc_private {<br>
uint32_t cache_timeout;<br>
uint32_t max_entries;<br>
uint32_t count;<br>
struct mdc_inode_cache *inode_cache_head[HASH_POS];<br>
gf_lock_t lock;<br>
};<br>
typedef struct mdc_private mdc_private_t;<br>
<br>
#endif /* __ROT_13_H__ */<br>
<br>_______________________________________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@nongnu.org">Gluster-devel@nongnu.org</a><br>
<a href="http://lists.nongnu.org/mailman/listinfo/gluster-devel" target="_blank">http://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>
<br></blockquote></div><br><br clear="all">regards,<br>-- <br>Raghavendra G<br><br>