<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 03/18/2013 10:12 PM, Marc Seeger
wrote:<br>
</div>
<blockquote
cite="mid:52D45056-B1DB-4C3E-9230-0B213FE6AE24@acquia.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div>Sadly, we keep seeing those. The logs display the same
pattern:</div>
<div><br>
</div>
<div>
<pre style="word-wrap: break-word; white-space: pre-wrap; ">[2013-03-18 05:22:49.174382] I [afr-self-heal-common.c:1941:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 0-replicate0: Non blocking entrylks failed.
[2013-03-18 05:22:49.174382] E [afr-self-heal-common.c:2160:afr_self_heal_completion_cbk] 0-replicate0: background meta-data data entry missing-entry gfid self-heal failed on /home/vcltest256/.drush/vcltest256.aliases.drushrc.php.lock
[2013-03-18 05:22:49.174382] W [inode.c:914:inode_lookup] (-->/usr/lib/libglusterfs.so.0(default_lookup_cbk+0xc1) [0x7f14a4cd6d01] (-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xf198) [0x7f14a27fe198] (-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xeffb) [0x7f14a27fdffb]))) 0-fuse: inode not found</pre>
</div>
<div><br>
</div>
<div><br>
</div>
<br>
<div>
<div>On Mar 7, 2013, at 1:14 AM, Marc Seeger <<a
moz-do-not-send="true" href="mailto:marc.seeger@acquia.com">marc.seeger@acquia.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div style="word-wrap: break-word; -webkit-nbsp-mode: space;
-webkit-line-break: after-white-space; ">Almost
forgot: These operations were done on a symlinked directory
(/home is linked to /mnt/gfs/home where /mnt/gfs is the
gluster mountpoint)
<div><br>
<div>
<div>On Mar 7, 2013, at 1:08 AM, Marc Seeger <<a
moz-do-not-send="true"
href="mailto:marc.seeger@acquia.com">marc.seeger@acquia.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div style="word-wrap: break-word; -webkit-nbsp-mode:
space; -webkit-line-break: after-white-space; ">
<pre style="word-wrap: break-word; white-space: pre-wrap; ">Hey, in our testing we seem to sometimes run into a problem with gluster fs breaking.</pre>
<pre style="word-wrap: break-word; white-space: pre-wrap; ">The most recent occurrence was two processes on two machines trying to stat the same lock file:</pre>
<pre style="word-wrap: break-word; white-space: pre-wrap; "><pre style="word-wrap: break-word; white-space: pre-wrap; ">2013-03-06T16:41:27+00:00 daemon.notice : creating directory: dir=/home/vcltest464/.drush, user=10036, group=10036, mode=0700
2013-03-06T16:41:27+00:00 daemon.notice : PHP Warning: stat(): stat failed for /home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock in [...].php on line 695
2013-03-06T16:41:27+00:00 daemon.notice : PHP Warning: stat(): stat failed for /home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock in [...].php on line 695
2013-03-06T16:41:27+00:00 daemon.notice : PHP Warning: stat(): stat failed for /home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock in [...].php on line 695
</pre><pre style="word-wrap: break-word; white-space: pre-wrap; ">(a few thousand times. it was a bug in the locking code when the stat failed)</pre><div>
</div></pre>
<pre style="word-wrap: break-word; white-space: pre-wrap; ">It ends up as something like this in the gluster log files:</pre>
<pre style="word-wrap: break-word; white-space: pre-wrap; ">
</pre>
<pre style="word-wrap: break-word; white-space: pre-wrap; "><pre style="word-wrap: break-word; white-space: pre-wrap; ">[2013-03-06 16:34:36.875559] W [client3_1-fops.c:2457:client3_1_link_cbk] 0-remote8: remote operation failed: File exists (00000000-0000-0000-0000-000000000000 -> /home/vcltest418/prod)
[2013-03-06 16:34:36.875559] W [client3_1-fops.c:2457:client3_1_link_cbk] 0-remote7: remote operation failed: File exists (00000000-0000-0000-0000-000000000000 -> /home/vcltest418/prod)
[2013-03-06 16:36:24.809098] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 0-remote8: remote operation failed: File exists. Path: /vcltest473/php_sessions (00000000-0000-0000-0000-000000000000)
[2013-03-06 16:36:24.809098] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 0-remote7: remote operation failed: File exists. Path: /vcltest473/php_sessions (00000000-0000-0000-0000-000000000000)
[2013-03-06 16:36:24.809098] W [fuse-bridge.c:292:fuse_entry_cbk] 0-glusterfs-fuse: 9061: MKDIR() /vcltest473/php_sessions => -1 (File exists)
[2013-03-06 16:36:26.179144] I [afr-self-heal-common.c:1189:afr_sh_missing_entry_call_impunge_recreate] 0-replicate0: no missing files - /home/vcltest473/.drush/vcltest473.aliases.drushrc.php.lock. proceeding to metadata check
[2013-03-06 16:36:34.899435] I [afr-self-heal-entry.c:2333:afr_sh_entry_fix] 0-replicate0: /vcltest473/livedev: Performing conservative merge
[2013-03-06 16:41:02.118580] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 0-remote8: remote operation failed: File exists. Path: /vcltest723/files-private (00000000-0000-0000-0000-000000000000)
[2013-03-06 16:41:02.118580] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 0-remote7: remote operation failed: File exists. Path: /vcltest723/files-private (00000000-0000-0000-0000-000000000000)
[2013-03-06 16:41:02.118580] W [fuse-bridge.c:292:fuse_entry_cbk] 0-glusterfs-fuse: 12435: MKDIR() /vcltest723/files-private => -1 (File exists)
[2013-03-06 16:41:27.179425] I [afr-self-heal-common.c:1941:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 0-replicate0: Non blocking entrylks failed.
[2013-03-06 16:41:27.179425] E [afr-self-heal-common.c:2160:afr_self_heal_completion_cbk] 0-replicate0: background meta-data data entry missing-entry gfid self-heal failed on /home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock
[2013-03-06 16:41:27.179425] W [inode.c:914:inode_lookup] (-->/usr/lib/libglusterfs.so.0(default_lookup_cbk+0xc1) [0x7f93c6ac7d01] (-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xf198) [0x7f93c45ef198] (-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xeffb) [0x7f93c45eeffb]))) 0-fuse: inode not found</pre><div>
</div></pre>
<pre style="word-wrap: break-word; white-space: pre-wrap; ">
</pre>
<div>After this, the mountpoint was not responding
to file_exists() anymore which usually means the
client died. Any idea what could have caused such
behaviour?</div>
<div><br>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
What do you mean by 'client died'. Does it mean the glusterfs mount
process crashed? If yes do you have the crash log from the log file
of the mount process.<br>
<blockquote
cite="mid:52D45056-B1DB-4C3E-9230-0B213FE6AE24@acquia.com"
type="cite">
<div>
<blockquote type="cite">
<div style="word-wrap: break-word; -webkit-nbsp-mode: space;
-webkit-line-break: after-white-space; ">
<div>
<div>
<blockquote type="cite">
<div style="word-wrap: break-word; -webkit-nbsp-mode:
space; -webkit-line-break: after-white-space; ">
<div><br>
</div>
<div>Cheers,</div>
<div>Marc</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
</div>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://supercolony.gluster.org/mailman/listinfo/gluster-users">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</body>
</html>