Thanks for looking into this. We do use io-threads. Here is the server config:<br>: volume brick1-posix<br> 2: type storage/posix<br> 3: option directory /mnt/brick1<br> 4: end-volume<br> 5:<br> 6: volume brick2-posix<br>
7: type storage/posix<br> 8: option directory /mnt/brick2<br> 9: end-volume<br> 10:<br> 11:<br> 12: volume brick1-locks<br> 13: type features/locks<br> 14: subvolumes brick1-posix<br> 15: end-volume<br> 16:<br> 17: volume brick2-locks<br>
18: type features/locks<br> 19: subvolumes brick2-posix<br> 20: end-volume<br> 21:<br> 22: volume brick1<br> 23: type performance/io-threads<br> 24: option min-threads 16<br> 25: option autoscaling on<br> 26: subvolumes brick1-locks<br>
27: end-volume<br> 28:<br> 29: volume brick2<br> 30: type performance/io-threads<br> 31: option min-threads 16<br> 32: option autoscaling on<br> 33: subvolumes brick2-locks<br> 34: end-volume<br> 35:<br> 36: volume server<br>
37: type protocol/server<br> 38: option transport-type tcp<br> 40: option auth.addr.brick1.allow *<br> 41: option auth.addr.brick2.allow *<br> 42: subvolumes brick1 brick2<br> 43: end-volume<br> 44:<br><br><br><br><div class="gmail_quote">
On Sun, May 31, 2009 at 11:44 PM, Shehjar Tikoo <span dir="ltr"><<a href="mailto:shehjart@gluster.com">shehjart@gluster.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="im">Alpha Electronics wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
We are testing the glusterfs before recommending them to enterprise clients. We found that the file system always hang after running for about 2 days. after killing the server side process and then restart, everything goes back to normal.<br>
<br>
</blockquote>
<br></div>
What is the server config?<br>
If you're not using io-threads on the server, I suggest you do,<br>
because it does basic load-balancing to avoid timeouts.<br>
<br>
Also, avoid using autoscaling in io-threads for now.<br>
<br>
-Shehjar<br>
<br>
<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div class="h5">
Here is the spec and error logged:<br>
GlusterFS version: v2.0.1<br>
<br>
Client volume:<br>
volume brick_1<br>
type protocol/client<br>
option transport-type tcp/client<br>
option remote-port 7777 # Non-default port<br>
option remote-host server1<br>
option remote-subvolume brick<br>
end-volume<br>
<br>
volume brick_2<br>
type protocol/client<br>
option transport-type tcp/client<br>
option remote-port 7777 # Non-default port<br>
option remote-host server2<br>
option remote-subvolume brick<br>
end-volume<br>
<br>
volume bricks<br>
type cluster/distribute<br>
subvolumes brick_1 brick_2<br>
end-volume<br>
<br>
Error logged on client side through /var/log/glusterfs.log<br>
[2009-05-29 14:58:55] E [client-protocol.c:292:call_bail] brick_1: bailing out frame LK(28) frame sent = 2009-05-29 14:28:54. frame-timeout = 1800<br>
[2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not connected)<br>
error logged on server<br>
[2009-05-29 14:59:15] E [client-protocol.c:292:call_bail] brick_2: bailing out frame LK(28) frame sent = 2009-05-29 14:29:05. frame-timeout = 1800<br>
[2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not connected)<br>
<br>
There is error message logged on server side after 1 hour in /var/log/messages:<br>
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] lib/util_sock.c:write_data(564)<br>
May 29 16:04:16 server2 winbindd[3649]: write_data: write failure. Error = Connection reset by peer<br>
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:write_socket(158)<br>
May 29 16:04:16 server2 winbindd[3649]: write_socket: Error writing 104 bytes to socket 18: ERRNO = Connection reset by peer<br>
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:cli_send_smb(188)<br>
May 29 16:04:16 server2 winbindd[3649]: Error writing 104 bytes to client. -1 (Connection reset by peer)<br>
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/cliconnect.c:cli_session_setup_spnego(859)<br>
May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot contact any KDC for requested realm<br>
<br>
<br></div></div>
------------------------------------------------------------------------<br>
<br>
_______________________________________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@nongnu.org" target="_blank">Gluster-devel@nongnu.org</a><br>
<a href="http://lists.nongnu.org/mailman/listinfo/gluster-devel" target="_blank">http://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>
</blockquote>
<br>
</blockquote></div><br><br clear="all">