<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
After a bit of load, I constantly find my gluster server getting
into a state where it seems to be unable to reply to NFS RPCs:<br>
<br>
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<pre class="sunburst" style="font-family: Inconsolata, Inconsolata, Monaco, Consolas, 'Courier New', monospace; font-size: 14.399999618530273px; line-height: 18.200000762939453px; margin-top: 0em; white-space: pre-wrap; word-wrap: break-word; color: rgb(0, 0, 0); font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255);"><span class="meta meta_paragraph meta_paragraph_text" style="font-weight: normal !important;">[2013-05-05 01:31:16.421507] E [rpcsvc.c:1080:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x3705786983x, Program: NFS3, ProgVers: 3, Proc: 6) to rpc-transport (socket.nfs-server)
[2013-05-05 01:31:16.421528] E [nfs3.c:627:nfs3svc_submit_vector_reply] 0-nfs-nfsv3: Reply submission failed</span></pre>
<a href="http://pastie.org/7803022">http://pastie.org/7803022</a><br>
<br>
Any idea what to do about it?<br>
<br>
The NFS daemon also gets rather large - I suspect it's storing up
data for all these RPCs:<br>
<tt> PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+
Command</tt><tt><br>
</tt><tt>30674 root 20 0 37.6G 37.3G 2288 R 99.0 29.6
5:29.88 /usr/local/glusterfs/sbin/glusterfs</tt><br>
<br>
I'm running 3.3.1 with a few patches:
<a class="moz-txt-link-freetext" href="https://github.com/Supermathie/glusterfs/tree/release-3.3-oracle">https://github.com/Supermathie/glusterfs/tree/release-3.3-oracle</a><br>
<br>
Workload is usually Oracle DNFS.<br>
<br>
M.<br>
<pre class="moz-signature" cols="72">--
Michael Brown | `One of the main causes of the fall of
Systems Consultant | the Roman Empire was that, lacking zero,
Net Direct Inc. | they had no way to indicate successful
☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth
</pre>
</body>
</html>