<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Also, in the logfiles on the clients, it looks like I get these types of messages whenever I try to access a file that is no longer accessible.<div><br></div><div><div>2009-06-11 07:58:24 E [fuse-bridge.c:675:fuse_fd_cbk] glusterfs-fuse: 22068570: /hourlogs/myDir0/1243432800.log => -1 (5)</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:436:fuse_entry_cbk] glusterfs-fuse: 22068579: /hourlogs/myDir1/1243400400.log => -1 (116)</div><div>2009-06-11 07:58:24 E [unify.c:850:unify_open] unify: /hourlogs/myDir1/1243400400.log: entry_count is 3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir1/1243400400.log: found on afr3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir1/1243400400.log: found on afr2</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir1/1243400400.log: found on afr-ns</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:675:fuse_fd_cbk] glusterfs-fuse: 22068580: /hourlogs/myDir1/1243400400.log => -1 (5)</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:436:fuse_entry_cbk] glusterfs-fuse: 22068583: /hourlogs/myDir2/1243411200.log => -1 (116)</div><div>2009-06-11 07:58:24 E [unify.c:850:unify_open] unify: /hourlogs/myDir2/1243411200.log: entry_count is 3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir2/1243411200.log: found on afr1</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir2/1243411200.log: found on afr3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir2/1243411200.log: found on afr-ns</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:675:fuse_fd_cbk] glusterfs-fuse: 22068584: /hourlogs/myDir2/1243411200.log => -1 (5)</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:436:fuse_entry_cbk] glusterfs-fuse: 22068599: /hourlogs/myDir3/1243472400.log => -1 (116)</div><div>2009-06-11 07:58:24 E [unify.c:850:unify_open] unify: /hourlogs/myDir3/1243472400.log: entry_count is 3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir3/1243472400.log: found on afr1</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir3/1243472400.log: found on afr3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir3/1243472400.log: found on afr-ns</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:675:fuse_fd_cbk] glusterfs-fuse: 22068600: /hourlogs/myDir3/1243472400.log => -1 (5)</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:436:fuse_entry_cbk] glusterfs-fuse: 22068603: /hourlogs/myDir4/1243404000.log => -1 (116)</div><div>2009-06-11 07:58:24 E [unify.c:850:unify_open] unify: /hourlogs/myDir4/1243404000.log: entry_count is 3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir4/1243404000.log: found on afr1</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir4/1243404000.log: found on afr-ns</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir4/1243404000.log: found on afr3</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:675:fuse_fd_cbk] glusterfs-fuse: 22068604: /hourlogs/myDir5/1243404000.log => -1 (5)</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:436:fuse_entry_cbk] glusterfs-fuse: 22068619: /hourlogs/myDir5/1243447200.log => -1 (116)</div><div>2009-06-11 07:58:24 E [unify.c:850:unify_open] unify: /hourlogs/myDir5/1243447200.log: entry_count is 4</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir5/1243447200.log: found on afr1</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir5/1243447200.log: found on afr3</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir5/1243447200.log: found on afr2</div><div>2009-06-11 07:58:24 E [unify.c:853:unify_open] unify: /hourlogs/myDir5/1243447200.log: found on afr-ns</div><div>2009-06-11 07:58:24 E [fuse-bridge.c:675:fuse_fd_cbk] glusterfs-fuse: 22068620: /hourlogs/myDir5/1243447200.log => -1 (5)</div><div><br></div><div><div>On Jun 11, 2009, at 10:33 AM, Elbert Lai wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">elbert@host1:~$ dpkg -l|grep glusterfs<div>ii glusterfs-client 1.3.8-0pre2 GlusterFS fuse client</div><div>ii glusterfs-server 1.3.8-0pre2 GlusterFS fuse server</div><div>ii libglusterfs0 1.3.8-0pre2 GlusterFS libraries and translator modules</div><div><br></div><div>I have 2 hosts set up to use AFR with the package versions listed above. I have been experiencing an issue where a file that is copied to glusterfs is readable/writable for a while, then at some point it time, it ceases to be. Trying to access it only retrieves the error message, "cannot open `filename' for reading: Input/output error".</div><div><br></div><div>Files enter glusterfs either via the "cp" command from a client or via "rsync". In the case of cp, the clients are all local and copying across a very fast connection. In the case of rsync, the 1 client is itself a gluster client. We are testing out a later version of gluster, and it rsync's across a vpn.</div><div><br></div><div><div>elbert@host2:~$ dpkg -l|grep glusterfs</div><div>ii glusterfs-client 2.0.1-1 clustered file-system</div><div>ii glusterfs-server 2.0.1-1 clustered file-system</div><div>ii libglusterfs0 2.0.1-1 GlusterFS libraries and translator modules</div><div>ii libglusterfsclient0 2.0.1-1 GlusterFS client library</div><div><br></div><div>=========</div><div>What causes files to become inaccessible? I read that fstat() had a bug in version 1.3.x whereas stat() did not, and that it was being worked on. Could this be related?</div><div><br></div><div>When a file becomes inaccessible, I have been manually removing the file from the mount point, then copying it back in via scp. Then the file becomes accessible. Below I've pasted a sample of what I'm seeing.</div><div><br></div><div><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div><div><a href="mailto:elbert@tool3.sc9.admob.com">elbert@tool3</a>.:hourlogs$ cd myDir</div><div>ls 1244682000.log</div><div><a href="mailto:elbert@tool3.sc9.admob.com">elbert@tool3</a>.:myDir$ ls 1244682000.log</div><div>1244682000.log</div><div><a href="mailto:elbert@tool3.sc9.admob.com">elbert@tool3</a>.:myDir$ stat 1244682000.log</div><div> File: `1244682000.log'</div><div> Size: 40265114 <span class="Apple-tab-span" style="white-space: pre; ">        </span>Blocks: 78744 IO Block: 4096 regular file</div><div>Device: 15h/21d<span class="Apple-tab-span" style="white-space: pre; ">        </span>Inode: 42205749 Links: 1</div><div>Access: (0755/-rwxr-xr-x) Uid: ( 1003/ elbert) Gid: ( 6000/ ops)</div><div>Access: 2009-06-11 02:25:10.000000000 +0000</div><div>Modify: 2009-06-11 02:26:02.000000000 +0000</div><div>Change: 2009-06-11 02:26:02.000000000 +0000</div><div><a href="mailto:elbert@tool3.sc9.admob.com">elbert@tool3</a>.:myDir$ tail 1244682000.log</div><div>tail: cannot open `1244682000.log' for reading: Input/output error</div></div></div></div></blockquote><br></div><div>At this point, I am able to rm the file. Then, if I scp it back in, I am able to successfully tail it.</div><div><br></div><div>So,</div><div><br></div><div>I have observed cases where the files had a Size of 0, and otherwise they were in the same state. I'm not totally certain, but it looks like if a file gets into this state from rsync, either it gets deposited in this state immediately (before I try to read it), or else it quickly enters this state. Speaking generally, file sizes tend to be several MB up to 150 MB.</div><div><br></div><div>Here's my server config:</div><div><div># Gluster Server configuration /etc/glusterfs/glusterfs-server.vol</div><div># Configured for AFR & Unify features</div><div><br></div><div>volume brick</div><div> type storage/posix </div><div> option directory /var/gluster/data/</div><div>end-volume</div><div><br></div><div>volume brick-ns</div><div> type storage/posix</div><div> option directory /var/gluster/ns/</div><div>end-volume</div><div><br></div><div>volume server</div><div> type protocol/server</div><div> option transport-type tcp/server</div><div> subvolumes brick brick-ns</div><div> option auth.ip.brick.allow 165.193.245.*,10.11.* </div><div> option auth.ip.brick-ns.allow 165.193.245.*,10.11.*</div><div>end-volume</div><div><br></div><div>Here's my client config:</div></div><div><div># Gluster Client configuration /etc/glusterfs/glusterfs-client.vol</div><div># Configured for AFR & Unify features</div><div><br></div><div>volume brick1</div><div> type protocol/client</div><div> option transport-type tcp/client # for TCP/IP transport</div><div> option remote-host 10.11.16.68 # IP address of the remote brick</div><div> option remote-subvolume brick # name of the remote volume</div><div>end-volume</div><div><br></div><div>volume brick2</div><div> type protocol/client</div><div> option transport-type tcp/client</div><div> option remote-host 10.11.16.71</div><div> option remote-subvolume brick</div><div>end-volume</div><div><br></div><div>volume brick3</div><div> type protocol/client</div><div> option transport-type tcp/client</div><div> option remote-host 10.11.16.69</div><div> option remote-subvolume brick</div><div>end-volume</div><div><br></div><div>volume brick4</div><div> type protocol/client</div><div> option transport-type tcp/client</div><div> option remote-host 10.11.16.70</div><div> option remote-subvolume brick</div><div>end-volume</div><div><br></div><div>volume brick5</div><div> type protocol/client</div><div> option transport-type tcp/client</div><div> option remote-host 10.11.16.119</div><div> option remote-subvolume brick</div><div>end-volume</div><div><br></div><div>volume brick6</div><div> type protocol/client</div><div> option transport-type tcp/client</div><div> option remote-host 10.11.16.120</div><div> option remote-subvolume brick</div><div>end-volume</div><div><br></div><div>volume brick-ns1</div><div> type protocol/client</div><div> option transport-type tcp/client</div><div> option remote-host 10.11.16.68</div><div> option remote-subvolume brick-ns # Note the different remote volume name.</div><div>end-volume</div><div><br></div><div>volume brick-ns2</div><div> type protocol/client</div><div> option transport-type tcp/client</div><div> option remote-host 10.11.16.71</div><div> option remote-subvolume brick-ns # Note the different remote volume name.</div><div>end-volume</div><div><br></div><div>volume afr1</div><div> type cluster/afr</div><div> subvolumes brick1 brick2</div><div>end-volume</div><div><br></div><div>volume afr2</div><div> type cluster/afr</div><div> subvolumes brick3 brick4</div><div>end-volume</div><div><br></div><div>volume afr3</div><div> type cluster/afr</div><div> subvolumes brick5 brick6</div><div>end-volume</div><div><br></div><div>volume afr-ns</div><div> type cluster/afr</div><div> subvolumes brick-ns1 brick-ns2</div><div>end-volume</div><div><br></div><div>volume unify</div><div> type cluster/unify</div><div> subvolumes afr1 afr2 afr3 </div><div> option namespace afr-ns</div><div><br></div><div> # use the ALU scheduler</div><div> option scheduler alu </div><div><br></div><div> # This option makes brick5 to be readonly, where no new files are created.</div><div> ##option alu.read-only-subvolumes brick5## </div><div><br></div><div> # Don't create files one a volume with less than 5% free diskspace</div><div> option alu.limits.min-free-disk 10% </div><div><br></div><div> # Don't create files on a volume with more than 10000 files open</div><div> option alu.limits.max-open-files 10000 </div><div> </div><div> # When deciding where to place a file, first look at the disk-usage, then at </div><div> # read-usage, write-usage, open files, and finally the disk-speed-usage.</div><div> option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed-usage</div><div><br></div><div> # Kick in if the discrepancy in disk-usage between volumes is more than 2GB</div><div> option alu.disk-usage.entry-threshold 2GB </div><div><br></div><div> # Don't stop writing to the least-used volume until the discrepancy is 1988MB </div><div> option alu.disk-usage.exit-threshold 60MB </div><div><br></div><div> # Kick in if the discrepancy in open files is 1024</div><div> option alu.open-files-usage.entry-threshold 1024 </div><div><br></div><div> # Don't stop until 992 files have been written the least-used volume</div><div> option alu.open-files-usage.exit-threshold 32 </div><div><br></div><div> # Kick in when the read-usage discrepancy is 20%</div><div> option alu.read-usage.entry-threshold 20% </div><div><br></div><div> # Don't stop until the discrepancy has been reduced to 16% (20% - 4%)</div><div> option alu.read-usage.exit-threshold 4%</div><div><br></div><div> # Kick in when the write-usage discrepancy is 20%</div><div> option alu.write-usage.entry-threshold 20%</div><div><br></div><div>## Don't stop until the discrepancy has been reduced to 16%</div><div> option alu.write-usage.exit-threshold 4% </div><div><br></div><div> # Refresh the statistics used for decision-making every 10 seconds</div><div> option alu.stat-refresh.interval 10sec </div><div><br></div><div># Refresh the statistics used for decision-making after creating 10 files</div><div># option alu.stat-refresh.num-file-create 10 </div><div>end-volume</div><div><br></div><div><br></div><div>#writebehind improves write performance a lot</div><div>volume writebehind </div><div> type performance/write-behind</div><div> option aggregate-size 131072 # in bytes</div><div> subvolumes unify</div><div>end-volume</div><div><br></div></div><div>Has anyone seen this issue before? Any suggestions?</div><div><br></div><div>Thanks,</div><div>-elb-</div></div></div>_______________________________________________<br>Gluster-users mailing list<br><a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users<br></blockquote></div><br></div></body></html>