Has anyone found a decent way out there to monitor GlusterFS volumes? <br>I'm currently using Nagios and Cacti to take care of basic CPU, Load, Memory, and raw Disk I/O. I need to monitor GlusterFS status and making sure all volumes are available..
<br><br>My test environment is 6 servers with 6 AFR volumes which are each shared between those 2 servers. All volumes are mounted on each server.<br><br>The checks I'm testing out so far include a simple Bash script that writes the current Unix timestamp and hostname to a file once a minute. This is done by each server on only the volumes that they store. <br>
echo "$(uname -n):$(date +%s)" > /mnt/gluster01/CHECK_FILE<br><br>The Nagios NRPE daemon would then execute a Perl script on each of the clients. This script goes thorugh each of the Gluster mount points comparing the timestamps in the CHECK_FILE to the current system time alarming if the timestamp is off by more than a minute. Another test which hasn't been implimented was checking the contents of the CHECK_FILE with the data that is on the raw disk.<br>
<br>Bash code to write timestamps and executed via cron once a minute. (write_timestamps.sh)<br><a href="http://glusterfs.pastebin.com/m5a220a6">http://glusterfs.pastebin.com/m5a220a6</a><br><br>Perl code to compare the timestamps which is executed on the client. (check_glusterfs_mounts.pl) <br>
<a href="http://glusterfs.pastebin.com/m2f057a77">http://glusterfs.pastebin.com/m2f057a77</a><br><br>Any ideas/questions/comments? <br><br><br>