David’s Blog

Megaraid Nagios Plugins

Posted in Systems Engineering / Unix Systems Operations by david415 on August 5, 2008

I wrote three small, useful Perl scripts, Nagios NRPE plugins to be precise, each scheduled to monitor the LSI RAID controller cache policy/status, controller’s patrol read status (since we never want a patrol read to affect performance) and the controller’s Battery Backup Unit status.

As a sys admin I often write small programs like these. I want to start writing custom stuff like this in Python. Python seems to be a very clean looking language with not a lot of syntactic sugary layers. It seems to have a clean try catch syntax like Java. Anyway here’s check_megaraid_cachepolicy.pl :

#!/usr/bin/perl

use strict;
use warnings;

my $cmd = 'sudo /usr/local/sbin/MegaCli64 -LDGetProp  -Cache -Lall -aALL';

my $output = `$cmd`;

open(F,">/tmp/fu") || die "$!\n";
print F "output: $output\n";
close F;

if($output =~ /Cache Policy:WriteBack, ReadAheadNone, Direct, No Write Cache if bad BBU/)
{
print "OK: MegaRAID cache policy is ok\n";
exit 0;
}
else
{
print "WARNING: MegaRAID cache policy is not ok\n";
exit 1;

}

check_megaraid_patrolread.pl :

#!/usr/bin/perl

use strict;
use warnings;

my $cmd = 'sudo /usr/local/sbin/MegaCli64 -AdpPR -Info -aALL';

my $output = `$cmd`;

if($output =~ /Patrol Read Mode: Disabled/)
{
print "MegaRAID patrol read mode is disabled\n";
exit 0;
}
else
{
print "WARNING: MegaRAID patrol read mode is NOT disabled\n";
exit 1;
}

check_megaraid_bbustatus.pl:

<pre>#!/usr/bin/perl

use strict;
use warnings;

my $cmd = 'sudo MegaCli64 -AdpBbuCmd -GetBbuStatus -aALL';
my $output = `$cmd`;

#  Fully Discharged        : No
#  Fully Charged           : Yes

if($output =~ /Fully Charged           : Yes/)
{
 print "OK: BBU Fully Charged\n";
 exit 0;
}

if($output =~ /Fully Discharged        : Yes/)
{
 print "WARNING: MegaRAID cache BBU is Fully Discharged\n";
 exit 1;
}

if($output =~ /Fully Charged           : No/)
{
 print "WARNING: BBU Not fully charged or discharged";
 exit 1;
}

On the Nagios server I add these two entries into our service config file :

define service{
        hostgroup_name                  megaraid_servers
        service_description             megaraid cache policy
#        notifications_enabled           0
        check_command                   check_nrpe_1arg!check_megaraid_cachepolicy
        use                             generic-service
        }

define service{
        hostgroup_name                  megaraid_servers
        service_description             megaraid patrolread
#        notifications_enabled           0
        check_command                   check_nrpe_1arg!check_megaraid_patrolread
        use                             generic-service
        }

define service{
        hostgroup_name                  megaraid_servers
        service_description             megaraid bbu status
        notifications_enabled           0
        check_command                   check_nrpe_1arg!check_megaraid_bbustatus
        use                             generic-service
        }

On the cluster nodes run the Nagios NRPE server which is configured run certain nagios health check plugins locally sending the results to the server and thus waking me up at 3am with a e-mail to my cellphone.

/etc/nagios/nrpe.cfg:

command[check_megaraid_cachepolicy]=/usr/lib/nagios/custom-plugins/check_megaraid_cachepolicy.pl
command[check_megaraid_patrolread]=/usr/lib/nagios/custom-plugins/check_megaraid_patrolread.pl
command[check_megaraid_bbustatus]=/usr/lib/nagios/custom-plugins/check_megaraid_bbustatus.pl
Tagged with: , , , , , , ,

Leave a Reply