Coding Nagios checks from scratch in Perl

I’m going to show you how to create a basic structure that you can use for all your custom Nagios checks. I tried to understand how to use the Nagios::Plugin module, but just couldn’t get it with my limited Perl skills. With my basic understanding, I hope the code will be simple enough for any sysadmin to use.

Get Your Script Connected

You may want to pass arguments from Nagios to your script. We will prepare the script to accept any values passed with Getopt::Long Perl module.

In this example, the script is to search a text file counting how many times a specified IP was found.

#! /usr/bin/perl
use strict;
use Getopt::Long qw(:config no_ignore_case);
my ($host, $server, $instances);
my $result = GetOptions(
“H|host=s”        => \$host,
“s|server=s”         => \$server,
“i|instances=s”   => \$instances,
);

In the above case I declare my variables using the “my” command, as in “my ($host, $server, $instances);” because I’m using “use strict;”.

my $result” is the command that gathers and assigns my Nagios values to the script’s variables.

Assuming our script calls check_ipvs.pl (located in your Nagios libexect dir) I could run this command:

./check_test.pl -H 192.168.1.101 -s 192.168.1.23 -i 2

My script’s variable of $host would contain the value of 192.168.1.101, and so on.

Exercise: Save the script below with executable permissions as check_test.pl and execute it with this command: ./check_test.pl -H 192.168.1.101 -s 192.168.1.23 -i 2

#! /usr/bin/perl
use strict;

use Getopt::Long qw(:config no_ignore_case);
my ($host, $server, $instances);
my $result = GetOptions(
“H|host=s”        => \$host,
“s|server=s”         => \$server,
“i|instances=s”   => \$instances,
);
print “My host IP is $host\n”;
print “My Server IP is $server\n”;
print “Times Server found are $instances\n”;

If all goes well, you should get this output:

My Host IP is 192.168.1.101
My Server IP is 192.168.1.23
Times Server found are 2

So that’s how to get your command line arguments passed to the variables in the check_test.pl script.

Telling Nagios What’s Up

The next thing to accomplish is informing Nagios what the result of your check was. Assume my $count was how many times I found the IP address passed by the -s of 192.168.1.2 in a text file, and I compared that to the $instances variable. For my check to be OK, I expect to find -s 192.168.1.2 twice.

if (“$count” =~ “$instances”)
{
print “OK – $server is in the load balancer.\n”;
exit 0; #Exit code 0 is what tells Nagios everything is Ok.
}
elsif (“$count” != “$instances”)
{
print “CRITICAL – $server is NOT in the load balancer. Server: $server Times Found: $count \n”;
exit 2; #Exit code 2 is what tells Nagios the check was critical.
}
else
{
print “UNKNOWN – Unable to determine the status of $server. \n”;
exit 3; #Exit code 3 is what tells Nagios the check has no clue.
}

I didn’t need a warning state, but that’s easy to add with an “exit 1″. The print statement is what you’ll see in your Nagios interface.

Pulling the Script Together

Of course you’ll need to make modifications to suit your needs, but this will give you the interface for Nagios with Perl.

#! /usr/bin/perl
use strict;
use Getopt::Long qw(:config no_ignore_case);
my $count = 0

my ($host, $server, $instances);

my $result = GetOptions(
“H|host=s”        => \$host,
“s|server=s”         => \$server,
“i|instances=s”   => \$instances,
);

#Your code does something here. In the suggested command of  ./check_test.pl -H 192.168.1.101 -s 192.168.1.23 -i 2…… $count below will have to = 2 to be OK and $count = 0 to be Critical.

$count = “0″;

if (“$count” =~ “$instances”)
{
print “OK – $server is in the load balancer.\n”;
exit 0; #Exit code 0 is what tells Nagios everything is Ok.
}
elsif (“$count” != “$instances”)
{
print “CRITICAL – $server is NOT in the load balancer. Server: $server Times Found: $count \n”;
exit 2; #Exit code 2 is what tells Nagios the check was critical.
}
else
{
print “UNKNOWN – Unable to determine the status of $server. \n”;
exit 3; #Exit code 3 is what tells Nagios the check has no clue.
}

Configure Nagios

You’ll need to add a check command to use your new check_test.pl script inside your Nagios checkcommands.cfg file.

define command{
command_name    check_test_friendly_name
command_line    $USER1$/check_test.pl -H $HOSTADDRESS$ -s $ARG1$ -i $ARG2$
}

Also add a service check to use your new command of check_test_friendly_name in your Nagios services.cfg file.

define service{
use                              generic-service
host_name                      load-balancer
service_description           load balancer check
is_volatile                     0
check_period                   24×7
max_check_attempts       2
normal_check_interval      1
retry_check_interval         1
contact_groups                admins
notification_interval           1
notification_period            24×7
notification_options          u,c,r
check_command             check_test_friendly_name!192.168.1.2!2
}

After restarting the Nagios daemon, you’ll see the new service for your host.  Now start thinking about all the things Perl can do, and how simple the logic is to pass status information to Nagios.

References
http://nagiosplug.sourceforge.net/developer-guidelines.html
http://search.cpan.org/~tonvoon/Nagios-Plugin-0.32/lib/Nagios/Plugin/Getopt.pm


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>