Nagios Core Administration Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

Implementing threshold checks in a plugin

You'll note that many of the plugins included in the Nagios Plugins set allow you to specify thresholds for different aspects of the tests that they perform, allowing custom configuration of which levels are ok, which need a warning, and which are critical. For example, the check_ping plugin requires us to specify thresholds with -w and -c options that define limits for round-trip-time and packet loss:

$ /usr/local/nagios/libexec/check_ping -H 192.0.2.21 -w 100,20% -c 200,40%
PING OK - Packet loss = 0%, RTA=0.20 ms|rta=0.200000ms;100.000000;100.000000;0.000000 pl=0%;10;20;0

In this case, the plugin's options are set only to raise a WARNING state if the round-trip-time for the check exceeds 100 milliseconds or if more than 20% of the packets are lost. It will raise a CRITICAL state if the check takes more than 200 milliseconds or 40% of the packets are lost.

When you're checking numeric values in a plugin, this is a useful way to allow the user to set their own thresholds for the check, rather than hardcoding them in the plugin itself. In this recipe, we'll adapt the plugin from the Writing a new plugin from scratch recipe to allow the user to set a threshold for kernel version numbers. We'll call this plugin check_kernel_version.

Getting ready

You should have a Nagios Core 4.0 or newer server running with a few hosts and services configured already, and you should already have successfully deployed the check_vuln_kernel plugin from the Writing a new plugin from scratch recipe in this chapter, including the installation of Perl and the two modules Nagios::Plugin (or Monitoring::Plugin) and Readonly.

How to do it...

We can write, test, and implement our check_kernel_version plugin as follows:

  1. Change to the directory containing the plugin binaries for Nagios Core. The default location is /usr/local/nagios/libexec:
    # cd /usr/local/nagios/libexec
    
  2. Start editing a new file called check_kernel_version:
    # vi check_kernel_version
    
  3. Include the following code in it. Take note of the comments, which explain what each block of code does:
    #!/usr/bin/env perl
    
    # Use strict Perl style
    use strict;
    use warnings;
    use utf8;
    
    # Require at least Perl v5.10
    use 5.010;
    
    # Require a few modules, including Nagios::Plugin
    use Nagios::Plugin;
    use POSIX;
    use Readonly;
    
    # Run POSIX::uname() to get the kernel version string
    my @uname = uname();
    my $version = $uname[2];
    my ($version_major) = split m/[.]/msx, $version;
    
    # Create a new Nagios::Plugin object
    my $np = Nagios::Plugin->new( usage => 'Usage: %s -w THRESHOLD -c THRESHOLD' );
    
    # Add options allowing specifying warning and critical ranges
    $np->add_arg(
     spec => 'warning|w=s',
     help => "-w, --warning=THRESHOLD\n"
     . ' Kernel version number threshold for returning warning',
     required => 1
    );
    $np->add_arg(
     spec => 'critical|c=s',
     help => "-c, --critical=THRESHOLD\n"
     . ' Kernel version number threshold for returning critical',
     required => 1
    );
    
    # Read options
    $np->getopts();
    
    # Compare the major version number to the thresholds
    my $code = $np->check_threshold(
     check => $version_major,
     warning => $np->opts->warning,
     critical => $np->opts->critical,
    );
    
    # Exit with the appropriate status
    $np->nagios_exit( $code, $version );
    
    # If we couldn't get the major version number, bail out with UNKNOWN
    if ( !$version_major ) {
     $np->nagios_die('Could not read kernel version string');
    }
    
  4. Make the plugin that is owned by the nagios group and executable with chmod(1):
    # chown root.nagios check_kernel_version
    # chmod 0770 check_kernel_version
    
  5. Run the plugin directly to test it; your output may differ depending on your system's kernel version:
    # sudo -s -u nagios 
    $ ./check_kernel_version -w 4: -c 3:
    KERNEL_VERSION WARNING - 3.16.0-4-amd64
    

We should now be able to use the plugin in a command and hence in a service check, just like any other command.

How it works...

The code in check_kernel_version differs from that of check_vuln_kernel in several ways:

  • It tests only the major kernel version number, the very first part of the kernel version string.
  • It allows us to specify the range of version numbers that should raise WARNING or CRITICAL states on the command line, rather than hardcoding them in the script. It does this using the Nagios::Plugin implementation of an option parser.
  • It includes some basic output, showing usage and help information for the options. This is required by Nagios::Plugin.
  • It uses the check_threshold method of the Nagios::Plugin package to check the version number against the thresholds for us.

We could have written our own code to compare the version numbers, but there's an advantage to using the check_threshold method; it uses the standard threshold format to specify the range for an alert level. In the recipe, we used these values:

  • 4: For -w, this means that an alert is generated if the value being checked is less than 4.
  • 3: For -c, this means an alert is generated if the value being checked is less than 3.

Because the major version number of the kernel in the recipe is 3, we got the WARNING output because it's less than 4, but not less than 3.

The threshold format syntax allows you to specify the ranges for alerts very carefully. There's a breakdown of the syntax available on the Nagios Plugins website, https://nagios-plugins.org/doc/guidelines.html#THRESHOLDFORMAT.

There's more...

We might set up a command and corresponding service check for this new plugin as follows:

define command {
    command_name  check_kernel_version
    command_line  $USER1$/check_kernel_version -w $ARG1$ -c $ARG2$
}
define service {
    use                  local-service
    host_name            localhost
    service_description  KERNEL_VERSION
    check_command        check_kernel_version!4:!3:
}

Note that we are able to define the WARNING and CRITICAL thresholds in the service definition as command arguments. This allows us to choose different thresholds for different services without editing the plugin's code.

Ideally, when writing plugins, we should include documentation and help output, particularly if we intend to distribute them to other users. The Nagios::Plugin module enforces some bare minimums, such as specifying the usage information output for the plugin if we declare any options, and requiring help information for each option.

If we run this plugin with a --help option, we get some useful output built by the Nagios::Plugin module's methods:

$ ./check_kernel_version --help
check_kernel_version 

This nagios plugin is free software, and comes with ABSOLUTELY NO WARRANTY. 
It may be used, redistributed and/or modified under the terms of the GNU 
General Public Licence (see http://www.fsf.org/licensing/licenses/gpl.txt).

Usage: check_kernel_version -w THRESHOLD -c THRESHOLD

 -?, --usage
 Print usage information
 -h, --help
 Print detailed help screen
 -V, --version
 Print version information
 --extra-opts=[section][@file]
 Read options from an ini file. See http://nagiosplugins.org/extra-opts
 for usage and examples.
 -w, --warning=THRESHOLD
 Kernel version number threshold for returning warning
 -c, --critical=THRESHOLD
 Kernel version number threshold for returning critical
 -t, --timeout=INTEGER
 Seconds before plugin times out (default: 15)
 -v, --verbose
 Show details for command-line debugging (can repeat up to 3 times)

Note that one of these options is --extra-opts. The module implements this so we can put any options for the plugin call into a file if we wish. For example, we could put our -w and -c options into an INI file check_kernel_version.ini:

[check_kernel_version]
warning = 4:
critical = 3:

Then we could call it like this:

$ ./check_kernel_version --extra-opts=@check_kernel_version.ini
KERNEL_VERSION WARNING - 3.16.0-4-amd64

See also

  • The Creating a new command section in this chapter
  • Creating a new service, Chapter 1, Understanding Hosts, Services, and Contacts
  • The Writing a new plugin from scratch section in this chapter
  • The Using macros as environment variables in a plugin section in this chapter