Triggering Nagios Alerts With SNMP Traps

Although SNMP polling is a great way to monitor the health of your network infrastructure it is impractical to poll every object on every device. Trap notifications can cover this visibility gap and provide immediate notification on network events.

Installation and configuration have been broken down into 7 steps:

  1. Configure “SNMP_TRAP” Nagios service definition
  2. Configure “TRAP” service check
  3. Change Nagios submit_check_result to use Bash
  4. Enable and configure snmp traps on network devices
  5. Download and install MIBs
  6. Configure snmptrapd
  7. Convert MIBs to snmptt configuration

Configure “SNMP_TRAP” Nagios Service

I’m assuming you already have a working Nagios installation. If not, please start with my post “Monitoring Network Devices with Nagios“.

First thing we will do is add a service entry in /etc/nagios3/conf.d/services.cfg:

define service {
name                            SNMP_TRAP
service_description             SNMP_TRAP
active_checks_enabled           1       ; Active service checks are enabled
passive_checks_enabled          1       ; Passive service checks are enabled/accepted
parallelize_check               1       ; Active service checks should be parallelized
process_perf_data               0
obsess_over_service             0       ; We should obsess over this service (if necessary)
check_freshness                 0       ; Default is to NOT check service 'freshness'
notifications_enabled           1       ; Service notifications are enabled
event_handler_enabled           1       ; Service event handler is enabled
flap_detection_enabled          1       ; Flap detection is enabled
process_perf_data               1       ; Process performance data
retain_status_information       1       ; Retain status information across program restarts
retain_nonstatus_information    1       ; Retain non-status information across program restarts
check_command                   check-host-alive      ; This will be used to reset the service to "OK"
is_volatile                     1
check_period                    24x7
max_check_attempts              1
normal_check_interval           1
retry_check_interval            1
notification_interval           120
notification_period             24x7
notification_options            w,u,c,r
contact_groups                  netops-24x7       ; Modify this to match your Nagios contact group definitions
register                        0
}

Note: About 90% of these settings should realistically be done in a service template which can then be used by multiple services. I am putting everything in the service definition for the sole purpose of making this example easier to follow.

Configure “TRAP” Service Check

Now you can add a service check which uses SNMP_TRAP for all of your network devices. In this example I am applying it to the Routers, Switches, and Security hostgroups:

define service {
use                 SNMP_TRAP
hostgroup_name      Routers,Switches,Security
service_description TRAP
check_interval      120 ; Don't clear for 2 hours
}

Your Nagios service should look like this:

Nagios_Trap

The “PING OK” Status is from the check-host-alive check_command we gave. This is just an easy way of clearing the alert since there is no active check going on.

You can now test your service by issuing the following command from the CLI to make TRAP go CRITICAL:

root@nagios# /usr/share/nagios3/plugins/eventhandlers/submit_check_result cr1.domain.com TRAP 2 “TESTING”

Command arguments are listed as follows:

#  $1 = host_name (Short name of host that the service is
#       associated with)
#  $2 = svc_description (Description of the service)
#  $3 = return_code (An integer that determines the state
#       of the service check, 0=OK, 1=WARNING, 2=CRITICAL,
#       3=UNKNOWN).
#  $4 = plugin_output (A text string that should be used
#       as the plugin output for the service check)

To reset the check to “OK” you can either run the same check with a 0, click “Re-schedule the next check of this service“, or wait for the duration of the check_interval (120 mins).

Change Nagios “submit_check_result” to use Bash

I’ll spare you all the headache of troubleshooting this on your own. If you don’t modify this file, it will work perfectly when you execute it form the command line but fail when snmptt calls it.

Open /usr/share/nagios3/plugins/eventhandlers/submit_check_result and change the first line from #!/bin/sh to #!/bin/bash:

#!/bin/bash

# SUBMIT_CHECK_RESULT
# Written by Ethan Galstad (egalstad@nagios.org)
# Last Modified: 02-18-2002
#
# This script will write a command to the Nagios command...

 

Enable and Configure SNMP Traps on Network Devices

You want your SNMP traps to compliment your SNMP polling alerts, not duplicate them. Only trap on things you find important enough to fire an immediate alert on such as Spanning Tree loops or hardware problems.

Configuring SNMP on network devices is trivial, so I will just paste examples rather than stepping you through it.

Full documentation can be found here:  http://www.cisco.com/en/US/docs/ios/12_2/configfun/configuration/guide/fcf014.html#wp1001086.

Note: You’ll see that my core router has traps enabled for configuration changes. This provides an easy way to test your complete Nagios trap setup.

Example: Cisco 6509 Core Router With VRF’s defined

snmp-server community public RO
snmp-server trap-source Vlan5
snmp-server enable traps chassis
snmp-server enable traps module
snmp-server enable traps transceiver all
snmp-server enable traps bgp
snmp-server enable traps config-copy  <- REMOVE AFTER TESTING
snmp-server enable traps config       <- REMOVE AFTER TESTING
snmp-server enable traps stpx inconsistency root-inconsistency loop-inconsistency
snmp-server enable traps envmon fan shutdown supply temperature status
snmp-server enable traps errdisable
snmp-server host 192.168.5.5 vrf INTERNAL public

Example: Cisco Nexus 5596 Aggregation Layer

snmp-server contact Paul Porter
snmp-server source-interface trap Vlan5
snmp-server source-interface inform Vlan5
snmp-server user admin network-admin auth localizedkey
snmp-server host 192.168.5.5 traps version 2c public
snmp-server host 192.168.5.5 use-vrf default
snmp-server enable traps bridge newroot
snmp-server enable traps bridge topologychange
snmp-server enable traps stpx inconsistency
snmp-server enable traps stpx root-inconsistency
snmp-server enable traps stpx loop-inconsistency
snmp-server community public group network-operator

Example: Cisco 2960S Access Layer

snmp-server community public RO
snmp-server enable traps bridge topologychange
snmp-server enable traps envmon fan shutdown supply temperature status
snmp-server enable traps errdisable
snmp-server host 192.168.5.5 version 2c public

Example: Cisco ASA 5520 Remote Access VPN

snmp-server host inside 192.168.5.5 community public
snmp-server community public
snmp-server enable traps entity config-change fru-insert fru-remove

Download and Install MIBs

I am assuming you already have snmpd and snmptt installed on your Nagios server. If not, you should be able to simply install the packages with apt-get or yum:

 root@nagios:~# apt-get install snmptt snmpd 

Note: By default snmptt will look in /usr/share/snmp/mib for a list of MIB’s to use, so place your downloaded MIB files there.

A list of MIBs that are compatible with your device can be found here: http://www.cisco.com/public/sw-center/netmgmt/cmtk/mibs.shtml

Select your device from the menu and use the listing provided to help you select the proper MIBs to download from ftp://ftp.cisco.com/pub/mibs/v2/

Here’s the partial list for a Cisco 6509 running IOS 12.2:

MIB-6509

Now use wget to download the MIB’s to the /usr/share/snmp/mib directory.

Below is the list of MIBs that I downloaded for my network. Be sure to download CISCO-CONFIG-MAN-MIB.my so you can test your Nagios setup with by just issuing “write mem” on your devices!

ftp://ftp.cisco.com/pub/mibs/v2/CISCO-TC.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-SMI.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-SMI.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-VTP-MIB.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-STP-EXTENSIONS-CAPABILITY.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-ENVMON-MIB.my
ftp://ftp.cisco.com/pub/mibs/v2/BGP4-MIB.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-ERR-DISABLE-MIB.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-MODULE-AUTO-SHUTDOWN-MIB.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-MEMORY-POOL-MIB.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-L4L7MODULE-RESOURCE-LIMIT-MIB.my
ftp://ftp.cisco.com/pub/mibs/v2/CISCO-CONFIG-MAN-MIB.my

Configure snmptrapd

Configure snmptrapd so that it sends all traps to snmptthandler and doesn’t use Authorization. Restart snmpd when you are done.

Your configuration file should look like this:


root@nagios# cat /etc/snmp/snmptrapd.conf
disableAuthorization yes
traphandle default /usr/sbin/snmptthandler

Convert MIBs to snmptt configuration

This is where you will tell the SNMP Trap Translator to call the submit_check_command we tested in Step #2 anytime it sees a trap that matches the MIBs we downloaded in Step #5.

Here’s how we would configure a CRITICAL alert for traps generated by configuration changes:

snmpttconvertmib --in=/usr/share/snmp/mibs/CISCO-CONFIG-MAN-MIB.my --out=/etc/snmp/snmptt.conf --debug --exec='/usr/share/nagios3/plugins/eventhandlers/submit_check_result $r TRAP 2' 

Run this command with all of the MIBs you downloaded and you’re all set. Restart snmptt when you are done.

If you look at your snmptt.conf file it should contain settings like this now:


EVENT ciscoEnvMonFanNotification .1.3.6.1.4.1.9.9.13.3.0.4 "Status Events" Normal
FORMAT A ciscoEnvMonFanNotification is sent if any one of $*
EXEC /usr/share/nagios3/plugins/eventhandlers/submit_check_result $r TRAP 2 "A ciscoEnvMonFanNotification is sent if any one of $*"
SDESC
A ciscoEnvMonFanNotification is sent if any one of
the fans in the fan array (where extant) fails.
Since such a notification is usually generated before
the shutdown state is reached, it can convey more
data and has a better chance of being sent
than does the ciscoEnvMonShutdownNotification.
This notification is deprecated in favour of
ciscoEnvMonFanStatusChangeNotif.
Variables:
1: ciscoEnvMonFanStatusDescr
2: ciscoEnvMonFanState
EDESC
#

Verification

If you downloaded CISCO-CONFIG-MAN-MIB.my and enabled config traps you can test the process by logging into your network device and issuing the “write mem” command. This should cause your TRAP service to go CRITICAL or WARNING, depending on your configuration.

Here’s an example of a BGP event causing the TRAP service to go to a WARNING state:

Nagios_TRAP_Warning

Additional Enhancements

If you want to have unique alerts for different SNMP traps, you can create additional services in Step #2 and then replace SNMP_TRAP with the service name in your snmpttconvertmib command.

Here’s an example of setting a unique service check for Spanning Tree:

1. Create TRAP and SPANNING_TREE service checks

define service {
use                 SNMP_TRAP
hostgroup_name      Routers,Switches,Security
service_description TRAP
check_interval      120 ; Don't clear for 2 hours
}
define service {
use                 SNMP_TRAP
hostgroup_name      Switches
service_description SPANNING_TREE
check_interval      120 ; Don't clear for 2 hours
}

2. Convert MIBs

Here we are telling snmptt to change the status of the SPANNING_TREE service for traps from the CISCO-STP-EXTENSIONS-CAPABILITY MIB.


snmpttconvertmib --in=/usr/share/snmp/mibs/CISCO-STP-EXTENSIONS-CAPABILITY.my --out=/etc/snmp/snmptt.conf --debug --exec='/usr/share/nagios3/plugins/eventhandlers/submit_check_result $r SPANNING_TREE 2' 

You should now have a TRAP and SPANNING_TREE service listed for your Switches:

Nagios_TRAP_STP

References

http://www.cisco.com/public/sw-center/netmgmt/cmtk/mibs.shtml

http://askaralikhan.blogspot.com/2010/12/receiving-snmp-traps-in-nagios.html

http://highsecurity.blogspot.com/2009/11/nagios-receive-traps-with-snmptt_08.html

Turnball, James. Pro Nagios 2.0, 2006. Print.

5 comments

  1. hello,
    Icinga on I tried to make SNMP-TRAP settings. Cisco-800 device, taking out the CAT-cable presentations, snmptrapd.log file at the same time I can see from the inside of the icing-web also Overwriting CRITICAL. But still, how can I test to make sure the other? Email: My muharrem@muharremaydin.co

  2. Thank you for your insight regarding switching the submit_check_result script shell to bash… Unfortunately I had already “bashed” my head on the desk for two days without solving it by the time I found your post :-/

    Rather bizarrely, even if I su snmptt -s /bin/sh and run submit_check_result interactively, it did in fact work. Frankly, I don’t care enough any more to troubleshoot it!

  3. Hello Thank for your site.
    Could you help me?
    I have a configuration Nagios with SNMP and I receive trap when I see it via tcpdump on port 162 but not on log file of snmptt.
    This command function well:
    ./submit_check_result srv-nagios-prod “SNMP Traps” 0 “WAITING FOR TRAPS…..”

    I use a debian 8.
    Thank you for your help.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s