Monitor Dell Server Hardware in Nagios with SNMP v3

RHEL

Instructions can be found here by selecting the version of DOM you need to install: http://linux.dell.com/repo/hardware/ The following steps worked for Dell PowerEdge 1950 and  PE 2950.

THIS DOES NOT WORK FOR PE 2550’s.  You need an older version of DOM to monitor these. For PE 2250’s I logged into support.dell.com, entered serial number and downloaded the appropriate version of OM for that hardware.

Add Dell Repository to your Server

wget -q -O - http://linux.dell.com/repo/hardware/OMSA_6.3/bootstrap.cgi | bash

Install DOM on Remote Server

Now you can install srvadmin-all with the yum command: yum install srvadmin-all

Restart snmpd service: sudo /sbin/service snmpd restart

SuSE 10. i386

Download the tarball from Dell Support

Extract the tarball: tar xvzf OM-SrvAdmin-Dell-Web-LX-XXX.tar.gz

Install the services: sudo sh linux/srvadmin-install.sh

Follow the prompts to install. I had an issue installing all so I just selected the necessary packages.

Restart snmpd: sudo /sbin/service snmpd restart

Troubleshooting

If the ipmi service doesn’t start then you will not get system specific SNMP variables returned.  I found a solution here:

http://lists.us.dell.com/pipermail/linux-poweredge/2008-October/037701.html

The /sbin/start_udev command worked in creating the /dev/ipmi0 character device

Monitor HP Proliant DL360 on Nagios and SNMP v3

RHEL5

Download hp-health.XXX.rpm from here

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&swItem=MTX-83c9772afe784cb4b0bad42f57&refresh=true

Download hp-snmp-agents from here:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=452749&prodNameId=3288142&swEnvOID=4006&swLang=8&mode=2&taskId=135&swItem=MTX-f0a7ddbd9a1b4be4acc735a541

RHEL4

These instructions assume you already have SNMP configured for version 3 on RHEL4 HP Proliant DL 360 server and another server with Nagios installed and working.

Download and Install HP RPMs

Necessary RPMs are hp-health and hp-snmp-agents.

You can go to the below link to download hp-health for RHEL4 x86 directly: http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=452749&prodNameId=3288142&swEnvOID=2025&swLang=8&mode=2&taskId=135&swItem=MTX-11651fcb8d1b4b3fb224959c4e

You can go to the below link to download hp-snmp-agents for RHEL4 x86 directly:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=452749&prodNameId=3288142&swEnvOID=2025&swLang=8&mode=2&taskId=135&swItem=MTX-15f072a096134b8397225e4612

First install the hp-health RPM: rpm -ivh hp-health-XXX.rpm

Start the service: service hp-health start

Install the hp-snmp-agents RPM: rpm -ivh hp-snmp-agents.XXX.rpm

Start the service: service hp-snmp-agents start

Edit SNMP file to add the HP MIB’s to net-snmp: dlmod cmaX /usr/lib/libcmaX.so (for 64-bit) dlmod cmaX /usr/lib64/libcmaX64.so

Configure snmp v3 user: net-snmp-config –create-snmpv3-user -ro -a AuthPasswd -x PrivPasswd -X AES -A SHA snmpuser

Restart SNMP service: service net-snmp restart

Verify you get a response from snmpwalk from your Nagios monitoring server:

snmpwalk -l authPriv -u “snmpuser” -X “PrivPassword” -A “AuthPassword” -a SHA -x AES -v 3 “HOST IP” 1.3.6.1.4.1.232.6.2.6.8.1.3 (HP specific OID)

Add Nagios Service Check

Now the issue is to edit a Nagios check to receive SNMP version 3 parameters … I downloaded the check_hpasm from Nagios exchange.  Installed it on my Nagios server.  From the command line made sure it worked by calling check_hpasm with the following options: ./check_hpasm -H HOSTNAME/IP -P 3 –username snmpuser –authpassword snmppassword

This should work with the correct username and password.

Add this to your commands.cfg file

# ‘check_hpasm_v3’ command definition
define command {
command_name    check_hpasm_v3
command_line    $USER1$/check_hpasm -H $HOSTADDRESS$ -P 3 –username $ARG1$ –authpassword $ARG2$
}

# ‘check_hpasm_v3’ command definitiondefine command {        command_name    check_hpasm_v3        command_line    $USER1$/check_hpasm -H $HOSTADDRESS$ -P 3 –username $ARG1$ –authpassword $ARG2$        }

Add this to a service check .cfg file:

define service{

use                      linux-service

host_name                g05

service_description      HPASM SNMP v3 Check

check_command            check_hpasm_v3!snmpuser!AuthPasswd

normal_check_interval    5

retry_check_interval     1

}

Make sure to run a Nagios configuration check before restarting.


Nagios on OpenSuSE

Followed the instructions listed here: http://nagios.sourceforge.net/docs/3_0/quickstart-opensuse.html

Here are the changes I made to get Nagios to play with Apache2 installed via source.

#1 Make sure to add the web server user to the nagcmd group.  Mine wasn’t wwwrun

#2 I created a symbolic link from the directory nagios was extracted to as follows: ln -s nagios-X.X.X nagios

#3 Once you run the ./configure –with-command-group=nagcmd the summary for my config was showing the Apache2 conf.d directory under /etc/apache2/conf.d since Apache was installed via source, it was located elsewhere on the system.  I accepted the defaults and chose to make all anyway.

Then I started up again at make install, make install-init, make install-commandmode, make install-config

In addition you can run make install-webconf and this will but the Apache nagios.conf it the Apache conf.d directory, you will want to copy this to the location of your Apache extra files.

I actually had multiple virtual hosts on my server so I had to copy the nagios.conf text into a subsection of a virtual host and then reload Apache. service httpd reload

Configured Nagios according to the documentation and started it, added it into chkconfig –add


Add Nagios Power Supply Check for PowerEdge 770N

1. Download the Nagios Power Supply check for PowerEdge 1850 from Nagios Exchange:

http://exchange.nagios.org/directory/Plugins/Hardware/Server-Hardware/Dell/Check-Dell-server-power-supplies-status/details

wget http://exchange.nagios.org/components/com_mtree/attachment.php?link_id=437&cf_id=24

2.The command line options and instructions on how to implement the check are listed on the plugin site.

3.  For each server, I needed to play with the command line options.  For most servers, the options of -n 2 -c 2 (2 – power supplies and 2 cpus) worked just fine, but for some servers I needed to edit those.  For one server particularly I needed to not provide any options.

4. This got Nagios checks working for all PowerEdge 1850’s.  But we had some 770N,  2650, and 850’s in our environment as well.

5. First make a copy so you can change the OID’s as appropriate.

6. I downloaded the MIB table from Dell for the PowerEdge 850.  From searching through the table I found a PowerSupplyStatus OID of  .1.3.6.1.4.1.674.10892.1.600.12.1.5  I edited the original script and change the powersupply OID from .1.3.6.1.4.674.10892.1.600.12.1.10.1 to .1.3.6.1.4.674.10892.1.600.12.1.5.1

7. Test the script via command line to find the appropriate values for the number of power supplies -n and the number of CPU’s. For some hosts I wasn’t able to enter a CPU value, it only worked with a number for the power supplies.

./check_snmp_dell_powersupply_other.pl -H IPAddress -C CommunityString -n 2

./check_snmp_dell_powersupply_other.pl -H IPAddress -C CommunityString -n 1 -c 1

8. Enter into the commands.cfg the correct command definition and edit the powersupply.cfg file to add the service for the new servers.