pantz.org banner
Nagios mail notfications not being sent out
Posted on 10-28-2006 02:26:00 UTC | Updated on 10-28-2006 02:26:00 UTC
Section: /software/nagios/ | Permanent Link

Had the strangest problem with nagios today. I noticed that I was not recieving email notifications when services went down. Nagios would log that it saw the problem and update the webpage correctly but when it came to sending an email notification I got nothing. It logged that the emails went out in it's log but when watching for it on the nagios machine you saw nothing. The log looked like this:

Oct 27 16:02:02 nagios nagios: SERVICE ALERT: mail;SSH;CRITICAL;HARD;5;CRITICAL - Socket timeout after 10 seconds  
Oct 27 16:02:02 nagios nagios: SERVICE NOTIFICATION: tech1;mail;SSH;CRITICAL;notify-by-email;CRITICAL - Socket timeout after 10 seconds  

Postfix had not logged an email going out. Tcpdump showed no emails going out when it was supposedly sent the email. I was confused to say the least.

Nagios uses regular unix programs (printf and mail) to send it's email. I tried using the line nagios uses to send mail on the machine and it went out fine. I finally broke down and compiled nagios with ultra (all) debug turned on. The webpage will not work with debug turned on but the notifications and checks will. When it came time to send the mail this is what I saw:

/tmp/RshkRO1F: Permission denied

Permissions on /tmp ??? WTF? Sure enough /tmp's permissions were screwed up. Showing:

drwxr-xr-x   5  100 users  4096 Oct 27 15:58 tmp

So setting them back to the correct perms (below) fixed the problem right up. Mail could not create a temp file to send out its email. Nagios does not seem to check if the mail went out correctly so you end up with nothing being logged anywhere.

chown root:root /tmp
chmod 1777 /tmp

Reddit!

Related stories

Nagios windows client fix by turning off DEP
Posted on 06-11-2005 01:45:00 UTC | Updated on 06-11-2005 01:45:00 UTC
Section: /software/nagios/ | Permanent Link

I got nailed with windows 2003 server sp1's new Data Execution Protection (DEP) (stack protection) today. I was trying to install the nagios NS Client program on a server with DEP turned on. When you tried to start the nagios agent service you would get "System Error 1067 has occurred". Which means the process was aborted and windows says "The process terminated unexpectedly". To make an exception for certain programs to run without DEP you need to do the following in W2k3 SP1: Right click "My Computer" then "Properties". Click the "Advanced" tab then click the "Settings" button under the "Performance" section. Click on the "Data Execution Prevention" tab and then click the radio button "Turn on DEP for all programs and services except those I select". Then click the "Add" button and add your exe you don't want stack protection for. That problem was fun to hunt down.

Reddit!

Related stories

Nagios check_http mod
Posted on 04-22-2005 15:11:00 UTC | Updated on 04-22-2005 15:11:00 UTC
Section: /software/nagios/ | Permanent Link

I was setting up Nagios to monitor some systems and finally got to the printers. Well, some one made a HP plugin for Nagios to check the laserjets status (toner low, ok, etc). This was great because we have 11 HP lasers but we also have 4 Xerox Phaser 7300's. There was of course no plugin to tell us of toner was low and such for this printer. Well cool thing is the printer runs a webpage that has a page where you can see the status of either OK, LOW, or Empty on the toner and fusers. I needed to check if the page says "Empty". If it said "Empty" we need a "Critical" state if not then we are good and give a "OK" state. Well Nagios has a plugin that can check strings on a webpage. I thought fantastic I'll just check for "Empty" and if so set it to give give a critical.

Well thing is the check_http program can check for a string on a webpage but when it finds it it gives an "OK" response. This postive response "OK" is good for when a string you looking for is supposed to be there but not good if you want a negitave response "Critical" if the string is there. Well, check_http did not have that function it only gave positive responses to finding the string. So I've heard that C is like Perl in some ways so I should be able to put a "!" in front of the string check in the source code to have it give a negative response if the string is on the page.

Well lucky me C and Perl share many of the same operators and the work the same. I made a new variable, added in a new switch at the top for my negitave response string check, and slapped in another "If" statement with the negitave check, and added the string output to the --help command line. Recompiled. And walla! Works like a charm! Now we can check our Xerox printers and see if they are out of supplies.

So if it were not for Open Source I would not be able to add my own needed features. I've always thought it was cool but it never hit home like this before.

The diff for the check_http is below. It was done on version 1.4.3 of Nagios Plugins.

85a86
> char string_noexpect[MAX_INPUT_BUFFER] = "";
172a174
>     {"nostring", required_argument, 0, 'g'},
207c209
<     c = getopt_long (argc, argv, "Vvh46t:c:w:A:k:H:P:T:I:a:e:p:s:R:r:u:f:C:nlLSm:M:N", longopts, &option);
---
>     c = getopt_long (argc, argv, "Vvh46t:c:w:A:k:H:P:T:I:a:e:g:p:s:R:r:u:f:C:nlLSm:M:N", longopts, &option);
327a330,333
>     case 'g': /* string or substring */
>       strncpy (string_noexpect, optarg, MAX_INPUT_BUFFER - 1);
>       string_noexpect[MAX_INPUT_BUFFER - 1] = 0;
>       break;
994a1001,1017
>
>    if (strlen (string_noexpect)) {
>             if (!strstr (page, string_noexpect)) {
>                     printf (_("HTTP OK %s - %.3f second response time %s%s|%s %s\n"),
>                             status_line, elapsed_time,
>                             timestamp, (display_html ? "</A>" : ""),
>                             perfd_time (elapsed_time), perfd_size (pagesize));
>                     exit (STATE_OK);
>             }
>             else {
>                     printf (_("CRITICAL - string found%s|%s %s\n"),
>                             (display_html ? "</A>" : ""),
>                             perfd_time (elapsed_time), perfd_size (pagesize));
>                     exit (STATE_CRITICAL);
>             }
>     }
>
1259a1283,1284
>  -g, --nostring\n\
>    String not to expect in the content\n\
1344c1369
<   printf ("       [-s string] [-l] [-r <regex> | -R <case-insensitive regex>] [-P string]\n");
---
>   printf ("       [-s string] [-g string] [-l] [-r <regex> | -R <case-insensitive regex>] [-P string]\n");

Reddit!

Related stories


RSS Feed RSS feed logo

About


3com

3ware

alsa

alsactl

alsamixer

amd

android

apache

areca

arm

ati

auditd

awk

badblocks

bash

bind

bios

bonnie

cable

carp

cat5

cdrom

cellphone

centos

chart

chrome

chromebook

cifs

cisco

cloudera

comcast

commands

comodo

compiz-fusion

corsair

cpufreq

cpufrequtils

cpuspeed

cron

crontab

crossover

cu

cups

cvs

database

dbus

dd

dd_rescue

ddclient

debian

decimal

dhclient

dhcp

diagnostic

diskexplorer

disks

dkim

dns

dos

dovecot

drac

dsniff

dvdauthor

e-mail

echo

editor

emerald

encryption

ethernet

expect

ext3

ext4

fat32

fedora

fetchmail

fiber

filesystems

firefox

firewall

flac

flexlm

floppy

flowtools

fonts

format

freebsd

ftp

gdm

gmail

gnome

google

gpg

greasemonkey

greylisting

growisofs

grub

hacking

hadoop

harddrive

hba

hex

hfsc

html

html5

http

https

hulu

idl

ie

ilo

intel

ios

iperf

ipmi

iptables

ipv6

irix

javascript

kde

kernel

kickstart

kmail

kprinter

krecord

kubuntu

kvm

lame

ldap

linux

logfile

lp

lpq

lpr

maradns

matlab

memory

mencoder

mhdd

mkinitrd

mkisofs

moinmoin

motherboard

mouse

movemail

mplayer

multitail

mutt

myodbc

mysql

mythtv

nagios

nameserver

netflix

netflow

nginx

nic

ntfs

ntp

nvidia

odbc

openbsd

openntpd

openoffice

openssh

openssl

openvpn

opteron

parted

partimage

patch

perl

pf

pfflowd

pfsync

photorec

php

pop3

pop3s

ports

postfix

power

procmail

proftpd

proxy

pulseaudio

putty

pxe

python

qemu

r-studio

raid

recovery

redhat

router

rpc

rsync

ruby

saltstack

samba

schedule

screen

scsi

seagate

seatools

sed

sendmail

sgi

shell

siw

smtp

snort

solaris

soundcard

sox

spam

spamd

spf

spotify

sql

sqlite

squid

srs

ssh

ssh.com

ssl

su

subnet

subversion

sudo

sun

supermicro

switches

symbols

syslinux

syslog

systemd

systemrescuecd

t1

tcpip

tcpwrappers

telnet

terminal

testdisk

tftp

thttpd

thunderbird

timezone

ting

tls

tools

tr

trac

tuning

tunnel

ubuntu

unbound

vi

vpn

wget

wiki

windows

windowsxp

wireless

wpa_supplicant

x

xauth

xfree86

xfs

xinearama

xmms

youtube

zdump

zeromq

zic

zlib