pantz.org banner
Nagios mail notfications not being sent out
Posted on 10-28-2006 02:26:00 UTC | Updated on 10-28-2006 02:26:00 UTC
Section: /software/nagios/ | Permanent Link

Had the strangest problem with nagios today. I noticed that I was not recieving email notifications when services went down. Nagios would log that it saw the problem and update the webpage correctly but when it came to sending an email notification I got nothing. It logged that the emails went out in it's log but when watching for it on the nagios machine you saw nothing. The log looked like this:

Oct 27 16:02:02 nagios nagios: SERVICE ALERT: mail;SSH;CRITICAL;HARD;5;CRITICAL - Socket timeout after 10 seconds  
Oct 27 16:02:02 nagios nagios: SERVICE NOTIFICATION: tech1;mail;SSH;CRITICAL;notify-by-email;CRITICAL - Socket timeout after 10 seconds  

Postfix had not logged an email going out. Tcpdump showed no emails going out when it was supposedly sent the email. I was confused to say the least.

Nagios uses regular unix programs (printf and mail) to send it's email. I tried using the line nagios uses to send mail on the machine and it went out fine. I finally broke down and compiled nagios with ultra (all) debug turned on. The webpage will not work with debug turned on but the notifications and checks will. When it came time to send the mail this is what I saw:

/tmp/RshkRO1F: Permission denied

Permissions on /tmp ??? WTF? Sure enough /tmp's permissions were screwed up. Showing:

drwxr-xr-x   5  100 users  4096 Oct 27 15:58 tmp

So setting them back to the correct perms (below) fixed the problem right up. Mail could not create a temp file to send out its email. Nagios does not seem to check if the mail went out correctly so you end up with nothing being logged anywhere.

chown root:root /tmp
chmod 1777 /tmp

Del.icio.us! | Digg Me! | Reddit!

Related stories

Nagios windows client fix by turning off DEP
Posted on 06-11-2005 01:45:00 UTC | Updated on 06-11-2005 01:45:00 UTC
Section: /software/nagios/ | Permanent Link

I got nailed with windows 2003 server sp1's new Data Execution Protection (DEP) (stack protection) today. I was trying to install the nagios NS Client program on a server with DEP turned on. When you tried to start the nagios agent service you would get "System Error 1067 has occurred". Which means the process was aborted and windows says "The process terminated unexpectedly". To make an exception for certain programs to run without DEP you need to do the following in W2k3 SP1: Right click "My Computer" then "Properties". Click the "Advanced" tab then click the "Settings" button under the "Performance" section. Click on the "Data Execution Prevention" tab and then click the radio button "Turn on DEP for all programs and services except those I select". Then click the "Add" button and add your exe you don't want stack protection for. That problem was fun to hunt down.

Del.icio.us! | Digg Me! | Reddit!

Related stories

Nagios check_http mod
Posted on 04-22-2005 15:11:00 UTC | Updated on 04-22-2005 15:11:00 UTC
Section: /software/nagios/ | Permanent Link

I was setting up Nagios to monitor some systems and finally got to the printers. Well, some one made a HP plugin for Nagios to check the laserjets status (toner low, ok, etc). This was great because we have 11 HP lasers but we also have 4 Xerox Phaser 7300's. There was of course no plugin to tell us of toner was low and such for this printer. Well cool thing is the printer runs a webpage that has a page where you can see the status of either OK, LOW, or Empty on the toner and fusers. I needed to check if the page says "Empty". If it said "Empty" we need a "Critical" state if not then we are good and give a "OK" state. Well Nagios has a plugin that can check strings on a webpage. I thought fantastic I'll just check for "Empty" and if so set it to give give a critical.

Well thing is the check_http program can check for a string on a webpage but when it finds it it gives an "OK" response. This postive response "OK" is good for when a string you looking for is supposed to be there but not good if you want a negitave response "Critical" if the string is there. Well, check_http did not have that function it only gave positive responses to finding the string. So I've heard that C is like Perl in some ways so I should be able to put a "!" in front of the string check in the source code to have it give a negative response if the string is on the page.

Well lucky me C and Perl share many of the same operators and the work the same. I made a new variable, added in a new switch at the top for my negitave response string check, and slapped in another "If" statement with the negitave check, and added the string output to the --help command line. Recompiled. And walla! Works like a charm! Now we can check our Xerox printers and see if they are out of supplies.

So if it were not for Open Source I would not be able to add my own needed features. I've always thought it was cool but it never hit home like this before.

The diff for the check_http is below. It was done on version 1.4.3 of Nagios Plugins.

85a86
> char string_noexpect[MAX_INPUT_BUFFER] = "";
172a174
>     {"nostring", required_argument, 0, 'g'},
207c209
<     c = getopt_long (argc, argv, "Vvh46t:c:w:A:k:H:P:T:I:a:e:p:s:R:r:u:f:C:nlLSm:M:N", longopts, &option);
---
>     c = getopt_long (argc, argv, "Vvh46t:c:w:A:k:H:P:T:I:a:e:g:p:s:R:r:u:f:C:nlLSm:M:N", longopts, &option);
327a330,333
>     case 'g': /* string or substring */
>       strncpy (string_noexpect, optarg, MAX_INPUT_BUFFER - 1);
>       string_noexpect[MAX_INPUT_BUFFER - 1] = 0;
>       break;
994a1001,1017
>
>    if (strlen (string_noexpect)) {
>             if (!strstr (page, string_noexpect)) {
>                     printf (_("HTTP OK %s - %.3f second response time %s%s|%s %s\n"),
>                             status_line, elapsed_time,
>                             timestamp, (display_html ? "</A>" : ""),
>                             perfd_time (elapsed_time), perfd_size (pagesize));
>                     exit (STATE_OK);
>             }
>             else {
>                     printf (_("CRITICAL - string found%s|%s %s\n"),
>                             (display_html ? "</A>" : ""),
>                             perfd_time (elapsed_time), perfd_size (pagesize));
>                     exit (STATE_CRITICAL);
>             }
>     }
>
1259a1283,1284
>  -g, --nostring\n\
>    String not to expect in the content\n\
1344c1369
<   printf ("       [-s string] [-l] [-r <regex> | -R <case-insensitive regex>] [-P string]\n");
---
>   printf ("       [-s string] [-g string] [-l] [-r <regex> | -R <case-insensitive regex>] [-P string]\n");

Del.icio.us! | Digg Me! | Reddit!

Related stories


RSS Feed RSS feed logo
About


3com
3ware
alsa
alsactl
alsamixer
amd
android
apache
areca
arm
ati
auditd
awk
badblocks
bash
bind
bios
bonnie
cable
carp
cat5
cdrom
cellphone
centos
chart
chrome
cifs
cisco
cloudera
comcast
commands
comodo
compiz-fusion
corsair
cpufreq
cpufrequtils
cpuspeed
cron
crontab
crossover
cu
cups
cvs
database
dbus
dd
dd_rescue
ddclient
debian
decimal
dhclient
dhcp
diagnostic
diskexplorer
disks
dkim
dns
dos
dovecot
drac
dsniff
dvdauthor
e-mail
echo
editor
emerald
ethernet
expect
ext3
ext4
fat32
fedora
fetchmail
fiber
filesystems
firefox
firewall
flac
flexlm
floppy
flowtools
fonts
format
freebsd
ftp
gdm
gmail
gnome
greasemonkey
greylisting
growisofs
grub
hacking
hadoop
harddrive
hba
hex
hfsc
html
html5
http
https
idl
ie
ilo
intel
ios
iperf
ipmi
iptables
ipv6
irix
javascript
kde
kernel
kickstart
kmail
kprinter
krecord
kubuntu
kvm
lame
ldap
linux
logfile
lp
lpq
lpr
maradns
matlab
memory
mencoder
mhdd
mkinitrd
mkisofs
moinmoin
motherboard
mouse
movemail
mplayer
multitail
mutt
myodbc
mysql
mythtv
nagios
nameserver
netflix
netflow
nginx
nic
ntfs
ntp
nvidia
odbc
openbsd
openntpd
openoffice
openssh
openssl
opteron
parted
partimage
patch
perl
pf
pfflowd
pfsync
photorec
php
pop3
pop3s
ports
postfix
power
procmail
proftpd
proxy
pulseaudio
putty
pxe
python
qemu
r-studio
raid
recovery
redhat
router
rpc
rsync
ruby
saltstack
samba
schedule
screen
scsi
seagate
seatools
sed
sendmail
sgi
shell
siw
smtp
snort
solaris
soundcard
sox
spam
spamd
spf
sql
sqlite
squid
srs
ssh
ssh.com
ssl
su
subnet
subversion
sudo
sun
supermicro
switches
symbols
syslinux
syslog
systemrescuecd
t1
tcpip
tcpwrappers
telnet
terminal
testdisk
tftp
thttpd
thunderbird
timezone
ting
tls
tools
tr
trac
tuning
tunnel
ubuntu
unbound
vi
vpn
wget
wiki
windows
windowsxp
wireless
wpa_supplicant
x
xauth
xfree86
xfs
xinearama
xmms
youtube
zdump
zeromq
zic
zlib