I’ve had nagios running for years, so decided to play around with the alerts.
Twitter seemed the obvious choice, it’s easy for a people to follow the twitter account that’s publishing the alerts, and great if you actively use twitter (I don’t, this was more of a ‘how would you’ rather than a need).
First thing is to register a twitter account that nagios will publish as. I setup https://twitter.com/NagiosStarB
Once you’ve registered you need to edit your profile and add a mobile phone number (This is needed before you can change the app permissions later. Once you’ve done that you can delete the mobile number).
Now head over to https://apps.twitter.com/ and create a new app
Fill in Name, Description and Website (This isn’t particularly important, as we’re not pushing this app out to users).
You’ll be taken straight into the new app (if not simply click on it).
We need to change the Access Level, Click on ‘modify app permissions’
I chose ‘Read, Write and Access direct messages’, although ‘Read and Write’ would be fine. Click ‘Update settings’ (If you didn’t add your mobile number to your account earlier, you;ll get an error).
Now click ‘API Keys’
You need to copy the API key and API secret (Please dont try to use mine).
Now click ‘create my access token’ close to the bottom of the page.
You also need to copy your ‘Access token’ and ‘Access tocken secret’
Now we move onto the notification script.
Login to your Nagios server via SSH.
You need to ensure you have python-dev & python-pip installed.
apt-get install python-dev python-pip
pip install tweepy
Then cd into your nagios libexec folder (mines at /usr/local/nagios/libexec)
cd /usr/local/nagios/libexec/
We now add a new file called twitternagiosstatus.py
nano -w twitternagiosstatus.py
Copy and paste the following code into the file
#!/usr/bin/env python2.7
# tweet.py by Alex Eames http://raspi.tv/?p=5908
import tweepy
import sys
import logging
# Setup Debug Logging
logging.basicConfig(filename='/tmp/twitternagios.log',level=logging.DEBUG)
logging.debug('Starting Debug Log')
# Consumer keys and access tokens, used for OAuth
consumer_key = 'jNgRhCGx7NzZn1Cr01mucA'
consumer_secret = 'nTUDfUo0jH2oYyG8i6qdyrQXfwQ6QXT7dwjVykrWho'
access_token = '2360118330-HP5bbGQgTw5F1UIN3qOjdtvqp1ZkhxlHroiETIQ'
access_token_secret = 'rXjXwfoGGNKibKfXHw9YYL927kCBQiQL58Br0qMdaI5tB'
# OAuth process, using the keys and tokens
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Creation of the actual interface, using authentication
api = tweepy.API(auth)
if len(sys.argv) >= 2:
tweet_text = sys.argv[1]
logging.debug('Argument #1 ' + tweet_text)
if len(tweet_text) <= 140:
logging.debug('Tweeting: ' + tweet_text)
api.update_status(tweet_text)
else:
print "tweet sent truncated. Too long. 140 chars Max."
logging.debug('Too Long. Tweet sent truncated.')
api.update_status(tweet_text[0:140])
Replace consumer_key with your API key, consumer_secret with your API secret, access_token with your access token and access_token_secret with your Access token secret.
Now save and exit the editor.
CTRL+x then Y then Enter.
With the file saved, we need to make it executable.
chmod +x twitternagiosstatus.py
You can now test that the script works by typing
./twitternagiosstatus.py "testy testy"
You should now be able to see the Tweet on your new account (you may need to refresh the page).
If all has gone well so far, you can now add your Nagios Configuration.
Change Directory into your nagios etc
cd /usr/local/nagios/etc/
Edit your commands.cfg (mine is inside objects)
nano -w objects/commands.cfg
Where you choose to place the new configurations doesn’t really matter, but to keep things in order I choose just below the email commands.
Copy and paste the following
# 'notify-host-by-twitter' command definition
define command{
command_name notify-host-by-twitter
command_line /usr/local/nagios/libexec/twitternagiosstatus.py "$NOTIFICATIONTYPE$: $HOSTALIAS$ is $HOSTSTATE$"
}
# 'notify-service-by-twitter' command definition
define command{
command_name notify-service-by-twitter
command_line /usr/local/nagios/libexec/twitternagiosstatus.py "$NOTIFICATIONTYPE$: $SERVICEDESC$ ON $HOSTALIAS$ is $SERVICESTATE$"
}
You can adjust the specifics, but adding other $$ arguments (Use the email notification commands as an example). Save and exit
CTRL+x, then Y, then ENTER
Now we add a new contact. Edit contacts.cfg
nano -w objects/contacts.cfg
Copy and Paste the following
define contact{
contact_name nagios-twitter
alias Nagios Twitter
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r,f
host_notification_options d,u,r,f,s
service_notification_commands notify-service-by-twitter
host_notification_commands notify-host-by-twitter
}
define contactgroup{
contactgroup_name nagiostwitter
alias Nagios Twitter Notifications
members nagios-twitter
}
I decided to create a specific contact and contact-group for this, but you can adjust as you wish, add the contact to other contact-groups if you wish.
Now the last bit,
Add the new contact group to the hosts & services, templates or host-groups and service-groups.
How you decide to do this will depend on how you’ve set out your hosts, services, templates and contacts. For me I edit the each of the host files and add contact_groups nagiostwitter to each host and service.
(IMPORTANT: this will override settings that are inherited from templates, so if you already have email notifications active you’ll either have to just add nagiostwitter to the template or add users to this). Dont forgot to , delimited
An example host of mine
define host{
use linux-server ; Name of host template$
; This host definition $
; in (or inherited by) $
host_name excalibur
alias Excalibur
address 192.168.1.27
parents switch-netgear8
hostgroups linux-servers
statusmap_image linux40.gd2
contact_groups nagiostwitter,sysadm
}
An example service on this host
define service{
use generic-service ; Name of servi$
host_name excalibur
service_description PING
check_command check_ping!100.0,20%!500.0,60%
contact_groups nagiostwitter,sysadm
}
That’s it, hopefully if all’s done right you can restart the nagios service.
/etc/init.d/nagios restart
Now your twitter feed will start to be populated with each alert. I can’t emphasis enough that if the nagios configuration is done wrong you may break other alerts that are already setup.
I really need to thank http://raspi.tv/2013/how-to-create-a-twitter-app-on-the-raspberry-pi-with-python-tweepy-part-1#install here as I used this as a starting point.
UPDATE:
A few weeks ago I received an email from twitter telling me my application had been blocked for write operations. It also said to check the Twitter API Terms of Service. I didn’t think this would cause a problem, I’m not spamming anyone other than myself or users I’ve asked to follow the alerts. So I read the Terms of Service, and it’s all fine. I raised a support request with Twitter and had a very quick response saying “Twitter has automated systems that find and disable abusive API keys in bulk. Unfortunately, it looks like your application got caught up in one of these spam groups by mistake. We have reactivated the API key and we apologize for the inconvenience.”
This did stop my alerts for a few days though.So just be aware of this.
UPDATE 2:
Thanks to a comment from Claudio to truncate messages over 140 characters. I’ve incorporated this recommendation into the code above.
Hello, i wanted to thank you for this thread, i was looking for such script and didn’t find any that was working with the new api security system of twitter. By the way, there is a very small mistake in the python script :
if len(sys.argv) >= 2:
and not
if len(sys.argv) >= 2;
Well spotted, I’ve amended. I actually had : on my system, I vaguely remember accidentally deleting text after pasting so I think I only messed it here, that’s why my system hadn’t broken.
Anyways thanks for letting me know, and I’m glad it’s of some help to someone.
This is awesome thank you so much. Question: I’ve begun to get these “tweepy.error.TweepError: [{u’message’: u’Status is a duplicate.’, u’code’: 187}” when Nagios continues to report the exact same warning (which is normal in my situation). Am I missing something? otherwise the script works fine.
I chose to ignore that as I’m not really interested in seeing the same alert twice in a row. This shouldn’t show up if you have a few different alerts being tweet one after the other.
However it may be needed if you only have one alert and need to know it’s still a problem over time. I’d suggest including something unique at the start of the tweet, maybe the epoch time, random code or nagios alert id.
http://nagios.sourceforge.net/docs/3_0/macrolist.html shows the different macros you can use:
$HOSTNOTIFICATIONID$
$SERVICENOTIFICATIONID$
Could be included in the host and service commands, and should stop the warnings.
Hope that’s some help.
MeToo, make sure you check out the update I’ve added to this post. something to be aware of, I’ve been meaning to add it for a week or so
Thank you for your quick response. BTW I’m running Nagios Core 4.0.6 on Centos 6.5 and abusing snmp :). I use templates through out (high host volume) and I think I’ve made a mistake somewhere in the config files. I’ve setup two hosts w/services to use twitter over email (as a test) and one of them is doing it right the other gives me the above error…I’m comparing configs. The epoch time tip is brilliant!. I don’t script enough write my own so thanks.
It’s always fun finding trying to work out what’s wrong
I used to run nagios in a large organisation, had it linked in to asterisk (it could call out, and place notifications on the helpdesk lines) and sms databases to txt people.
Now I’m just running it at home on a raspberry PI monitoring a bunch of systems and VPN links. I started work the other day on a display board with LED’s, a network diagram in a picture frame with LED’s for the status. I hit a small snag thinking getting json output of the status would be easy, then today I received a nagios update mail saying new core version does json output so I’ll be doing another nagios blog entry if I can get it all working smoothly.
On the error your getting though, it does make more sense if you’ve only activated twitter output for 2 servers. Twitter wont let you post the same thing twice in succession, in my setup it’s not a really a problem, if one thing goes down it affects others so it will generate a few alerts, then the next time around the first alert isn’t the same as the last alert.
$TIMET$ is the epoch macro, so a host command like /usr/local/nagios/libexec/twitternagiosstatus.py “$TIMET$ $NOTIFICATIONTYPE$: $HOSTALIAS$ is $HOSTSTATE$”
should stop your warnings and you can do a similar one for service. the only problem being you loose some characters.
EXACTLY!!! I chose $TIME$ for simplicity though. So that it seems this changing bit of info in the twits makes it different than the previous even though is the same warning just at a different time. All good now, hopefully I won’t get caught up in the spamming myself “bot” but if I do, I’ll do as you instructed and follow up with Twitter. Here are my pretty Twitts:
Service – command_line /usr/local/nagios/libexec/twitternagiosstatus.py “$TIME$ ($HOSTGROUPALIAS$) Service: $SERVICEDESC$ ON $HOSTADDRESS$/$HOSTNAME$ = $SERVICESTATE$”
Host – command_line /usr/local/nagios/libexec/twitternagiosstatus.py “$TIME$ $NOTIFICATIONTYPE$: $HOSTADDRESS$/$HOSTNAME$ is $HOSTSTATE$”
I did notice the asterisk option, but it’s too broad for my “noob-self” so I’m teplating ’till I get an expert handle. I’m implementing various scripts to work for my situation (mostly bash and perl) all using snmp
OIDs that I’m familiar with, but have no clue how to present in an easy to read format for the average “tech”. raspberry PI has been in my radar for a
while but I thought of it as a media center for the car…prob silly…will shutting the engine/power kill the OS eventually? probably. You’ve been immensivebly helpful
and your script kicks! Next move…2×50 in TV display of both nagios Map and Twitter for my tech to glance at!
Really glad to be of help.
On the PI side of things, I use them a lot for media centre (rasbmc is superb, I haven’t used my xbox to watch anything since I got this in place). I’ve been wanting to do some kind of project with a PI and my car though. I backed this (https://www.kickstarter.com/projects/1312527055/raspberry-pi-car-power-supply-ignition-switch) some time ago, I got stung a little on import tax but it seemed a good project and really useful. It’s been sat in the box though since I received it.
I used to run nagios on a bunch of displays in the office, but I found the page would sometimes screw up. I used a windows program called autoit (I think) to write some basic scripts to kill the browser every hour and relaunch and navigate to the relative status page I needed. I didn’t script the mouse just the keyboard presses as I found it more reliable.
I bet displaying everything on 2x50in TV’s will look superb. I used to have the audio working on page refreshes to alert to a problem, this seemed to break in different versions of IE and would drive a load of people in the office mad playing bugs bunny ‘you’ve got a problem doc’ every 30 seconds. but my answer was always fix the problem and the sound goes away. or at least acknowledge the problem and take some ownership to notify the right people.
Anyway best of luck with it all, I may dig out some old config and scripts that maybe useful, keep an eye out if your interested.
BH
Yeah a lifesaver, thanks. As for PI stuff it’s fascinating…that would solve the problem of a proper shutdown…as for the tax thingy…I’m in the USA. I’ll have to look into it. FYI I’ve been
using PLEX since about 2009 why? bc I had a spare macmini with a small footprint and a apple remote included. XBMC would have been an excellent choice but my PCs are bulky…I’m a server
type. Things have radically changed since, and I’ve even considered on of these Androset Mini MK802 with HDMI out to run my PMS…this thing set off all sorts of alarms for me and
one of them was “use in the car!!!”, but power issues (bad shutdowns) would soon degrade the OS…no matter which one…this PI project maybe the solution for me. If Plex continues to “clam up”
their code, I’ll have to move to XBMC. I will take all your tips/scripts into consideration regarding Nagios, right now trying to dig into VLANs, Routes, CISCO and Extreme networks
routers (remarkably similar to Cisco’s IOS), email servers, RAID systems, UPSes, etc. If it’s plugged in I want to know. I will keep an eye on all you respond thank you.
Nice article thanks a lot. Twitter alerts work like a charm.
I’d like to share my little improvement to your python script.
It will send the text message truncated to 140 characters if necessary.
I think for sysadmins having a shorter message might be a safety net.
It is just a one line edit:
[…]
# else:
# print “tweet not sent. Too long. 140 chars Max.”
# logging.debug(‘Too Long.’)
else:
print “tweet sent truncated. Too long. 140 chars Max.”
logging.debug(‘Too Long. Tweet sent truncated.’)
api.update_status(tweet_text[0:140])
Hi Claudio, nice idea. I’ve incorporated this into the main article.
This is more related to Nagios than your script, but maybe you can help. Running Nagios Core 4.0.6 on Centos 6.6. I've managed to setup a second twitter account. I thought I'd use one for CRITICAL and one for WARNINGS and have TweetDeck running all the time. Now I'm stuck as to how to tell Service/Host which to use? as in if critical/warning tweet or not. object definitions in nagios don't allow for conditionals…maybe I'm approaching this all wrong. Thanks.
Hi,
Sorry for the delay in replying. Haven't looked at the blog in a while, just updating servers and noticed a load of comments to sort through.
I'd have to have a look at it and do some testing, but I believe you could use one of two approaches:-
1) create 2 scripts, and then 2 notification configurations within nagios. 1 setup for critical hosts and services and 1 for warning, unknown, etc.
2) adjust the script to include an if/else statement. If critical set the twitter API to xxx, else set it to yyy.
Approach 2 would probably be the simplest, the one thing I can think of being a problem with both approaches though is the OK alerts. I think lurking somewhere in my memory I remember there being a 'previous state' that could be passed from nagios into the notification command but that would mean changing the nagios configuration and script to account for this variable (if it exists). You'd have to setup 2 further if statements:-
1) if Status OK > If previous-status Warning,etc > API Warning
> If previous-status Critical > API Critical
2) else > If Status Warning, etc > API Warning
> If Status Critical > API Critical
Hopefully this rambling at 6am will make some sense. I'll have to look back at nagios notifications to get a better idea.
Oh hey no rush, but thanks for the tips. I'll try and report back.
Hi again,
I noticed that if for some reason the connectivity with twitter is not working the tweet might not be published. Wouldn't be nice to have a retry policy implemented in the plugin script?
Claudio
Hi,
I haven't used this for quite a while, but yes I can see there could be a problem if you loose internet connectivity/problem where the script can't get to twitter. It's not setup like a mailer daemon to queue the messages and then try again later.
I suppose it wouldn't be all that difficult to make a twitter posting daemon to take and queue messages and keep retrying. then change this to just pass a message to the queue. But for my setup if the internet being down is the problem seeing a load of tweets after it's been fixed wouldn't help me + the tweet times would be out and could lead to falsely thinking there's another problem.
Moving a queue away from this script would be really good though, you could then use it for anything else and have nagios monitor the size of the queue.
I've since changed my setup to use LED's to notify me of a problem instead of email and twitter. I'm now using Raspberry PI's and WS2801 LED strips. http://blog.starbyte.co.uk/hyperion-leds-nagios-part-1/
A few years ago I had planned to create a poster with an LED for each server/vpn link/router, it's another project I've just not managed to finish. At the time getting the status out of nagios wasn't easy, but I believe now you can pull xml (I did add a custom field on each server with an LED number ready). I'd really like to implement this with another thing I made in the past, nagios push status to asterisk, which then would call (in this case itself as a PA speaker) and announce the server, service and status. Unfortunately leaving a USB stick in the server and remotely rebooting was a bad bad idea (wiped the whole OS including the asterisk/nagios setup). If I get time to tie that together with a speaker in a picture frame I think it would be cool though.