Nagios Twitter Alerts

I’ve had nagios running for years, so decided to play around with the alerts.
Twitter seemed the obvious choice, it’s easy for a people to follow the twitter account that’s publishing the alerts, and great if you actively use twitter (I don’t, this was more of a ‘how would you’ rather than a need).
First thing is to register a twitter account that nagios will publish as. I setup https://twitter.com/NagiosStarB
Once you’ve registered you need to edit your profile and add a mobile phone number (This is needed before you can change the app permissions later. Once you’ve done that you can delete the mobile number).
Now head over to https://apps.twitter.com/ and create a new app
Fill in Name, Description and Website (This isn’t particularly important, as we’re not pushing this app out to users).
You’ll be taken straight into the new app (if not simply click on it).

We need to change the Access Level, Click on ‘modify app permissions’

I chose ‘Read, Write and Access direct messages’, although ‘Read and Write’ would be fine. Click ‘Update settings’ (If you didn’t add your mobile number to your account earlier, you;ll get an error).
Now click ‘API Keys’

You need to copy the API key and API secret (Please dont try to use mine).
Now click ‘create my access token’ close to the bottom of the page.

You also need to copy your ‘Access token’ and ‘Access tocken secret’

Now we move onto the notification script.
Login to your Nagios server via SSH.
You need to ensure you have python-dev & python-pip installed.

apt-get install python-dev python-pip
pip install tweepy

Then cd into your nagios libexec folder (mines at /usr/local/nagios/libexec)

cd /usr/local/nagios/libexec/

We now add a new file called twitternagiosstatus.py

nano -w twitternagiosstatus.py

Copy and paste the following code into the file

#!/usr/bin/env python2.7
# tweet.py by Alex Eames http://raspi.tv/?p=5908
import tweepy
import sys
import logging

# Setup Debug Logging
logging.basicConfig(filename='/tmp/twitternagios.log',level=logging.DEBUG)
logging.debug('Starting Debug Log')

# Consumer keys and access tokens, used for OAuth
consumer_key = 'jNgRhCGx7NzZn1Cr01mucA'
consumer_secret = 'nTUDfUo0jH2oYyG8i6qdyrQXfwQ6QXT7dwjVykrWho'
access_token = '2360118330-HP5bbGQgTw5F1UIN3qOjdtvqp1ZkhxlHroiETIQ'
access_token_secret = 'rXjXwfoGGNKibKfXHw9YYL927kCBQiQL58Br0qMdaI5tB'

# OAuth process, using the keys and tokens
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

# Creation of the actual interface, using authentication
api = tweepy.API(auth)

if len(sys.argv) >= 2:
    tweet_text = sys.argv[1]
    logging.debug('Argument #1 ' + tweet_text)

    if len(tweet_text) <= 140:
        logging.debug('Tweeting: ' + tweet_text)
        api.update_status(tweet_text)
    else:
        print "tweet sent truncated. Too long. 140 chars Max."
        logging.debug('Too Long. Tweet sent truncated.')
        api.update_status(tweet_text[0:140])

Replace consumer_key with your API key, consumer_secret with your API secret, access_token with your access token and access_token_secret with your Access token secret.
Now save and exit the editor.

CTRL+x then Y then Enter.

With the file saved, we need to make it executable.

chmod +x twitternagiosstatus.py

You can now test that the script works by typing

./twitternagiosstatus.py "testy testy"

You should now be able to see the Tweet on your new account (you may need to refresh the page).
If all has gone well so far, you can now add your Nagios Configuration.
Change Directory into your nagios etc

cd /usr/local/nagios/etc/

Edit your commands.cfg (mine is inside objects)

nano -w objects/commands.cfg

Where you choose to place the new configurations doesn’t really matter, but to keep things in order I choose just below the email commands.
Copy and paste the following

# 'notify-host-by-twitter' command definition
define command{
        command_name    notify-host-by-twitter
        command_line    /usr/local/nagios/libexec/twitternagiosstatus.py "$NOTIFICATIONTYPE$: $HOSTALIAS$ is $HOSTSTATE$"
}

# 'notify-service-by-twitter' command definition
define command{
        command_name    notify-service-by-twitter
        command_line    /usr/local/nagios/libexec/twitternagiosstatus.py "$NOTIFICATIONTYPE$: $SERVICEDESC$ ON $HOSTALIAS$ is $SERVICESTATE$"
}

You can adjust the specifics, but adding other $$ arguments (Use the email notification commands as an example). Save and exit

CTRL+x, then Y, then ENTER

Now we add a new contact. Edit contacts.cfg

nano -w objects/contacts.cfg

Copy and Paste the following

define contact{
        contact_name                    nagios-twitter
        alias                           Nagios Twitter

        service_notification_period     24x7
        host_notification_period        24x7
        service_notification_options    w,u,c,r,f
        host_notification_options       d,u,r,f,s
        service_notification_commands   notify-service-by-twitter
        host_notification_commands      notify-host-by-twitter
        }

define contactgroup{
        contactgroup_name       nagiostwitter
        alias                   Nagios Twitter Notifications
        members                 nagios-twitter
        }

I decided to create a specific contact and contact-group for this, but you can adjust as you wish, add the contact to other contact-groups if you wish.
Now the last bit,
Add the new contact group to the hosts & services, templates or host-groups and service-groups.
How you decide to do this will depend on how you’ve set out your hosts, services, templates and contacts. For me I edit the each of the host files and add contact_groups  nagiostwitter to each host and service.
(IMPORTANT: this will override settings that are inherited from templates, so if you already have email notifications active you’ll either have to just add nagiostwitter to the template or add users to this). Dont forgot to , delimited
An example host of mine

define host{
        use                     linux-server            ; Name of host template$
                                                        ; This host definition $
                                                        ; in (or inherited by) $
        host_name               excalibur
        alias                   Excalibur
        address                 192.168.1.27
        parents                 switch-netgear8
        hostgroups              linux-servers
        statusmap_image         linux40.gd2
        contact_groups          nagiostwitter,sysadm
        }

An example service on this host

define service{
        use                             generic-service         ; Name of servi$
        host_name                       excalibur
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        contact_groups                  nagiostwitter,sysadm
        }

That’s it, hopefully if all’s done right you can restart the nagios service.

/etc/init.d/nagios restart

Now your twitter feed will start to be populated with each alert. I can’t emphasis enough that if the nagios configuration is done wrong you may break other alerts that are already setup.
I really need to thank http://raspi.tv/2013/how-to-create-a-twitter-app-on-the-raspberry-pi-with-python-tweepy-part-1#install here as I used this as a starting point.

UPDATE:
A few weeks ago I received an email from twitter telling me my application had been blocked for write operations. It also said to check the Twitter API Terms of Service. I didn’t think this would cause a problem, I’m not spamming anyone other than myself or users I’ve asked to follow the alerts. So I read the Terms of Service, and it’s all fine. I raised a support request with Twitter and had a very quick response saying “Twitter has automated systems that find and disable abusive API keys in bulk. Unfortunately, it looks like your application got caught up in one of these spam groups by mistake. We have reactivated the API key and we apologize for the inconvenience.”
This did stop my alerts for a few days though.So just be aware of this.

UPDATE 2:
Thanks to a comment from Claudio to truncate messages over 140 characters. I’ve incorporated this recommendation into the code above.

Raspberry PI + RTorrent +Apache2 + RUTorrent

I’ve been using one of my PI’s as a torrent server for some time. Recently I decided to refresh the entire system. This will NOT go into the legalities of downloading anything, I expect everyone to only be using this for downloading raspberry images 🙂

Version Info:
2014-01-07-wheezy-raspbian.img
libtorrent 0.13.2
rtorrent 0.9.2
rutorrent 3.6

I’m going to assume you can SSH to your PI, and recommend you get all the latest updates before you start. I’m also going to be naughty and be running all the commands as root.
sudo su –

Then we’ll get the stuff needed to compile rtorrent and a few things needed for the plugins

apt-get install subversion build-essential automake libtool libcppunit-dev libcurl3-dev libsigc++-2.0-dev libxmlrpc-c-dev unzip unrar-free curl libncurses-dev apache2 php5 php5-cli php5-curl libapache2-mod-scgi mediainfo ffmpeg screen

While your waiting may as well get a coffee. With that all finished we’re going to grab the rtorrent packages.

mkdir /root/rtorrent
cd /root/rtorrent

wget http://libtorrent.rakshasa.no/downloads/libtorrent-0.13.2.tar.gz
wget http://libtorrent.rakshasa.no/downloads/rtorrent-0.9.2.tar.gz
wget http://dl.bintray.com/novik65/generic/rutorrent-3.6.tar.gz
wget http://dl.bintray.com/novik65/generic/plugins-3.6.tar.gz

tar xvf libtorrent-0.13.2.tar.gz
tar xvf rtorrent-0.9.2.tar.gz
tar xvf rutorrent-3.6.tar.gz
tar xvf plugins-3.6.tar.gz

Now that you’ve got everything extracted it’s time to compile and install libtorrent

cd /root/rtorrent/libtorrent-0.13.2
./autogen.sh
./configure
make
make install

With libtorrent installed it’s time to compile and install rtorrent

cd /root/rtorrent/rtorrent-0.9.2
./autogen.sh
./configure --with-xmlrpc-c
make
make install
ldconfig

Once you’ve reached this bit, we’re finished with the hanging around. We’ll now install rutorrent and it’s plugins

cd /root/rtorrent
rm /var/www/index.html
cp -r rutorrent/* /var/www/
cp -r plugins/* /var/www/plugins
chown -R www-data:www-data /var/www
a2enmod scgi
service apache2 restart

We’ll next create a new user account for rtorrent to run as

adduser -m -r rtorrent

Now we switch to the new user account to add the required rtorrent directories and config

su rtorrent
mkdir .sessions
mkdir complete
mkdir torrents
mkdir watch
nano -w .rtorrent.rc

Copy and Paste the following:-

# This is an example resource file for rTorrent. Copy to
# ~/.rtorrent.rc and enable/modify the options as needed. Remember to
# uncomment the options you wish to enable.

# Maximum and minimum number of peers to connect to per torrent.
#min_peers = 40
#max_peers = 100

# Same as above but for seeding completed torrents (-1 = same as downloading)
#min_peers_seed = 10
#max_peers_seed = 50

# Maximum number of simultanious uploads per torrent.
#max_uploads = 15

# Global upload and download rate in KiB. "0" for unlimited.
download_rate = 0
upload_rate = 100

# Default directory to save the downloaded torrents.
directory = ~/torrents

# Default session directory. Make sure you don't run multiple instance
# of rtorrent using the same session directory. Perhaps using a
# relative path?
session = ~/.sessions

# Watch a directory for new torrents, and stop those that have been
# deleted.
schedule = watch_directory,5,5,load_start=~/watch/*.torrent
schedule = untied_directory,5,5,stop_untied=~/watch/*.torrent

# Close torrents when diskspace is low.
schedule = low_diskspace,5,10,close_low_diskspace=200M

# Stop torrents when reaching upload ratio in percent,
# when also reaching total upload in bytes, or when
# reaching final upload ratio in percent.
# example: stop at ratio 2.0 with at least 200 MB uploaded, or else ratio 20.0
#schedule = ratio,60,60,"stop_on_ratio=200,200M,2000"
#schedule = ratio,5,5,"stop_on_ratio=1,1M,10"
ratio.enable=
ratio.min.set=1
ratio.max.set=2
ratio.upload.set=1K
system.method.set = group.seeding.ratio.command, d.close=, d.stop=

# Set Schedules
#schedule = throttle_1,00:10:00,24:00:00,download_rate=0
#schedule = throttle_2,07:50:00,24:00:00,download_rate=200

# Stop Seeding When complete
#system.method.set_key = event.download.finished,1close_seeding,d.close=
#system.method.set_key = event.download.finished,2stop_seeding,d.stop=

# The ip address reported to the tracker.
#ip = 127.0.0.1
#ip = rakshasa.no

# The ip address the listening socket and outgoing connections is
# bound to.
##bind = 127.0.0.1
#bind = rakshasa.no

# Port range to use for listening.
port_range = 51515-51520

# Start opening ports at a random position within the port range.
#port_random = no

# Check hash for finished torrents. Might be usefull until the bug is
# fixed that causes lack of diskspace not to be properly reported.
#check_hash = no

# Set whetever the client should try to connect to UDP trackers.
use_udp_trackers = yes

# Alternative calls to bind and ip that should handle dynamic ip's.
#schedule = ip_tick,0,1800,ip=rakshasa
#schedule = bind_tick,0,1800,bind=rakshasa

# Encryption options, set to none (default) or any combination of the following:
# allow_incoming, try_outgoing, require, require_RC4, enable_retry, prefer_plaintext
#
# The example value allows incoming encrypted connections, starts unencrypted
# outgoing connections but retries with encryption if they fail, preferring
# plaintext to RC4 encryption after the encrypted handshake
#
# encryption = allow_incoming,enable_retry,prefer_plaintext
#encryption = allow_incomming,try_outgoing

# Enable DHT support for trackerless torrents or when all trackers are down.
# May be set to "disable" (completely disable DHT), "off" (do not start DHT),
# "auto" (start and stop DHT as needed), or "on" (start DHT immediately).
# The default is "off". For DHT to work, a session directory must be defined.
#
# dht = auto
dht = off

# UDP port to use for DHT.
#
# dht_port = 6881

# Enable peer exchange (for torrents not marked private)
#
# peer_exchange = yes
peer_exchange = no

#
# Do not modify the following parameters unless you know what you're doing.
#

# Hash read-ahead controls how many MB to request the kernel to read
# ahead. If the value is too low the disk may not be fully utilized,
# while if too high the kernel might not be able to keep the read
# pages in memory thus end up trashing.
#hash_read_ahead = 10

# Interval between attempts to check the hash, in milliseconds.
#hash_interval = 100

# Number of attempts to check the hash while using the mincore status,
# before forcing. Overworked systems might need lower values to get a
# decent hash checking rate.
#hash_max_tries = 10

#Added for rutorrent stuff
encoding_list = UTF-8
#scgi_local = /tmp/rpc.socket
#schedule = chmod,0,0,"execute=chmod,777,/tmp/rpc.socket"
scgi_port = localhost:5000

# Start The Plugins when Rtorrent Starts not when the page is first opened. If apache service is restart separately the plugins are likely to be stopped. Only really needed for RSS feeds.
execute = {sh,-c,/usr/bin/php /var/www/php/initplugins.php &}

Save and Exit (ctrl+x then y then enter)
We now need to perform a test run of rtorrent.

rtorrent

It should start without any problems. You may get a few warnings inside rtorrent, but it should still be running. To Exit press ctrl+q.
You should now exit the rtorrent user.

exit

Finally we’re going to setup rtorrent to automatically start when the PI is powered up.

nano -w /etc/init.d/rtorrent

Copy and Paste the Following:-

#!/bin/bash

# To start the script automatically at bootup type the following command
# update-rc.d torrent defaults 99

RTUSER=rtorrent
TORRENT=/usr/local/bin/rtorrent

case $1 in
start)
#display to user that what is being started
echo "Starting rtorrent..."
sleep 4
#start the process and record record it's pid
rm /home/rtorrent/.sessions/rtorrent.lock
start-stop-daemon --start --background --pidfile /var/run/rtorrent.pid --make-pidfile --exec /bin/su -- -c "/usr/bin/screen -dmUS torrent $TORRENT" $RTUSER
## start-stop-daemon --start --background --exec /usr/bin/screen -- -dmUS torrent $TORRENT
#output failure or success
#info on how to interact with the torrent
echo "To interact with the torrent client, you will need to reattach the screen session with following command"
echo "screen -r torrent"
if [[ $? -eq 0 ]]; then
echo "The process started successfully"
else
echo "The process failed to start"
fi
;;

stop)
#display that we are stopping the process
echo "Stopping rtorrent"
#stop the process using pid from start()
start-stop-daemon --stop --name rtorrent
#output success or failure
if [[ $? -eq 0 ]]; then
echo "The process stopped successfully"
else
echo "The process failed to stop"
fi
;;

*)
# show the options
echo "Usage: {start|stop}"
;;
esac

Save and Exit (ctrl+x then y then enter)
Then run


chmod +x /etc/init.d/rtorrent

update-rc.d rtorrent defaults

And that’s it. You could now start rtorrent using “/etc/init.d/rtorrent start”, but it’s just as easy to reboot and test that the startup scripts runs. Once you’ve reboot or started rtorrent you can access the webpage at http://{ip-address or name} Notes:-

This setup is meant to run internally, as such there is no security on the apache setup.

Personally I forward ports 51515-51520 on the router onto the PI, this makes a difference in download speed (much quicker) but as it's opening ports it's a security risk so you'll have to decided whether or not to.

I run this setup behind a vpn using ipredator.se, if there's any demand I'll write up another guide on how to configure that and ensure your traffic is locked to only go over the vpn.

otrs forbidden installer.pl or index.pl

Currently moving an OTRS installation from one server to another. This installation has been running fine for months, I’ll be upgrading as part of the process from 3.2.6 to 3.3.1 but to keep the migration simple I thought I’d just tar up the current OTRS installation and dump the database. Copy them over to the new server extract the files and restore the database and permissions.
This all went well, so I configured apache (copied the existing apache config from the old server). Restart the apache server and tried accessing the page. Kept getting ‘Forbidden’ messages, everything pointing towards permissions, so checked and reran otrs.SetPermissions still no joy. As nothing seemed to be moving forward I decided to wipe the install, and perform a fresh install. Did this but then found that apache just wouldn’t start. This was down to alot of entries in various failes pointing to /opt/otrs/ rather than my installed /usr/local/otrs/ Once I sorted this I was back to encountering ‘Forbidden’ messages again.
After alot of poking around and searching the net I found:-
http://httpd.apache.org/docs/2.4/upgrading.html
I hadn’t considered that apache may have been different between servers, a quick ‘apache2 -v’ on each server confirmed one server running 2.2 and the new running 2.4.

So to solve this I had to replace:-
        Order allow,deny
        Allow from all

with

        Require all granted

all through the apache config. After a service apache2 restart this let me get to the installer.pl so with that working, I’ll be back to extracting the OTRS and database from the old server.
I’ll apologise for any mistypes, I’ve written all this from memory, while having a well deserved coffee.
Old system was running Ubuntu 12.04 new is 13.10?

Ubuntu LDAP Authentication (You are required to change your password immediately (password aged)) Part 2

In an earlier post I was encountering password problems when authenticating via OpenLDAP. This was prompting me to change my password while login onto certain servers but not all. The change prompt would then disappear after typing the current password and close the putty session.

Having resolved that particular problem I’m left with another. Although the password change is successful I now have to change the password on each login.

When I encountered the first problem a few months back I thought it was to do with the LDAP ACL. I think I was partly right as this is a continuation of that problem and it does look like this will be ACL related.

So looking at what information I can pull together, looking at the shadow information:-

root@Exxxxxxxx:~# getent shadow
root:*:::45::::
nobody:*:::::::
{username}:*:::365:::16177:

Using slapcat to pull all the information off ldap below are the relevant bits:-

shadowMax: 365
shadowExpire: 16177
shadowLastChange: 15921

So it looks like the shadowLastChange isn’t allowed to be viewed. I found someone else recommending that you make shadowLastChange readable by all. Below is the current ACL:-

dn: olcDatabase={1}hdb,cn=config
olcAccess: {0}to attrs=userPassword,shadowLastChange by self write by anonymous auth by dn=”cn=admin,dc=domain,dc=local” write by * none
olcAccess: {1}to dn.base=”” by * read
olcAccess: {2}to * by self write by dn=”cn=admin,dc=domain,dc=local” write by * read

And here is the configuration that supposed to work (I say supposed to as I’m writing this while doing):-

dn: olcDatabase={1}hdb,cn=config
olcAccess: {0}to attrs=userPassword by self write by anonymous auth by dn=”cn=admin,dc=domain,dc=local” write by * none
olcAccess: {1}to attrs=shadowLastChange by self write by dn=”cn=admin,dc=domain,dc=local” write by * read
olcAccess: {2}to dn.base=”” by * read
olcAccess: {3}to * by self write by dn=”cn=admin,dc=domain,dc=local” write by * read

I’m not going to address any security concerns on making this field readable, for me it’s minimal.
So how do I change the ACL from the 1st to the 2nd. Make a new text file:-

nano -w auth_new.ldif
dn: olcDatabase={1}hdb,cn=config
changetype: modify
replace: olcAccess
olcAccess: {0}to attrs=userPassword by self write by anonymous auth by dn="cn=admin,dc=domain,dc=local" write by * none
olcAccess: {1}to attrs=shadowLastChange by self write by dn="cn=admin,dc=domain,dc=local" write by * read
olcAccess: {2}to dn.base="" by * read
olcAccess: {3}to * by self write by dn="cn=admin,dc=domain,dc=local" write by * read

Make sure to change the dn to your specific setup. Failure to do so may result in you loosing admin access. Useful command:-

ldapsearch -Q -LLL -Y EXTERNAL -H ldapi:/// -b cn=config '(olcAccess=*)' olcAccess
Next you modify the ldap using:-

ldapmodify -Q -Y EXTERNAL -H ldapi:/// -f auth_new.ldif

Now when I checkout the shadow information I get:-

root@Exxxxxxxx:~/ldap# getent shadow
root:*:15797::45::::
nobody:*:::::::
{username}:*:15921::365:::16177:

Now when I login I’m not being prompted to change my password. I’m not entirely sure if this is right or wrong anymore as I’ve been changing my password all night, so I guess I’ll just wait for a few user accounts to expire and check that it does all work.

Update: It does work. I tested it with a users account that was having problems login into one of the servers, they were still prompted for their ldap password and told they must change it. Did that and then closed putty and tried again, logged in with the new password and wasn’t reprompted to change it again.

Ubuntu LDAP Authentication (You are required to change your password immediately (password aged))

Been hitting a problem on one of my servers for a while, when trying to login users keep getting prompted to change their password but it just closes putty after they retype their password.

I thought I narrowed it down to an ldap option roobinddn, I use this on some of my servers (those I consider secure) For the servers that I dont have the rootbinddn setup for, they receive the password change prompt for those that have it set they just allow login.
I looked at it a few months back but never had the time to really investigate and resolve it. I thought it had something to do with the ldap ACL permissions that the user doesn’t have access to the password fields for their own account. However looking at it today I think I may be only partly correct.

If I run login {username} I get the below:-

root@Exxxxxxxxx:~# login {username}
Password:
You are required to change your password immediately (password aged)
Enter login(LDAP) password:

Authentication information cannot be recovered

 I haven’t seen the ‘Authentication information cannot be recovered’ before as putty always closes. Checking out this error (I google every error) I found the solution was installing libpam-cracklib:-

apt-get install libpam-cracklib 

So now when I run login {username} I get:-

root@Exxxxxxxxx:~# login {username}
Password:
You are required to change your password immediately (password aged)
Enter login(LDAP) password:
New password:
Retype new password:
LDAP password information changed for {usernae}
Last login: Sun Aug 4 04:48:45 BST 2013 on pts/1

And a nice bash prompt.
Now onto problem #2, although I can now login after changing the password I get the password change prompt each login. Changing the password does take as login in the 2nd time uses the new password. So I think it’s now down to the ldap ACL for shadowLastChange so I’m going to investigate that, and will put anything to correct that one in another post.

Quagga Automatic Restart/Recovery

As I’m sure I’ve posted before, I use OpenVPN and Quagga to build up my network. After recently updating all my Ubuntu servers something strange happened, quagga that had been pretty rock solid started screwing up. Previously I’ve had the odd problem where a VPN would drop out and somehow block coming back up, so I scripted some VPN checks to confirm each link was up. If not the script restarts the VPN link that’s down. This has been working fine on each server and with quagga running routes around the entire network just keep working. Until of course the recent updates on each system that seem to have introduced a fault with quagga. Although quagga is remains running, all the routes disappear and just wont come back. An error does get logged in one of the logs (I’ll try to find what the error was and update here), but the quagga watchdog doesn’t see a problem since everything is still running. So I’ve put together a little script below that checks the routing table, and if there’s no entries relating to other networks (not local) then it’s considered that quagga has gone faulty and restarts it.

nano -w /usr/sbin/check_quagga_routes.sh
#!/bin/bash
checktime=`date`
echo $checktime : Checking Routing... >> /var/log/connection.info
routing=`route -n | grep -i 255.255.255.0 | grep -vi eth0`
if [ -z "$routing" ]
then
# No Routes to VPNs Detected. Restart Quagga
/etc/init.d/quagga restart
# log the restart
echo $checktime : VPN Routes NOT Detected. Restarting Quagga! >> /var/log/connection.info
fi
echo $checktime : Routing Check Complete. >> /var/log/connection.info
exit 0

The crontab entry is:-

*/1 * * * *   root     /usr/sbin/check_quagga_routes.sh > /dev/null 2>&1

This should mean that I wont have to manually restart quagga again if the fault occurs. Hopefully whatever has happened in the update will be fixed, but there’s no harm in leaving this in place as far as I can see.

Normally I’d opt for Nagios to run a check and on failure run a handler script, but since all the nagios checks and handlers get run across the VPN, as soon as the routes go down nagios is pretty useless. So this has to be run on each of the servers.

Asterisk UK Caller ID

I’m sure I’ve got an older post giving details of a patch to get caller ID working. I’d used the same patch for about 5 years over the different versions of Asterisk and it always worked, until recently.

A fresh install on one of the servers, applied the patch and made a bunch of test calls, half caught the Caller ID half missed it. So I worked on it for a while, and could not get it to reliably detect the Caller ID. I posted onto the forums incase anyone else had used the patch nothing.
What made it worse was this server was 1 of 3 running asterisk with a Digum Card on a UK phoneline. the other 2 were still working. So in an effort to eliminate possible causes I ended up installing the same version of asterisk and dahdi that was on each of the other 2 servers in turn and running tests, both with the patch and without. Sometimes it looked better than others catching 8 out of 10 CLI’s but that wasn’t the 10/10 that I was getting on this server before a fresh install of the OS.

After a while I gave up, it wasn’t an important server. I’d only put the asterisk card in to track calls coming into the home line, but it didn’t do anything more than log the Date and CLI. Everything else was done over SIP on this server, so I left it kind of working.
Over time the logs seemed to fill with unknown CLI’s more often.

Anyway onto this month. Earlier in the month I reboot my server remotely, when it didn’t come back online I went to investigate why. It was at this point I was almost in tears. I’d stupidly left an SD card in the server that I was imaging, and the server had boot from it. What made this a really really bad move was the fact the SD card I’d imaged with the Rasberry OS, and it decided to go off and format sda (I presume hard coded) which unfortunately was my HDD NOT the SD Card. To make things worse my OS on this server was encrypted, so little chance of recovering anything at all.
As the server runs alot of different stuff, I had to rebuild it as quickly as I possibly could.
A few days ago I got onto the asterisk installation, so ran through installing Dahdi and Asterisk, and changed the configuration based on memory and copying from the other 2 servers. Then I hit the Caller ID problem, out of 8 test calls 3 CLI’s. Now I know this one was working 100% so it has to be something obvious.
Then I remembered seeing something on the “working” server as I was checking it’s configuration to copy and paste.

nano -w /etc/modprobe.d/dahdi.conf
# You should place any module parameters for your DAHDI modules here
# Example:
#
# options wctdm24xxp latency=6
options wctdm opermode=UK fwringdetect=1 battthresh=4

Not something I would have paid attention to, I only ever remember editing a dahdi.blacklist.conf to stop the card being detected as a NetJet. Anyway I added the above file and contents, then reboot. (It should be noted at this point that I hadn’t applied the patch and was almost about to).
After the reboot I started asterisk and run through test calls. 9 out of 9 calls all CLI detected.

So the following day I went and checked the server I was originally having problems with after a fresh install. That didn’t have the line

options wctdm opermode=UK fwringdetect=1 battthresh=4

Bit it did have the blacklist setup. So I added the above and reboot. 10 test calls later all CLI’s detected.

I’ve no idea where I got the above from, but I must have stumbled upon it as a replacement fixed instead fo the patch I’d been using. I say that because the patch files wasn’t on the server that had this configuration, and I always leave the patch files in root incase I need them again.

So if your having a problem with UK Caller ID it maybe worth adding the above.
For info I’m running:-
Server 3:
dahdi-linux-complete-2.6.0+2.6.0
Asterisk 1.8.6.0

Server 2:
dahdi-linux-complete-2.6.1+2.6.1
Asterisk 1.8.6.0

Server 1:
dahdi-linux-complete-2.6.2+2.6.2
asterisk-11.2.1

Server 1 being the newest install, and Server 2 being the one I left with the problem for months.

Hope this helps someone. As UK Caller ID on Asterisk used to be a huge pain with little support and not something that people seemed to cover because Businesses tend to go SIP, IAX or ISDN eliminating the need to fix missing CLI’s.

Nagios Plugins fail to Compile

Getting the error

check_http.c:150:12: warning: ignoring return value of âasprintfâ, declared with attribute warn_unused_result [-Wunused-result]

When trying to compile the nagios plugins 1.4.16 in Ubuntu 12.04, it comes down to SSL.
Running : apt-get install libssl-dev
Then ./configure and make again fixed the problem.

As a side note I was also having problems with NRPE not installing. This one was down to not reading the error. the user and group nagios doesn’t get created by the nrpe install script. So just have to add the user and group and then the install was fine.

Ubuntu 12.04 Decrypt Drive Remotely

It’s been a while since I setup a new system, and although I had to look into decrypting a drive remotely a few months back when one of my servers refused the key, it’s pretty much been just as long since I really had to setup remote decryption from scratch.

Tonight I’m building up a new system to replace an existing server. The reason is it’s undergone several major distribution updates without a full reinstall and I think it starts getting messy after a while. Put that together with X refusing to startup and display my cctv stuff after updates a few months back (see other post), I really think it’s time for a clean server.

I’m not going to run through all the step for my install here, it’s pretty common Ubuntu 12.04 Alternative CD, Encrypted root, swap and data running on LVM and a /boot that’s the first partition on the disk.

Now I’ve got my system up and running, I want to be able to remotely access it while it’s booting to provide the decryption key and then let it continue. I do this on all my system, and although I’m sure you don’t need all the commands when it’s booting I use them just to be sure.

I’m making the following assumptions, you will need to adjust accordingly:-
Your machine is already on your network.
You have SSH access to your machine.
You have root privileges.

First thing is to install dropbear and busybox

apt-get install dropbear busybox
It says above it’s going to remove busybox-static and ubuntu-standard, I personally dont have any issues with this, but you may wish to search google for what these packages do (or any packages your system says it’s removing) before you continue.

At this point I reboot my system (you wont have any remote access yet), purely because I’d already run updates and forgotten to reboot before I started this blog.

When the system was rebooting, pressing escape when being prompted for the decryption password showed me the interface configuration.
I was able to make an SSH connection to the dropbear server, but unable to authenticate. Also as this was a DHCP ip address, it’s not really much good as a remote recovery system.

Next we need to edit initramfs.conf
nano -w /etc/initramfs-tools/initramfs.conf
Here’s how my file looks before editing:-

Locate the line DEVICE=
and adjust to be DEVICE=eth0
then add a line IP=192.168.yyy.253::192.168.yyy.1:255.255.255.0:daedalus.xxxxxxxxxx.local:eth0

Replacing the yyy with your own network value and xxxxxxx with your own domain
The IP= is separated into the following options IP ADDRESS :: GATEWAY : SUBNET : COMPUTERNAME : INTERFACE
There’s an option between IP and GATEWAY, {review} I need to add in explanation for but wont affect anything left empty.

Personally I use the address .253 as it’s outside my DHCP scope and not an address I’m using. I also have it setup on my router to forward SSH traffic to the .253 IP. Once the machine has boot it drops the .253 address so is only accessible externally while booting.

Save and Exit this file. CTRL+X. Followed by Y to save. Then ENTER to keep the same name.

Now we’re going to add an authorized key. First
cd /etc/initramfs-tools/root/.ssh
then
cat authorized_keys

As you can see it already has an entry from the dropbear installation. However we’re going to replace this with a new key. First we must generate the key. On your windows machine open PuttyGen from your start menu:-

Press the Generate button (I increased the bits to 2048 first).

You will be asked to move the mouse randomly over the blank area until the green bar completes.
Then a key is generated:-

Once your key generation has complete, save the public key. How you choose to secure this is upto you, personally I just keep it saved in my documents, it’s not decrypting the hard drive just getting me access to do so.
With that saved right click the public key and Select All, then Copy.

Now go back to your Putty SSH connection window and nano -w authorized_keys

As you can see the existing key is already in place. You can delete this line entire if you wish CTRL+K. If you didn’t delete just move to the next line. Once on a free line simply right click to Paste.

As you can see Nano has scrolled to the end of my line, so I can only see the Key comment. You can now Save and Exit this file. CTRL+X. Followed by Y to save. Then ENTER to keep the same name.

Now that we’ve made adjustments to the boot configuration you need to rebuild the boot files.

Type update-initramfs -u
UPDATE:- you may encounter the error “cp: cannot stat `/lib/x86_64-linux-gnu/libnss_*’: No such file or directory” I cover this in another post http://blog.starbyte.co.uk/cp-cannot-stat-liblibnss_-no-such-file-or-directory/
Once complete you can reboot your system ready to test.
When you system is rebooting and sat prompting for the password, press the Esc key and ensure that the network configuration is correct.

Now we need to connect to the system from our windows machine. Open Putty and start a new connection:-

Click on Data on the left hand side:-

Fill in the Auto-login username as ‘root’. Then expand SSH and select Auth:-

Browse to the private key you saved earlier.
You can either press Open to connect now, or return to the Session screen and Save the session for easy access later (I didn’t).
Once connecting you should be prompted to accept the fingerprint:-

You should now be connected to your server:-

You may wish to use the above to create new keys from each machine you may connect from (Desktop, Leptop, etc) and append them to the authorized_keys file. If one of your systems is then lost you can merely remove the key and regenerate the initramfs.

Now that we have our connection, we need to supply the password. Over the last few years I’ve come across various different methods. Some pipping the password into a hook, others killing the script currently asking for the password and manually unlocking the drive. The later method is the one I’ve always used, mainly because I have multiple encrypted drives and unlock each of them manually.
{sidenote: rebooting from within busybox did restart the machine, but left it on the grub selection screen no countdown. Encountered a halted boot before but didn’t know why. Need to ensure grub always has default countdown}

I’ve just run through a few quick tests of simply pipping the password and it still doesn’t work for me. So here goes with the 2nd longer process. I’d like to thank whoever I originally found this from, but I have no idea. Their steps were extremely well written (unlike mine).

First we need to stop the current script running looking for the password:-

Type ps | grep -i crypt
This will list the running crypt processes. We’re interested in the script in local-top. It’s process number is 227, so we issue : kill 227

Now that we’ve issued the kill, we need to wait a while for the system to drop to a prompt. Type ps | grep -i wait
As above you will see the wait-for-root script running, and it has a value of 30 (meaning 30 seconds). If you wait 30 seconds and rerun ps | grep -i wait, you will find it’s no longer running.
You can now continue to unlock your drive(s)

In the image above you can see where wait-for-root was running, then the following check it wasn’t. Followed by the command I use to unlock my drive.
Type /sbin/cryptsetup luksOpen /dev/zzzz zzzz_crypted
Replacing the zzzz with your drive and partition. If your unsure you can always double tab for suggestions.
You wont get any password feedback, so if you think you’ve made a mistake hold the backspace key for a while then type again.
If your password is accepted you will be returned to a prompt.
At this stage we tell the shell to kill itself:-

Use ps | grep -i sh to find the process number (271in the above) then kill -9 271
Follow by exit
If all has gone well I normally receive:-

Normally if I haven;t received this, I’ve done something wrong. At which point physical access is required to correct my mistake (or get someone to reset the machine and try again).

At some point I may make a script or 2 to run and see if it hasn’t been unlocked within 30 mins to reset. This may give me a bit of resilience is screwing it up. After a few times of unlocking though you really do remember the commands. I did save a text file into the root folder /etc/initramfs/root/guide.txt
(Thinking of that now I’ll just paste that below – again thanks to whoever originally wrote it. but after a few times I didn’t need it anymore.)

1) run “ps aux” and located the process id for the /scripts/local-top/cryptroot script
2) run “kill -9 pid” replacing pid with the process id you found in step 1
3) run “ps aux” again and look for a wait-for-root script and note the timeout on the command line
4) twiddle you thumbs for that many seconds – what will happen is that script will exit and start an initramfs shell
5) run “/scripts/local-top/cryptroot” and wait for it to prompt for your unlock passphrase
6) enter the unlock passphrase and wait for it to return you to the busybox shell prompt
6.5) Unlock each drive to get a clean boot, sda,sdb,sdc,sdd as sda_crypt,sdb_crypt,sdc_crypt,sdd_crypt
7) run “ps aux” again and locate the process id of “/bin/sh -i”
8) run “kill -9 pid” using the process id you found from step 7

As you can see I added 6.5 to remind me to decrypt other drives. Doing this means their mounted as the system is coming up.
That’s pretty much it. You should now be able to remotely unlock your encrypted drives.I realised towards the end my internal IP is on the top of the putty windows, but I only really masked it from the examples to highlight a change. I hope people find this somewhat helpful. Any feedback welcome, I’m now off to start copying data over. and btw I’ve changed the keys incase anyone worries for me 🙂

UPDATE:-
I’m just running through this on 12.10 and ran into a problem whereby the ip address yyy.253 wasn’t being released. Some searching suggested that network/interfaces will be ignored because of checks in the process that fail. This wasn;t the case for me, I was getting 2 addresses on the interface. The solution was to add:-

pre-up ip addr flush dev eth0

To /etc/network/interface to clear eth0 before applying the new IP.

Logwatch pam_unix unmattched entries

Ok this post will need some work to pad it out and make more sense.
I’ve been running logwatch for years and a few months back had to reconfigure some of the configuration files after splitting out syslog to multiple files to make them easier to read i.e. putting all cron stuff into cron.log bind9 into named.log etc.
After those simple changes my logwatch email went from a hundred or so lines to thousands and until now I haven’t had time to look into it.
All the unmattched entries were against the cron log and all being pam_unix stuff as cron goes off running stuff.

As I didn’t get these before I was a bit confused but looking around at the configs and services there is a pam_unix.conf in the services. So after more changing and fiddling about I was still getting over 7k lines of logwatch email and no idea why. but tonight on looking closer at the email it’s the cron service that’s marking the entries as unmattched not as I thought that pam_unix wasn’t going near the file (to be fair it probably isn’t, but that’s not why the lines are being included in the email).

I wont run through my entire process of narrowing it down, to be fair I couldn’t remember every step I’ve done tonight anyway. Bottom line was modifying the following:-

/usr/share/logwatch/scripts/services/cron

find the lines:

} elsif ($ThisLine =~ /FAILED to authorize user with PAM (User not known to the underlying authentication m$
      $PAMAUTHErr++;

underneath insert the lines:-

} elsif ($ThisLine =~ /pam_unix/) {
      $PAMUNIXAUTHErr++

then search for:-

if ($PAMAUTHErr) {
      printf "nPAM autentification error: " . $PAMAUTHErr . " time(s)n";
}

and underneath insert the lines:-

if ($PAMUNIXAUTHErr) {
      printf "nPAM_UNIX autentification error: " . $PAMUNIXAUTHErr . " time(s)n";
}

Save the file, and that’s it.
Now instead of having 7k extra lines of pam_unix stuff, I have one line summing up.

As a side problem, I’m now receiving clamav info when I wasn’t before and dont run clamav or have the logfiles mentioned. that’s something to look at tomorrow, but at least the logwatch is back down to one small scrollable window so even with the clamav annoying stuff I’m happy to be able to read the logwatch out easily again.

As the top says this needs some cleaning up on edit. hopefully get around to it in the next few days.