In the last few weeks I’ve been rebuilding servers & all services. I like to do this every so often to clear out any old junk, and take the opportunity to improve/upgrade systems and documentation.
This time around it’s kind of a big hit though. While I’ve had some services running with HA in mind, most would require some manual intervention to recover. So the idea this time was to complete what I’d previously started.
So far there’s 2 main changes I’ve made:
- Move from MySQL Master/Master setup to MariaDB cluster.
- Get redis working as HA (which is why your here).
I’ll bore everyone in another post with the issues on MariaDB cluster. But this one concentrates on Redis.
The original setup was 2 redis servers running, 1 master 1 slave. With php session handler configured against a hostname which was listed in /etc/hosts. However this time as I’m running a MariaDB cluster, it kind of made sense to try out a redis cluster (dont stop reading yet). After reading lots, I decided on 3 Servers each running a master and 2 slaves. A picture here would probably help but you’ll just have to visualize. 1, 2 & 3 are Servers, Mx is Master of x, Sx is Slave of x. So I had 1M1, 1S2, 1S3, 2S1, 2M2, 2S3, 3S1, 3S2, 3M3.
This worked, in that if server 1 died, it’s master role was taken over by either 2 or 3. And some nagios checks and handlers brought it back as the master once it was back online. Now I have no idea if this was really a good setup (I didn’t test it for long enough), but 1 of the problems I encountered was where the PHP sessions ended up. I (wrongly) thought the php session would work with the cluster to PUT and GET the data, so I gave it all 3 master addresses. Reading info on redis, if the server your asking doesn’t have the data it will tell you which does, so I thought it’s not really a problem if 1M1 goes down and the master becomes 2M1 because the other 2 masters will know so will say the data is over there. In manual testing this worked. but PHP sessions doesn’t seem to work with being redirected (and this is also a problem later).
So after seeing this as a problem, I thought maybe a cluster is a bit overkill anyway and simplifying it to 1 Master and 2 Slaves would be fine anyway.
I wiped out the cluster configuration and started again, but I knew I was also going to need sentinel this time to manage which is the master (I’d looked at it before, but went for cluster instead. Time to read up again).
After getting a master up and running and then adding 2 slaves. I pointed PHP Sessions to all 3 servers (again a mistake). I was hoping (again) that the handler would be sensible enough to connect to each and if it’s a slave (read only) detect that it can’t write and move to the next. It doesn’t. It happily writes errors in the logs for UNKNOWN.
So I need a way for the session handlers to also know which is the current master, and just use this.
My setup is currently 3 MariaDB/redis servers S1, S2 & S3 and 2 Nginx servers N1 & N2.
I decided to install redis-sentinel on all 5, with a quorum of 3. The important bit in my sentinel config is:
sentinel client-reconfig-script mymaster /root/SCRIPTS/redis-reconfigure.sh
and the redis-reconfigure.sh script:
#!/bin/bash adddate() { while IFS= read -r line; do echo "$(date) $line" done } addrecord() { echo "## Auto Added REDIS-MASTER ##" >> /etc/hosts echo "$1 REDIS-MASTER" >> /etc/hosts } deleterecord() { sed -i '/REDIS-MASTER/d' /etc/hosts } # <master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port> # $1 $2 $3 $4 $5 $6 $7 if [ "$#" -eq "7" ]; then if grep -q "REDIS-MASTER" /etc/hosts; then echo "Delete and Re-Add $6 REDIS-MASTER" | adddate >> /var/log/redis/reconfigure.log deleterecord addrecord "$6" else echo "Add $6 REDIS-MASTER" | adddate >> /var/log/redis/reconfigure.log addrecord "$6" fi fi
Basically this script is run whenever the master changes (I need to add some further checks to make sure <role> and <state> are valid, but this is being done for quick testing.
I’ve then changed the session path:
session.save_path = "tcp://REDIS-MASTER:6379"
and again in the pool.d/www.conf:
php_admin_value[session.save_path] = "tcp://REDIS-MASTER:6379"
Quite simply I’m back to using a hosts entry which points at the master. but the good thing (I wasn’t expecting) was I dont have to restart php-fpm for it to detect the change in IP address.
It’s probably not the most elegant way of handling sessions, but it works for me. The whole point in this is to be able to handle spikes in traffic by adding more servers N3, N4, etc. and providing they are sentinels with the script they will each continue to point to the master.
I did think about writing this out as a step by step guide with each configuration I use, but the configs are pretty standard and as it’s a bit more advanced than just installing a single redis to handle sessions, I think the above info should really only be needed by anyone with an idea of what’s where anyway.
I still dont really get why the redis stuff for PHP doesn’t follow what the redis server tells it i.e the data is over there. I suppose it will evolve. If your programming up your own use of redis like the big boys then you’ll be programming for that. I feel like I’m doing something wrong or missing something, I’m certain I’m not the only one who uses redis for PHP sessions across multiple servers behind a load balancer, but there is very little I could find googling beyond those that just drop a single redis instance into place. As this becomes single point of failure it’s pretty bizarre to me that every guide seems to go this way.
If anyone does read this and actually wants a step by step, let me know in the comments.
Just to finish off I’ll give an idea of my current (it’s always changing) topology.
2 Nginx Load Balancers (1 active, 1 standby)
2 Nginx Web Servers with PHP (running syncthing to keep files in sync, I tried Gluster FS which everyone raves about, but I found it crippling so I’m still looking for a better solution).
3 Database Servers (running MariaDB in cluster setup and Redis in 1 Master, 2 Slaves configuration).
1 Monitoring Server (running nagios, checking pings, disk space, processes, ports, users, ram, cpu, MariaDB cluster status, Redis, DB & redis is accisble from Web Servers, vpns, dns and probably a load more).