GlusterFS – J.Lo's Tech Blog

How I started using WordPress

I should make it clear from the outset, this post isn’t going to be solving anything. I’ve spent about 3 days working on stuff and this is just bugging me.

Let’s go back to around this time last year. A couple of friends and I were working on how to get some money back from a facebook page one of them had setup a few years prior. I had been working with him on it since a few weeks after he set it up, and we’d pushed around ideas to sell some merchandise alongside it a few times, but never really got anywhere.

He had made a rash decision one night to use an ‘online website creation’ provider to get a site online. From the start I hated it. Not the idea, I think it was about time to get our own site running, but he spent a few weeks tweaking about 8 pages to look really good, using their WYSIWYG editor. Then wanted me to change a few things in code that wasn’t right. It was an absolute nightmare! I can’t remember the name of the site but I think it had a W E X somewhere in the title.

It was a paid “solution”, and I think cost about £40 per month by the time he’d added a mailing list option to capture email addresses (not actually handle any emails) and a few other bits.

Coming from more of an IT position, my main concern was around load/spikes being handled. There was very little information about how well they could handle this (and I think we found the reason why). We finally put this live posting the link to about 40k followers.

Watching the nice google analytics (that I had to add to each page, because they didn’t have a drop in the tracking code option), within seconds we hit around 150 hits per second. This continued for a few hours, but 2 problems became apparent:

The site was struggling, and we would probably have been hitting a higher number. We were getting positive feedback and people understood it was busy, but I still wasn’t happy we’re paying for them to handle this and it’s just not.
And this really hits onto point 1. He’d setup a site that was pretty static! There was nothing to get people coming back for more. Yes there was a news page that we could update, but other than the mailing list form there was nothing interactive. (So hitting point 1, static content should really have been able to be handled 100x better).

Anyway we took that for what it was, a basic site with a bit of info and something to get us started.

We already had a ‘Shop coming soon’ page, so the next thing we were trying to figure out was what are we going to sell and how?

Initially we thought t-shirts, and started looking at some of the ways we could do this. 2 main providers seemed to jump out zazzle and cottonpress (I think, it was a while ago). While both had some good offerings, neither really grabbed us. I can’t remember which, but one of them deleted an image we uploaded for copyright (it was our logo, and we had it plastered everywhere and the account was signed up with a domain name using the same logo), but they wanted us to fill in and snailmail/fax some copyright forms and reupload the logo. Considering we were only seeing what we could do at the time, we decided to drop them as an option. If we have to jump through these hoops with everything we do, we’ll spend more time filling in their paperwork than anything else.

Time went on, visits to the site died down (did I mention lack of content/interactivity), and we still hadn’t sorted out products, a store, a business.

We continued with the facebook page and still poked around ideas on how to get a site/shop running. I spent a good few weeks working with oscommerce (I’d previously used it a little for another project idea, but it never went live) and finally had something to show, a semi working shop front (it had no products).

We discussed that neither of us really had a clue about setting up a proper business. I’m all IT and have no interest in writing business plans or doing business meetings (I should mention that a previous role I was an IT Manager and regularly had to be part of “grown up” meetings, I’m a Tech, I hate people, I hate meetings, give me something broken – I’ll fix it, give me a problem – I’ll work out a solution. But in NO WAY do I want to be taking part in any more business meetings).

A few months later I was helping him move. Another friend of his was also helping. I’d met him once before but didn’t get chatting then. He mentioned he was in his last year of university and was studying business and finance! Just how he hadn’t thought of this before I dont know, but instantly we knew he was coming on board 🙂

We spent about 3 hours in McDonalds discussing what we had, what we’d like to do, and just how crap we’ve been so far. Within this conversation we said about selling t-shirts/mugs/bags etc. Just like a genii out of a lamp our 3rd comes out with ‘oh my almost father in-law does printing stuff, I know he does mugs. Shall I speak to him?’ Just like a match made in heaven, we suddenly had our missing piece! someone who should have more of an idea on the business side (or at least know someone to ask) and connections to a printers for the kind of stuff we want to sell. You just couldn’t make it up, he’d known this guy for a few years and never thought to asking him about business stuff.

Things started moving forward, slowly at first but at least they were moving. We met up with “almost” father in-law and went through some designs and processes. We setup a business. I continued work on the shop website and we took down the other that was costing too much and not really doing anything.

In around October time we were set. Nothing spectacular, about 9 mugs and a few t-shirts. The mugs would be the easiest, we just send the order to “almost” and he takes care of printing, packaging and sending. The t-shirts would be a little different as we’d have to get a template made for them, and couldn’t afford the cost until we had some orders in.

We launched, I had tried to over-spec the server(s) but in itself this was tricky. There were no real stats on how well oscommerce could perform on certain hardware. And scalable VPS’s such as DigitalOcean’s current system just didn’t exist. Scaling would mean taking a new 30 day server and moving everything over to it. Certainly not a 5 min job, and definitely not something to start an hour after we’ve launched. We’ll just have to bite the bullet and see.

My memory of launch night is fuzzy to say the least. I think I’d been working 36-48 hours trying to finish stuff off. I had a big list of checks and can’t remember doing half of them.

Our page audience had grown to about 70k, so I was very nervous. We launched the shop and watched. ping, email – it’s an order, ping, another, ping another. It was working. I have to say the server(s) held up pretty well. It wasn’t without problems, we did start seeing the site timing out on new connections for about 10 mins. but a swift kick of apache sorted that and it didn’t cause a problem again.

Finally we were running. The feedback again was good. We had a bunch of concerns like

Will the system work
Will the servers(s) hold up
What happens if it goes mental and we sell a thousand mugs
Can we really do this

I think all in all it went well. We could have done better, but it also could have been a lot worse. We ran with oscommerce for a few months. Shortly after launching the shop I had a discussion on just how were we going to get facebook and the shop incorporated? There was no obvious answer, then we hit a problem. Once of the posts to facebook got reported (it’s a humorous page, and we only every post stuff sent in to us), this showed us just how much we’re reliant on facebook. Suddenly we’re all logged out and the poster was blocked for 24 hours. Luckily facebook pretty much left the page alone (just delete the one post), so we played on it that one of our admins was in the dog house for an earlier post. But it still didn’t take long to realise if facebook wanted they can delete the page at a whim and we’ve suddenly lost all our content and fans!

This just didn’t sit right with me, and I started looking at how the hell to get a backup of OUR page and content. There was nothing. So I started looking at how can we do things differently, enter WORDPRESS.

I’d seen this name floating about for the last few months, but never really saw the point in using it. I dont blog, we dont blog, so what’s the point? (I’m still not entirely sure I understand the point) but it’s close to one of the best things I’ve spent weeks fiddling with.

I’d installed wordpress on our VPS for me to have a look around, it still wasn’t a site that we could really use, but as a CMS maybe I can find a way to connect to facebook and backup our stuff. There must be people who do this right? WRONG. There’s loads of plugins for wordpress & facebook, but I’ve only ever found 1 that takes your page and puts it as posts. To make matters worse, it’s flakey as f**k hadn’t been worked on in god knows how long, and of the very few comments in the code their in Chinese.

Now I would never describe myself as a coder. I’ve used Delphi and VB for writing some functional programs in the past, and had to program a few in VB.net when I was IT Manager (the old problem/solution thing), I could also write some ASP and PHP, really most of my stuff was drag and drop boxes and program them up for stuff. I did quite a bit of databases stuff within them, but that was it. There was absolutely no such thing as using classes (I think even think they existed). But as part of my job I had a dev team who did develop in PHP and VB.net, and they were always amazed when trying to tell me how something couldn’t work, that I could not only follow along but tell them why they were wrong and on several occasions when something broke could actually read their code and work out (normally the simple thing) a temporary fix.

And so it begins I now have no dev team, a bunch of php code and classes that really didn’t make much sense to me. Bit by bit I managed to work out what each bit was doing, then moved on to changing it so that it would run for us. I know it will seem like simple stuff (especially looking back), but things like:

Changing a hard coded loop to only pull 10 facebook posts, to take a limit from a setup interface where you can specify how many to pull.
Adding in Date ranges to pull from and to.
Improving the cron job, so it look from the time of the lastest post it was + 1 second.
Downloading any attached image and saving it to the server (huge accomplishment).
Changing the posts content that gets published and updating links back to the post it just posted.

There’s load of stuff I’ve had to do to this plugin to get it working, let alone better and working for us. Eventually I’d finished (your never really finished, I have a list of new changes to get to sometime). After running it on a fresh wordpress install, we suddenly have a complete backup of our facebook page, around 9k posts and images, all sat in wordpress. and what’s more automatically grabbing new stuff 🙂

I showed off my new achievement, personally I dont think it was appreciated just how much time and effort I had put into this. but it went down really well. We now had a blog! we now had a blog alongside our store and it was really starting to come together.

Over the next few weeks I kept working on improving the blog while managing the shop. Then suddenly a new disaster, our server had for some reason gone offline. Trying to connect via the backup terminal access just gave me a blank screen, something was wrong and I couldn’t get access to see what. To make matters worse, our provider had very nicely decided to cut back on it’s 24/7 support, and now only operated 8-8. At around midnight, it’s not exactly what you want to be finding out that the support has changed and no-one bothered to tell you. The ONLY thing I could do was email them, and hope someone picked it up soon. They didn’t! I spent a good few hours trying everything I could think of to get hold of someone or find a way to the console, but nothing. This had the effect of making me sleep through my alarm at 8am, but I woke at 9am and called them. After a few choice words, I was assured the tech team would look at it right away. I was so tired I fell back to sleep and woke again about mid day. The first thing I did was open our site, or I should say attempted to! it was still down!! Another call, more choice words, and me advising them I’m not going anywhere and if they cut me off I’ll just keep calling until I can speak to a “Tech”, then explaining the problem and what I had tried to tier 1 monkey, quickly got me escalated. I couldn’t stop laughing when I finally did get to tier 2, their tier 1 had placed me on hold to get someone then come back to me and said ‘I need you to take this call, this bloke knows what the f**k he’s on about, I’d put him to tier 3 but I can’t direct transfer’ to which I replied ‘Yes and I know how to work a phone system, Line 1 is the customer, you should be on Line 2 for that conversation’ 🙂 I have to be honest just that mistake made my day. Tier 1, 2 and the manager that called me back an hour later were mortified, but as I explained to him I’ve been a senior tech on phone support, I’ve been an IT manager, I’m guessing I hit a newish person and scared the crap out of them. I only care in getting this back online. To be fair to Tier 2, I was connected to the console while he was apologising. (This part really could have had it’s own post).

Anyway getting over that failure I started looking for another VPS provider, I had no problem with their VPS and generally it was a very stable system, but 8-8 support with no out of hours we’re really f**ked option, forced my hand. It had gone down very badly with the others that this had costs us money and there was no way I could argue it as I agreed the situation was crap.

I found another provider and started moving stuff, but it just wasn’t right. It was actually a previous colleagues company, but something just wasn’t right. So I kept looking. Then I found Digital Ocean, initially I started using them to test some wordpress plugins, but I loved that I could very quickly bring up new servers in a matter of minutes. This surely has to be better than waiting hours. And it was. Testing was going well, so I started moving everything over. Everything just worked, and where I had to contact their support for a few little things (1 account related can’t remember the others), I had a reply very quickly sometimes within minutes other times within 30 mins. I couldn’t fault their support and I wasn’t bounced around, they knew exactly what I needed and sorted it.

So here we have a medium spec’d Digital Ocean server, running our WordPress and OsCommerce solutions and handling both pretty well.

But being one to never settle, I kept tweaking stuff and looking at out options. I setup another server (droplet) for testing, another wordpress install later and I’m going through trying out the ecommerce plugins. I was blown away with WooCommerce! yes OsCommerce worked for us. and yes I had put in quite a bit of time customising it and getting it to work with our processes. but the whole feel of the interface was crap. Woocommerce was like a breathe of fresh air. It had a bunch of functionality, there’s loads more plugins, it’s far easier to customise, it works from the wordpress themes, and fits right in with our blog and doesn’t look disjointed.

I proposed we move over to this and it went down well. Well enough infact the the others wanted to get more involved, we spent weeks working on changes to the theme (that’d we’d paid about $50 for), I moved the shop over and made it live without telling our facebook audience. We started getting some sales via Woocommerce, and it was obvious that this just integrated well.

We were going to have a relaunch to show off the new blog and store, I think I managed to p*ss the others off, when WordPress brought in a new standard theme that worked even better with Woocommerce and I changed to it to show them. It was obvious that it did and we should stick with it, but it also meant the last few weeks of customising was wiped out (and they still bring up the time I wiped out a few weeks work when I changed the theme).

I would never say wordpress/woocommerce is perfect, I’ve found many issues along the way and had to find work arounds for a lot of stuff. I still dont truly feel like I know what I’m doing and there’s no way that we use wordpress to it’s full potential. Currently we have the blog and shop running, we have somewhere in the region of 10k posts and around 15k sales. We still don’t publish to the blog independent of facebook/twitter but it’s on the roadmap.

One thing that has caught us out a few times is DigitalOcean scaling. Because we very often have little traffic, I always keep the servers scaled down with the intention of boosting them up before we push anything new. On at least 2 occasions, we’ve forgotten this and overloaded our site.

I’ve also gone through a few different configurations just trying to find the best solution.

1st We had 1 server, that was mid range and just worked, but I knew this alone wouldn’t handle the traffic.

2nd I brought up 2 web servers and a database server. This wasn’t an ideal setup, loadbalancing was at DNS level, syncing was done via cron jobs, and the whole thing held together via a VPN to keep database connections secure. This had a bunch of problems.

Next I moved back to a single web server but kept a separate database. This was better, and around the time DigitalOcean allowed you to scale up easier (but not down you still had to wipe out the server to do so).

Because having a single web server just wasn’t enough, I went back to 2, but added in the new(ish) cloudflare CDN in front of the servers. This really helped (though I’m still not convinced really does CDN for us)

As part of the above, I tried incorporating GlusterFS (absolute disaster). From every web search I did GlusterFS looks to be THE solution. In practice for us it took a website responding (with some heavy graphics) in 3 seconds longest avg 2secs, to 30sec longest 18secs avg. I know everyone rave’s on about how great it is and how if it’s slow it’s something you’ve done. I dont believe this for a second. I’ve spent days at a time trying to make it better, but the simple truth is if the files are pulled locally I get the 2/3secs above. When using a Gluster mount point to the files (which are still actually local, Gluster on both web servers, mounted back to themselves), I get the 18-30secs. Both web servers have a private lan connection to use gluster in the same Digital Ocean Location and NO amount of tweaking or testing seems to every really improve this. It was only made worse during testing when I took down one of the servers, so that the other could only use itself to serve the files and this managed to take out the mountpoint until I restart, and still it served up the pages slowly. I thought the whole point in using Gluster (at least for me) was HA, no single point of failure. Having both servers offline if one goes down does not seem very HA to me.

The ONE thing I really want Digital Ocean to sort out is their private lan. In order to solve the issue of anyone else on the private lan being able to see my traffic between servers I’ve had to use VPN’s between them. This adds complications to the entire setup, and a private lan per account would be very welcomed.

The setup I’m currently in the middle of deploying is:

a) Cloudflare

b) 2xNginx loadbalance proxies (also serve up maintence pages if they can’t connect back.

c) 2xNginx Backend servers

d) 2xMySql+Redis Servers

e) 2xNFS Servers

I’m happy with the load balancers, though I would love for DigitalOcean to offer a proper loadbalancing solution.

The MySQL servers took some config to get replication working properly while also using SSL for the connection to each other and from the backend web servers.

I still haven’t managed to configure MySQL to be HA from the web servers, so at the moment this would be a manual switch. I’ve found HyperDB for wordpress, while should resolve this, but since I had to slightly change the wordpress config to do SSL for MySQL, HyperDB doesn’t seem to be able to use SSL so I need to work out how to do this. I find this really weird as once of the suggestions is to have your database remote, I really would have thought being remote (especially if using something like Amazon for the database) would mean you’d want to use SSL to keep your database traffic secure. It seems strange that this isn’t a fundamental option in HyperDB (unless I’m just not seeing it).

And the last part NFS Servers, I still need to find out how to keep these in sync (without using Gluster), I’ve previously used Syncthing to keep servers in sync, it works but is pretty much held together with tape (my configuration of it not the actual program). Once I have the NFS sync’d I also need to find a way for the web servers to use both HA.

I do feel like this configuration is the closest to the best I can achieve on a budget. Once I have the MySQL and NFS stuff worked out, I will then be able to scale any server without completely taking the site offline. Which will really help in being able to deal with spikes. It is not much easier to scale with Digital Ocean, but I’d still really want to know doing so or taking a server out for maintenance is fine because everything will just keep running.

If you’ve got this far, I really thank you for reading. I hope the next couple of posts will be my solutions to the MySQL SSL and NFS problems. It’s not 2:41am and I think I’ve been writing this for about 2 hours, so I’m going to sleep 🙂 leave a comment if you got this far, include the words ‘sleep deprived’ so I dont think it’s spam.

GlusterFS woes

If your looking for the gluster error ‘brick2.mount_dir not present’
jump to the end

Time for another post 🙂

I’ve been using DigitalOcean for some time now, and I’m still tweaking my setup. Once thing I really hope they sort soon is proper private lan between your own droplets, for now we just have to use a vpn between them.

Being responsible for a new website can give a lot of headaches, especially when you have to try to guess just how popular it will be. So about a year ago I setup a new droplet to host the new site, testing it was going well and I increased the droplet before we launched to handle a spike. Sadly I underestimated just how busy it would get, based on the numbers I was given I think I was about right but unfortunately those numbers were way off.

But each failure it just another learning curve 🙂 so as quickly as possible that was fixed, then the site got to normal volume so we scaled it back down (yes it’s a whole cost exercise especially when your paying for it). Then we had the lead up to Christmas, in an attempt to not repeat the problems at launch, I changed the whole configuration so that I could (if needed) take a server down while staying partly operational. This kind of worked and was needed when some brightspark promoted the site a day early and we hadn’t scaled back up!

Come the new year I decided it was time to seriously sort the infrastructure for the site. It now has an online store and it’s important it keep running, it’s not just a blog anymore. So I put in place the following setup (working around various obstacles).

DNS:

All websites name servers are pointing to cloudflare and they handle the first web connection. It works really well on their free tier, and changes (adding new servers) are pretty quick to take effect.

DigitalOcean Droplets:

1x Server running as a load balancer.
2x Servers running as webservers.
1x Server running as database server.
1x Server running as email (not quite running).
At the same time as making this setup I decided to ditch apache and move to nginx, so loadbalancer and webservers are running that.

Software:

4x Nginx (loadbalancer and webservers, and installed on database server for stats).
2x Syncthing (webservers) to keep the www folders in sync.
1x MySQL Server.
4x OpenVPN (connections between loadbalancer and webservers, webservers and database).
1x Redis Server (for session data, I tried nginx load balancing options but it still screwed up if I had to take one of the web servers out for maintenance, so installed Redis on the Database server.

As this progressed I dropped the VPN between loadbalancer and webservers and just use HTTPS/SSL instead. Syncthing already has it’s own SSL built-in so I could leave that over the semi-private LAN. but I really would like to change MySQL to be encrypted and drop the vpn from that too, but find info on doing this for wordpress seems to be non-existent at the moment.

Roll forward a few months, this has been working but still has areas to resolve. Such as syncthing: yes it keeps the folders in sync and is actually really good that I can also store them on another system easily. but it doesn’t listen to the OS for changes to files. Instead it polls every x seconds. Although there’s nothing much changing, updating plugins became a problem if you click the update button it downloads but then nginx sends your next request to another server and now the plugin.zip isn’t there so wordpress throws an error.

My whole reason for running syncthing was I wanted the files to be available on each server independently. So if Server A goes down it doesn’t matter Server B has all the files locally anyway. NFS would still give me a single point of failure. On looking into resolving this though I remembers GlusterFS. I’d played with it a long time ago, but dropped it as a solution (can’t remember what I was doing or why it wasn’t working). Now it’s time to try it again. downside I’m back to needing VPN’s and OpenVPN isn’t the easiest to quickly add a new server.

So I’ve done the following:

Added a new server just for the files (I don’t like gluster being in a 2 replica incase there’s a problem, there should be a majority who thinks they are holding the correct file).
Swapped out OpenVPN for Tinc, I have to say one of the best decisions. yes there are downsides, it creates a mesh (only doable with OpenVPN by running quagga for manually forcing routes) but I have no idea which Server is actually connected to which Servers. There’s no VPN status and I can’t see how much traffic has gone between 2 particular servers (iptables helps but it’s not 100%)
Added another new server for nagios and central logging.

There were a load of changes within a few weeks of each other, but I now have a setup I’m confident I can scale more quickly than ever before. Yes it has single points (load balancer, mysql) but I know if the load balancer has a problem it’s pretty static so can be wiped and redeployed quickly, as well as it will take a few minutes to open the webservers to the world and let cloudflare hit them directly. So MySQL is the real problem and I’ll be addressing that one soon enough.

So now onto today’s problem 🙂

I’ve had gluster running a few weeks, and I have our testing website (for theme changes etc) setup on our webservers behind the loadbalancer. The last few days I’ve need to do more extensive testing than just changing bits in a theme, so I’ve decided to split the tester site onto it’s own droplet (still behind the loadbalancer and with VPN to the databases). I thought I may as well make use of Gluster here too (yes it would be in a 2 replica setup itself and the fileserver. I don’t like that idea). So I brought up a new server and configured it: new users, firewall rules, tinc, nginx, php, etc.

I added gluster and copied the /etc/hosts entries over from the other servers. All looked good. I gluster peer probe ServerX and it worked, gluster peer status and I could see it fine. but on trying to add a new volume:

gluster volume create xxx-yyy-zzz replica 2 transport tcp FILESERVER:/GLUSTER/xxx.yyy-zzz TESTSERVER:/GLUSTER/xxx.yyy-zzz force

I was getting the error:

volume create: xxx-yyy-zzz: failed: Commit failed on localhost. Please check the log file for more details.

Checking the logs on both servers would show (maybe slight variation):

[2015-07-28 16:00:41.612907] E [glusterd-hooks.c:328:glusterd_hooks_run_hooks] 0-management: Failed to open dir /var/lib/glusterd/hooks/1/create/pre, due to No such file or directory
[2015-07-28 16:00:41.614499] E [glusterd-volume-ops.c:1811:glusterd_op_create_volume] 0-management: brick2.mount_dir not present
[2015-07-28 16:00:41.614587] E [glusterd-syncop.c:1288:gd_commit_op_phase] 0-management: Commit of operation 'Volume Create' failed on localhost

I tried a series of things to fix it:

I thought maybe the /GLUSTER/xxx-yyy-zzz needed to be created (I already made /GLUSTER) – Nope

I detached the peer and reattached – No.
I reboot the file server and test server – No.
I detached, reboot, reattached – No.
I tried creating the volume with just the test server and no replica – No.
I tried creating the volume on just the fileserver with no replica – Yes.

So the problem is point to the new system, but it’s a brand new system. They’re peers and connected.

I tried uninstalling and reinstalling gluster – No.
I tried uninstalling, purging and reinstalling – No.
I tried uninstalling, purging, manually deleting the /var/lib/gluster (probably a mistake that I didn’t detach first :() and reinstalling – No.
I have no idea why this wont WORK!!!!!

Let’s go further back, check the VPN, ping the servers.

Ping fileserver from testserver – Yes.
Ping testserver from fileserver – Yes/Hang on that’s the wrong IP!! Yes I’d copied an entry from webserverB into /etc/hosts, update the name but missed the IP address. Idiot! correct that. Ping – OK.
Try gluster again – Yes.

So if you’re having problems and seeing brickX.mount_dir not present make sure your DNS between servers is correct.

I don’t really know how the peer probe worked, but I think I must have done that from a servers who’s hosts was correct

Raspberry PI + GlusterFS (Part 4)

In Part 1 I mentioned encrypting my disks. but didn’t go into it, so here I’m going to run through encrypting, decrypting and using it with GlusterFS.
Part 2 Was an attempted but failed install of the latest GlusterFS (3.5.0) Server
Part 3 Covered installing GlusterFS Server with the new information from Ashley

To recap I’m using the following:-
2 PI’s
2 8Gb SD Cards
2 4GB USB Sticks
2 512Mb USB Sticks.

As yet we haven’t setup any Gluster Volumes and this is all on a pretty fresh system.

First we need to install some tools we’ll be using.

apt-get install cryptsetup pv

I know my 4GB USB Stick is on /dev/sda and 512Mb is on /dev/sdb, I’ll only be concentrating on the 4Gb in this, but make sure if your following along that your using the correct paths. Using the wrong paths can wipe your data.

I dont want any partitions on the stick (I’ll be encrypting the whole drive)

fdisk -l

Shows me I’ve got a few partitions on the stick:-

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   ?   778135908  1919645538   570754815+  5b  Unknown
/dev/sda2   ?   168689522  2104717761   968014120   65  Novell Netware 386
/dev/sda3   ?  1869881465  3805909656   968014096   79  Unknown
/dev/sda4   ?  2885681152  2885736650       27749+   d  Unknown

I can’t remember what this stick was used for (to my knowledge I’ve never used Novell partitions), but we’ll delete them all.

fdisk /dev/sda
d
1
d
2
d
3
d
wq

My partitions were listed 1-4 so it was nice and easy. You can rerun the fdisk -l command to check they’ve all gone.

This step wasn’t strictly necessary but I always like to make sure I’m working with the correct drives.

With the Drive empty of partitions I like to unplug it and plug it back in (keep everything fresh) Note: if you do reconnect the drive make sure your still working with the correct /dev/sd* path. Sometimes this can change.

Now run

cryptsetup -y -v luksFormat /dev/sda

This creates a new encryption key for the Drive (note this is not how you add new keys on a drive, only do this once!!)

Then we need to unlock the drive for use

cryptsetup luksOpen /dev/sda USB1_Crypt

/dev/sda is the drive path

USB1_Crypt is what we’re going to be labelling the decrypted drive.

You’ll be prompted for the Drive passphrase that you just created. If successful it doesn’t actually tell you, just drop you back to a prompt. From here on we wont be doing any drive work on /dev/sda as this will be outside the encrypted bit, we’ll be using /dev/mapper/USB1_Crypt

We can check it’s unlocked with

ls -l /dev/mapper/

You should see something similar to

lrwxrwxrwx 1 root root       8 May 13 18:39 USB1_Crypt -> ../dm-1

You can also check the status using

cryptsetup -v status USB1_Crypt

Now that we have the drive with an encryption key and unlocked we’ll write a bunch of data across the drive

pv -tpreb /dev/zero | dd of=/dev/mapper/USB1_Crypt bs=128M

Writing zero’s to a drive is generally considered bad for data security, but we’re writing them to the encrypted system not the actual stick, so the output to the stick will be encrypted data.

Once the data has finished writing we’ll create a new filesystem on the encrypted disk

mkfs.ext4 /dev/mapper/USB1_Crypt

You don’t have to use ext4, but I generally do.

That’s the USB Stick encrypted

We close the encrypted drive and remove it from /dev/mapper/ with


cryptsetup luksClose USB1_Crypt

If all you wanted was an encrypted Drive that’s it, and you can unlock the drive on systems with cryptsetup installed and then mount away.

So far we’ve encrypted the entire USB Stick, written a bunch of encrypted data across the entire Stick, created a new filesystem, and closed the Stick.

Now we’re ready to mount the Stick ready for Gluster to use.

We’re going to create a folder to mount the Drive into


mkdir /mnt/USB1

We’ll open the encrypted Drive again using


cryptsetup luksOpen /dev/sda USB1_Crypt

Then mount the decrypted drive


mount /dev/mapper/USB1_Crypt /mnt/USB1

If you


ls -l /mnt/USB1

You should see the lost+found directory on the filesystem.

I should mention again I’ve been running through this process on 2 PI’s, and to keep things simple I’m keeping the same names on both systems /mnt/USB1

Now it’s time to get GlusterFS running with these drives.

So while on Gluster-1(PI) issue the command


gluster peer probe Gluster-2

This should find and add the peer Gluster-2 and you can check with


gluster peer list

and


gluster peer status

Now because I always want each Gluster system working by name from Gluster-2 I issue


gluster peer probe Gluster-1

This updates the Gluster-1 peer to it’s name not it’s IP address, There’s nothing wrong with using IP addresses if your using static assigned IP’s on your PI’s, but I wouldn’t recommend doing so if your IP address is DHCP’d

With glusterfs knowing about both Gluster-1 and Gluster-2 we can create a new volume (It’s important that /mnt/USB1 has been mounted on both system before proceeding)

On either PI you can create a new replica volume with


gluster volume create testvol replica 2 Gluster-1:/mnt/USB1 Gluster-2:/mnt/USB1

This will create a new volume called testvol using /mnt/USB1 on both PI’s. The folder /mnt/USB1 is now referred to as a brick. and volumes consist of bricks.

Now we start the volume


gluster volume start testvol

Finally we need somewhere to mount the gluster filesystem


mkdir /media/testvol

Then we mount it


mount.glusterfs Gluster-1:/testvol /media/testvol

It doesn’t matter which host we use in this command, apparently it’s only used to pull the list of bricks for this volume and will then balance the load.

Now you can write data to /media/testvol. If you’ve mounted the volume on both PI’s you will see the files on both.

You can also


ls -l /mnt/USB1

To see the actual files on the stick (DO NOT do any more than just read the files from /mnt/USB1, playing in this folder can cause issues, you should only be using /media/testvol from now on).

If instead of replica you used a stripe, you’ll be able to see all the files in /media/testvol but only some files in /mnt/USB1 on each PI.

Shutting down 1 of the PI’s in a replca mode volume wont show any difference in /media/testvol (and hopefully on the new 3.5.0 version wont cause you as much of a headache if files get updated while 1 PI is offline, though it is likely to need manual intervention to fix maybe a part 4 🙂 when I get that far) but in striped mode with 1 of the PI’s offline you’ll notice files in /media/testvol have gone missing. For this reason I’m hoping to do both stripe and replica to keep files available across multiple PI’s and allow me to increase the storage space easily.

Replicating across 2 drives will mean I will need to add new storage 2 drives at a time.

Replicating across 3 drives would mean I need to add 3 new drives each time.

Just to make things easy I’ll list the commands to decrypt and mount after the PI has been reboot


cryptsetup luksOpen /dev/sda USB1_Crypt


mount /dev/mapper/USB1_Crypt /mnt/USB1


mount.glusterfs Gluster-1:/testvol /media/testvol

Raspberry PI + GlusterFS (Part 3)

After hitting errors when installing in Part 2 I decided to split out the solution.
Ashley saw part 2 and had already ran into the same problem (see the comment), thanks to his comment it gave me a huge help on what to do next.
I’ve started with a fresh raspberry pi image so that nothing conflicts. Again get the latest updates

apt-get update
apt-get upgrade

Then download the needed files with the following commands

wget http://download.gluster.org/pub/gluster/glusterfs/3.5/LATEST/Debian/apt/pool/main/g/glusterfs/glusterfs_3.5.0.orig.tar.gz
wget http://download.gluster.org/pub/gluster/glusterfs/3.5/LATEST/Debian/apt/pool/main/g/glusterfs/glusterfs_3.5.0-1.dsc
wget http://download.gluster.org/pub/gluster/glusterfs/3.5/LATEST/Debian/apt/pool/main/g/glusterfs/glusterfs_3.5.0-1.debian.tar.gz

Now extract the archives

tar xzvf glusterfs_3.5.0.orig.tar.gz
tar xzvf glusterfs_3.5.0-1.debian.tar.gz

We need some tools so

apt-get install devscripts

Then we move the debian folder into the glusterfs folder and change into the glusterfs folder

mv debian glusterfs-3.5.0/
cd glusterfs-3.5.0

Next run

debuild -us -uc

This will start but will throw dependency errors.
The important line is

Unmet build dependencies: dh-autoreconf libfuse-dev (>= 2.6.5) libibverbs-dev (>= 1.0.4) libdb-dev attr flex bison libreadline-dev libncurses5-dev libssl-dev libxml2-dev python-all-dev (>= 2.6.6-3~) liblvm2-dev libaio-dev librdmacm-dev chrpath hardening-wrapper

Which I resolved with

apt-get install dh-autoreconf libfuse-dev libibverbs-dev libdb-dev attr flex bison libreadline-dev libncurses5-dev libssl-dev libxml2-dev python-all-dev liblvm2-dev libaio-dev librdmacm-dev chrpath hardening-wrapper

With the dependencies installed I ran

debuild -us -uc

This may output some warnings. On my system I had a few warnings and 2 errors “N: 24 tags overridden (2 errors, 18 warnings, 4 info)”, but it didn’t seem to affect anything.
Now we’ll wrap up with

make
make install

The Make probably isn’t necessary, Once installed we need to start the service

/etc/init.d/glusterd start

You can check everything is working ok with

gluster peer status

This should return

Number of Peers: 0

The only thing left to do is ensure glusterd starts with the system

update-rc.d glusterd defaults

And we’re all set. Now you can take a look at Part 4

Raspberry PI + GlusterFS (Part 2)

IMPORTANT: When running through the steps in Part 2 I encounter errors. Thanks to Ashley commenting I’ve created Part 3. I’ve decided to leave Part 2 intact for anyone searching on errors etc.

Hopefully you’ve read Part 1 and understood what I’m trying to do and why.

Here’s Part 2 attempting the install.
Part 3 Actually does the installation now.
Part 4 will cover the encryption and setup.

First things first, I (being naughty) use root far too much in testing, but do not recommend it all the way through on production servers.

So lets get into root

sudo su -

This should place you in root’s home direcoty.

You can Skip down to get past errors I encountered, but it’s possibly still worth reading the below which ended up not working.

Now we’re going to grab gluster 3.5.0

wget http://download.gluster.org/pub/gluster/glusterfs/3.5/3.5.0/glusterfs-3.5.0.tar.gz
tar xzvf glusterfs-3.5.0.tar.gz
cd glusterfs-3.5.0/

DON’T do this step yet. At this point I jumped straight into a configure attempt

./configure

This threw errors that I’m missing flex or lex

configure: error: Flex or lex required to build glusterfs.

Clearly I’m going to need to install a few dependencies before going further

apt-get update
apt-get install make automake autoconf libtool flex bison pkg-config libssl-dev libxml2-dev python-dev libaio-dev libibverbs-dev librdmacm-dev libreadline-dev liblvm2-dev libglib2.0-dev pkg-config

I’m not sure if all these are needed, but after digging around that’s the list that’s used elsewhere.

Actually reading the INSTALL file says to start with ./autogen.sh so this time we will

./autogen.sh

This took about 5 mins, but didn’t throw errors. Next onto

./configure

This eventually spits out

GlusterFS configure summary
===========================
FUSE client          : yes
Infiniband verbs     : yes
epoll IO multiplex   : yes
argp-standalone      : no
fusermount           : yes
readline             : yes
georeplication       : yes
Linux-AIO            : yes
Enable Debug         : no
systemtap            : no
Block Device xlator  : yes
glupy                : yes
Use syslog           : yes
XML output           : yes
QEMU Block formats   : yes
Encryption xlator    : yes

Now we can get on with the actually compiling


make

This threw a number of warnings for me

warning: function declaration isn’t a prototype [-Wstrict-prototypes]

But didn’t seem to be of great concern.
After about an hour it was ready to continue (I wrote Part 4 from memory while waiting).
We install with


make install

Here’s where I’m getting an error.


../../py-compile: Missing argument to --destdir.

So it’s on stop until I can figure out how to resolve it. one suggestion from version 3.4.x was to add the prefix path to the configure, but this didn’t do anything for me.

Move on to Part 3

Raspberry PI + GlusterFS (Part 1)

Here’s Part 1 which is really background information.
Part 2 will be actually doing stuff, if you dont want to read how/why I ended up here skip to Part 2.

A few days ago I yet again ran out of space on my server. Normally this would just mean deleting a load of junk files, but I’ve been doing that for months I’m now at the point there are no junk files left to delete. So time to increase the storage. Unfortunately problem #2 I currently have 4 sata drives in the server taking up all the connections. Expanding wouldn’t be a problem as I originally setup the drives with lvm to handle large storage requirements. but now there’s nowhere to turn to increase the capacity in this server.

So instead I thought I’d have a look at the alternatives. I’d been hoping to move some of the server services over to PI’s since I first heard about the Raspberry PI project (long before they were released), I knew I could make good use of lots of them.

After a little searching I found Gluster, and this seems to be ideal for what I need. At this point I should say that I clearly expect any USB drive connected to the PI to be slower access than sata on my server. but for my use slower doesn’t matter. I doubt this would suit everyone, but I think Gluster is even a good option on beefy servers/pc’s.

I have an idea that needs testing, so I setup 2 PI’s with 2 USB sticks for storage. A simple apt-get install glusterfs-server gets me moving quickly while reading 2 guides http://www.linuxjournal.com/content/two-pi-r?page=0,0 and http://www.linuxuser.co.uk/tutorials/create-your-own-high-performance-nas-using-glusterfs I quickly have a working system. I’m not going to go into the steps, both the links give details, but here’s my layout:-
2 PI’s
2 8Gb SD Cards
2 4Gb USB Sticks
2 512Mb USB Sticks.
So each PI has an SD card, a 4gb USB Stick and a 512Mb USB Stick.

As I prefer to encrypt all data on my drives I installed cryptsetup and using luksFormat got the USB Sticks ready (more on this later), I unlocked the USB sticks on both PI’s which opens the drive to /dev/mapper/USB1_Crypt and /dev/mapper/USB2_Crypt
I then mount each to /mnt/USB1 and /mnt/USB2 (both were already ext4 filesystems) with USB1 being the 4Gb.
The PI’s are called Gluster-1 and Gluster-2, I know my DHCP and DNS are working so I can use the names when setting up the gluster peers and volumes instead of IP addresses.
Once I’d created the Gluster volume (I tested both stipe and replica) I mounted the testvol (Gluster Volume) into /media/testvol
I made a quick file, and could see it from both servers. It’s also interesting looking in the /mnt/USB1/ folder (but do NOT change anything in here), while using a stripe volume the new file only appears on 1 of the PI’s, creating more files puts some on Gluster-1 and some on Gluster-2. So to see what happens I reboot Gluster-1 and watched the files in /media/testvol yep all the files that are in testvol and actually on Gluster-1 disappear from the file list. As soon as Gluster-1 was back (decrypted and mounted /mnt/USB1) the files were back.

So this is looking pretty good, and adding more storage was easy (I didn’t go into rebalancing the files across new bricks, but I don’t see this being a huge problem). Now I moved onto using Replica instead of Stripe (my ultimate intention will be to use both). Here’s where the fun begins, I actually didn’t delete the test files in /mnt/USB1 from either server, just stopped the gluster volume and deleted it. Creating a new gluster volume in stripe mode was easy, and a few seconds after I started it gluster sync’d up the files already in /mnt/USB1 on both PI’s so now in /media/testvol I can still see all the files and access them fine. Now I reboot Gluster-2 and checked the files in /media/testvol on Gluster-1 yep all the files are there (there’s was a 2-3 second delay on ls I assume I was previously connected to Gluster-2 and it had to work out it wasn’t there anymore but from there on was fine).

I have a working replica system 🙂 and it’s taken me no-time at all to get running. I brought Gluster-2 back online, then thought what if I change files when one of them is offline. Reboot Gluster-1, and change some of the test in a few of the test files in /media/testvol and create a new testfile. Then I bring Gluster-1 back online, and here’s where things fall apart!!!
The new testfile is on both and fine, the existing files I hadn’t changed are all fine too. But the files I changed I can’t access anymore, I’m getting cat: testy1: Input/output error. This isn’t good so I check the file testy1 file in /mnt/USB1 on both PI’s, Gluster-2 has the changes (as expected it was online when I made the changes) Gluster-1 has the original.

So head over to google to find out how I’d fix such a scenario (a friend told me years ago “your never the first person to have a problem” google every error message) yep there’s information on it, it’s called split-brain. The first solution I find is to delete the file that’s wrong from /mnt/USB1 that is incorrect. This didn’t solve my problem, testy1 was recreated but was still giving I/O error a little more reading says that gluster creates hard links and you need to go and delete these too, apparently in a .glusterfs folder but I couldn’t find it, there was also a heal function that would show me what files are affected. But not on my system. gluster doesn’t know heal. WHY????
gluster -V tells me I’m running an old version from 2013 (I think 3.2.8 but that’s from memory) the latest version is 3.5.0

I also came across some information saying that Stripes using 3 or more can quorum changes to resolve issues with file differences (I think this has to be setup though, I can see situations that automatically doing so could cause data loss, such as log files being written and having different data not just that one may have stopped being written to).
Anyway it looks like having 3.5.0 with a heal function would be beneficial, and I always prefer for things to be upto date where possible anyway.
Looking at the Gluster download page there’s a simple way to add gluster to the debian apt system. So I followed the steps to add it, and run apt-get update. Now another problem it’s not able to download the gluster info, but the link it’s using I can access fine. Then something jumps out at me, “Architectures: amd64″, Raspberry uses arm so this isn’t built for it. Now I have no idea if this is right or wrong but it makes sense to me.

Like many other things, it’s time to manually compile and install.
So there’s the background (sorry it took so long), Part 2 is going to run through manually installing gluster on the PI