A nice easy one for 2am 🙁 but it took me hours to work out (and it really shouldn’t have).
Trying to update servers using apt-get upgrade. It listed about 6 packages to be upgraded, but kept throwing an error:
Setting up libc6:amd64 (2.23-0ubuntu4) ... sh: echo: I/O error sh: echo: I/O error dpkg: error processing package libc6:amd64 (--configure): subprocess installed post-installation script returned error exit status 1 Errors were encountered while processing: libc6:amd64 E: Sub-process /usr/bin/dpkg returned an error code (1)
I spent a good few hours searching google and trying to work out what had been installed that could be causing an issue with libc6. Though I was fairly certain nothing had been (only because this is 2 days after a migration to new servers, and I’m hitting this problem on 2/4 servers, but they have extremely similar setups).
By chance I noticed:
Filesystem Size Used Avail Use% Mounted on udev 489M 0 489M 0% /dev tmpfs 100M 100M 0 100% /run /dev/vda1 30G 16G 13G 55% / tmpfs 497M 0 497M 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 497M 0 497M 0% /sys/fs/cgroup tmpfs 100M 0 100M 0% /run/user/1000
I’ll give you a clue “look at /run” 🙁
Something is eating all the space. So I tried to run ncdu to look for large files (I know there’s other ways, but I like ncdu). But I hadn’t installed it on this new server and I can’t install it with apt-get broken.
Thinking /run is bound to be causing some issues (still not sure if it’s causing this particular issue), I reboot the server (bad move!). It locked up and had to be power cycled. Thankfully it’s a droplet and with DigitalOcean I can power cycle easily (I did try the console but it wouldn’t connect).
Anyway after a reboot, /run started at 5% used, but quickly grew to 70%. but I did managed to install ncdu, and with that I knew the problems I had with apt-get were being caused by a full /run.
After a quick (very quick) look at ncdu /run I could see hhvm.hhbc taking up approx 85Mb+
A quick check of the config and I can see hhvm is configured to do so. So I adjusted the config to put it in /var/cache/hhvm/hhvm.hhbc instead and update the systemd service script to create /var/cache/hhvm and set it’s owner.
Another reboot, everything seems fine and /run is now at 3% used.
And I’ve run apt-get upgrade successfully.
I’m thankful that I noticed, I really thought I’d screwed something on these 2 servers while migrating, and I could see another night of rebuilding new servers ahead of me.
Morale of the story: Check your not out of space when you get weird errors (yes the I/O should have rang some bells, but hey it is 2am).