On my MythTV Backend, I find there are a number of error conditions that I want to monitor and be alerted about if they should happen. For example, as of late, I’ve been having issues with the one of the drives in my RAID configuration (under load I’m getting errors that I think are a result of an old SATA controller), which causes the RAID to drop into degraded mode, and error messages to be logged by the kernel. In a situation like this, I wanted a tool that could monitor my log files and email me if “interesting” things happen.
Now, the first thing I did was search the web for something that would do the job. swatch popped up immediately as one alternative. It’s a nice, simple Perl script which takes a configuration file that defines a log file to monitor, and a series of rules which define what to look for. Unfortunately, it can only monitor one log file at a time (you need to run multiple instances and have multiple configuration files if you want to monitor multiple files), and it has to run continuously in the background. And, quite frankly, the configuration file is a tad byzantine for my taste.
Another common option is logwatch. This application is definitely a lot more flexible, but the configuration is, again, rather complicated. And, at least as far as I can tell, it’s really meant to be run once a day for a given date range, as opposed to operating as a regular, polling application.
And thus ended my search, with the conclusion that it’d really be a lot simpler just to write my own tool. And this pwatch was born. pwatch is a simple Perl script that takes an Apache-style configuration file and processes your log files. Each matching event triggers an action, and then the event is recorded in an SQLite database. Run pwatch again and it’ll skip any events it’s seen before and only report new ones. The result is that you can just fire off pwatch in a cronjob on a regular basis (I run it every five minutes), and it can alert you if something interesting has happened.
Now, pwatch is pretty basic at this point, and I probably won’t add much more to it unless people ask for it (or unless I need it). For example, at this point, the only action it knows how to take on an event is to send out an email. But adding new features should be trivial enough, so if anyone has any ideas, let me know. And if you find pwatch useful, send me an email!
So, one of the ongoing issues that anyone with a public-facing server has to deal with is a barrage of SSH login attempts. Now, normally this isn’t a problem, as a decent sysadmin will use fairly strong passwords (or disable password-based logins entirely), disable root logins, and so forth. But it’s certainly an irritant, and so it’s worth implementing something to mitigate the issue.
Now, traditionally, there are a few general approaches people take:
- Use iptables or something similar to throttle inbound ssh connection attempts.
- Coupled with the previous, implement tarpitting (this slows down ssh responses, which means the attacker wastes resources on your server).
- Implement something like fail2ban to automatically detect attacks and dynamically add them to a set of block rules (managed with something like iptables).
- Move SSH to a non-standard port.
All of these work reasonably well, and particularly for the lazy, something like fail2ban on Ubuntu is dead easy to deploy and works quite nicely. Of course, there’s always the chance that you lock yourself out if you fail at a few login attempts, so it’s not without it’s risks.
But I recently discovered a fifth option which, at least at this stage of IPv6 growth, works incredibly well: disable inbound SSH over IPv4. See, most attackers aren’t v6 connected. Meanwhile, acquiring v6 connectivity remotely is usually just a matter of running a Teredo tunneling client. The result is perfectly workable remote accessibility, while the number of SSH attacks is cut down to essentially zero.
Of course, this won’t last forever. In the future, v6 is likely to get deployed more widely, and I suspect I’ll start seeing v6-based ssh attacks. But until then, this solution is dead simple to deploy and works great!
And naturally, just a day after I finish writing this, I decided to fiddle around with NX for remotely accessing this server, and lo and behold, NX doesn’t support IPv6. :) So, I’m back to using fail2ban, until NX can get their act together (though, to be fair, latency over my v6 tunnel has an unfortunate negative impact on NX performance, and so I’m not sure I’d use v6 even if I could).
Or: Why RAID isn’t foolproof.
First, a little bit of background. At home, I have a MythTV installation. And as part of that installation, I have a MythTV Backend, which is basically a glorified fileserver that sports a couple video capture cards, the MythTV scheduling and recording software, a mysql database, and a few other odds and ends (not the least of which is this web server). Now, being a fileserver, one of the jobs that machine fulfills is to provide large amounts of storage, primary for MythTV recordings, and since I don’t want to lose those records, I have my storage set up in a RAID-1 mirror, which basically takes two drives and makes it look like a single drive, while underneath, anything written to the logical drive is actually written out to both physical disks. That way, if something bad happens, I have what amounts to a live backup that I can quickly switch to (in addition to my regular, nightly incremental and weekly checkpoint backups).
So I came home on Wednesday night to discover something rather annoying: Some sort of write error had occurred on one of those physical disks, and so the mirror was degraded and deactivated. Now, this has happened in the past (I think it’s related to a buggy DMA implementation on my SATA controller), but usually recovery is pretty easy: remove the bad disk from the mirror, then re-add it, which causes Linux to synchronize the two disks, using the good disk as the primary. But for some reason, this time, it wasn’t so easy.
See, when I ran a command to view the status of the mirror, I found both drives marked as “removed” (ie, taken out of the mirror), and one marked as a “spare”. That itself is kinda weird, as usually it’s one active, and one failed. “Whatever”, I told myself, “I’ll just take the spare out of the mirror, re-add it, and then add the other drive, and voila, that should be it”. But when I attempted to re-add the spare, I got the weirdest error message:
cannot find valid superblock in this array - HELP
I can tell you right now, when your computer is imploring you for help, it’s probably a bad thing. Now, for those not in the know, a superblock is kinda like a special marker on the disk, and in this case, it tells Linux which mirror the disk belongs to, along with a bunch of other metadata. This error indicates that this decidedly important piece of bookkeeping information was, supposedly, absent. That’s bad. Unfortunately, googling around lead me nowhere. Even more confusing, when I attempted to mount (ie, attach, connect, etc) one of the halves of the mirror, the OS detected the filesystem, and the contents of the mirror looked to be intact. And running a tool to examine the RAID mirror components returned what looked like perfectly normal data.
In the end, I gave up for the day, figuring I would come up with some strategy for moving forward the next day. Eventually, I settled on breaking the mirror up, mounting both drives separately, and then using a tool like rsync to manually back up the primary disk to the secondary… not an ideal solution, as a disk failure means you lose everything since the last snapshot, but it’d do the job, and I wouldn’t have to deal with RAID headaches anymore.
So this evening, I fire up zaphod (that’s the fileserver name) into single user mode, and as I watch the kernel messages scroll by, I see the RAID mirror… start up perfectly normally. Examining the mirror showed one active disk, and one re-syncing, suggesting that the kernel was rebuilding the RAID successfully. What. The. Heck. And as of this writing, I still have absolutely no idea what on earth went wrong, or how it magically got fixed.
Well, it’s been a couple days now, and I continue to fiddle around with NetBSD… it’s definitely not going to be displacing Ubuntu any time soon, but it’s definitely an amusing project to play around with.
Most recently, as I was testing out Evolution (my email client) compiled from pkgsrc, I discovered that it started up incredibly slowly. Like, 5 minutes from invocation to a window popping up on my desktop. So, a little Google-fu, and I found myself here. It turns out that one of the things Evolution does a lot is attempt to open shared libraries that don’t exist. Unfortunately, those failures are very expensive, and as of 5.0.2, NBSD’s linker doesn’t cache the failures.
And this is where that blog post comes in. The author of that post wrote up a negative lookup cache and incorporated it into the NBSD dynamic linker. By itself, that’d be interesting, but what’s deeply cool about this is that I was able to get a patch representing his change, tweak them, apply them to my local copy of the NBSD source, and then build out and install a new version of the dynamic linker. Result: startup times went from minutes to seconds. I’d call that a huge win.
What this fundamentally speaks to is just how open and easy it is to fiddle around with the internals of NetBSD. The entire system is designed to make it trivial to alter the base and rebuild it out from scratch, which makes it possible to do the kinds of things I just did. Very cool!
Next up: Attempt to hack nouveau DRI support into the kernel so I can get reasonable video performance.
So for no particular reason at all, I recently got the urge to try out a BSD variant on my laptop. Now, historically I’ve been a die-hard Linux user, having cut my teeth on Slackware back when you needed dozens of floppies to install the thing (as a quick aside, I didn’t have internet access at home at the time, and so I used a PC at school to download Slack from a local BBS, which meant trucking dozens of floppies there and back… which was really fun when, say, disk 12 of 20-something had a bad sector, requiring me to return to school the next day (leaving the install process up and in limbo in the mean time) to write out a new disk). Since then, I’ve worked with Redhat, Debian, Fedora, and Ubuntu, but have never strayed outside the realm of Linux, and so, in a fit of boredom, I decided to address that little shortcoming in my technical upbringing.
Of course, there are multiple BSDs out there, each with their own focus and vision, and chosing one is often a matter of taste. My initial choice was FreeBSD, which I threw on a 10GB partition on my laptop, after which I found myself facing the familiar command prompt (well, not quite familiar… it was straight sh instead of bash, which was… annoying), and a fairly barebones system. At this point I discovered an important difference between the BSDs and, say, Ubuntu: out of the box, they tend to provide a very bare-bones system, enough to get you bootstrapped so you can build the system you need. But you have to build it. Not that I mind, I’m a tinkerer at heart.
I then spent the next couple days fiddling around with the system and configuring it as necessary, which was a very different experience from what you see in Linux. You see, in FreeBSD (and NetBSD, which I’ll get to later), the primary system configuration, which includes network configuration, system daemon selection, and so forth, is all stored in a single file in etc called ‘rc.conf’. In contrast, Linux distros tend to manage things in varying ways, which means you to need to learn individual platform quirks and tools, something which is always a bit tedious. And so, by playing with the rc.conf, I was easily able to get networking up and running, including my wireless card, various system daemons, and so forth. And after that, it was off to install some interesting programs.
And this was where I discovered my next surprise. In the Linux world, package managers are really king, with two main contenders vying for the top spot: deb and rpm. Of course, there are a few outliers (Slackware’s tgz’s, Gentoo’s portage system, etc), but for the most part, modern distros are based on one of these two package management systems. Not so with FreeBSD. FBSD uses a system called ‘ports’, which should be familiar to a Gentoo user, as portage is really a rip-off of ports. In essence, ports is a gigantic set of scripts, where each supported application is represented by a directory containing Makefiles, patches, and so forth, which can be used to install the application. A simple ‘make install’ in the directory results in the source for the package being downloaded, patched, configured, built, and installed. It’s really quite slick, if you’re interested in building everything from source (which can take quite a while). Of course, FBSD also has binary package support, but building from ports is the most common way people install software in the FBSD world.
Unfortunately, I finally hit a brick wall with FBSD on my laptop when I attempted to suspend it. Big mistake. You see, it turns out that, even now, with FreeBSD 8.0, support for suspend/resume is incredibly weak. So while Linux has stumbled along and finally reached a point where things kinda sorta work most of the time, FBSD is, I’d wager, at least 5 years behind. Which is a real shame, as I use suspend all the time with my laptop. And thus it was that FBSD as a possible OS alternative was nixed.
So, what next? Well, in my mind, the most obvious alternative contender was NetBSD (I eventually chose the 32-bit version for reasons I won’t get in to here). Like FreeBSD, NetBSD installs to a very barebones system, though even more barebones than FBSD, if that can be believed. In fact, the ISO for the installation media is a mere 250MB, give or take, which is pretty diminutive beside FBSD’s 2GB DVD image (though, to be fair, FBSD’s DVD ships with a ton of pre-compiled packages, while NetBSD leaves you having to download all that software from the intertubes). Similar to FBSD, the entire system is configured through /etc/rc.conf, and basic configuration was equally easy. Once that was done, again my thoughts turned to software.
The NetBSD package system shares a lot of commonalities with the FreeBSD system. Which shouldn’t be surprising because NetBSD’s system, pkgsrc, was forked from ports back in 1997. As such, they share an underlying philosophy, and so the two systems operate very similarly. I will say, though, that ports does have one significant advantage over pkgsrc: Much better OS integration. See, pkgsrc is really a sister project to NetBSD. As such, it can actually be run on myriad operating systems, including Linux, among many others. But that means that the system doesn’t tie into the OS all that well. So while a ports package, once built, will populate /etc/rc.conf will configuration values, throw itself into /usr/local/etc/rc.d, and so forth, a pkgsrc package requires the user to perform extra work to integrate the software into the OS. Additionally, I do prefer the way ports actively prompts the user for configuration directives for packages that provide them, but that’s probably just a matter of taste.
Of course, I once again made the mistake of investing a fair bit of time into installing packages before I decided to test out suspend, and once again I was disappointed, though somewhat less so (which is why NetBSD is still on my laptop). Suspending the laptop worked flawlessly, and was incredibly fast. Honestly, I’ve never seen a laptop go to sleep that quickly. But on resume, oddly enough, my videocard doesn’t get initialized properly (this is a known problem with nVidia graphics chips in general, and on my laptop model in particular). On the other hand, everything else works perfectly (the OS is actually fully responsive under the hood, the display simply doesn’t come on). Some hacking got things sorta working, but not reliably, so for now suspend on NetBSD will have to wait. But at least there appears to be a chance.
So for now I’ve decided to stick with NetBSD. Naturally I expect there to be more problems and limitations (at minimum, I’ll be stuck with nv as my X driver, as nVidia’s binary blob isn’t supported on NetBSD), and I doubt it’ll displace my Ubuntu install, but it should be fun seeing if it can!
And quick aside: I was very impressed to discover that both Free and NetBSD supported essentially all the hardware on my laptop, without exception (well, save for ACPI suspend, of course), straight out of the box. Very nice!