20100808

How are the mighty fallen...

I am defeated. The computer has won. Tomorrow I shall retaliate with a reboot of several pieces of hardware, but right now I'm done with it all.

One wouldn't think that something as simple as a NAS could bring down several servers. I'm not even sure why it's capable, but it is. And two of my servers have now paid the price.

I'm a into-virtualization-kind-of-guy. I like to virtualize my hosts because it gives me the ability to send magical reset button events to them without having to drive into the office. I can RDC to their system consoles and make the magick happen whilst sshd is milling around in a zombie-like state. I never fear cycling the systems while I'm at home, comfortable, at my workstation.

But when the VM host suffers a panic-attack, the shit hits the fan. (In true "Airplane!" fashion, it also spins around a little before shlopping out onto the counter.) So I'm guessing right about now the host has a load average of around 10; it should be less than 1. I'm guessing that the VM responsible - which is already terminated, by the way - got nice and clunked up when the Netgear ReadyNAS 1100 exploded itself while trying to facilitate my very simple chmod -R request. I mean, really, wtf?! Is this what technology has come to? I know every Windows user expects a lock-up at least once a week, and a therapeutic reboot to fix all problems, but the "rest of us" like machines and operating systems that can go 180 days without flinching. Yeah, try that on Windows XP or 7 - I DARE YOU!

I once had a box live over 300 days before the power went out. Damn UPS batteries.

But here we have ReadyNAS, a strange Debian derivation running on a Sparc, serving up LVM-goodness and some superiority-complex called "X-RAID". Whatever it's got, it isn't enough to deal with my burly Linux boxen that madly request to store data on it. Even with dual gigabit ports, it simply can't handle the load, it seems...

...or it has shitty hard drives. That's also a distinct possibility. We've been through that mess already, but I have a terrible suspicion the faulty disks are not all gone. Anyway, it was over NFS, a simple chmod command, and then KABLAM! I can ping it, but I can't see it on Raidar (their nifty little utility that doesn't do much except make NAS lights blink - which is kinda cool in and of itself), nor can I connect to it via any mechanism including the web interface, which was painfully slow when it WAS working.

What a pain in the ass....