20120913

ZFS, Additional Thoughts

I am about to expand my ZFS array, and I'm a little bit stuck...not because I don't know what to do, but because I am reflecting on my experiences thus far.

I guess I just find ZFS a little, well, uncomfortable.  That's really the best word I can come up with.  It's not necessarily all ZFS' fault, although some of the fault does lie with it.  I'll try to enumerate what's troubling me.

First, the drive references - they recommend adding devices via their /dev/disk/by-id (or similarly unique-but-consistent) identifiers.  This makes sense in terms of making sure that the drives are always properly recognized and dealt with in the correct order, and having been through some RAID hell with drive ordering I can attest that there have been instances where I've cursed the seeming-randomness of how the /dev/sd? identifiers are assigned.  That being said, my Linux device identifiers look like this:

    scsi-3632120e0a37653430e79784212fdb020

That's really quite ugly and as I look through the list of devices to prune out which ones I've already assigned, I'm missing my little 3-character device names....a lot.  This doesn't seem to be an issue on OpenSolaris systems, but I can't/won't run OpenSolaris at this time.

Second, there's the obvious "expansion through addition" instead of "expansion through reshaping."  I want to believe that two RAID-5-style arrays will give me almost as much redundancy as a single RAID-6, but truth be told any two drives could fail at any time.  I do not think we can safely say that two failing in the same array is less likely than one failing in each array.  If anything, it's just as likely.  If Fate has its say, it's more likely, just to piss off Statistics.

But this is what I've got, and I can't wait another 15 days for all my stores to resync just because I added a drive.  That will be even more true once the redundant server is remote again, and syncing is happening over a tiny 10Mbit link.  I'll just have to bite the bullet and build another raidz1 of four drives, and hope for the best.

Third, I'm just a little disturbed about the fact that once you bring something like a raidz online, there is no initial sync.  I guess the creators of ZFS might have thought it superfluous.  After all, if you've written no data, why bother to sync garbage?  It's just something I've come to expect from things like mdadm and every RAID card there is, but then again I suppose it doesn't make a lot of sense after all.  I'm trying to find a counter-example, but so far I can't seem to think of a good one.

Fourth, the tools remind me a little of something out of the Windows-age.  They're quite minimalist, especially when compared to mdadm and LVM.  Those two latter tools provide a plethora of information, and while not all admins will use it, there have been times I've needed it.  I just feel like the conveniences offered by the ZFS command-line tools actually takes away from the depth of information I expect to have access to.  I know there is probably a good reason for it, yet it just isn't that satisfying.

The obvious question at this point is: why use it if I have these issues with it?  Well, for the simple fact that it does per-block integrity checks.  Nothing more.  That is the one killer feature I need because I can no longer trust my hard drives not to corrupt my data, and I can't afford to drop another $6K on new hard drives.  I want so badly to have a device driver that implements this under mdadm, but writing one still seems beyond the scope of my available time.

Or is it?

No comments:

Post a Comment