20120426

Cluster File Systems, Beginning Trials

My first foray into cluster file systems taught me a great deal, though I still have a great deal to learn.  Since coming to understand what OCFS2 can do for me, I have employed it thus (in a sandbox environment):

  • Create one iSCSI target - this would be our backing store, or "SAN".
  • Create three different accessing nodes - my iSCSI initiators.
  • Configure the three initiators into the same OCFS2 cluster.
  • Use OCFS2 to manage the iSCSI store effectively.
In short, it worked.  All three initiators connected to the iSCSI target simultaneously, and, using OCFS2, were able to read/write the file system, together, in real-time.

After browsing through an Ubuntu document on high-availability iSCSI, I think the road-map will be as follows:
  1. Set up a data-store server (server 1) with DRBD and iSCSI Enterprise Target.
  2. Set up a second data-store server (server 2) to mirror the first server.
  3. Configure High-Availability services.
  4. Configure one or more new iSCSI initiators to use this store and OCFS2.
  5. Test fail-over (during reading and writing).
  6. Test live-migration of some sample VMs.
This begs the question: what are the limits of this cluster?  Well, the physical file system limits are the limits of the technology and what I'm able to attach.  To that end, I can expand the systems to be quite large, although performance may suffer.  Increasing the size of the data-store cluster is an option, and a very viable one even without a technology to bind the cluster members together (in terms of their storage).  As far as any member of the OCFS2 cluster is concerned, there can be any number of drive targets on the cluster, which equates (in our case) to any number of iSCSI targets.  

Adding more data-store servers means basically adding one or more new iSCSI targets.  All the initiators will access all of the targets, and data-migration from one target to another can happen anywhere, at any time.  Live-migration will continue to work, and the only thing we really lose out on is increased redundancy - that is to say, the redundancy does not improve, but it also does not necessarily diminish.  As the servers are already intended to be highly-available, here I think we've reached the threshold to the point of diminishing returns.

I will hopefully post configuration file samples soon, so that this information may live on.

No comments:

Post a Comment