"Disks are always full. It is futile to try to get more disk space. Data expands to fill any void." This example of Murphy's Computer Laws usually brings a smile to the faces of IT professionals, yet, while humorous, there's always the truism that "in good humor, there is usually truth" to sober one's mood.
The battle to provide additional storage is never-ending, and for well-heeled corporations, the weapon of choice is the storage area network (SAN). Smaller corporations usually make do with network-attached storage (NAS), but they don't need to. Open-source software, in combination with increasingly faster traditional network speeds, can put a SAN into shops with smaller budgets. This month, I'll point you to some resources that will allow you to build your own SAN, for minimal cost.
NAS vs. SAN
Network-attached storage and storage area networks share some of the same properties. From a management standpoint, NAS and SANs are usually implemented to minimize maintenance and downtime through the purchase of quality hardware and backup/recovery solutions. Contrast the management needs of a network of even as few as 10 peer computers, all with cheap hardware and their own disk storage, with a network of a server with 10 clients, and you can easily see the utility of a client/server network. Scale that up a couple of orders of magnitude, and you can really appreciate it. The differences between NAS and SAN technologies boil down to the way data appears to the client machine and the additional recovery options made easier via a SAN.
NAS has been with us for many years and is most familiar in the guise of the Network File System (NFS) on *nix machines and i5, or SMB/CIFS on Windows machines (and i5). In both cases, the server hosting the data provides file-level access; thus it is responsible for authenticating a client's access to the various directories (shares) and files. Since many clients can be simultaneously connected to the same resource, the server also handles file contention/locking. Despite the occasional problem caused by multiple clients accessing a given file, NAS works well and has modest resource requirements both in computing power and network bandwidth. No modern operating system worth its salt comes without the ability to access NAS resources.
A SAN server is a horse of another color. Unlike its NAS brother, a SAN server provides block-level access to the client. Thus, what a SAN serves up appears as a directly attached drive. I'm not talking about a drive as in "I have my drive N: mapped to hatserver hatshare"; I'm talking about a drive as in fdisk, format, and install to it. It doesn't take much to imagine the network bandwidth required to support this kind of access (it's huge) and thus the cost of the network hardware to provide it. Is it any wonder that SANs have traditionally been deployed only in large enterprises?
While NAS allows multiple connections to the same resource, only the über-expensive SAN solutions permit more than one client to connect to a single resource at a time. The faux drives are not shareable, at least not by the low-cost solution I'll be discussing shortly. With this seemingly huge limitation, why would you want to implement a SAN? By utilizing SAN storage instead of directly-connected drives, you enhance your ability to swap server hardware (clients of a SAN) for upgrade or replacement. Provided your server can boot from a SAN, you have little to move except the old hardware. Remember those applications I mentioned earlier, the ones that don't work well with NAS-hosted files? They'll work just fine on a SAN, as it is the host on which the software is running that is taking care of file access. Everything works the same as it would if the storage was directly attached. From a backup perspective, SANs have some terrific tools. Even the on-the-cheap solution I'm going to describe has the ability to make snapshots for backups that won't interfere with, or be interfered with by, the applications currently accessing the SAN drive.
iSCSI
The network technology that I use in my SAN is iSCSI. It uses TCP/IP to transport SCSI commands between the client (the iscsi-initiator) and the server (the iscsi-target). This is but one of the options, but it's one that I can see becoming increasingly attractive as network bandwidth becomes cheaper. The downside to iSCSI is the overhead introduced by the TCP/IP stack, which can be significant. Some iSCSI hardware solutions offload the workload from the CPU to a dedicated card, but they're beyond the scope of this article.
It bears repeating multiple times that the most important resource to a successful SAN implementation is network bandwidth. Without sufficient bandwidth, you'll end up with drive performance reminiscent of the early hard-drive technology. The reigning champ in SAN network technology is fibre channel, with its ability to provide speeds that start at 1 Gbps and go upward from there. Fibre channel makes our humble Ethernet speeds look pathetic (it should, given the price), yet the latest 1 Gbps Ethernet hardware is becoming inexpensive and thus is starting to encroach on fibre's turf. The gap between the two will continue to shrink, or at least the bandwidth provided by Ethernet will become sufficient to handle mid-sized SANs.
Roll Your Own
While there are certainly many commercially available SAN solutions, most companies will not want to pony up the cash to purchase one unless they can be certain that it will work. Fortunately, using very modest hardware, you can roll your own for proof-of-concept or as an educational exercise. While it won't be sufficient for more than testing, that old desktop machine that you have in your closet will do just fine. You'll need at a minimum a 100 Mbps Ethernet card (anything slower won't even be worth the time for testing), a disk drive large enough to hold a Linux installation and the space you'd like to export (20 G or more will be more than enough), and 512 Mb or so of RAM. For testing, I use CentOS Linux (a RHEL derivative), which already includes the iscsi-initiator code. Installing Linux should take no more than 30 minutes if you've done it before or an hour if you haven't and you take the time to read the help text on each of the install screens.
Once you have the Linux installation completed and updates applied, install the iscsi-target software using the instructions found in this article, entitled "Going Enterprise - setup your FC4 iSCSI target in 5 minutes." While his estimate of five minutes is a bit optimistic, Anze (the author) has done a pretty good job of documenting the process, complete with pictures. He uses Windows 2003 for the iscsi-initiator (client), so if you will be attaching to your SAN from that platform, you can see what it looks like. The only caveat to using his method is that you have to remember to recompile the initiator any time you update the Linux kernel. The iscsi-target module is tied to the kernel level and may fail if you reboot after loading a new kernel. You won't lose data if this happens; you'll just be unable to access any iscsi targets until you do recompile.
For those using Linux as the initiator, you will need to load the "iscsi-initiator-utils" package (assuming RHEL or CentOS) or your distribution's equivalent. If you are not going to use authentication (a really bad idea in a production environment), you need only one line in the /etc/ietd.conf file: DiscoveryAddress=your.server.here. Adding authentication is a simple process, described in the well-documented configuration files on both the target and initiator sides.
Start the ietd process with /sbin/service iscsi start. If all goes well, you should see something like this in your /var/log/messages file:
Jul 20 01:01:50 laptop2 kernel: iscsi-sfnet: Loading iscsi_sfnet version 4:0.1.11-3
Jul 20 01:01:50 laptop2 kernel: iscsi-sfnet: Control device major number 253
Jul 20 01:01:50 laptop2 iscsi: Loading iscsi driver: succeeded
Jul 20 01:01:55 laptop2 iscsid[15158]: version 4:0.1.11-4 variant (15-Jan-2007)
Jul 20 01:01:55 laptop2 iscsi: iscsid startup succeeded
Jul 20 01:01:56 laptop2 iscsid[15168]: Connected to
Discovery Address 192.168.1.15
Jul 20 01:01:56 laptop2 kernel: iscsi-sfnet:host1: Session established
Jul 20 01:01:56 laptop2 kernel: scsi1 : SFNet iSCSI driver
Jul 20 01:01:56 laptop2 kernel: Vendor: IET Model: VIRTUAL-DISK Rev: 0
Jul 20 01:01:56 laptop2 kernel: Type: Direct-Access ANSI SCSI revision: 04
Jul 20 01:01:56 laptop2 kernel: SCSI device sda: 536870912 512-byte hdwr sectors (274878 MB)
Jul 20 01:01:56 laptop2 kernel: SCSI device sda: drive cache:
write back
Jul 20 01:01:56 laptop2 kernel: SCSI device sda: 536870912 512-byte hdwr sectors (274878 MB)
Jul 20 01:01:56 laptop2 kernel: SCSI device sda: drive cache:
write back
Jul 20 01:01:56 laptop2 kernel: sda: sda1
Jul 20 01:01:56 laptop2 kernel: Attached scsi disk sda at scsi1,
channel 0, id 0, lun 0
Jul 20 01:01:56 laptop2 scsi.agent[15205]: disk at /devices/platform
/host1/target1:0:0/1:0:0:0
This indicates that the new "drive" will be addressed as /dev/sda, with one partition: /dev/sda1. It appears because I had earlier used fdisk to partition it. Your application may allow (or prefer) accessing the entire drive. Formatting the partition is the same as for any directly attached drive on a Linux system: mkfs.ext3 /dev/sda1. And mounting it into the root filesystem is also identical: mkdir /mnt/iscsi; mount /dev/sda1 /mnt/iscsi.
The iscsi-utilities include iscsi-ls, which, when invoked on my laptop, produces this output:
*****************************************************************
SFNet iSCSI Driver Version ...4:0.1.11-4(15-Jan-2007)
*****************************************************************
TARGET NAME : iqn.2006-12.com.blkline.hunt.testbox1:storage.lun0
TARGET ALIAS :
HOST ID : 0
BUS ID : 0
TARGET ID : 0
TARGET ADDRESS : 192.168.1.15:3260,1
SESSION STATUS : ESTABLISHED AT Thu Jul 19 19:20:23 EDT 2007
SESSION ID : ISID 00023d000001 TSIH 100
DEVICE DETAILS:
---------------
LUN ID : 0
Vendor: IET Model: VIRTUAL-DISK Rev: 0
Type: Direct-Access ANSI SCSI revision: 04
page83 type1: 49455400000000000000000001000000040500000d000000
page80: 0a
Device: /dev/sda
*******************************************************************************
Once mounted, it appears as any other drive:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg0-log_rootdir
6.0G 5.2G 466M 92% /
/dev/hda2 99M 20M 75M 21% /boot
none 506M 0 506M 0% /dev/shm
/dev/mapper/vg0-log_home
25G 23G 540M 98% /home
/dev/mapper/vg0-log_opt
2.5G 1.8G 657M 73% /opt
/dev/mapper/vg0-log_tmp
1.2G 35M 1.1G 4% /tmp
/dev/mapper/vg0-usr_local
248M 229M 6.6M 98% /usr/local
/dev/mapper/vg0-log_var
3.0G 2.1G 784M 73% /var
/dev/mapper/vg0-vmstorage
38G 28G 9.4G 75% /vmstorage
/dev/sda1 252G 93M 247G 1% /mnt/iscsi
Throw Money at It
If you have built your SAN using the hardware I've just described, don't expect that it will set any world records for access speed. However, by taking the time to build such a machine, you will see that it is very simple to get a SAN built and to start discovering the benefits one can provide. Once your curiosity is satisfied, it is a simple step to achieve a production-quality version: Throw money at it. If you have lots of money, jump headfirst into fibre channel and a commercial solution. For those who are more frugal or have more modest requirements, consider purchasing server-quality hardware with plenty of disk arms, RAID, and, of course, one or more Gb Ethernet NICs.
I have used my SAN to provide storage to older hardware on my network. I've used it for temporary storage for projects. I've even used it in conjunction with LVM to rearrange and upgrade hard drives—while the systems were running.
SAN technology can be intimidating. Make it less so by dipping your toes into the pond for little more than the cost of a couple of hours. You'll be glad that you did.
Barry L. Kline is a consultant and has been developing software on various DEC and IBM midrange platforms for over 23 years. Barry discovered Linux back in the days when it was necessary to download diskette images and source code from the Internet. Since then, he has installed Linux on hundreds of machines, where it functions as servers and workstations in iSeries and Windows networks. He co-authored the book Understanding Linux Web Hosting with Don Denoncourt. Barry can be reached at
LATEST COMMENTS
MC Press Online