The Very Basics of How a Filer Works
Our storage devices, largely NetApp filers, are the least understood piece of gear in our inventory. Not because they are infinitely more complicated than anything else we do, they are just a lot (relatively) newer technology for Marines. The virtualization craze is what really pushed us to diligent use of these devices, but more on that in another post. I’m not going to cover any commands or screenshots during this post, only the basic components and the theory behind how they work.
A filer is really just a specialized server. It has a collection of hard drives, multiple network interfaces, and multiple (normally a pair) storage processors.
Hard drives are grouped together as RAID groups. You can choose how many RAID groups you would like total, and out of that number how many you would like per storage processor. I’ve written a short post on the basics of RAID. A couple of considerations when you are making this choice: each storage processor will run a version of the NetApp operating system, ONTAP, and that will be installed on the discs, in those RAID groups. If you want your filer to be able to run in HA (High Availability) mode, you will need at least two RAID groups. One for storage processor 1 and one for storage processor 2. HA allows one storage processor to take over for a failed storage processor in the event of a failure.
What has worked for me when dealing with a 12 disc filer is to use RAID 4 for two separate groups. Group 1 is six discs assigned to storage processor 1 and will be used for VM storage. Group 2 is five discs assigned to storage processor 2 and will be used as a ShareDrive (using CIFS). This leaves one disc to be used as a hot spare. By separating your VMs and your ShareDrive by storage processor, the activity of one will not affect the other.
Physical discs are grouped together as an aggregate. This allows you to have capacities greater than that of a single disc. For example, an aggregate of five physical 300GB discs (four data and one parity disc) would give you a total of 1.2 TB (300GB x 4, the actual usable space would not be quite this high). It is not 300GB x 5 because the parity disc does not contribute to your total storage space. Your aggregates can be logically separated into volumes. For example, Aggr0 will contain Vol0, which is where the Operating System, ONTAP, will be installed. While you could just place all of your files into Vol0, it is a good idea to create another volume, Vol1, which you place your files (such as VMs) into. In the event where you accidentally outgrow your storage, keeping these logical divisions can keep you from filling the volume with your operating system (which would be a bad thing).
The next logical question is: how do I “talk” to my storage? Say you have three servers, all ESXi hosts (servers running VMware). You would like to be able to vMotion, sometimes called live migration, your VMs between all three of these hosts. The requirement to do this is to have “shared” storage, that is storage all three ESXi hosts could access. The filer you put online would serve as that “shared storage”, but you would need a common language that both the filer and the ESXi hosts have in common for them to talk. The three major ways this is normally accomplished would be to use NFS (Network File System), iSCSI (Internet Small Computer System Interface), or FibreChannel. NFS and iSCSI work using IP, which means you do not need special switches or routers to use this. FibreChannel is altogether different, and requires special network adapters on your ESXi host and NetApp Filer, as well as, a special switch that understands FibreChannel. If you are big, and performance is your main driver, FibreChannel is for you, but none of the Marine Corps employments I have been part of have required this level of performance and it is more gear and complexity than required for what we do. We are left with iSCSI vs NFS. I’m personally a big fan of NFS, but if you want to get more in the details of this, you can start here.
The last thing you need to worry about is the networking. A very popular configuration is to have four NICs (Network Interface Cards) per storage processor. I like to split these into two groups of two NICs. The first group is normally a private IP and used for storage traffic (VMs). The second group is normally a public IP and used for ShareDrive (CIFS). Part of your HA configuration will be to have these interfaces work as teams, in the event storage processor 1 goes down, not only does storage processor take over the RAID groups associated with storage processor 1, it also takes over the IP addresses. Make sure when you design this you are consistent with your VLANs. For example, if the first group of interfaces on storage processor 1 is in VLAN 10, make sure the first group of interfaces on storage processor 2 is also in VLAN 10 so it will still work in the event of a failure.