The waning importance of storage array controllers
I found an interesting text this morning. In “SSDs, pNFS Will Test RAID Controller Design” Henry Newman speculates about the future of RAID controller and the difficulties their vendors will have in the light of the advent of SSD. My opinion is somehow divided: Yes, RAID controller vendors will have a difficult time in the future. But: No, this won’t be the fault of SSD. Interestingly the storage market behaves in waves (like many other markets). The pendulum moves between RAID-Controller and JBOD for years now, at the moment the pendulum moves to JBOD, it was on the RAID-Controller side fore many years, albeit the next few years may destroy the pendulum, as the shortfalls get more and more visible and other technologies are available. The most imminent thread to dedicated RAID controller has it’s foundation in the past: Hardware RAID was invented many years ago, when the main CPUs hadn’t a large amount of power to compute the necessary calculations for RAID5 for example. So it was obvious to offload all these calculations into an external device, the RAID-Controller was born. But this method isn’t without shortfalls like the read-modify-write cycle when you modify a stripe or the point that a HW raid controller gives some protection, but just in case you have to recover. On the other hand that HW raid controller hides the redundancies for more intelligent mechanisms: For example when you use HW RAID you just see one copy with one checksum from the more intelligent mechanism. The system can detect the error, but has no redundancy to correct it. A more intelligent mechanism may use the redundancies of RAID to correct the data, even when it was corrupted on wire. A more intelligent mechanism than a HW raid controller may be aware of the placement of data on the disk and doesn’t have to put millions of zeros or useless already deleted data in sync. One of those more intelligent mechanisms would be ZFS, but I’m sure that the future will bring us other, similar technologies as other operating environments have pretty much the same problems with their storage. I really think, that all storage will look really similar to the S7000 in the future. When we talk about increasing requirements to the storage systems by new storage media we get to a point where some embedded CPU aren’t enough and we end with systems that look really similar to an x86 server. Maybe they will be better hidden than at our S7000 series, but it will be similar. And when we already are at this point, many storage companies will come to the conclusion that it may be a good choice to trow away their operating system they’ve used in the past and use something already available, a Linux, a BSD or, well, an OpenSolaris. But then we get to a more important point: The people using this component could get to the idea, that storage arrays like we now them today with their centralized storage controller could be just a bottleneck. And when all this storage-stuff is done be general-purpose OS on general-purpose hardware you could come to the conclusion, that your servers could do the job as well and get rid of some of the shortfalls I’ve described above. To explain that dedicated storage controllers have still some advantages will be a tough job for the vendors of such components. For a long time data services like replication were one of the advantages, but many of them are already available with OpenSolaris for example: You can already do replication (synchronous with the help of AVS and asynchronous natively) with ZFS, you can do compression, you can thin-provision, you will be able to do encryption and deduplication, you can do all this file-level and block level sharing protocols. So even the resort of having more data services is just a short-lived, almost non-existent last resort. pNFS is the next new technology, that may be problematic for the future of dedicated storage controllers. Distributed in nature, there doesn’t seem to be a niche for this controllers. There are just large amount of servers with a modest amount of hard disks per server but again … just as a JBOD. So, i’ve just explained, why dedicated storage controllers may be lose their drive(s), but why isn’t it the fault of SSD as I wrote in the beginning. The reason lies in the nature of SSD: You simply put SSD not behind RAID controllers. You keep the distance between the CPU and the SSD as short as possible. Period. Any given local area storage network introduces latency. The higher the number of I/O operations per second, the more harmful is any additional latency. There is only one exception: In case of a cluster, you have to put everything you need to fail over a consistent version of your data to the surviving node in the local area storage network to enable a fail over. And when you don’t put the SDD behind the RAID controllers, they can’t be the bottleneck. ZFS offers a technology called Hybrid Storage Pool (I’m pretty sure that we will see similar technologies in Linux and Windows at a point in the future) and with technology you don’t have the need to put SSD behind a controller. It’s in front of the controller and reduces the load to the controller. Many people still think about SSD as a substitute they hard disk, but we have to thing different, we can think different with HSP. So: Yes, RAID controller are really a technology that may be of waning importance, but the reasons are different.