Flash impact

I´ve thought a little bit about the upcoming flash drives in our systems yesterday evening while summarizing a meeting with a customer regarding our Open Storage initiative. I really think that these system will change the business in storage.
There is a strange development in storage. While the capacity rised from a few megabyte to a terabyte per disk, the amount of harddisks for a given tasks didn´t reduced at the same speed. Imagine the following example. Let´s assume, that you have a database with 1 Terabyte worth of data. Let´s assume you´ve used 73 GB harddisks in the past. You would split this in 14 disks or better 24 disks for data redundancy with RAID-1. Interestingly even with 1 TB harddisks available you will see almost the same amount of disks disks. Even if you needed only one (or better two) for redundancy to fullfill the capacity. The reasoning is relatively simple. There is a second parameter in sizing storage. The Input/Output operations Per Second. More than anything else the rate of IOPS is specified by the physics. How fast does the spindle rotates, and interestingly this is a relativly fixed number. We´ve talked about 7.2k, 10k and 15k rounds per minute in 2000, we still talk about 7.2, 10k and 15k in 2008. The kicker: No matter how you size on one variable, you will oversize the other part of the equation: Let´s assume, need 5000 IOPS. Let´s assume we have an ideal 15k disk. You will have something around 500 IOPS per disk. So with the 14 harddisks 73 GB harddisks (or 28) you will have 7000 IOPS 14000 IOPS and round about 5000 harddisk. You have have more IOPS than you need, but you need the number of disks to reach the capacity Let´s assume a 1 TB 15k disk. You would need 10 (respectively 20) disks to fullfil the IOPS requirement, giving you 5000 IOPS but whooping 10 (respectivly 20) TB of storage. You have more capacity than you need, but you need the number of disks to reach your target IOPS number. Okay, now let´s think different about the problem: Assume you have a technology that combines a IOPS optimized media with a capacity optimized media. Let´s assume the first media is a SSD disk qualified with 7500 IOPS, the other media is our 15k harddisk. How many drives do you need now ? Exactly: 2 (respectivly 4 for redundancy). The IOPS requirement is fullfiled by the SSD, the storage capacity is fullfiled by the rotating rust. All you need is filesystem that combines both media in an intelligent manner. And even better: There is no 1 TB 15K harddisk … but with SSD you don´t need it. You can use a standard SATA 7.2k disk. Even when it´s not qualified for 7/24 at a high load pattern, this is not a problem. This high load is shielded by the SSD from the mechanical disk. I think this will have an interesting impact to large enterprise storage: One reason to use this, was the need to administrate large amount of disks at a central position. Do you still need such large systems system, when you need 4 disks where needed 20 or 28 disk in former times? Do the same for 20 servers. 80 disks instead of 400 respectively 560 disks. You don´t need the monster caches of enterprise storage, you don´t need the capability of housing hundreds of disks. Additionally this will impact backup, too. We reach in even in enterprise tape the range where the tape drives doesn´t keep pace with the development with the harddisks. Now we´ve reduced the the number of disks for the working dataset we have a power and rackspace budget to use for other backup concepts. This could be something sounding insane at first: Let´s use the the 1 TB 15k disk example again. We saved 16 disks by using the SSD, we can use 16 disks for backups before we reach the same power budget. Let´s assume a mirrored backup pool. We could do 8 generations of full backups to disk. I really think, that flash storage in server system will vastly change the way how we design storage in the future. This will be an interesting development.