Saving energy ... for real-or: The need for a different view to datacenter management
I´ve thought a little bit about energy saving in data centers while riding by train to Berlin. And the more i think about the problem, i´m sure that the biggest part is not a processor or something like that. The biggest single component at your electricity bill are the unused cycles of your processors.
There are two reasons of unused compute power. The first reason is are technological inefficiencies. For example processors waiting on memory, systems waiting for harddisks. Those reasons are easy to solve. UltraSPARC T1 solves the first problem, correctly sized storage solves the second.
The second reason is a little bit difficult to solve. Not because of technological problems. It´s a problem of habits. The problem are unloaded systems in your datacenter. We are used to use a system only with one task. We have our mail-servers, we have our dns-servers, and their run only this task. And when you look at the load of the system, you see 10 or maybe 15 percent of uitilisation. We are used to the fact that once we christened the system, it has the assigned task, until the system breaks or a more powerful system obsoletes it.
Both habits denies a different view to the datacenter. We have to view the datacenter as a pool of resources, not as a pool of distinct machines, all executing a certain assigned tasks. We have to lift the assignments from machine level to pool level. This would solve many efficiency problems of enterprise computing.
As i´ve reported a few days ago, only a fully loaded system is a system, that uses energy efficiently (a system that takes 400 watts at full throttle, needs 4 watts per percent of utilisation, a system that take only 100 watts(discs still spinning, refresh of DRAM, power of the NIC at 10% utilisation) by some clever powersaving needs 10 watts for every percent of utilisation. So real energy saving can be only done by having systems either fully loaded or switched off.
So you need an intelligent orchestrating component in your datacenter. This central component has to control the workload of your system and making decision based on the workload based on a mathematical model. This can be as simple as an function of time and workload or a more complex multidimensional model to predict the amount of needed computing resources at a certain point of time.
With this knowledge you can power up or power down computing resources as you need them. You can anticipate the workload of your entire system in five or ten minutes. Okay, not every surge in the need of computing power is predictable, but this would be solvable by some override exception ruleset, for example “data center load at 95%, start a new computing resource” You have not the problem, that you need new technologies as super-duper silicon manufacturing technologies. All components are available:
- with XEN is an universal hypervisor available to build an abstraction layer between the bare metal and the service
- with life migration you can move your service without interruption</a>
- with the fault management architecture of Solaris you can precent multi-service outages because of failed compontents more efficently as with other operating systems.
- every decent server at earns this name has an Lights-Out-Management
- the extended accounting of the Solaris Resource Manager would enable the central component to get the data for the mathematical model to orchestrate the pool.</ul> All this components will be available in Solaris in 2007. You can already use them in Opensolaris. All you need is the mathematical modell and an exeption ruleset to control your resources and some scripting magic to control the LOMs and the XEN-Hypervisor. It´s not that difficult but would solve many of the energy problems in enterprise computing. Designing and developing such an component would be really simple compared to the huge amount of work for saving 10 watts at the processor level.