James Hamilton has written previously on the inefficiency of blade servers on his blog, which focused more on the cooling/power/floor space equation. Equally important is the application deployment side of blade servers, which is, again, not as efficient as the vendors would like you to believe. And I am going to state some of the reasons why:
- Deployment and capacity planning is never easy-peasy: Whoever tells you that it is nearly possible to deploy on a whim with blade servers should be beaten with a stick. Even with virtualization, with blades, your total capacity is limited and fixed, unless you get another blade into the equation.
As a result, you need to be very careful with how you intend to slice it up, because the last thing you want to hit is a problem where you have a need to recalculate your hardware usage profiles a fair way into the deployment.
Remember, if it was hard for you to procure additional stand alone servers, it will be even harder to get extra blades to augment capacity.
- Application layout nightmares: If you go in for an all-blade deployment, chances are that you will wind up with at least a handful of virtual servers. Any application that involves heavy reads and writes from disk will have to be planned rather carefully. The reason for this is that if you leave the data on the same platter as the guest OS in which the application is running and reading/writing the data from, you will not sub-optimal disk performance, depending on how the host OS is treating the guest OS-level reads and application-level reads within the OS.
- Troubleshooting: In my experience (yours may vary), it has been much easier to troubleshoot problems with standalone server than in a blade set up. Like how the "slip between the cup and the lip" often happens, there have been problems that happen between the chassis and the individual servers or the host OS and the guest OS, leading the circumstances where you are not really sure what exactly went wrong.
- Blade maintenance: As an offshoot of (1), it can become a bit of a problem when you have to take a blade down for regular maintenance like firmware updates. Normally, you stripe cluster nodes across two or more nodes to extract more redundancy from your hardware. When you take down a blade, you take down an entire bunch of servers and not just one server, which means that if your application layout is not absolutely perfect, it can lead to a lot of tears.
This is not to say that blades don't have their place in the data center or an enterprise. They certainly do have a role, but not as database servers or anything that involves heavy disk read/writes. They are amazing to have as front-end HTTP servers or as memcached servers. They are best suited to roles where most of their operations are done with data read from RAM or in enterprises where 100% uptime is not an operational must-have.
