Using a Service Catalog and Lifecycle Management to Increase Capacity or How Many VM Fit Into a Server?

0 Comment

A colleague sent me this CIO article this morning, How Many Virtual Machines Fit on Your Server?

And it’s a very interesting read, because the answer seems to be depends on your workload and how well you manage your lifecycle. Here’s the first quote on best practices.

But making sure the hardware can support that additional load is a real trick because of the almost infinite variety of the software that runs within the virtual environment — each application making a slightly different set of demands on the host OS and the hardware, says Chris Wolf, an analyst at The Burton Group.

In my experience, the data center folk rarely know what’s running on the boxes they manage. They also have little visibility into the workloads (with a few exceptions like SAP) or even owners.

So one of the key things we are recommending to newScale service catalog customers is to make part of the request process, the capture of workload sizing and business information.

This can be very difficult and involved. Or it can be as simple as “small, medium, large, xl” for everyday requests.

Second, we add to the form, a wizard which helps the request translate what they need into actionable Quality of Service instructions to the data center. The other good practice is to detail the workload in the portfolio so it can be mapped to risk, criticality, hours of operation, supported business process, etc.

This information provides the data center the right information about the workload to know when it’s ok to over commit capacity, what’s the criticality of the application if performance is constrained. Or as one admin said to me, “I wish I’d known you only support to work on that app on Saturday nights.”

Third, because we’ve enabled self-service (we provide the best IT Storefront), even if the user makes a mistake in estimating workload size, they can quickly fix it. Here’s another quote from the article that makes this point:

“If you put five VMs on a server, you’re running six operating systems and all the applications, so you have to ramp up to be able to handle that and keep the service levels, the performance, high for the applications,” Scanlon said. “We ended up having to put on a lot more memory than we figured during the capacity planning.”

This is not a failure of capacity planning. I was at a Morgan Stanley CTO event a few months back, and they were enabling self-service for their users to request servers. One of the attendees said, “But won’t they make mistakes? The answer was “The users never get it right. With all the planning, architecture review, it’s always wrong.” But, he continued, they can self-correct immediately.

And the users don’t get to blame the data center, which is a nice bonus.

The other interesting aspect we’ve brought into play is lifecycle management. As we move from hard assets to virtual assets, we’ve lost the ability to look at what’s running and ask who owns it, and for how long do they need it.

Here’s the recommendation from the article.

Second: build a fence. VMs are easy to launch and hard to see, so server sprawl – having too many VMs running without actually being used – is appallingly common. Killing off all the unused servers and the disk space they reserved saved gave Computacenter back so many resources it was able to put off a major upgrade until the following budget cycle, Scanlon says

That’s a pretty impressive statistic. And the key to killing unused servers is to have a good life-cycle manager tied to the self-service storefront.

We’ve introduced these new capabilities in newScale 9. The service catalog details the offerings, and helps drive standard configurations. This removes a large part of the “idiosyncrasies” of workloads.

The Wizard and configurator guides the user to make a request that actually works. This includes detailing the quality of service, SLA’s, and portfolio elements (like criticality, disaster recovery, storage tier, etc, etc). And it includes a subscription expiration.

Here’s an example of the questions (of course, it’s all customizable)

All this information drives the workflow for provisioning, but also is captured in the lifecycle database which syncs with the VM infrastructure. This is what the lifecycle looks like:

The result is unprecedented technical and business visibility about the workloads running in the data center.

The real hard dollar results come from reducing costs due to hoarded capacity, rogue servers, and zombie VM’s. When you think about increasing virtualization density 20-30% by JUST KNOWING what’s running and what can be done to the workload… That’s a mind-boggling return on investment.

Tags: ,