Sun Inner Circle: For Business & Technology Leaders Sun Inner Circle: For Business & Technology Leaders

Sun Inner Circle Newsletter - July 2006

Thinking Beyond Today's Definition of Virtualization

It's Bill Vass back with another letter. Before my faithful readers get to thinking that there is no rhyme to my reason, I should explain that in my letters I am building a stack of technologies that can help enterprises reduce data center costs, improve hardware utilization, and increase application availability and performance.

In recent letters, we discussed how Sun's CMT processor can shrink data center requirements, how open source and OpenSolaris can improve security and application performance, and how open source middleware can help companies get the most out of legacy investments. In this issue, I am going to take a step or two back down the stack to discuss a hot topic for today's enterprises — virtualization.

In its most basic terms, virtualization is about abstracting computing resources from underlying hardware. As such, virtual machine technologies have emerged as a recent phenomenon that has become almost synonymous with virtualization, itself. But virtual machines are by no means the "be all and end all" of virtualization. Operating system virtualization schemes like Solaris Containers have a big role to play. From a broader perspective, grids offer the potential for widespread virtualized and shared architectures that can dramatically transform the way applications are developed and delivered.

The Business Rationale for Virtualization
It's no secret that maintaining hardware in the data center can incur high costs. Power, cooling, and real estate costs form a triple whammy for IT budgets. Most enterprises build for peak loads, but peak times tend to arrive en masse at the end of the quarter or the year, and the rest of the time hardware utilization rates hover around 10 percent to 15 percent. As a result, significant resources go unused — representing a huge waste of capital, power, and space.

Management of large server farms creates another escalating cost. As the price for hardware falls, management costs continue to soar. So, aside from the direct costs of large data centers, reducing the number of physical machines to monitor and maintain is another motivation for deploying virtualization technologies — assuming, of course, that an enterprise doesn't simply swap physical management for virtual machine management.

Less discussed, but no less important, enterprises depend on hardware for application availability. Many organizations implement redundant hardware to help ensure continual application and business operations and avoid crippling downtime. Not only is hardware redundancy a less-than-perfect means to guarantee application availability, it also serves to multiply data center costs.

Virtualization has emerged as a means to address both utilization and redundancy. By allowing several applications and operating system instances to share hardware resources, virtualization promotes higher utilization rates and a better return on investment. By abstracting software from the underlying hardware, virtualization offers higher availability because applications can be made redundant without depending on temperamental hardware. In the end, virtualization helps organizations:

  • Improve IT and business flexibility
  • Rapidly grow and shrink infrastructure
  • Increase availability and application performance
  • Simplify application management and increase application performance per dollar spent

The Dawn of Virtual Machines
Virtual machine technologies have become a well-accepted form of virtualization. Essentially these technologies allow a physical machine to host multiple virtual operating system instances, and they permit different operating systems like Solaris, Linux, and Windows, to co-exist peacefully on the same hardware.

In recent years, VMware technology has emerged as a likely frontrunner in the virtual machine marketplace. VMware virtualization technology works by creating a hypervisor, which acts like a mini-operating system that is booted first and essentially virtualizes the underlying hardware into little pieces, so different operating systems can run on the same physical server. This approach has proved particularly attractive to organizations with legacy applications that require a mixed operating system environment.

 
Consolidation through Virtualization with the Sun x64 server product line

While appealing, the VMware virtual machine approach can be relatively expensive from a performance standpoint. Typically, the hypervisor consumes between 5 percent to 15 percent of the total CPU power, while each operating system adds incremental overhead. In the end, enterprises can consume a large amount of CPU resources just to support the virtual machine infrastructure. Virtual machines can also negatively impact observability because if an application fails, it becomes more difficult to know if the application, the operating system, or the virtual machine technology itself is to blame.

Recently, Open Source Xen software has emerged as an alternative to VMware. As with VMware, Xen supports the execution of multiple guest operating systems on the same piece of hardware. Xen is a much lower-level form of virtualization that allows administrators to virtualize various parts of a system, including the memory and the CPU. And, because it resides at such a low level, it offers significant resource and fault isolation.

There are many compelling reasons to consider Xen. First, it is open source. Second, it is relatively lightweight, so it doesn't consume an inordinate amount of CPU resources. Third, it achieves a high degree of isolation among virtual machine technologies. Finally, as with other virtual machine technologies, it supports mixed operating systems and versions, and it allows administrators to dynamically instantiate and execute an operating system instance without impacting service.

Using Solaris Containers with Virtual Machines
Another form of virtualization is available free to all users of the Solaris 10 OS, namely Solaris Containers. Within Solaris, it is possible to virtualize application environments using Solaris Containers — giving each application its own IP address, file system, users, and assigned resources. Furthermore, Solaris Containers are extremely lightweight because the virtualization happens at the kernel level. Because there isn't the added overhead of multiple kernels, it is possible to run hundreds, if not thousands, of Solaris Containers on a single server. Plus, with Solaris Containers for Linux Applications, it is now possible to run Linux applications completely unmodified in Solaris Containers.

While Solaris Containers don't permit multiple operating system instances, the technology does offer numerous advantages.

Higher Utilization Rates: Solaris Containers promote higher utilization rates because the containers consume only the resources required by the workload without added overhead.
Lower Management Costs: Solaris Containers lower management costs because administrators don't have to maintain a separate operating system instance for each workload.
Excellent Observability: Solaris Containers provide visibility into the virtualized environment, particularly in conjunction with other Solaris features like DTrace for real-time troubleshooting.
Rapid Provisioning: Solaris Containers enable rapid provisioning and short-term availability of hosts, such as in QA environments that perform extensive regression testing of different system and service configurations.
Reduced Licensing Costs: Solaris Containers can potentially reduce licensing costs because one OS license can support hundreds of Solaris Containers.

Still, the higher level of resource and fault isolation offered by the virtual machines are not to be underestimated. There are times when applications simply call for isolation beyond the capability of Solaris Containers. In these instances, it is possible to use virtual machines in conjunction with Solaris Containers to achieve the optimum levels of isolation, utilization, and manageability.

Mixing Solaris Containers and Virtual Machines
Let's consider an example of when a mix of virtual machines and Solaris Containers might make sense. Assume there is an application that requires patching for security or performance reasons, and a second application with a conflicting patch level. Further, there is a third legacy application that is tied to the Windows operating system. Those three applications need to run in three distinct virtual machine operating system instances. Now let's assume that the same environment has four more applications that are forgiving in terms of patches.

In this example, the three applications that require their own operating system instances could be virtualized using virtual machines, and a fourth virtual machine operating system instance could house four Solaris Containers for the applications that require only a single instance of Solaris. The advantages include:

  • Solaris Containers allow the applications to share the same set of patches
  • Administrators need to maintain just one operating system instance
  • Administrators don't have to keep track of several virtual machines
  • The organization reduces its administrative costs

Grids: The Virtualization of Yesterday, Today, and Tomorrow
So far this letter has been pretty vanilla in terms of expanding the horizons of the discourse around virtualization. But that is about to change. There is one area where virtualization has long been employed but little talked about in terms of being virtualization — namely grids. While grids might not at first seem like virtualization, if one recalls the definition offered at the beginning of this letter — abstracting computing resources from underlying hardware — there is no more promising example of virtualization.

As enterprises move down the path to a virtualized environment, they will find that virtualization innately lends itself to a grid environment. That's because once there are virtualized instances of Solaris and multiple Solaris Containers, the resources in these containers can be consumed by multiple grids. Application grids can be formed and controlled by grid frameworks to deliver software as services, and grid engines can leverage the excess cycles in compute grids. But, as usual, I'm getting a bit ahead of myself.

Grids, of course, are nothing new. They have long been employed as a way to gang tackle compute-intensive tasks. For example, grids have been used to create complex weather simulations, map immense underground oil fields, model the effects of a nuclear bomb explosion, simulate the financial performance of complex derivatives, and add multi-layered textures to the individual frames of a movie. Pretty much every industry that faces intense processing demands — and most industries do — relies on grids.

Stateless vs. Stateful Grids
The way traditional grids work is pretty simple. If a movie studio needs to texture map 1000 frames of a movie, the grid controller spawns 1000 parallel processes, and those processes are farmed out to as many servers. (Of course, in a virtualized environment, it could just as easily be a 1000 Solaris Containers instead of 1000 physical servers or CPUs).

 
To see a grid in action, check out the Sun Utility Grid

In any case, if one of the servers goes down, the grid controller gathers the other 999 workloads, realizes one is missing, and sends the remaining load to a functioning server. This most basic type of grid is known as a stateless grid because it makes little difference if someone walks up to one of the servers and turns it off; the grid controller will simply reassign the missing workload to another server — no harm, no foul.

Recently, so-called stateful grids have emerged. Instead of installing hordes of desktops with vast amounts of wasted processing power, Sun uses a display grid in conjunction with Sun Ray thin clients to send desktops to users when and where they need them. At Sun, the grid controller gives each individual the processing power it needs from among the 700 servers dedicated to the display grid.

For example, when an employee comes to work and fires up StarOffice, the user might need 60 percent of a typical CPU's total processing power, so that's what the grid controller gives the employee. The Sun display grid is called stateful because the user has a stateful connection to the underlying infrastructure — if someone turned off the desktop, the user would most surely notice.

Optimizing Resources with an Application Grid
Now let's consider a more advanced form of a stateful grid-an application grid. Suppose a company wants to move its non-RAC Oracle ERP system to a grid environment, and that system consists of a typical three-tier presentation layer, application layer, and database layer architecture. It is possible to allocate 20 grid containers to be Web servers and designate four of those containers to act as load balancers.

Then, for the application server, the enterprise could create a stateful cluster between two sets of containers acting as application servers, so if someone turns off one of the application servers, the grid controller can restart its tasks on a functioning application server. Now, when the company adds the database layer to the grid, the grid allocates containers as stateful components, and might even make the containers a high availability cluster across the grid, so the containers assume different physical locations.

In the past, to create such an architecture, the standalone Oracle ERP system would require something like 20 CPUs dedicated to each layer — despite the fact that a typical ERP system tends to get pounded in the morning, then there's a lull, and processing demands peak again in the afternoon. Again, there's simply too much hardware dedicated to peak use, and too many underutilized resources.

 
It is only a matter of time before institutions of all sizes look to treat their entire underlying hardware infrastructure as a shared resource for all applications.

In a grid environment, it is possible to create the same architecture using grid containers, so the Web server layer consists of 20 containers, the application layer has 10 primary containers and 10 containers for redundancy, and the database layer has another 10 containers and another 10 for redundancy. And, all of the layers share the same virtualized hardware resources.

Moving Towards 100% Hardware Utilization Rates
The real revolution is that it is becoming possible to mix and match loads — whether they are stateless or stateful — on the same grid. And it is possible to assign priority given the statefulness or criticality of the loads. For instance, an employee firing up StarOffice takes CPU power from the large stateless loads running in the background. Similarly, when an Oracle application or database server requests processing power, the StarOffice employee gets the leftover CPU cycles, and the stateless loads receive any remaining processing power.

In the end, by allowing all applications to share the same virtualized infrastructure, and by leveraging appropriate resource management, enterprises might be able to realize hardware utilization rates approaching 100 percent instead of the 15 percent reality today — and improve application redundancy and availability at the same time.

Virtualization, Grids, and Bringing It All Together
The scenarios outlined in this article are no pie-in-the-sky fantasy. They are realities today. Certainly, enterprises are already making good use of virtual machine technologies like those offered by VMware and Xen. Many organizations have deployed Solaris Containers to improve utilization.

And several leading companies — such as some auto giants and world-class financial institutions — are moving to completely virtualized environments, including virtualized operating systems, virtualized applications, shared desktop infrastructures, and grids to handle stateless and stateful loads.

The trend is undeniable, and it is only a matter of time before institutions of all sizes look to treat their entire underlying hardware infrastructure as a shared resource for all applications. It simply makes too much dollars and sense to ignore.

Bill Vass
CIO
Sun Microsystems, Inc.
cio@sun.com