Sun MicrosystemsInner Circle - For Information Technology Leaders

Sun's N1 Grid Engine increases Return on Assets (ROA) Quickly for Sun IT Operations

If you haven't already heard, grid computing is well-established, relatively easy, and a great way to increase ROA right now. In early 2004, surveys by analyst firm IDC determined that grid computing is currently one of the top five data center priorities and predicted that 20 percent of companies will use grid computing by the end of 2005.

Sun IT Operations recently proved the benefits of using N1 Grid Engine to harvest unused computing power from Sun Ray servers, increasing after hours utilization and ROA, and gathering CIO-level support for a global production program.

You know you need to get more from your existing systems

Everyone from the CEO down is looking to get more out of existing assets. Even a 2 percent increase in ROA, the best financial measure for this need, will get C-level attention. To the CFO it means higher earnings per share, and to the CIO it increases the attractiveness of IT investments.

Sun's IT operations — like those of many of its customers — are under pressure to get more out of its significant existing computing resources. An opportunity was identified in Sun's desktop server environment, which is powered by innovative Sun Ray technology, processing thin client user sessions for all campuses worldwide. Since many of these servers were relatively idle in the evening, a strategic pilot project was initiated to identify ways to increase utilization of Sun Rays after business hours.

The team realized it was likely that, even with staff working longer hours and logging on to Sun Rays at home, there would always be less demand for the computing power at night than during business hours.

As they looked across the company and the incredible 24/7/365 capacity of its global computing resources, the Sun operations team found many instances of some servers working harder while others were idle. For example, during morning business hours in the UK, Sun Ray servers in England would be buzzing while servers in the US slept.

Fully optimized global computing resources

If any computing job could take advantage of any less-than-fully-utilized company computing resource available anywhere in the world, you'd end up with the smallest possible amount of computing resources and highest ROA. If the team found a way to optimize the use of global Sun Ray servers, the project would be a success.

This is exactly the kind of problem grid computing is designed to address. Sun defines grid computing as a coordinated way of managing and dynamically sharing disparate sets of computing resources. Making all global computing resources part of a single grid, and then making it easy for any computing task to access any unused part of that grid, maximum utilization and ROA.

The team realized that grid computing was the obvious choice for optimizing the global computing power of Sun Ray servers. Servers were identified in the UK, Singapore and the US for the pilot.

How Sun got more from its existing servers

The next challenge was identifying the best application for the pilot. Globally, many different applications were being run on the Sun Ray servers, and these applications had many different owners. Not all of these applications were architecturally suited to grid computing. Not all the application owners understood the vision or the benefit of their application having access to the grid.

The project team used Six Sigma [http://www.sun.com/2000-1115/sigma/] methodology to rank seven potential applications on the basis of more than ten criteria, including feasibility, effort, likely benefits, and organizational readiness. The matrix identified a network health check and monitoring application — called Sun CheckUp Web — as the best candidate.

As it turned out, the Sun CheckUp Web team had already budgeted to invest in an additional mid-range 8-CPU server, so they were excited that the grid project could remove that requirement. Sun CheckUp team leaders understood and bought into the vision of grid computing.

While deciding which application to migrate to the grid, the team learned that application architecture would impact the development effort required and degree of benefit to be gained from grid computing. Sun CheckUp's architecture was well suited to grid computing, as it supported distributed compute processing, so the planned development effort was not extensive.

The project team completed the development and deployed Sun's N1 Grid Engine software with less than four person-weeks of effort. The global network already existed. A utility was created to submit jobs to the servers in the pilot grid after-hours for each server. The pilot was run in parallel with the existing production Sun CheckUp server, and metrics were established to assess the results of the project.

This minor effort yielded significant results. Application throughput was increased by nearly 50 percent, and after-hours CPU utilization also increased. The Sun CheckUp team was able to eliminate one existing E4500 12 CPU mid-range server and no longer required a new V880 8CPU mid-range server, resulting in savings of approximately $47,000 in procurement and $18,000 in ongoing annual support costs that more than covered the costs of the project. And it clearly proved that Sun IT could get more done with its existing computing resources, thereby increasing ROA.

"A little victory" is a sometimes the best way to start

If you've been thinking about grid computing but didn't know how to get started, this is a inspiring case study. The pilot clearly demonstrated the potential for grid computing to increase both ROA and ROI by increasing total throughput by almost 50 percent and increasing after hours average CPU utilization.

For less than one month of internal deployment effort, this project realized $65,000 in savings in the first year and nearly $20,000 in following years, a solid ROI. The production financial impact is naturally expected to be significantly higher.

The project proved the technical feasibility and business benefits of using the Sun N1 Grid Engine as a grid computing solution. The project team generated organizational knowledge on where and how to do grid computing and documented best practices and recommendations, including how to deal with network latency and potential disaster recovery scenarios.

The absence of additional help-desk calls proved that having a grid using unused compute cycles doesn't interfere with regular system users' perceptions of system performance or health.

Perhaps most importantly, this pilot project stimulated a great deal of thinking and subsequent action that will generate significant financial returns for Sun. A dedicated grid computing production project team was created with backing from Sun's CIO to create a global grid with all Sun Ray servers.

Combined with the pilot grid's success metrics, this executive support should be more than enough to overcome any internal concerns that grid computing might negatively affect application performance or user experience. Sun is now thinking about other ways grid computing can be used on under-utilized company computing resources — from mobile phones to giant servers — to increase ROA.

Sun has shown that grid computing is here now, ready to increase ROA for your business. Sun's know-how can help you make grid work for you.


Inner Circle - For Information Technology Leaders
 »
Sun Focus Articles
Sun's N1 grid engine increases return on assets (ROA) quickly for Sun IT operations