AVAILABILITY TO THE NTH:
How Sun Is Driving the Industry's New Prime Metric
When an industry icon like General Electric CEO Jack Welch predicts that his
company will save $1.5 billion through the use of Internet technologies, it's
clear that there's a great deal at stake for today's enterprises. Indeed, the
Internet is fundamentally changing the way companies do business -- companies
with familiar names like GE, Ford, Federal Express, Prudential and Daimler
Benz
AG.
But as more and more of the world's "traditional" companies grasp the
importance
of the Internet and a networked economy, a new and essential truth has
emerged:
The prime measure of the network's performance is no longer megahertz, but
availability.
The reason is simple: When the network fails, business stops. Depending on
the
industry, an hour of downtime can cost a business as little as $10,000 or as
much as $6 million.
While this availability metric is simple to understand, it's enormously
complex
and becoming radically more so each day because of what Sun Microsystems Inc.
calls the Net Effect -- the explosive growth of millions of networks, forcing
vast increases in bandwidth and carrying trillions of bytes of data to
billions
of devices. The outcome, though, is remarkable: New and exciting networks of
smart, integrated web services that are far easier to use and manage than
those
designed for the era of monolithic computers. As Sun President and COO Ed
Zander wrote recently in the Financial Times, "The Internet is going away in
the
same sense that electricity and plumbing did in the 20th Century ... The Net
will assume an always-present, behind-the-scenes quality... Who today ever
says, "Activate the plumbing and pour me some water"? Like plumbing, the
Internet will be everywhere."
Availability in the Net Effect Era
Until recently, processor speed was the gold standard of performance
measurement
and -- however dubious it may have been as a real-world measure -- the
supremacy
of the megahertz went almost unchallenged. In the Net Effect era,
availability
replaces chip-speed performance because raw performance means nothing if the
system isn't available. The fastest server in the world has raw performance
of
"zero" whenever and for whatever reason the system fails. Leading analysts
agree, recommending to their clients that information systems now need to be
designed with no single point of failure that can jeopardize the system's
operation. Each tier, they say, should have redundant elements -- from Web
servers to application servers to database servers.
Sun Drives Improvements in Availability
Recognizing that the company's future leadership depends largely on Sun's
ability to be the "availability' company, Sun's top executives in late 1999
mounted a focused effort to rethink, retool and reorganize the company --
leading to the creation of an entirely new organization in 2000 called
Customer
Advocacy, which is responsible for driving availability initiatives across the
enterprise.
From the outset, the Customer Advocacy organization has had a broad,
trans-enterprise charter. Though it builds on the knowledge and successes of
earlier quality-improvement programs, the organizations efforts work beyond
traditional quality centers, such as engineering, product development and
manufacturing. Instead, it could range across the entire company, with a
specific focus on driving improvements at Sun in three critical areas:
Skills: Well-trained people who can communicate clearly and precisely are the
best defense against downtime.
Processes: Well-defined processes can accelerate the speed and accuracy with
which things get done, and play a significant role in preventing downtime.
Structures: New internal organizations and business practices are essential
to
make change real and lasting.
The key was to focus on specific improvement projects that would have a direct
impact on improving Sun's customers' ability to deliver continuous, real-time,
service-level availability -- both in the short term and into the future.
As already noted, unplanned downtime most often can be attributed to human
factor. From a lack of expertise or training, to a moment of unclear
communication, Sun's skills improvement projects begin with the Sun Sigma
initiative.
Based on the Six Sigma business philosophy that has been proven at such
companies as GE, Allied-Signal and Motorola, Sun Sigma is a statistics-based,
rigorous approach to process improvement that gives Sun -- and equally
important, its people -- a consistent set of tools and methodologies that
clarify where improvements need to be made, as well as provide a common
language
for clear communication.
Achievement of high availability, however, does not end at Sun's door. Other
programs are needed to evangelize the best practices and processes that lead
to
high availability.
For example, Sun is implementing a Mission-Critical Sign Off process that
establishes a clear and consistent set of requirements as part of the purchase
process. This rigorous set of conditions -- covering everything from
environmental factors to staffing and training -- is focused on assuring that
the customer's site is ready to operate and maintain a mission-critical
installation.
Another example of how Sun is helping customers achieve higher availability is
through Sun Remote Services (SRS), an automated service for providing
customers
with real-time, on-site system monitoring from Sun's offices. The monitoring
is
accomplished through an automated software agent, installed at the customer
site. When the agent detects a potential problem, it automatically notifies
the
SRS engineers, so they can initiate a resolution.
Keeping customers informed of changes that affect them is also vital to
maintaining availability. Sun's new patch management process helps make it
easy
for customers to understand which software patches they need to download and
install. Sun PatchPro(TM), Sun's new, automated, patch-identification tool
has
already facilitated thousands of patch downloads for customers in just 10
months
of operation. And since the Solarism 8 can be upgraded even while the
production OS is operating, these patches can be installed without
interrupting
service.
Likewise, Sun has also implemented Sun(SM) Alert, an e-mail notification
program
that pro-actively notifies customers of known product issues that could have
an
impact on their systems' availability.
Making change happen is challenging; making it last, even more so. To do
both,
Sun has chartered a number of new organizations and structures that benefit
both
its employees and its customers.
For example, the company created the Sun Configurations group, which delivers
Sun systems that tightly integrate, pre-test and benchmark Sun hardware with
software from key Sun partners. The first Sun reference configuration is
based
on the VOS(TM) initiative, which was jointly developed and supported by the
VERITAS, Oracle and Sun. These systems are easier to order and deploy than
custom systems, as well as more affordable and more reliable. In a world
where
availability is paramount, shortening the time involved in bringing a system
online is crucial.
In addition, because they are based on the VOS reference configuration, these
systems are supported through a new joint escalation center (JEC) that
provides
an integrated approach to customer problem resolution: One call is all it
takes, whether the problem originated with a VERITAS, Oracle or Sun product.
Maynard Webb, President of eBay Technologies, has said: "When you're in a
multi-vendor environment, a solution that makes the products appear like they
are from one company is a great answer. Rather than working three escalation
chains back and forth, the JEC enables us to work one. Though we hope we
don't
have to use the JEC very often, it is great to know it's there."
Beyond these programs, Sun makes availability a pocketbook issue for all of
its
employees. In recent months, Sun has revamped its compensation plans to tie
its
quality improvement efforts directly to its bonus plan. Under the new rules,
30% of a Sun employee's bonus compensation is dependent on the result and
success of availability and quality programs. Everyone is touched by these
goals, from recent college grads to all top executives.
Conclusion: Availability is All
As the Net Effect takes hold, society's reliance on smart web services will
grow, as will the importance of availability as the only benchmark that
counts.
Raw processor speed will mean nothing if a network failure interrupts supply
chains, stops the trading of stocks, hampers a doctor in surgery, or prevents
a
critical email from reaching its destination. Sun is making fundamental
improvements in its skills, processes and structures to deliver on the promise
of continuous, real-time, service-level availability.
|