Keeping your business running in today's increasingly complex environment amid government regulations, natural disasters, and pandemic and terrorist threats is no small task. Disruption to your business can quickly translate into lost revenue, poor productivity, customer dissatisfaction, or worse.
Business continuity and disaster recovery are no longer optional, but how do you know if your plan is right for your business? Here to shed some light on the topic are two Sun experts, Hal Stern, senior vice president of global systems engineering, and Randy Chalfant, chief technology officer and distinguished marketing director.
Q: Does every enterprise need to be concerned with disaster recovery and business continuity?
Hal: Yes, and it's important to note that they are different things. Many think of disaster recovery as high availability. But disaster recovery goes much further than dealing with a single point of failure in the infrastructure. Issues can include brand hiccups or poor customer satisfaction that lead to real business operations problems. A variety of studies have shown that after 48-72 hours, there can be a material impact on a company.
Business continuity and disaster recovery (BC/DR) are about more than the infrastructure going down. They run the spectrum from knowing what to do if your PC is stolen with sensitive data, to what happened to Sun in New York when a steam pipe exploded near our building and we were locked out for four days. How do you get people to work? How do you make sure employees can continue operations and deal with customers even without physical access to their office? You need to assess how long it will take to recover from an event — hours, days, or weeks?
Randy: Business continuity is critical not only for business reasons. There is legislation that says you must have the ability to recover your business. The Disaster Recovery Institute definition of a disaster is a sudden, unplanned catastrophic event causing unacceptable damage or loss. The cost of non-recovery is even larger because many businesses that don't recover quickly simply become insolvent.
However, this rationale often fails to resonate with business leaders. Something like 50 percent of organizations worldwide have disaster recovery capability, but of them, only 50 percent are tested. If you're not testing your ability to recover, there's no proven efficacy. Perhaps a better business case is needed, so that business leaders see the effect to the bottom line.
Q: What’s the difference between business continuity and disaster recovery?
“Something like 50 percent of organizations worldwide have disaster recovery capability, but of them, only 50 percent are tested.”
Randy: A disaster recovery plan defines the resources, actions, tasks, and data required to manage the technology recovery effort and is a component of business continuity. A business continuity plan is broader and applies to all arrangements and procedures that enable an organization to respond to a disaster.
Hal: The historical hard line dividing what IT does and what the business does is gone. Disaster recovery is when something bad happens and you suffer the loss of infrastructure. Business continuity forces you to ask, “What we are going to do?” It’s the public perception of your company while you’re recovering from the disaster. Are your Web sites there? Can people place orders? How does your presence look to customers, partners, or competitors? BC and DR are intimately tied together, and a business continuity plan needs to be in place for any threat to your infrastructure.
Q: Some say that BC/DR is just an expensive insurance plan. Thoughts?
Randy: They’re probably right. People often overbuild their infrastructure and don't have enough money left over to keep people operating it. Companies often buy a one-size-fits-all solution, which may be overkill. Only about 25 percent of infrastructures need a Tier One plan with the highest level of protection. It's an insurance policy with an associated cost, but the perception of it being expensive is mostly caused by implementations that are far greater than the application deserves.
Hal: The issue here is recovery time, so if you build an infrastructure that recovers quickly from a failure and you realize that you don't need that level for many of your applications, then you begin categorizing your applications based on the longest time you can be without them. It's one thing to go without email for a day, and another thing for the stock exchange to go down, where minutes are measured in millions of dollars.
Randy: I’ve noticed that the cost of a Tier One infrastructure that hits five nines (99.999 percent) availability for one company can be vastly different than it is for another. Hal used the example of Wall Street where huge amounts of money are at risk, so the infrastructure tends to be premium and justifiably so. Balance here is key, and that comes as a result of looking at the applications, cataloging them, categorizing them, reading the taxonomy, understanding the financial impact, and building against that set of criteria.
Q: Where does a company begin assessing its disaster recovery plan?
Randy: It starts with a business impact analysis. Every business is comprised of three things — top-line growth, operational efficiency, and risk reduction. There are processes designed to hit each of these, which become automated in an IT infrastructure. If you make a change in the infrastructure, you can expect X impact on business objectives. You determine how the financial environment is impacted with the current state of preparedness, and determine what IT resources are tied to the business resources. You then look at recovery point objectives (RPO) and recovery time objectives (RTO) and the data protection window. The net-net is a measure that says if the cost of the RTO and RPO is smaller than the financial business impact, implement the technology. If these are higher than the business impact, then either don't do it, or move down to a lower class of availability.
“To assess your plan, get all your business functional leads who would be affected by a disaster together to do scenario planning.”
Hal: To assess your plan, get all your business functional leads who would be affected by a disaster together to do scenario planning. Pretend something just happened, discuss what happens next, walk through what employees see, what customers see, what the users see, what you say to the press. If a datacenter in New York City goes down, you can turn on a secondary site in New Jersey. But what if the system administrators can't physically get out there? You may have this assumption that you just turn on the other site when in fact there may be no one physically there to do it. Or phone access may be disrupted. You begin to look at other ways of remote management and what resources you can call on.
Start at the physical layers to determine what you’re protecting and how you provide recovery for it. Decide who has access to what and create internal and external communications plans. Then layer on various risk management plans. The biggest component is understanding the architecture of everything you’re trying to protect, so you don’t leave out something that may not be under your control like your extra network link or the ability to shift your main operations from one location to another so that everything continues to work. It's not just a question of replicating components. At some point you do a post-event analysis to see what worked and what didn't. You always discover what breaks when it's under the most duress.
Q: Is BC/DR cheaper or more expensive to implement today than two or three years ago?
Randy: It's a lot more expensive if you continue to do things without thinking about it. It's a lot less expensive if you build a tiered infrastructure that's appropriate to the needs of the business.
Hal: It's cheaper to implement because you’re hopefully thinking about being able to minimize the overhead cost of implementing disaster recovery or business continuity. At the same time, more infrastructure now is critical to the overall operation of the business such that there are more things needing to be included in the business continuity plan.
Q: Many companies are taking BC/DR back in house. Thoughts?
Randy: Good for them. I believe people should be the masters of their destiny, and too often I think people are misled by external interests that aren't aligned with the needs of their business. If companies understand the business needs, the cost of operations, the costs to sustaining an application's ability to deliver value to the business versus the cost to protect it, they’re in a better position to make good decisions.
Hal: I see many new companies looking to outsource their entire infrastructure. They rent infrastructure rather than buy it, but they still have responsibility for building the plans around business continuity. So it's not a question of who owns the servers, or who has the license keys for the software — it's a question of who has responsibility for ensuring the continuous operations of the infrastructure. I see more and more companies owning up to their responsibilities there, regardless of who has the data center operational keys.
About Hal Stern
Hal Stern is Sun’s senior vice president of Systems Engineering at Sun Microsystems, with responsibilities for technical leadership, training, and management of Sun's customer engineering teams in Global Sales and Services. In his 17 years with Sun, Stern has been CTO of Software, CTO of Sun Services, chief architect of Sun Professional Services, and CTO for Sun infrastructure products. His technical interests include security, performance, reliability, massive scale of networked systems, and data management models for the "read-write Web."
About Randy Chalfant
Randall Chalfant is responsible for Marketing Strategy and is the chief technology officer at Sun in Louisville, Colorado. Chalfant's expertise is represented by over 33 years experience in business development, storage, mainframes, open system servers, operating systems, applications, and networking solutions. Chalfant is responsible for a variety of technical, communication, and strategy goals at Sun. His responsibilities include driving the strategic sales process, and the analysis and development of advanced storage technologies and business strategies that drive new and emerging opportunities.
|