itinfraworld

Crisis and Incident Management for the 21st Century

By Louis Grosskopf, General Manager, Business Continuity Software, Sungard Availability Services

Louis Grosskopf, General Manager, Business Continuity Software, Sungard Availability Services

The terrorist attacks of 9/11 should serve as a wake-up call: The 20th century approach to business continuity planning is no longer adequate to the demands of an increasingly interconnected world, buffeted by terrorism, unpredictable weather, and cybercrime. On that day, the incident response team for a major transportation coordinator picked up its 500-page business continuity document, said, “Well, this isn’t going to work,” and literally threw it in the trash. The team then turned to a white board to figure out in real time how to respond.

Businesses now realize that they need to integrate the long-term, careful planning of business continuity programs with the needs of crisis and incident management teams that must respond in real-time in a crisis to answer these three key questions:

• What has happened?
• What is the impact to our people and our business?
• What do we do next?

Historically, business continuity planning has been conducted independently from incident response. Often, two different parts of the organization were responsible for each activity. Incident response addresses the actionable steps immediately following the disaster, while business continuity focuses on the subtle nuances of businesses processes and employee safety.  There was a metaphorical wall between the team tasked with managing an incident and all of the rich data available in the business continuity plan.

In a complex enterprise with massive amounts of inter-related and constantly changing systems, processes and data, static business continuity plans are nearly useless. Having a business continuity plan in place may satisfy an auditing requirement, but such a “check the box” approach will never support an effective response in a real event. Even if a static plan is updated annually, chances are it will be out of date or forgotten when needed. In this day and age, because of all of the great technology we have, such plans should be updated in real time. When it’s time to create a new business process, that’s the time to think about whether it’s critical to the organization or not. After all, if it’s not critical, then why is it being added in the first place?

Scenario-based crisis and incident management responses are highly dependent on pristine data and the relationships that are established between teams during the planning process, as opposed to just being “plans.” As the saying goes, “Garbage in, garbage out.” Not only does the plan need to be based on clean, accurate, and up-to-date data, every data point needs to have context so that during an incident the respondents can understand the impact of every movement. The plan has to be able to see both the forest and the trees.

Incidents are never predictable.  Each scenario, each instance of an incident is unique. It could affect the whole town, could affect one building, could affect one room. There’s no cookie cutter plan that could cover everything.  Every response has to be made up in real time, and only those factors that apply to a given situation should be included in that specific response.

As part of the planning process, businesses need to determine the relationships between the different elements of the business and the infrastructure and making sure that the data is flowing properly between them. If a plan is good and the right data is in the data base, it will show the upstream and downstream relationships from the point of incident. One scenario might involve a truck crashing into a building and taking down half the data center. A plan built around a data base could demonstrate visually the area where the incident is located and all of the upstream and downstream impacts on business processes with dependencies on the equipment within that data center. It’s vital to depict in real time what’s affected based on a particular incident, all without having to flip through pages and pages of paper to figure out what needs to be done. In the heat of the moment, it is impossible to rely on a written plan.

Managing incidents and conducting plan or test exercises are really one and the same. Deviation from this approach creates unseen risk. The old style of thinking was that businesses needed to undertake test management, which was either a paper exercise or an informal employee fire-drill. Neither of these provide a test of how a business continuity plan might perform in a real incident. Truly, there shouldn’t be any difference in how a company approaches and executes a test and a real incident response. If you’re not testing something close to the real incident, it’s not much of a test. No one is going to blow up a building to see if the plan works, but a business can pretend it blew up and dutifully walk through the planned response. The key consideration is that the test feels like the real thing, and it should be tested as such and done in real-time.

Business continuity also needs to extend beyond the borders of the organization. The Japanese tsunami and the Thai floods of 2011 show how vulnerable modern supply chains can be to region-wide events on the far-side of the globe. Hard disk drives were suddenly in short supply when all the Thai manufacturing facilities representing 40 percent of the worldwide supply of hard drives, along with the facilities of all their suppliers, were suddenly inundated with a history-making flood. The PC manufacturers on the other side of the world probably didn’t even know that their supply chains were so vulnerable to a flood in a single area.

A business continuity plan is an integral part of any business, ensuring that the most vital functions of the business continue when an incident occurs. Incident response deals with events as they unfold, in real time. In the world of the 21st century, the two functions must be integrated so that the incident response and business continuity ensure that any response is both timely and effective.

What is needed is some way of taking the embedded knowledge available within the business continuity and make it available to the crisis response team in the form of a checklist or a dashboard that tells them exactly what needs to happen, in what order, and what upstream and downstream dependencies might be affected by the incident. This sort of flexibility could never be accommodated with a static, printed document. Even PDFs are not dynamic enough. Instead, the relationships and dependencies need to be tracked by a database, one that can provide written instructions on the fly.

New Editions