Regular testing is vital component of disaster recovery planning

Issued by ContinuitySA
Johannesburg, Jun 30, 2015

Management 101 says: "If you can't measure it, you can't manage it." Disaster recovery 101 puts it somewhat differently: "If you don't test it, then it's not really a disaster recovery plan."

"When a disaster happens, when the power outage drags into the second day, is no time to discover that some component of your recovery plan isn't performing," says Peter Westcott - Senior BCM Advisor at ContinuitySA.

"The only way to ensure that there are no nasty surprises is to document the recovery procedures and schedule regular testing of the entire business continuity plan, including the IT disaster recovery portion. Care has to be taken to ensure the IT disaster recovery plan, usually the preserve of the IT department, is aligned with the overall business continuity plans that cover the business processes and people," he adds.

At a practical level, business continuity testing is often performed on a unit-by-unit basis, but it makes sense to test the effectiveness of the IT disaster recovery plan as a whole - in fact, IT testing can be run separately. Virtualisation can be successfully utilised to mirror the entire production environment, and such a test could be performed without any disruption to the normal running of the company. In fact, switching between the production and disaster recovery environments would be a normal routine.

Disaster recovery is there to give alternative processing infrastructure while your primary systems are down. In organisations where IT is a big part of their product or service delivery, this option becomes even more crucial. The data available in DR is normally 24 hours older than in production, and as a result, many organisations cannot afford to lose (or recapture) that amount of data, so they opt to rather stay down and fix the production problem.

This is a symptom of inefficient data backup policies as it only addresses black swan events or satisfies audits, instead of enabling business like it is supposed to. Case in point is FNB, which suffered an outage for several hours yesterday. The question must be asked as to why they never switched to DR and the answer may well be that they don't backup or replicate their data often enough to reduce the data loss to a tolerable amount. Sadly, the end result of this approach is that the DR infrastructure is merely there as a false sense of security, when in reality the intention is never to actually use it.

"Considering that we are seeing hardware failure from power outages, and many insurance companies no longer cover equipment damage in that event, a proper DR policy with shortened replication times and regular testing will definitely be beneficial to the organisation's survival in the short term. This does not mean that money has to be spent. There are many good options available to shorten the recovery window, and in most cases, organisations already have the tools without realising it," Westcott concludes.