The exponential growth of data and the reliance of business on IT have forced IT to provide uninterrupted services. Unfortunately, data centers are not exempt from experiencing a disaster. Most data centers are built with high availability, but a major disaster caused by human error, system breakdowns, or natural disasters can bring the business down to its knees, without knowing if or when it will happen to you. Thus, it’s important to have a Disaster Recovery Plan (DRP) and a Business Continuity Plan (BCP). While the DRP addresses the technology elements of continuity, a business continuity plan incorporates organizational and human resources issues such as communications plans and crisis management.
According to industry data, software and hardware failures account for about half of unacceptable downtime. Less than a quarter of outages result from major events, such as fires and natural disasters. Hence, it’s important that agencies incorporate DR scenarios into service management processes. This should help leadership make better decisions about how to respond to less obvious DR scenarios, when to escalate those incidents, and whether to initiate recovery procedures rather than continue troubleshooting.
“Our DRP and BCP solution needs to meet a multitude of requirements for FDA to continue its mission during a disaster”
The FDA currently operates several data centers and our minimum goal is to maintain redundant infrastructure between two centers and leverage the cloud where it makes sense. In the event of a disaster at one of the sites, we restore to alternate computing infrastructure (DR site). Finally, defining the Recovery Point Objectives (RPO) FDA to identify and manage the maximum targeted period in which data might be lost due to a major incident.
The FDA’s disaster recovery solution encompasses a large variety of technologies and processes. Our DRP and BCP solution needs to meet a multitude of requirements for FDA to continue its mission during a disaster. During our planning, a risk analysis was performed to determine the greatest threat that could impact production. It was discovered that by following best practice industry standards, the FDA maintains an appropriate level of redundancy within each of the data centers. The backup and recovery process are in line with industry standards and will protect the FDA against most types of data loss.
Resources are limited, so it is important to identify business critical applications essential to operations. It’s also important to weigh the risk against the benefits and costs. In the absence of any DRP or BCP there is a significant level of assumed risk. When we commenced our DRP, we needed to perform a thorough discovery of all of production applications. A Business Impact Analysis (BIA) was completed for each application to set priority. The applications were categorized into disaster recovery tiers with the initial focus on high (the most critical) applications.
Lastly, we explored multiple options for a dedicated DR site. An extensive analysis was done to compare the solutions and a shortlist of options was created. After completing a thorough site and cost analysis, it was clear that hosting a solution at one of the FDA facilities would provide the FDA with the greatest chance at meeting its recovery goals. The final FDA Disaster Recovery solution will include automated recovery of the critical applications within the FDA.