Getting Back to I&T Basics: The Art of War is probably the most famous treatise on military strategy, but it’s actually the lesser known Dē Rē Mīlitārī, written by Flavius Vegetius Renatus, that yields perhaps the most commonly quoted strategy phrase: “Igitur qui desiderat pacem, praeparet bellum” or “therefore, he who desires peace should prepare for war.” The idea of preparing for battle in the business world is not as extreme as you might think. It’s just a different kind of battle.
Back in Article II, we spent some time talking about identifying and mitigating risk, and admittedly, there are a lot of risks out there. In my mind, every organization’s biggest risk is the same. What happens if an event occurs that causes the organization, or even just certain key parts of it, to cease operating as needed? What happens when I cannot service my customers? This, ladies and gentlemen, is the war organizations must prepare against. Policies can be made, risks can be mitigated, controls can be implemented, but at the end of the day the question of a disaster or an incident occurring that disrupts your organization is not a question of if but rather a question of when.
If you do not believe me, then look no further than 5 years ago. COVID-19 brought a crisis to every single organization in trying to determine how to continue operating. When the necessary number of staff could not go to the office or operate manufacturing facilities. When the components needed to build products could not be sourced due to supply chain restrictions. The upstream and downstream impact was monumental. Pandemic planning is just one component of Business Continuity Management, but it drives home the importance of creating the plan today, so that the organization can be ready to respond to the crisis or incident that might occur tomorrow.
- Article I – Foundations First: Crafting Effective IT Governance and Policies
- Article II – Strengthening the Pillars of Governance: IT and Vendor Risk Management
- Article III – Essential Safeguards: Building Your IT General Controls Framework
- Article IV – Resilient by Design: Fundamentals of Business Continuity and Incident Response
In the fourth installment of the ‘Returning to I&T Basics’ series, we will examine key elements involved in developing Business Continuity Management (BCM) and Incident Response (IR) plans. While there is a distinction between these two types of plans, both serve important roles in protecting an organization from availability risks. Effective planning and preparation today can help maintain organizational resilience against disruptions and support the protection of data, operations, and reputation.
Understanding Business Continuity Management
Business Continuity Management planning is a comprehensive approach designed to enable an organization to resume delivering its critical services or products at predefined levels following a disruption. At its core, BCM serves to minimize an organization’s operational downtime, risk of financial loss, and reputational damage due to a disaster or similar incident.
The creation of business continuity and incident response plans isn’t just a sound practice, regulatory bodies across various industries mandate the implementation of robust BCM practices to ensure resilience against crises. Financial institutions must adhere to guidelines from the Federal Financial Institutions Examination Council (FFIEC). Healthcare organizations follow directives from the Health Insurance Portability and Accountability Act (HIPAA) and the Joint Commission. The energy sector is guided by standards from the North American Electric Reliability Corporation (NERC). US community water organizations are subject to the to the 2018 – America’s Water Infrastructure Act.
Despite strict requirements, we should not approach BCM as another aspect of regulatory compliance. Simply developing a plan to check a box will doom efforts from the start; rather, we should approach BCM by fostering a culture of resilience and preparedness throughout our entire organization. By adopting this mindset, companies can better navigate through unexpected disruptions and maintain their competitive advantage.
Key Components of a Business Continuity Management Plan
1. Business Impact Analysis (BIA)
The first step, and arguably the most crucial, in developing a BCM plan starts with conducting a thorough Business Impact Analysis. A BIA identifies an organization’s business processes (i.e., business units, systems, applications, etc.), assessing the potential impact of a disruption to each process and how it affects the business as a whole. The BIA then determines the prioritization of resources and identifies any process interdependencies. The result should be three key pieces of information for each process:
- Recovery Time Objective (RTO): The designated period within which a business process must be restored to enable the resumption of operations.
- Recovery Point Objective (RPO): The maximum allowable amount of data loss, measured in time, that can be tolerated before restoring from backup while maintaining minimal operational impact.
- Maximum Tolerable Downtime (MTD): The longest duration that a business process may remain unavailable before resulting in critical or irreversible consequences.
2. Risk Assessment
Following the BIA, organizations should conduct comprehensive risk assessments to identify threats and vulnerabilities that could disrupt operations. Typical threats include natural disasters, cyber-attacks, equipment failures, and human errors. Risk assessments prioritize these threats based on likelihood of occurrence and potential impact, laying the groundwork for actionable risk mitigation strategies.
3. Recovery Strategies Development
Once critical operations and risks have been determined; organizations should design specific recovery strategies tailored to mitigate identified threats. Recovery strategies must address the necessary expertise, personnel, facilities, and equipment needed to resume operations. This may include data backups, alternate site arrangements such as hot, warm, or cold sites, cloud solutions, redundancy in systems, and clear, actionable recovery procedures.
4. Plan Development and Documentation
A BCM plan details the strategies, procedures, roles, responsibilities, and resources required to swiftly recover critical business operations. Plans must be clear, actionable, and accessible, outlining steps employees should follow during disruptions. Documentation must include detailed recovery timelines, emergency contact lists, vendor agreements, and roles and responsibilities matrices. Please remember that there are certain key responsibilities that need to be explicitly assigned, such as who determines when the BCM plan is implemented and who is responsible for making external communications with customers, regulators, and law enforcement.
5. Training and Awareness
Effective BCM relies heavily on people. Staff training and awareness programs ensure employees understand their roles and responsibilities in the event of a disruption. Regular training sessions, mock drills, tabletop exercises, and awareness campaigns reinforce preparedness and improve response efficiency during actual incidents.
6. Testing and Maintenance
Regular testing and updating of the BCM plan is essential to maintain effectiveness. Testing exercises such as technical recovery tests, drills, simulations, and tabletop exercises validate the plan’s practicality and identify gaps or areas for improvement. Be sure to benchmark any recovery tests against the RTOs established within the plan to ensure their validity. Additionally, a systematic maintenance schedule ensures plans stay current, reflecting organizational changes, emerging threats, or lessons learned from previous incidents.
Incident Response Planning Fundamentals
In recent years, the line between BCM and incident response (IR) has become increasingly blurred. Where BCM traditionally provides a broad framework for resuming operations following a disruption, often with an emphasis on large-scale events or disasters, IR focuses more narrowly on addressing cybersecurity incidents as they unfold. An effective IR plan emphasizes rapid detection, containment, and recovery to minimize impact, protect critical assets, and support the timely restoration of normal business operations.
Core Elements of an Incident Response Plan
1. Preparation
Just like with BCM, an incident response team should be assembled, with clearly defined roles and responsibilities. Preparation must also address the necessary resources, tools, and communication protocols needed to set the stage for a prompt and effective response, ultimately reducing incident recovery time.
When an incident occurs, reach out to your cyber insurance carrier early, but you don’t have to wait for an incident to take advantage of their knowledge, skillset, and resources. A simple conversation today might mean a few minutes saved during an incident tomorrow, and with incidents, every minute counts.
2. Identification
Accurate and swift incident identification is crucial. Organizations must have established processes for monitoring and detecting anomalies, security breaches, or operational disruptions. Define clear criteria of indicators of compromise, as well as procedures to help determine whether an incident requires activation of the IR plan.
Because incidents vary widely in scope and severity, organizations should establish a formal classification scale to guide the level of response. Paired with this, tailored playbooks for different incident types ensure that responders have predefined steps ready for execution. Together, these tools provide both the decision framework and the practical guidance necessary to react appropriately — whether dealing with a quarantined virus on a single endpoint or a ransomware outbreak impacting critical servers.
3. Containment
Once identified, incidents must be swiftly contained to minimize damage. Containment strategies include isolating affected systems, shutting down impacted services, or applying immediate corrective actions to prevent further damage or data loss. Effective containment limits the incident’s spread, significantly reducing overall impact.
4. Eradication
After containment, focus should shift to eradicating the root causes of the incident. Eradication might involve removing malware, patching vulnerabilities, repairing affected systems, and confirming that threats have been completely eliminated. Thorough eradication ensures that threats do not reemerge.
5. Recovery
Recovery focuses on restoring affected systems and operations to their normal states. This includes verifying system integrity, restoring data from secure backups, and gradually resuming business operations. Recovery phases are critical and must be managed systematically to avoid introducing new vulnerabilities or errors.
6. Lessons Learned and Improvement
After each incident, the IR team should complete post-incident reviews to analyze what occurred, how it was handled, and what could be improved. Lessons learned sessions contribute to continuous improvement by identifying necessary updates to the IR plan, training programs, and security measures.
You do not need an incident to learn lessons and identify plan improvements. Complete an annual tabletop, scenario-based exercise with the IR team. Not only does this give the team a chance to logically step through the plan, but it also provides practice in making decisions as well. Keep in mind though, that you get out what you put in.
Integration of BCM and IR Plans
Though distinct, BCM and IR plans share complementary objectives and must be closely integrated. Effective integration ensures that immediate incident responses smoothly transition into longer-term recovery strategies. Coordinating both plans prevents duplication, confusion, and operational inefficiencies.
Best Practices for Effective Integration:
- Unified Command Structure: Clearly define leadership and reporting structures across BCM and IR teams.
- Shared Resources and Communication Tools: Common communication channels and shared resources improve coordination and response times.
- Joint Exercises and Drills: Regularly conducted joint exercises strengthen collaborative effectiveness and ensure cohesive responses.
- Integrated Documentation: Consolidating BCM and IR documentation facilitates consistent and cohesive incident management.
Common Challenges and Solutions
Organizations often encounter several challenges when developing BCM and IR plans. Understanding these challenges and proactively addressing them significantly strengthens preparedness.
- Insufficient Management Support: Ensure executive buy-in through demonstrated ROI and alignment with organizational strategic objectives.
- Outdated Plans: Regular updates and scheduled reviews are critical for plan effectiveness and relevance.
- Poor Employee Awareness: Establish frequent and effective communication, training programs, and awareness campaigns.
- Complex Documentation: Simplify plans to maintain clarity and ensure accessibility during incidents.
Conclusion
Business Continuity Management and Incident Response planning are the bedrock of organizational security and resilience. If you want to safeguard organizational stability in times of “peace,” the organization must prepare for the rigor of “war” (i.e., business disruptions, disasters, and cyber events). Through methodical assessments, clear documentation, rigorous testing, and ongoing maintenance, as well as training employees and securing leadership commitment, you can be assured that every layer of the organization stands ready when disruptions strike. By embracing this mindset, organizations can withstand turbulence, recover swiftly, and continue operating with confidence, proving that true resilience is built not in calm, but in preparation for conflict.
Join us next month as we learn about cybersecurity fundamentals, which is perfect since October happens to be Cybersecurity Awareness month. Until then, stay frosty.