Incident response planning can be risky business. It requires you to consider what could happen in the future, so you can best plan for it now. Shorting the predictions can result in severe strategic, reputation, operational, or financial failures, but overestimating the risk exposure can be inefficient and costly, taking resources away from other important business and security matters.
So, where do you draw the line?
The best incident response planners know the secret to planning can be found in one simple word: effective. While it is not possible to predict every type of incident, you can form a plan to be used as a guide for all your responses. This is a core idea in the NIST Computer Security Incident Handling Guide. NIST recommends following a six-phase approach to handling incidents. To best understand this topic, we will explore the six phases in this article and provide practical examples of each.
Detection is the phase in which your organization is alerted to a suspicious event. For your incident response plan to be most effective, your detection plans should answer two questions:
- What methods do you have in place to identify potential incidents?
- When a possible incident is detected, what are the first steps you should take?
Detection methods are most successful when they exist in a layered environment. Having multiple channels for detection casts a wide net and can help you see more clearly if an event is an incident because of the multiple points of similar data (event correlation). Some common detection methods include:
- Alerts: These include intrusion detection system (IDS) alerts, security information and event management (SIEM) system alerts, anti-malware application alerts, etc.
- Logs: These include operating system logs, service logs, application logs, etc.
- Publicly Available Information Sources: These include mailing lists, press releases and briefings, social media posts, other web posts, etc.
- People: These include employees, customers, third parties, etc.
Each of these detection methods can provide valuable information regarding an incident's authenticity, origins, and/or activities. Documenting the detection methods used by your organization can help identify any gaps in detection, which can result in a more adaptable plan for any type of incident.
When all the pieces start to come together and a suspicious event is detected, your plan should outline three steps to take during the detection stage:
- Assign a lead incident handler. The lead incident handler functions as the primary contact, accepts overall accountability for the incident's handling, and coordinates activities among other handlers, as needed.
- Document detection details. Be sure to include things like the date and time the incident occurred, location of the incident, and details of affected systems (e.g., serial numbers, host names, IP addresses, etc.).
- Begin recording facts. As the response process begins, record all facts about the incident, including steps taken and evidence gathered. Thorough documentation can be referenced and used to assist with the response process, as well as serving as proof in the event of legal proceedings. Exercise discretion when recording facts about incidents which affect personally identifiable information or confidential documents, as documenting these specifics could result in further unauthorized disclosure.
When a potential incident is detected, the early steps can be some of the most critical. It is important to have first steps outlined to ensure that regardless of the type of event, personnel know what to do, when to do it, and who is responsible for initiating the response process.
Analysis is the phase in which your organization takes steps to verify whether a suspected event is an incident or not. For your incident response plan to be most effective, your analysis plans should include details to help you answer two questions:
- What characteristics can you observe about the event?
- How should you classify the incident?
Regardless of the type of event, your plan should encourage incident handlers to look at four characteristics:
- Scope: How wide-reaching is the event? Determine if the event is localized to a certain area, system, user, etc., or if it is a widespread occurrence.
- Origins: Where did the event begin? Trace the incident's origins beyond the observable symptoms and back to the root cause.
- Occurrence Patterns: What is the event currently doing? If the event is spreading, determine how and at what rate.
- Recurrence Details: Has this event happened before? If so, determine what controls were missing or failed to prevent the event from happening again.
With the answers to these questions in mind, handlers can confirm if the event is an incident and begin walking through the classification process.
Classifying an incident allows your team to communicate about the nature and scope of an incident, determine which response plans should be implemented, and enable trend analysis during post-incident activities. Two common forms of classification include:
- Categories: Example categories could include data breach, malicious code, social engineering, third party incidents, etc.
- Severity: Severity is often placed on a scale (e.g., low, medium, high, etc.), determined by evaluating the incident's functional impact, information impact, and recoverability.
Including classification strategies in your response plan helps ensure you have the resources you need, so you don't have to make as many decisions in the moment.
Once an incident has been analyzed, your plan should include containment strategies to help you isolate affected areas. For your incident response plan to be most effective, your containment plans should guide your team to answer the following questions:
- Could the containment strategy interfere with evidence preservation or service availability? If so, and depending on how much it could interfere, you may wish to implement compensating controls or evaluate other options.
- How much time and resources will be invested in containing the incident? Containment should be thorough, but also efficient.
- If you don't take steps to contain the incident, either by choice or by oversight, how much damage could the incident cause? Refer to the analysis details to help you determine this effect.
- How much of the incident would be contained and how long would the solution last? Some solutions are temporary, while others are permanent.
The containment strategy will be largely determined in the moment, as it will depend on the nature of the incident, as well as the affected systems, areas, and/or people. By defining containment considerations up front, you can guide decisions and help ensure the chosen plan is the most effective and prudent for your organization.
While the containment phase is all about stopping the incident from spreading, eradication is about getting rid of it entirely. Like containment, eradication strategies are largely going to depend on the specific incident, and at times, be joined with the recovery phase. For your response plan to be most effective, eradication strategies should help you answer these questions:
- What steps will you take to eradicate the incident? Some examples of common eradication strategies include malware removal, disabling compromised user accounts, reimaging compromised systems, patching exploited vulnerabilities, and mitigating misconfigurations.
- Will the eradication strategy mitigate the incident on all affected areas? At this point, it is important to look back on the Analysis phase to ensure the root cause of the incident has been resolved. Confirm all instances of the incident are fully eradicated. Otherwise, the incident response is not complete.
During the eradication process, if you find the incident has affected other systems, further analysis and containment should be performed.
The recovery phase is the part of the response plan in which affected areas are returned to normal operation. For your incident response plan to be most effective, your recovery strategies should include recommended steps to follow to achieve this goal. Examples may include restoring data from a backup, rebuilding servers, replacing hardware, reprovisioning accounts, etc. The ultimate recovery goal is to do what you can to return your systems to a secure state and protect them from recurrence of the same incident. Once processes and systems are restored, they should be tested to ensure they are fully functional and ready to return to production.
The final phase in responding to an incident is postmortem. Your plan should define the actions that come after incident recovery to review and document lessons learned from the incident. Three common activities included as part of the postmortem include:
- Reporting: The final version of an incident report should include a timeline of events, exposure of effects from the incident, response actions, and a monetary cost estimate. The report may be used in reporting to senior management, the board of directors, auditors, examiners, or even during legal proceedings, so it is important for the report to be comprehensive, yet easy to understand.
- Meeting: Depending on the incident's severity, a lessons-learned meeting should be held with involved parties. During the meeting, attendees should review what occurred, which steps were taken to intervene, and outcomes of the handling process.
- Effectiveness Review: Data gathered from postmortem activities should be used to assess the incident response function and identify potential areas of improvement in preparation for future incidents. For example, lessons learned can be used to update the plan or improve template action steps for further incidents.
The goal of the postmortem is to take what was learned from the incident and use the information to further improve the effectiveness of the incident response plan.
Bringing It All Together
As incidents continue to increase in frequency and severity, a well-documented plan for the six-phases of incident management is a must for any organization. Building the plan alone can be challenging, but with the right partner, your plan can be effective and ready for any type of incident that comes your way. Tandem Incident Management is designed with organizations, like yours, in mind. Tandem's flexible framework includes an incident response plan component, as well as an incident tracking system, designed to walk your organization through developing and using the six-phase plan. To learn more about Tandem Incident Management, visit Tandem.App/Incident-Management-Software.