Incident Response
Incident response teams jump into action when cyber defenses fail. Through planning and preparation, their response templates provide the direction and authority to identify, contain, and recover from a cyber incident. But what they don’t have in advance are the insights necessary for quick, effective action.
Here’s what you need to know about incident response management and data analytics’ role in shutting down an attack.
What is the difference between SOC and incident response?
Although the responsibilities of an incident response team and a security operations center (SOC) overlap, it is incorrect to say that the former is a subset of the latter.
A SOC consists of IT and security professionals whose responsibilities encompass all aspects of day-to-day cybersecurity. They continuously monitor and manage an organization’s layered security defenses while maintaining threat intelligence. In many cases, the initial detection of security breaches happens within the SOC.
In contrast, an incident response team is both more narrowly focused and organizationally diverse. An incident response team plans for, responds to, and evaluates the impacts of security events. Other than the core process owners, team members work in other areas of the company. They are activated as needed, depending on the nature of the incident.
What is the difference between incident response and disaster recovery?
Organizations must plan for any source of disruption, from natural disasters to pandemics to cyberattacks, to ensure business continuity. In that respect, incident response is one type of disaster response. Where they differ is their frequency.
Global disease outbreaks and natural disasters are rare events that could significantly disrupt a company’s operations. Although difficult to predict far in advance, companies can have general plans ready for various events and then tailor those plans for a specific event should it happen.
Disruptive cyber threats are ubiquitous and continuous. Low-level security events occur constantly as employees open email attachments and firewalls go unpatched. These events can easily transition to full-blown security incidents. Many get addressed within the SOC, but the most severe events trigger the broader incident response team and may cross into disaster recovery.
What are the 4 elements of a good incident response plan?
NIST’s Computer Security Incident Handling Guide defines a four-stage incident response life cycle: Preparation; Detection & Analysis; Containment, Eradication & Recovery; and Post-Incident Activity. Although written for federal agencies, this framework can help any private or public organization develop an effective incident response process.
1. Preparation
The initial phase of the life cycle involves putting in place the technologies, people, policies, and plans required to address security events when they happen.
Developing an incident response policy
This policy is an executive-led statement of the incident management process’s purpose, scope, and objectives. It will document the company’s definition of an incident and prioritize responses based on severity ratings.
In addition, the policy will establish an organizational structure along with the roles and responsibilities of team members within the security organization and the organization at large.
Finally, the incident response policy will authorize the team’s actions during an incident. This pre-authorization bypasses multi-layer review processes so the team can act quickly to end the security incident and mitigate its impacts by, for example, shutting down an on-premises network.
Forming an incident response team (IRT)
With the formal policy approved by the C-suite, the next step is to form the incident response team (IRT), or as different companies may call it, the computer security incident response team (CSIRT) or computer emergency readiness team (CERT).
The team leader must be a senior employee, as the position requires engaging with and gaining stakeholders’ support.
Drawing from IT and cybersecurity for technical expertise, the team membership will include representatives from legal, physical security, media relations, human resources, and other departments.
In many cases, companies will outsource aspects of incident response to compensate for resource limits or to bring in outside expertise.
Creating an incident response plan
An overall risk management strategy will have identified the range, probability, and potential impact of cyber security threats the company faces. The incident response plan should address how to detect, contain, eradicate, and recover from each threat.
These plans are the playbooks the team follows to triage an incident, identify its root cause, isolate affected systems, and bring the incident under control.
Establishing communication protocols
A communication plan is essential for effective incident response. The type and severity of each incident will determine the appropriate notifications. IRT staff may document relatively minor events in a daily incident report. With increasing severity, notifications will appraise department heads, executives, the C-suite, or the Board.
Regulations will require notifications to some combination of local, tribal, state, and national law enforcement organizations (LEOs). Again, severity matters. An annual security report may list minor security breaches, while severe events require LEO notification within 24 hours. More immediate communications will be necessary should affected systems threaten public safety.
Incident severity will also determine whether, or how quickly, the company must alert the press.
Having these protocols in place ensures prompt, accurate, and responsible reporting.
Acquiring and configuring tools
Although the details will vary from company to company, IRTs should put in place all the resources they need to support their response efforts. These security tools should not be reliant on a potentially compromised network infrastructure. For example, offline documentation of operating systems, applications, port lists, and network diagrams can be valuable references during an attack.
NIST suggests preparing “jump kits” for the IRT. These kits include cables, networking equipment, and laptops with forensic software.
2. Detection and analysis
The NIST incident response framework’s second phase is the early detection and analysis of potential security incidents.
Monitoring and detection
A continuous monitoring system should review network traffic and other activity logs for signs of potential incidents. However, networks generate so much real-time data they would overwhelm any attempt at manual monitoring. Technology like a security information and event management (SIEM) system will automate monitoring activities and prioritize alerts that require human intervention.
Initial assessment
Given the volume and diversity of incoming alerts the IRT must triage incoming incidents. Some incidents will be obvious, but many signals of potential incidents are subtle. An initial assessment will provide enough context about the incident’s scope, source, and status for the incident lead to prioritize the next steps.
Data collection and analysis
Before activating an incident response plan, the IRT must quickly document as much as possible. To do this, they must gather and synchronize forensic data such as logs and network traffic. If server log time stamps lag a router’s, correlating and analyzing data becomes challenging. In addition, compiling historical data will provide a helpful baseline for “normal” behavior
3. Containment, eradication, and recovery
The IRT can prioritize the response with sufficient documentation of the incident. However, this prioritization happens on multiple axes. The potential impact on operations, the threat to sensitive data, and the cost of recovery all inform how to approach the incident life cycle’s third phase.
Containment
By the time the IRT engages, the breach is already underway, so the initial priority is to get it contained before it gets worse. Each incident type will require its own containment tactics. As an example, isolating affected systems or networks can prevent the incident from spreading.
Eradication
Once contained, the IRT must identify the incident’s root cause and presence in the network before it can eradicate the threat. More than just deleting malware from systems, eradication includes the remediation of the exploited vulnerabilities.
Recovery
Once the IRT neutralizes the threat, its next priority is to restore operations to their normal state. Severe events may require rebuilding or replacing compromised systems. In addition, any security weaknesses identified during the response must be addressed.
For events that trigger LEO notifications or press briefings, the recovery process may continue long after the technical remediation as the company deals with investigations, audits, and media coverage.
4. Post-incident activity
Effective incident response requires continuous process improvement. Lessons learned from each incident should feed forward so the company can better handle future attacks.
After-incident briefings can yield insights that enhance all incident response services. For example, descriptions of a novel spear phishing attack should be included in employee security training.
Accelerate incident response insights with Starburst Galaxy
Starburst’s data lake analytics platform can play an essential supporting role in your security incident response plans. By making data easier to access and analyze, IRTs can analyze incidents faster to improve response effectiveness.
Starburst Galaxy is the universal discovery, governance, and sharing layer of the Starburst platform. It streamlines the discovery of structured and unstructured data and the sharing of urgent information during an incident.
Reliable, efficient, and performant, Galaxy speeds analysis when IRTs need them the most. After the fact, Galaxy can provide actionable insights for post-incident reports and continuous improvement processes.