Computer security incident management

Last updated

In the fields of computer security and information technology, computer security incident management involves the monitoring and detection of security events on a computer or computer network, and the execution of proper responses to those events. Computer security incident management is a specialized form of incident management, the primary purpose of which is the development of a well understood and predictable response to damaging events and computer intrusions. [1]

Contents

Incident management requires a process and a response team which follows this process. This definition of computer security incident management follows the standards and definitions described in the National Incident Management System (NIMS). The incident coordinator manages the response to an emergency security incident. In a Natural Disaster or other event requiring response from Emergency services, the incident coordinator would act as a liaison to the emergency services incident manager. [2]

Overview

Computer security incident management is an administrative function of managing and protecting computer assets, networks and information systems. These systems continue to become more critical to the personal and economic welfare of our society. Organizations (public and private sector groups, associations and enterprises) must understand their responsibilities to the public good and to the welfare of their memberships and stakeholders. This responsibility extends to having a management program for “what to do, when things go wrong.” Incident management is a program which defines and implements a process that an organization may adopt to promote its own welfare and the security of the public.

Components of an incident

Events

An event is an observable change to the normal behavior of a system, environment, process, workflow or person (components). There are three basic types of events:

  1. Normal—a normal event does not affect critical components or require change controls prior to the implementation of a resolution. Normal events do not require the participation of senior personnel or management notification of the event.
  2. Escalation – an escalated event affects critical production systems or requires that implementation of a resolution that must follow a change control process. Escalated events require the participation of senior personnel and stakeholder notification of the event.
  3. Emergency – an emergency is an event which may
    1. impact the health or safety of human beings
    2. breach primary controls of critical systems
    3. materially affect component performance or because of impact to component systems prevent activities which protect or may affect the health or safety of individuals
    4. be deemed an emergency as a matter of policy or by declaration by the available incident coordinator

Computer security and information technology personnel must handle emergency events according to well-defined computer security incident response plan.

Incident

An incident is an event attributable to a human root cause. This distinction is particularly important when the event is the product of malicious intent to do harm. An important note: all incidents are events but many events are not incidents. A system or application failure due to age or defect may be an emergency event but a random flaw or failure is not an incident.

Incident response team

The security incident coordinator manages the response process and is responsible for assembling the team. The coordinator will ensure the team includes all the individuals necessary to properly assess the incident and make decisions regarding the proper course of action. The incident team meets regularly to review status reports and to authorize specific remedies. The team should utilize a pre-allocated physical and virtual meeting place. [3]

Incident investigation

The investigation seeks to determine the circumstances of the incident. Every incident will warrant or require an investigation. However, investigation resources like forensic tools, dirty networks, quarantine networks and consultation with law enforcement may be useful for the effective and rapid resolution of an emergency incident.

Process

Initial incident management process

Author: Michael Berman (tanjstaffl) Computer-security-incident-initial-process(high-res).png
Author: Michael Berman (tanjstaffl)
  1. Employee, vendor, customer, partner, device or sensor reports event to Help Desk.
  2. Prior to creating the ticket, the help desk may filter the event as a false positive. Otherwise, the help desk system creates a ticket that captures the event, event source, initial event severity and event priority.
    1. The ticket system creates a unique ID for the event. IT Personnel must use the ticket to capture email, IM and other informal communication.
    2. Subsequent activities like change control, incident management reports and compliance reports must reference the ticket number.
    3. In instances where event information is “Restricted Access,” the ticket must reference the relevant documents in the secure document management system.
  3. The First Level Responder captures additional event data and performs preliminary analysis. In many organizations the volume of events is significant relative to the staff. As a result, automation may be applied, typically in the form of a SOAR (security orchestration, automation and response) tool, [4] integrated with an intelligence API. The SOAR tool automates the investigation via a workflow automation workbook. [4] The cyber intelligence API enables the playbook to automate research related to the ticket (lookup potential phishing URL, suspicious hash, etc.). The First Responder determines criticality of the event. At this level, it is either a Normal or an Escalation event.
    1. Normal events do not affect critical production systems or require change controls prior to the implementation of a resolution.
    2. Events that affect critical production systems or require change controls must be escalated.
    3. Organization management may request an immediate escalation without first level review – 2nd tier will create ticket.
  4. The event is ready to resolve. The resource enters the resolution and the problem category into the ticket and submits the ticket for closure.
  5. The ticket owner (employee, vendor, customer or partner) receives the resolution. They determine that the problem is resolved to their satisfaction or escalate the ticket.
  6. The escalation report is updated to show this event and the ticket is assigned a second tier resource to investigate and respond to the event.
  7. The Second Tier resource performs additional analysis and re-evaluates the criticality of the ticket. When necessary, the Second Tier resource is responsible for implementing a change control and notifying IT Management of the event.
  8. Emergency Response:
    1. Events may follow the escalation chain until it is determined that an emergency response is necessary.
    2. Top-level organization management may determine that an emergency response is necessary and invoke this process directly.

Emergency response detail

Author: Michael Berman (tanjstaffl) Computer-security-emergency-response-process(high-res).png
Author: Michael Berman (tanjstaffl)
  1. Emergency response is initiated by escalation of a security event or be direct declaration by the CIO or other executive organization staff. The CIO may assign the incident coordinator, but by default, the coordinator will be the most senior security staff member available at the time of the incident.
  2. The incident coordinator assembles the incident response team. The team meets using a pre-defined conference meeting space. One of the (CIO, CSO or Director IT) must attend each incident team meeting.
  3. The meeting minutes capture the status, actions and resolution(s) for the incident. The incident coordinator reports on the cost, exposure and continuing business risk of the incident. The incident response team determines the next course of action.
  4. Lock-down and Repair – Perform the actions necessary to prevent further damage to the organization, repair impacted systems and perform changes to prevent a re-occurrence.
  5. False Positive – The incident team determines this issue did not warrant an emergency response. The team provides a written report to senior management and the issue is handled as either a normal incident or it is closed.
  6. Monitor and Capture – Perform a thorough investigation with continued monitoring to detect and capture the perpetrator. This process must include notification to the following senior and professional staff:
    1. CEO and CFO
    2. Corporate Attorney and Public Relations
  7. Review and analyze log data to determine nature and scope of incident. This step should include utilizing virus, spyware, rootkit and other detection tools to determine necessary mitigation and repair.
  8. Repair systems, eliminate vector(s) of attack, and mitigate exploitable vulnerabilities.
  9. The Test Report documents the validation of the repair process.
    1. Test systems to ensure compliance with policy and risk mitigation.
    2. Perform additional repairs to resolve all current vulnerabilities.
  10. Investigate incident to determine source of attack and capture perpetrator. This will require the use of forensics tools, log analysis, clean lab and dirty lab environments and possible communication with Law Enforcement or other outside entities.
  11. The “Investigation Status Report” as captures all current information regarding the incident. The Incident response team uses this information to determine the next course of action. (See Ref 2 and Ref 3)

Definitions

First Responder/First level review
first person to be on scene or receive notification of an event, organizations should provide training to the first responder to recognize and properly react to emergency circumstances.
Help Desk Ticket (Control)
an electronic document captured in a database and issue tracking/resolution system
Ticket Owner
person reporting the event, the principal owner of the assets associated with the event or the common law or jurisdictional owner.
Escalation Report (Control)
First Responder’s documentation for ticket escalation, the Responder writes this information into the ticket or the WIKI log for the event. The ticket references the WIKI log for the event.
Second Tier
Senior technical resources assigned to resolve an escalated event.
Incident Coordinator
individual assigned by organization senior management to assemble the incident response team, manage and document response to the incident.
Investigation Status Report (Control)
documentation of the current investigation results, the coordinator may document this material in the ticket, WIKI or an engineer's journal.
Meeting Minutes (Control)
documentation of the incident team meeting, the minutes document the attendees, current nature of the incident and the recommended actions. The coordinator may document this material in the ticket, WIKI or an engineer's journal.
Lock-down Change Control
a process ordered as a resolution to the incident. This process follows the same authorization and response requirements as an Emergency Change Control.
Test Report (Control)
this report validates that IT personal have performed all necessary and available repairs to systems prior to bringing them back online.
War Room
a secure environment for review of confidential material and the investigation of a security incident.
Report to Senior Management (Control)
the incident coordinator is responsible for drafting a senior management report. The coordinator may document this material in the ticket, WIKI or an engineer's journal

Incident Response Steps

See also

Related Research Articles

Computer security Protection of computer systems from information disclosure, theft or damage

Computer security, cybersecurity, or information technology security is the protection of computer systems and networks from information disclosure, theft of, or damage to their hardware, software, or electronic data, as well as from the disruption or misdirection of the services they provide.

Emergency service Organizations that ensure public safety and health by addressing different emergencies

Emergency services and rescue services are organizations which ensure public safety and health by addressing different emergencies. Some of these agencies exist solely for addressing certain types of emergencies whilst others deal with ad hoc emergencies as part of their normal responsibilities. Many of these agencies engage in community awareness and prevention programs to help the public avoid, detect, and report emergencies effectively.

Incident Command System Standardized approach to command, control, and coordination of emergency response

The Incident Command System (ICS) is a standardized approach to the command, control, and coordination of emergency response providing a common hierarchy within which responders from multiple agencies can be effective.

The United States Computer Emergency Readiness Team (US-CERT) is an organization within the Department of Homeland Security’s (DHS) Cybersecurity and Infrastructure Security Agency (CISA). Specifically, US-CERT is a branch of the Office of Cybersecurity and Communications' (CS&C) National Cybersecurity and Communications Integration Center (NCCIC).

Help desk and incident reporting auditing is an examination of the controls within the help desk operations. The audit process collects and evaluates evidence of an organization's help desk and incident reporting practices, and operations. The audit ensures that all problems reported by users have been adequately documented and that controls exist so that only authorized staff can archive the users’ entries. It also determine if there are sufficient controls to escalate issues according to priority.

Command center Place used to provide centralized command

A command center is any place that is used to provide centralized command for some purpose.

The New Zealand Co-ordinated Incident Management System (CIMS) is New Zealand's system for managing the response to an incident involving multiple responding agencies. Its developers based the system on California's Incident Command System (ICS) - developed in the 1970s - and on other countries' adaptations of ICS, such as Australia's Australasian Inter-Service Incident Management System (AIIMS).

An incident is an event that could lead to loss of, or disruption to, an organization's operations, services or functions. Incident management (IcM) is a term describing the activities of an organization to identify, analyze, and correct hazards to prevent a future re-occurrence. These incidents within a structured organization are normally dealt with by either an incident response team (IRT), an incident management team (IMT), or Incident Command System (ICS). Without effective incident management, an incident can disrupt business operations, information security, IT systems, employees, customers, or other vital business functions.

Critical infrastructure protection

Critical infrastructure protection (CIP) is a concept that relates to the preparedness and response to serious incidents that involve the critical infrastructure of a region or nation.

Database security concerns the use of a broad range of information security controls to protect databases against compromises of their confidentiality, integrity and availability. It involves various types or categories of controls, such as technical, procedural/administrative and physical.

Oklahoma Emergency Management Act of 2003

The Oklahoma Emergency Management Act of 2003 is an Oklahoma state law that replaced the Oklahoma Civil Defense and Emergency Resources Management Act of 1967 as the primary state law detailing emergency management in Oklahoma. The Emergency Management Act and the Catastrophic Health Emergency Powers Act together form the primary state laws regarding emergency and disastrous situations that may occur in the state.

Document Exploitation United States Armed Forces procedures to use documents seized in combat

Document Exploitation (DOCEX) is the set of procedures used by the United States Armed Forces to discover, categorize, and use documents seized in combat operations. In the course of performing its missions in the War on Terrorism, members of the United States Armed Forces discover vast amounts of documents in many formats and languages. When documents are suspected of containing information of potential intelligence value, rapid and accurate interpretation of the information identifies targets, bolsters success in subsequent operations, and enhances tactical and strategic all-source intelligence efforts. The sheer volume of documents acquired in the course of military operations can overwhelm a unit's capability to extract meaningful information in a timely manner.

Presidential Decision Directive 62 (PDD-62), titled Combating Terrorism, was a Presidential Decision Directive (PDD), signed on May 22, 1998 by President Bill Clinton. It identified the fight against terrorism a top national security priority.

The Emergency Data Exchange Language (EDXL) is a suite of XML-based messaging standards that facilitate emergency information sharing between government entities and the full range of emergency-related organizations. EDXL standardizes messaging formats for communications between these parties. EDXL was developed as a royalty-free standard by the OASIS International Open Standards Consortium.

Security information and event management Computer security

Security information and event management (SIEM) is a field within the field of computer security, where software products and services combine security information management (SIM) and security event management (SEM). They provide real-time analysis of security alerts generated by applications and network hardware. Vendors sell SIEM as software, as appliances, or as managed services; these products are also used to log security data and generate reports for compliance purposes. The term and the initialism SIEM was coined by Mark Nicolett and Amrit Williams of Gartner in 2005.

Information security operations center Facility where enterprise information systems are monitored, assessed, and defended

An information security operations center is a facility where enterprise information systems are monitored, assessed, and defended.

IT risk management

IT risk management is the application of risk management methods to information technology in order to manage IT risk, i.e.:

Emergency Responder Health Monitoring and Surveillance

Emergency Responder Health Monitoring and Surveillance (ERHMS) is a health monitoring and surveillance framework developed by a consortium of federal agencies, state health departments, and volunteer responder groups designed to address existing gaps in surveillance and health monitoring of emergency responders. The framework provides recommendations, guidelines, tools, and trainings to protect emergency responders during each phase of an emergency response, including pre-deployment, deployment, and post-deployment phases. ERHMS was designed to function within the Federal Emergency Management Agency's (FEMA's) National Incident Management System (NIMS), a systematic approach to emergency management. The ERHMS trainings satisfy Public Health Emergency Preparedness capability 14, "Responder Safety and Health."

Threat Intelligence Platform is an emerging technology discipline that helps organizations aggregate, correlate, and analyze threat data from multiple sources in real time to support defensive actions. TIPs have evolved to address the growing amount of data generated by a variety of internal and external resources and help security teams identify the threats that are relevant to their organization. By importing threat data from multiple sources and formats, correlating that data, and then exporting it into an organization’s existing security systems or ticketing systems, a TIP automates proactive threat management and mitigation. A true TIP differs from typical enterprise security products in that it is a system that can be programmed by outside developers, in particular, users of the platform. TIPs can also use APIs to gather data to generate configuration analysis, Whois information, reverse IP lookup, website content analysis, name servers, and SSL certificates.

ISO 22300:2021, Security and resilience – Vocabulary, is an international standard developed by ISO/TC 292 Security and resilience. This document defines terms used in security and resilience standards and includes 360 terms and definitions. This edition was published in the beginning of 2021 and replaces the second edition from 2018.

References

  1. "ISO 17799|ISO/IEC 17799:2005(E)". Information technology - Security techniques - Code of practice for information security management. ISO copyright office. 2005-06-15. pp. 90–94.
  2. "NIMS - The Incident Command System". National Incident Management System. Department of Homeland Security. 2004-03-01. Archived from the original on 2007-03-18. Retrieved 2007-04-08.
  3. "Creating a Computer Security Incident Response Team" (PDF). Computer Emergency Response Team. US-CERT. 2003-04-01. Retrieved 2007-04-08.
  4. 1 2 "What is SOAR (Security Orchestration, Automation and Response) ?". SearchSecurity. 2019-12-06. Retrieved 2019-12-06.

Further reading