Abstract
When fending off attacks on large organizations, it is necessary to detect the threat early and investigate it quickly in order to determine the appropriate response. Endpoint Detection and Response (EDR) tools are essential to providing such capabilities in large organizations, providing visibility into sophisticated intrusions by matching system events against databases of known adversarial tactics, techniques, and procedures. However, current solutions suffer from three major challenges: 1) EDR tools generate a high volume of false alarms, creating huge backlogs of investigation tasks for cyber analysts; 2) determining the veracity of these threat alerts requires tedious manual labour due to the overwhelming amount of low-level log information, creating a “needle-in-a-haystack” problem; and 3) due to the tremendous resource burden of log retention, in practice the audit logs describing long-lived attack campaigns are often destroyed before an investigation is ever initiated. This paper describes an effort to bring the benefits of provenance-based causal analysis and triage to commercial EDR tools. We introduce the notion of Tactical Provenance Graphs (TPGs) that, rather than encoding low-level system dependencies, reason about causal dependencies between the behavior-based threat alerts fired by EDR tools. TPGs provide compact visual explanations of multi-stage attacks to cyber analysts, accelerating the investigation process. To address EDR’s false alert problem, we introduce a novel threat scoring methodology that assesses severity based on the temporal ordering between individual threat alerts present in the TPG. In contrast to the retention of unwieldy low-level system logs, we maintain a minimally- sufficient skeleton graph that can provide linkability between existing and future threat alerts. We evaluate our prototype EDR tool, RapSheet, in a small enterprise environment to discover that this approach is able to consistently rank truly malicious TPGs higher than false alarm TPGs. Moreover, our skeleton graph reduces the long-term burden of log retention by up to 87%. This work thus demonstrates a viable strategy for incorporating recent community advancements in the area of data provenance into commercial security solutions.