Taming the Costs of Trustworthy Provenance through Policy Reduction

Adam Bates, Dave (Jing) Tian, Grant Hernandez, Thomas Moyer, Kevin R. B. Butler, and Trent Jaeger.
ACM Transactions on Internet Technology (TOIT).
September 1, 2017.
Available Media
Share
tweet

Abstract

Provenance is an increasingly important tool for understanding and even actively preventing system intrusion, but the excessive storage burden imposed by automatic provenance collection threatens to undermine its value in practice. This situation is made worse by the fact that the majority of this metadata is unlikely to be of interest to an administrator, instead describing system noise or other background activities that are not germane to the forensic investigation. To date, storing data provenance in perpetuity was a necessary concession in even the most advanced provenance tracking systems in order to ensure the completeness of the provenance record for future analyses. In this work, we overcome this obstacle by proposing a policy-based approach to provenance filtering, leveraging the confinement properties provided by Mandatory Access Control (MAC) systems in order to identify and isolate subdomains of system activity for which to collect provenance. We introduce the notion of minimal completeness for provenance graphs, and design and implement a system that provides this property by exclusively collecting provenance for the trusted computing base of a target application. In evaluation, we discover that, while the efficacy of our approach is domain dependent, storage costs can be reduced by as much as 89% in critical scenarios such as provenance tracking in cloud computing data centers. To the best of our knowledge, this is the first policy-based provenance monitor to appear in the literature.