Abstract
Defenders of enterprise networks have a critical need to quickly identify the root causes of malware and data leakage. Increasingly, USB storage devices are the media of choice for data exfiltration, malware propagation, and even cyber-warfare. We observe that a critical aspect of explaining and preventing such attacks is understanding the provenance of data (i.e., the lineage of data from its creation to current state) on USB devices as a means of ensuring their safe usage. Unfortunately, provenance tracking is not offered by even sophisticated modern devices. This work presents ProvUSB, an architecture for fine-grained provenance collection and tracking on smart USB devices. ProvUSB maintains data provenance by recording reads and writes at the block layer and reliably identifying hosts editing those blocks through attestation over the USB channel. Our evaluation finds that ProvUSB imposes a one-time 850 ms overhead during USB enumeration, but approaches nearly-bare-metal runtime performance (90% of throughput) on larger files during normal execution, and less than 0.1% storage overhead for provenance in real-world workloads. ProvUSB thus provides essential new techniques in the defense of computer systems and USB storage devices.