Trustworthy Machine Learning... for Systems Security
No day goes by without reading machine learning (ML) success stories across different application domains. Systems security is no exception, where ML's tantalizing results leave one to wonder whether there are any unsolved problems left. However, machine learning has no clairvoyant abilities and once the magic wears off, we're left in uncharted territory. We, as a community, need to understand and improve the effectiveness of machine learning methods for systems security in the presence of adversaries. One of the core challenges is related to the representation of problem space objects (e.g., program binaries) in a numerical feature space, as the semantic gap makes it harder to reason about attacks and defences and often leaves room for adversarial manipulation. Inevitably, the effectiveness of machine learning methods for systems security are intertwined with the underlying abstractions, e.g., program analyses, used to represent the objects. In this context, is trustworthy machine learning possible? In this talk, I will first illustrate the challenges in the context of adversarial ML evasion attacks against malware classifiers. The classic formulation of evasion attacks is ill-suited for reasoning about how to generate realizable evasive malware in the problem space. I'll provide a deep dive into recent work that provides a theoretical reformulation of the problem and enables more principled attack designs. Implications are interesting, as the framework facilitates reasoning around end-to-end attacks that can generate real-world adversarial malware, at scale, that evades both vanilla and hardened classifiers, thus calling for novel defenses. Next, I'll broaden our conversation to include not just robustness against specialized attacks, but also drifting scenarios, in which threats evolve and change over time. Prior work suggests adversarial ML evasion attacks are intrinsically linked with concept drift and we will discuss how drift affects the performance of malware classifiers, hinting at the role the underlying feature space abstraction has in the whole process. Ultimately, these threats would not exist if the abstraction could capture the 'Platonic ideal' of interesting behaviour (e.g., maliciousness), however, such a solution is still out of reach. I'll conclude by outlining current research efforts to make this goal a reality, including robust feature development, assessing vulnerability to universal perturbations, and forecasting of future drift, which illustrate what trustworthy machine learning for systems security may eventually look like.