Tackling ML-based Dynamic Mispredictions using Statically Computed Invariants for Attack Surface Reduction

Chris Porter; Sharjeel Khan; Kangqi Ni; Santosh Pande

doi:10.1145/3676641.3716276

ASPLOS 2025

Conference paper

30 Mar 2025

Tackling ML-based Dynamic Mispredictions using Statically Computed Invariants for Attack Surface Reduction

View publication

Abstract

Recent work has demonstrated the utility of machine learning (ML) in carrying out highly accurate predictions at runtime. One of the major challenges with using ML, however, is that the predictions lack certain guarantees. For such approaches to become practicable in security settings involving debloating and dynamic control flow monitoring, one must distinguish between mispredictions vs. attacks. In this work, we introduce a low overhead framework for tackling mispredictions of ML-based approaches using static invariants. In particular, we tackle the problem of dynamic function call set prediction encountered in program debloating. We first introduce an ML-based prediction technique that works on the whole application, providing high precision and reducing ∼90% of code-reuse gadgets useful for staging attacks at runtime. We then propose an effective mechanism for dealing with the ML model's mispredictions: a new static relation called the ensue() of a function, which is a set of functions that could legally follow a given function under any dynamic execution. We develop efficient algorithms to statically compute such a set by modeling the relation in a Datalog-based solver. Upon misprediction, the framework invokes a lightweight mechanism to distinguish between an attack vs. misprediction. We show that the framework triggers misprediction checking in our experiments on a reasonable percentage of predictions invoked at runtime (3.8%, 16%, and 2.3% for SPEC CPU 2017 and low- and high-complexity applications, respectively), of which all cases are validated to conform to the static call relations. We contend that a low runtime overhead (7.5% for SPEC, 3% for real-world applications) and precise ML mechanism, along with the ability to effectively deal with mispredictions, yields a real-world solution.

Paper