Publication
ACM TOSEM
Paper

Stateful Serverless Computing with Crucial

View publication

Abstract

Serverless computing greatly simplifies the use of cloud resources. In particular, Function-As-A-Service (FaaS) platforms enable programmers to develop applications as individual functions that can run and scale independently. Unfortunately, applications that require fine-grained support for mutable state and synchronization, such as machine learning (ML) and scientific computing, are notoriously hard to build with this new paradigm. In this work, we aim at bridging this gap. We present Crucial, a system to program highly-parallel stateful serverless applications. Crucial retains the simplicity of serverless computing. It is built upon the key insight that FaaS resembles to concurrent programming at the scale of a datacenter. Accordingly, a distributed shared memory layer is the natural answer to the needs for fine-grained state management and synchronization. Crucial allows to port effortlessly a multi-Threaded code base to serverless, where it can benefit from the scalability and pay-per-use model of FaaS platforms. We validate Crucial with the help of micro-benchmarks and by considering various stateful applications. Beyond classical parallel tasks (e.g., a Monte Carlo simulation), these applications include representative ML algorithms such as k-means and logistic regression. Our evaluation shows that Crucial obtains superior or comparable performance to Apache Spark at similar cost (18%-40% faster). We also use Crucial to port (part of) a state-of-The-Art multi-Threaded ML library to serverless. The ported application is up to 30% faster than with a dedicated high-end server. Finally, we attest that Crucial can rival in performance with a single-machine, multi-Threaded implementation of a complex coordination problem. Overall, Crucial delivers all these benefits with less than 6% of changes in the code bases of the evaluated applications.

Date

Publication

ACM TOSEM