Archipelago: Storage and Data Orchestrator for Edge Computing Infrastructure
Abstract
Traditional cloud computing where data is transmitted for processing from end users to centralized data centers remained dominant computing approach in the past decades. The trends are now shifting from the centralized cloud computing to distributed edge computing, as the ever-growing number of IoT devices and their data traffic are posing significant burdens on the capacity-limited Wide Area Networks (WANs) and result into prolonged and unpredictable service delays. Edge computing clusters can relay data received from IoT devices to the core data center and become bridges between them. However, in many cases, it is not feasible to transfer every piece of data from the edge cluster to the core. Instead, it is highly desirable to (1)~allow data replication among certain but not all edge nodes, (2)~enable directional data movement within a given subset, and (3)~support data replication and movement from different storage systems. Last but not least, data synchronization and replication should be predictable regardless of network conditions. In this work, we discuss problems emerging during this paradigm shift and propose Archipelago framework to manage cross-edge data movement.