Generalizable Reinforcement Learning-Based Coarsening Model for Resource Allocation over Large and Diverse Stream Processing Graphs
Abstract
Resource allocation for stream processing graphs on computing devices is critical to the performance of stream processing. Efficient allocations need to balance workload distribution and minimize communication simultaneously and globally. Since this problem is known to be NP-complete, recent machine learning solutions were proposed based on an encoder-decoder framework, which predicts the device assignment of computing nodes sequentially as an approximation. However, for large graphs, these solutions suffer from the deficiency in handling long-distance dependency and global information, resulting in suboptimal predictions. This work proposes a new paradigm to deal with this challenge, which first coarsens the graph and conducts assignments on the smaller graph with existing graph partitioning methods. Unlike existing graph coarsening works, we leverage the theoretical insights in this resource allocation problem, formulate the coarsening of stream graphs as edge-collapsing predictions, and propose an edge-aware coarsening model. Extensive experiments on various datasets show that our framework significantly improves over existing learning-based and heuristic-based baselines with up to 56% relative improvement on large graphs.