Conference paper

Edge Classification on Graphs: New Directions in Topological Imbalance

Abstract

Recent years have witnessed the remarkable success of applying Graph Machine Learning (GML) to node/graph classification and link prediction. However, edge classification task that enjoys numerous real-world applications such as social network analysis and cybersecurity, has not seen significant advancement with the progress of GML. To address this gap, our study pioneers a comprehensive approach to edge classification. We identify a novel 'Topological Imbalance Issue,' which arises from the skewed distribution of edges across different classes, affecting the local subgraph of each edge and harming the performance of edge classifications. Inspired by recent node-level studies observing performance discrepancies with varying local structural patterns, we aim to investigate if the topological imbalanced edge classification tasks can also be mitigated by characterizing the local class distribution variance. Thus, we introduce Topological Entropy (TE), a novel topological-based metric that measures the topological imbalance for each edge. Our empirical studies confirm that TE effectively measures local class distribution variance, and indicate that prioritizing edges with high TE values can help address the issue of topological imbalance. Inspired by this observation, we develop two strategies - Topological Reweighting and TE Wedge-based Mixup - to adaptively focus training on (synthetic) edges based on their TEs. While topological reweighting directly manipulates training edge weights according to TE, our wedge-based mixup interpolates synthetic edges between high TE wedges. To further enhance performance, we integrate these strategies into a novel topological imbalance strategy for edge classification: TopoEdge. Extensive experiments on real-world datasets demonstrate the efficacy of our proposed strategies1. Additionally, our curated datasets and designed experimental settings establish a new benchmark for future edge classification research, particularly in addressing imbalance issues.