Conference paper

AI4DE: The 1st International Workshop on AI for Data Editing

Abstract

Machine learning traditionally emphasizes developing models for given datasets, but real-world data is often messy, making model improvement insufficient for enhancing performance. AI for data editing (AI4DE) is an emerging field that systematically improves datasets, leading to significant practical ML advancements. While experienced data scientists have manually refined datasets through trial-And-error and intuition, AI4DE approaches data enhancement as a systematic engineering discipline. AI4DE represents a shift from focusing on models to the underlying data used for training and evaluation. Despite the dominance of common model architectures and predictable scaling rules, building and using datasets remain labor-intensive and costly, lacking infrastructure and best practices. The AI4DE movement aims to develop efficient, high-productivity open data engineering tools for modern ML systems. This workshop seeks to foster an interdisciplinary AI4DE community to address practical data challenges, including data collection, generation, labeling, preprocessing, augmentation, quality evaluation, debt, and governance. By defining and shaping the AI4DE movement, this workshop aims to influence the future of AI and ML, inviting interested parties to contribute through paper submissions