Cristina Cornelio, Judy Goldsmith, et al.
JAIR
To refine unsupervised geospatial model training, we introduce a novel method emphasizing diverse and clean datasets. Extracting finer-resolution metrics like land use, temperature, and precipitation, we cluster similar statistics to comprehend data distribution comprehensively. Weighted sampling based on cluster size ensures representative data points, with a down-weighting strategy favoring less frequent data for enhanced diversity. This achieves a balanced dataset representation, significantly improving the geospatial foundation model's accuracy. Our study underscores the potential for optimizing geospatial data sampling, enhancing model accuracy, and broadening practical applications.
Cristina Cornelio, Judy Goldsmith, et al.
JAIR
Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Conrad Albrecht, Jannik Schneider, et al.
CVPR 2025