Publication
ML4EO 2024
Poster

Using continual pretraining with a geospatial foundation model

Abstract

AI Foundation models are being increasingly used for earth observation tasks using deep learning. However, it takes significant resources to train these types of models from scratch. For example, the Prithvi model used in this study took ~4 days per using 8 A100 GPUs to train. If we want to add, for example, another channel from optical imagery or use a different satellite for self-supervised training, retraining the model from scratch (i.e. starting with random weights) would use comparable resources again. Ideally, we want to reduce the cost of training the new model by using the weights from a previously trained model as these still contain useful representation of the data. Here we investigate adding additional data and using weights from a previously trained model. We use Prithvi for the original model, which is a transformer-based model trained on Harmonized Landsat-Sentinel 2 (HLS) multispectral satellite imagery from the continental USA. We then investigate adding Sentinel-1 Synthetic Aperture Radar (SAR), and elevation data as additional channels to the model. We test three different methods to add these new data. 1) We use pretrained model weights for the HLS data and then initialising the new channels with random weights. 2) We use pretrained model weights for the HLS data and then initialising the new channels with the average weights of the other channels. 3) Using a student-teacher approach, which involves using the pretrained model weights as the teacher, where a new model is trained based on the loss compared to the teacher weights and the masked image modelling task from the Prithvi model. We evaluate the performance of each method on the self-supervised task. Including adding both Sentinel-1 SAR and elevation data separately to the HLS data and creating a model with all data (i.e. HLS, SAR and elevation data). Finally, we investigate the amount of data and resources is needed to train these models.