SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning

Bac Nguyen*, Stefan Uhlich, Fabien Cardinaux, Lukas Mauch, Marzieh Edraki, Aaron Courville ;

Abstract


"Handling distribution shifts from training data, known as out-of-distribution (OOD) generalization, poses a significant challenge in the field of machine learning. While a pre-trained vision-language model like CLIP has demonstrated remarkable zero-shot performance, further adaptation of the model to downstream tasks leads to undesirable degradation for OOD data. In this work, we introduce Sparse Adaptation for Fine-Tuning (), a method that prevents fine-tuning from forgetting the general knowledge in the pre-trained model. only updates a small subset of important parameters whose gradient magnitude is large, while keeping the other parameters frozen. is straightforward to implement and conceptually simple. Extensive experiments show that with only 0.1% of the model parameters, can significantly improve the performance of CLIP. It consistently outperforms baseline methods across several benchmarks. On the few-shot learning benchmark of ImageNet and its variants, gives a gain of 5.15% on average over the conventional fine-tuning method in OOD settings."

Related Material


[pdf] [supplementary material] [DOI]