Vision-Language Dual-Pattern Matching for Out-of-Distribution Detection

Zihan Zhang, Zhuo Xu, Xiang Xiang* ;

Abstract


"Out-of-distribution (OOD) detection is a significant challenge in deploying pattern recognition and machine learning models, as models often fail on data from novel distributions. Recent vision-language models (VLMs) such as CLIP have shown promise in OOD detection through their generalizable multimodal representations. Existing CLIP-based OOD detection methods only utilize a single modality of in-distribution (ID) information (, textual cues). However, we find that the ID visual information helps to leverage CLIP’s full potential for OOD detection. In this paper, we pursue a different approach and explore the regime to leverage both the visual and textual ID information. Specifically, we propose Dual-Pattern Matching (DPM), efficiently adapting CLIP for OOD detection by leveraging both textual and visual ID patterns. DPM stores ID class-wise text features as the textual pattern and the aggregated ID visual information as the visual pattern. At test time, the similarity to both patterns is computed to detect OOD inputs. We further extend DPM with lightweight adaptation for enhanced OOD detection. Experiments demonstrate DPM’s advantages, outperforming existing methods on common benchmarks. The dual-pattern approach provides a simple yet effective way to exploit multi-modality for OOD detection with vision-language representations."

Related Material


[pdf] [supplementary material] [DOI]