Uncertainty-aware sign language video retrieval with probability distribution modeling
Xuan Wu*, Hongxiang Li, yuanjiang luo, Xuxin Cheng, Xianwei Zhuang, Meng Cao, Keren Fu*
;
Abstract
"Sign language video retrieval plays a key role in facilitating information access for the deaf community. Despite significant advances in video-text retrieval, the complexity and inherent uncertainty of sign language preclude direct applications of these techniques. Previous methods achieve mapping between sign language videos and text through fine-grained modal alignment. However, due to the scarcity of fine-grained annotations, the uncertainty inherent in sign language videos is underestimated, limiting further development of sign language retrieval tasks. To address this challenge, we propose a new Uncertainty-aware Probability Distribution Retrieval (UPRet), which conceptualizes the mapping process of sign language videos and texts in terms of probability distributions, explores their potential interrelationships, and enables flexible mappings. Experiments on three benchmarks demonstrate the effectiveness of our method, which achieves state-of-the-art results on How2Sign (59.1%), PHOENIX-2014T (72.0%), and CSL-Daily (78.4%). Our source code is available: https://github.com/xua222/ UPRet."
Related Material
[pdf]
[supplementary material]
[DOI]