KeypointDETR: An End-to-End 3D Keypoint Detector

Hairong Jin, Yuefan Shen, Jianwen Lou, Kun Zhou, Youyi Zheng* ;

Abstract


"3D keypoint detection plays a pivotal role in 3D shape analysis. The majority of prevalent methods depend on producing a shared heatmap. This approach necessitates subsequent post-processing techniques such as clustering or non-maximum suppression (NMS) to pinpoint keypoints within high-confidence regions, resulting in performance inefficiencies. To address this issue, we introduce KeypointDETR, an end-to-end 3D keypoint detection framework. KeypointDETR is predominantly trained with a bipartite matching loss, which compels the network to forecast sets of heatmaps and probabilities for potential keypoints. Each heatmap highlights one keypoint’s location, and the associated probability indicates not only the presence of that specific keypoint but also its semantic consistency. Together with the bipartite matching loss, we utilize a transformer-based network architecture, which incorporates both point-wise and query-wise self-attention within the encoder and decoder, respectively. The point-wise encoder leverages the self-attention mechanism on a dynamic graph derived from the local feature space of each point, resulting in the generation of heatmap features. As a key part of our framework, the query-wise decoder facilitates inter-query information exchange. It captures the underlying connections among keypoints’ heatmaps, positions, and semantic attributes via the cross-attention mechanism, enabling the prediction of heatmaps and probabilities. Extensive experiments conducted on the KeypointNet dataset reveal that KeypointDETR outperforms competitive baselines, demonstrating superior performance in keypoint saliency and correspondence estimation tasks. (The code will be released at github.com/bibi547/KeypointDETR)"

Related Material


[pdf] [supplementary material] [DOI]