OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers

Qitai Wang, Jiawei He, Yuntao Chen, Zhaoxiang Zhang* ;

Abstract


"Existing end-to-end trackers for vision-based 3D perception suffer from performance degradation due to the conflict between detection and tracking tasks. In this work, we get to the bottom of this conflict, which was vaguely attributed to incompatible task-specific object features previously. We find the conflict between the two tasks lies in their partially conflicted classification gradients, which stems from their subtle difference in positive sample assignments. Based on this observation, we propose to coordinate those conflicted gradients from object queries with contradicted polarity in the two tasks. We also dynamically split all object queries into four groups based on their polarity in the two tasks. Attention between query sets with conflicted positive sample assignments is masked. The tracking classification loss is modified to suppress inaccurate predictions. To this end, we propose , the first one-stage joint detection and tracking model that bridges the gap between detection and tracking under a unified object feature representation. On the nuScenes camera-based object tracking benchmark, outperforms previous works by 6.9% AMOTA on the validation set and by 3.1% AMOTA on the test set."

Related Material


[pdf] [DOI]