6d-diff : A keypoint diffusion framework for 6d object pose estimation

Xu, Li and Qu, Haoxuan and Cai, Yujun and Liu, Jun (2024) 6d-diff : A keypoint diffusion framework for 6d object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 :. UNSPECIFIED.

[thumbnail of Xu_6D-Diff_A_Keypoint_Diffusion_Framework_for_6D_Object_Pose_Estimation_CVPR_2024_paper (1)]
Text (Xu_6D-Diff_A_Keypoint_Diffusion_Framework_for_6D_Object_Pose_Estimation_CVPR_2024_paper (1))
Xu_6D-Diff_A_Keypoint_Diffusion_Framework_for_6D_Object_Pose_Estimation_CVPR_2024_paper_1_.pdf - Accepted Version

Download (3MB)

Abstract

Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds. Mean-while, diffusion models have shown appealing performance in generating high-quality images from random noise with high indeterminacy through step-by-step denoising. Inspired by their denoising capability, we propose a novel diffusion-based framework (6D-Diff) to handle the noise and indeterminacy in object pose estimation for better performance. In our framework, to establish accurate 2D-3D correspondence, we formulate 2D keypoints detection as a reverse diffusion (denoising) process. To facilitate such a denoising process, we design a Mixture-of-Cauchy-based forward diffusion process and condition the reverse process on the object appearance features. Extensive experiments on the LM-O and YCB-V datasets demonstrate the effectiveness of our framework.

Item Type:
Contribution in Book/Report/Proceedings
ID Code:
227549
Deposited By:
Deposited On:
29 Apr 2025 15:05
Refereed?:
Yes
Published?:
Published
Last Modified:
20 May 2025 01:40