8 months ago

3D Machine Vision

Computer Vision

Computer Vision

Guanglin Li Yifeng Li Zhichao Ye Qihang Zhang Tao Kong Zhaopeng Cui Guofeng Zhang

Abstract

Empowering autonomous agents with 3D understanding for daily objects is agrand challenge in robotics applications. When exploring in an unknownenvironment, existing methods for object pose estimation are still notsatisfactory due to the diversity of object shapes. In this paper, we propose anovel framework for category-level object shape and pose estimation from asingle RGB-D image. To handle the intra-category variation, we adopt a semanticprimitive representation that encodes diverse shapes into a unified latentspace, which is the key to establish reliable correspondences between observedpoint clouds and estimated shapes. Then, by using a SIM(3)-invariant shapedescriptor, we gracefully decouple the shape and pose of an object, thussupporting latent shape optimization of target objects in arbitrary poses.Extensive experiments show that the proposed method achieves SOTA poseestimation performance and better generalization in the real-world dataset.Code and video are available at https://zju3dv.github.io/gCasp.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

3D Machine Vision

Computer Vision

Computer Vision

Guanglin Li Yifeng Li Zhichao Ye Qihang Zhang Tao Kong Zhaopeng Cui Guofeng Zhang

Abstract

Empowering autonomous agents with 3D understanding for daily objects is agrand challenge in robotics applications. When exploring in an unknownenvironment, existing methods for object pose estimation are still notsatisfactory due to the diversity of object shapes. In this paper, we propose anovel framework for category-level object shape and pose estimation from asingle RGB-D image. To handle the intra-category variation, we adopt a semanticprimitive representation that encodes diverse shapes into a unified latentspace, which is the key to establish reliable correspondences between observedpoint clouds and estimated shapes. Then, by using a SIM(3)-invariant shapedescriptor, we gracefully decouple the shape and pose of an object, thussupporting latent shape optimization of target objects in arbitrary poses.Extensive experiments show that the proposed method achieves SOTA poseestimation performance and better generalization in the real-world dataset.Code and video are available at https://zju3dv.github.io/gCasp.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp