SAM-Zero3D : Extending Segment Anything to Zero Shot 3D Scene Segmentation via Iterative Global–Local Interaction

Zhang, Dejun and Xu, Shifeng and Bai, Yanzi and Wu, Yiqi and Liu, Jun (2026) SAM-Zero3D : Extending Segment Anything to Zero Shot 3D Scene Segmentation via Iterative Global–Local Interaction. IEEE Transactions on Circuits and Systems for Video Technology. ISSN 1051-8215

[thumbnail of Manuscript_6.0_TCSVT__Final]
Text (Manuscript_6.0_TCSVT__Final)
Manuscript_6.0_TCSVT_Final.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (10MB)

Abstract

Lifting multi-view 2D masks generated by the Segment Anything Model (SAM) into 3D space offers a promising direction for zero-shot 3D scene segmentation, but view-dependent occlusions and limited fields of view often cause incomplete observations and cross-view inconsistencies, resulting in fragmented semantics and geometric misalignment. To address this, we propose SAM-Zero3D, which extends SAM to the 3D domain through a structured fusion pipeline with two complementary branches. The global anchor point-guided branch projects 3D anchors into multi-view masks to construct a cross-view affinity graph, identifies consistent mask groups via connected component analysis, and assigns 3D masks via majority voting and nearest-neighbor propagation. The local geometry-driven branch partitions the point cloud into fine-grained regions, estimates region-level semantic similarity from aggregated mask distributions, and progressively merges similar regions through a multi-stage merging strategy. An iterative global–local interaction further refines both branches by aligning global semantic priors with local geometric cues. Extensive experiments on ShapeNetPart, ScanNetV2, and ScanNet200 show that SAM-Zero3D significantly outperforms existing zero-shot baselines, achieving accurate and structure-aware segmentation without any 3D training or supervision.

Item Type:
Journal Article
Journal or Publication Title:
IEEE Transactions on Circuits and Systems for Video Technology
Uncontrolled Keywords:
/dk/atira/pure/subjectarea/asjc/2200/2214
Subjects:
?? media technologyelectrical and electronic engineering ??
ID Code:
235989
Deposited By:
Deposited On:
11 Mar 2026 15:00
Refereed?:
Yes
Published?:
Published
Last Modified:
11 Mar 2026 22:35