Command Palette
Search for a command to run...
MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation
MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation
["name": "Duc Dang Trung Tran" "email": "trandangtrungduc@seoultech.ac.kr" "affiliation": "Seoul National University of Science and Technology Department of Electrical and Information Engineering Seoul Republic of Korea" "name": "Byeongkeun Kang" "email": "byeongkeun.kang@seoultech.ac.kr" "affiliation": "Seoul National University of Science and Technology Department of Electronic Engineering
Abstract
Recently, transformer-based techniques incorporating superpoints have becomeprevalent in 3D instance segmentation. However, they often encounter anover-segmentation problem, especially noticeable with large objects.Additionally, unreliable mask predictions stemming from superpoint maskprediction further compound this issue. To address these challenges, we proposea novel framework called MSTA3D. It leverages multi-scale featurerepresentation and introduces a twin-attention mechanism to effectively capturethem. Furthermore, MSTA3D integrates a box query with a box regularizer,offering a complementary spatial constraint alongside semantic queries.Experimental evaluations on ScanNetV2, ScanNet200 and S3DIS datasetsdemonstrate that our approach surpasses state-of-the-art 3D instancesegmentation methods.