Command Palette
Search for a command to run...
A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection
Su Yukun ; Deng Jingliang ; Sun Ruizhou ; Lin Guosheng ; Wu Qingyao

Abstract
Humans tend to mine objects by learning from a group of images or severalframes of video since we live in a dynamic world. In the computer vision area,many researches focus on co-segmentation (CoS), co-saliency detection (CoSD)and video salient object detection (VSOD) to discover the co-occurrent objects.However, previous approaches design different networks on these similar tasksseparately, and they are difficult to apply to each other, which lowers theupper bound of the transferability of deep learning frameworks. Besides, theyfail to take full advantage of the cues among inter- and intra-feature within agroup of images. In this paper, we introduce a unified framework to tacklethese issues, term as UFO (Unified Framework for Co-Object Segmentation).Specifically, we first introduce a transformer block, which views the imagefeature as a patch token and then captures their long-range dependenciesthrough the self-attention mechanism. This can help the network to excavate thepatch structured similarities among the relevant objects. Furthermore, wepropose an intra-MLP learning module to produce self-mask to enhance thenetwork to avoid partial activation. Extensive experiments on four CoSbenchmarks (PASCAL, iCoseg, Internet and MSRC), three CoSD benchmarks(Cosal2015, CoSOD3k, and CocA) and four VSOD benchmarks (DAVIS16, FBMS, ViSaland SegV2) show that our method outperforms other state-of-the-arts on threedifferent tasks in both accuracy and speed by using the same networkarchitecture , which can reach 140 FPS in real-time.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| co-salient-object-detection-on-coca | UFO | MAE: 0.095 Mean F-measure: 0.555 S-measure: 0.697 max E-measure: 0.782 max F-measure: 0.571 mean E-measure: 0.762 |
| co-salient-object-detection-on-cosal2015 | UFO | MAE: 0.064 S-measure: 0.860 max E-measure: 0.906 max F-measure: 0.865 mean E-measure: 0.889 mean F-measure: 0.848 |
| co-salient-object-detection-on-cosod3k | UFO | MAE: 0.073 S-measure: 0.819 max E-measure: 0.874 max F-measure: 0.797 mean E-measure: 0.855 mean F-measure: 0.783 |
| co-salient-object-detection-on-icoseg | UFO | MAE: 0.029 S-measure: 0.924 max E-measure: 0.969 max F-measure: 0.953 |
| video-salient-object-detection-on-davis-2016 | UFO | AVERAGE MAE: 0.015 MAX F-MEASURE: 0.906 S-Measure: 0.918 |
| video-salient-object-detection-on-fbms-59 | UFO | AVERAGE MAE: 0.028 MAX F-MEASURE: 0.890 S-Measure: 0.894 |
| video-salient-object-detection-on-segtrack-v2 | UFO | AVERAGE MAE: 0.022 MAX F-MEASURE: 0.863 S-Measure: 0.892 |
| video-salient-object-detection-on-visal | UFO | Average MAE: 0.011 S-Measure: 0.953 max E-measure: 0.987 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.