Command Palette
Search for a command to run...
Is Weakly-supervised Action Segmentation Ready For Human-Robot Interaction? No, Let's Improve It With Action-union Learning
{Fan YangShigeyuki OdashimaShoichi MasuiShan Jiang}
Abstract
Action segmentation plays an important role in enabling robots to automatically understand human activities. To train the action recognition model, while obtaining action labels for all frames is costly, annotating timestamp labels for weak supervision is cost-effective. However, existing methods may not fully utilize timestamp labels, which leads to insufficient performance. To alleviate this issue, we proposed a novel learning pattern in our training stage, which maximizes the probability of action union of surrounding timestamps for unlabeled frames. In our inference stage, we provided a new refinement solution to generate better hard-assigned action classes from soft-assigned predictions. Importantly, our methods are model-agnostic and can be applied to existing frameworks. On three commonly used action-segmentation data, our method outperforms previous timestamp-supervision methods and achieves new state-of-the-art performance. Moreover , our method uses less than 1% of fully-supervised labels to obtain comparable or even better results.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| action-segmentation-on-50-salads-1 | AUL | Acc: 77.9 Edit: 77.0 F1@10%: 84.4 F1@25%: 81.3 F1@50%: 67.1 |
| action-segmentation-on-gtea-1 | AUL | Acc: 69.2 Edit: 84.0 F1@10%: 88.2 F1@25%: 85.5 F1@50%: 67.3 |
| weakly-supervised-action-localization-on-gtea | AU-Action | mAP@0.1:0.7: 76.9 mAP@0.5: 66.3 |
| weakly-supervised-action-segmentation | AUL | Acc: 67.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.