Command Palette
Search for a command to run...
Fine-Grained Action Detection with RGB and Pose Information using Two Stream Convolutional Networks
Hacker Leonard ; Bartels Finn ; Martin Pierre-Etienne

Abstract
As participants of the MediaEval 2022 Sport Task, we propose a two-streamnetwork approach for the classification and detection of table tennis strokes.Each stream is a succession of 3D Convolutional Neural Network (CNN) blocksusing attention mechanisms. Each stream processes different 4D inputs. Ourmethod utilizes raw RGB data and pose information computed from MMPose toolbox.The pose information is treated as an image by applying the pose either on ablack background or on the original RGB frame it has been computed from. Bestperformance is obtained by feeding raw RGB data to one stream, Pose + RGB(PRGB) information to the other stream and applying late fusion on thefeatures. The approaches were evaluated on the provided TTStroke-21 data sets.We can report an improvement in stroke classification, reaching 87.3% ofaccuracy, while the detection does not outperform the baseline but stillreaches an IoU of 0.349 and mAP of 0.110.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| action-classification-on-ttstroke-21 | RGB and PRGB | Acc: 0.8731 |
| action-detection-on-ttstroke-21 | RGB and PRGB | IoU: 0.3491 mAP: 0.1101 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.