| re-SEQ2SEQ [DBLP:conf/eccv/ZhangGS18] | - | 63.9 | - | - | Supervised Video Summarization via Multiple Feature Sets with Parallel Attention |  | 
| MC-VSA [DBLP:journals/corr/abs-2006-01410] | - | 63.7 | - | - | Supervised Video Summarization via Multiple Feature Sets with Parallel Attention |  | 
| PGL-SUM (maximum learning capacity) | - | 62.7 | - | - | Combining Global and Local Attention with Positional Encoding for Video Summarization | - | 
| M-AVS [DBLP:journals/corr/abs-1708-09545] | - | 61 | - | - | Supervised Video Summarization via Multiple Feature Sets with Parallel Attention |  | 
| VASNet [DBLP:conf/accv/FajtlSAMR18] | - | 59.8 | - | - | Supervised Video Summarization via Multiple Feature Sets with Parallel Attention |  |