Self-Attention Based Video Summarization
-
Graphical Abstract
-
Abstract
Video summarization aims to identify the most representative contexts in videos.In this paper,we propose a new video summarization method which assigns different importance to video frames.Specifically,we exploit bidirectional LSTMs to capture temporal information of video frames and then employ self-attention mechanism to pay different attention on different frames for extracting their global features.Finally,we sample an action for each frame by using the corresponding regression score and apply the reinforcement learning strategy to optimize parameters in our model,where actions are defined as select or not select the current frame,states are defined as actions for the whole video,and the reward is defined as the sum of representative and diversity costs.We conduct video summarization experiments on two public video summarization datasets including SumMe and TVSum and evaluate the performance by using F-measure.Experimental results demonstrate that our proposed video summarization method has achieved the superior performance,comparing to the state-of-the-arts.
-
-