SIA OpenIR  > 工艺装备与智能机器人研究室
A two-stage temporal proposal network for precise action localization in untrimmed video
Wang F(王斐)1; Wang, Guorui2; Du, Yuxuan2; He, Zhenquan2; Jiang Y(姜勇)3
Department工艺装备与智能机器人研究室
Source PublicationInternational Journal of Machine Learning and Cybernetics
ISSN1868-8071
2021
Volume12Issue:8Pages:2199-2211
Indexed BySCI ; EI
EI Accession number20211410183581
WOS IDWOS:000635039400001
Contribution Rank3
Funding OrganizationFoundation of National Natural Science Foundation of China under Grant 61973065 ; Fundamental Research Funds for the Central Universities of China under Grant N172608005, N182612002 and N2026002 ; National Natural Science Foundation of China under Grant 61973065
KeywordAction detection Correctness discriminator Extended context pooling Temporal context regression
Abstract

In this paper, we propose a two-stage temporal proposal algorithm for the action detection task of long untrimmed videos. In the first stage, we propose a novel prior-minor watershed algorithm for action proposals with precise prior watershed proposal algorithm and minor supplementary sliding window algorithm. Here, we propose the correctness discriminator to fill the proposals that watershed proposal algorithm may omit with the sliding window proposals. In the second stage, an extended context pooling (ECP) is firstly proposed with two modules (internal and context). The context information module of ECP can structure the proposals and enhance the extended features of action proposals. Different level of ECP is introduced to model the action proposal region and make its extended context region more targeted and precise. Then, we propose a temporal context regression network, which adopts a multi-task loss to realize the training of the temporal coordinate regression and the action/background classification simultaneously, and outputs the precise temporal boundaries of the proposals. Here, we also propose prior-minor ranking to balance the effect of the prior watershed proposals and the minor supplementary proposals. On three large scale benchmarks THUMOS14, ActivityNet (v1.2 and v1.3), and Charades, our approach achieves superior performances compared with other state-of-the-art methods and runs over 1020 frames per second (fps) on a single NVIDIA Titan-X Pascal GPU, indicating that our method can efficiently improve the precision of action localization task.

Language英语
WOS SubjectComputer Science, Artificial Intelligence
WOS Research AreaComputer Science
Funding ProjectFoundation of National Natural Science Foundation of China[61973065] ; Fundamental Research Funds for the Central Universities of China[N172608005] ; Fundamental Research Funds for the Central Universities of China[N182612002] ; Fundamental Research Funds for the Central Universities of China[N2026002] ; National Natural Science Foundation of China[61973065]
Citation statistics
Document Type期刊论文
Identifierhttp://ir.sia.cn/handle/173321/28738
Collection工艺装备与智能机器人研究室
Corresponding AuthorWang F(王斐)
Affiliation1.Faculty of Robot Science and Engineering, Northeastern University, Shenyang 110169, China
2.College of Information Science and Engineering, Northeastern University, Shenyang110819, China
3.Shenyang Institute of Automation Chinese Academy of Sciences, Shenyang 110016, China
Recommended Citation
GB/T 7714
Wang F,Wang, Guorui,Du, Yuxuan,et al. A two-stage temporal proposal network for precise action localization in untrimmed video[J]. International Journal of Machine Learning and Cybernetics,2021,12(8):2199-2211.
APA Wang F,Wang, Guorui,Du, Yuxuan,He, Zhenquan,&Jiang Y.(2021).A two-stage temporal proposal network for precise action localization in untrimmed video.International Journal of Machine Learning and Cybernetics,12(8),2199-2211.
MLA Wang F,et al."A two-stage temporal proposal network for precise action localization in untrimmed video".International Journal of Machine Learning and Cybernetics 12.8(2021):2199-2211.
Files in This Item:
File Name/Size DocType Version Access License
A two-stage temporal(3682KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Application Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang F(王斐)]'s Articles
[Wang, Guorui]'s Articles
[Du, Yuxuan]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang F(王斐)]'s Articles
[Wang, Guorui]'s Articles
[Du, Yuxuan]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang F(王斐)]'s Articles
[Wang, Guorui]'s Articles
[Du, Yuxuan]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: A two-stage temporal proposal network for precise action localization in untrimmed video.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.