SIA OpenIR  > 光电信息技术研究室
基于卷积神经网络的跨镜多行人目标检测与跟踪方法研究
Alternative TitleMultiple Camera Multiple Pedestrian Detection and Tracking Method Based on Convolutional Neural Networks
何淼1,2
Department光电信息技术研究室
Thesis Advisor罗海波
Keyword行人检测 多目标跟踪 再识别 摄像机网络拓扑估计 跨镜多目标跟踪
Pages156页
Degree Discipline模式识别与智能系统
Degree Name博士
2020-11-26
Degree Grantor中国科学院沈阳自动化研究所
Place of Conferral沈阳
Abstract随着国家城市化进程不断发展以及图像采集处理技术的不断提高,各行各业在室内外安放的摄像机数量不断增加,摄像机之间的时空关系日趋复杂。如何快速整理大量分布式摄像机的时空关系,对场景中诸多目标进行长时间的跨镜监控也就成为亟待解决的问题。不同摄像机下目标的姿态、观察角度、光照条件、成像色彩、拍摄距离等均有不同,这就使得依靠多摄像机覆盖的大范围场景下目标的连续监控更加复杂,制约了相关视觉应用的进一步发展。针对上述问题,本文以较为复杂的非重叠视域下的行人目标作为核心,开展跨镜多行人目标检测与跟踪技术的研究,主要贡献如下:1. 提出了一种基于语义兴趣区域的行人检测算法。基于检测标签训练弱监督语义分割网络,利用语义分割算法更强的表征能力来降低密集人群和复杂背景对检测算法的影响。通过弱监督语义分割网络和基于卷积网络的检测算法的特征复用,融合后的检测算法可对检测结果进行修正,提升算法的鲁棒性。通过语义兴趣区域和基于滑窗的检测算法的级联检测,可减少滑窗范围,降低空间金字塔的层数,提高算法效率和多尺度鲁棒性。实验结果表明,所提算法在以上两类应用中均具有较为出色的行人检测能力。2. 提出了一种基于深度表征和卡尔曼滤波的快速在线多目标跟踪算法。算法以行人检测算法作为基础,融合基于卡尔曼滤波器的匀速线性运动模型和基于再识别网络的长、短期表征模型,利用匈牙利算法对短轨迹的运动预测结果和检测器的观测结果进行关联。通过对跟踪策略的设置,算法对检测器的误检和漏检具有较强的鲁棒性。实验结果表明,所提多目标跟踪算法在达到较高的跟踪精度的同时具有较好的实时性。3. 提出了一种基于属性和多级自注意力特征抑制的行人再识别算法。算法通过改进的CBAM注意力机制模块对特征进行筛选。基于此,提出了一种多级自注意力抑制结构,通过多级自注意力特征的相互抑制提升再识别特征的细节搜索能力。最后,将属性注意力特征融入多级自注意力抑制结构中作为其中的第一级,进一步提高了算法精度以及可解释性。实验结果表明,所提算法具有较强的行人再识别性能。4. 提出了一种基于轨迹跟踪和再识别的无监督摄像机网络拓扑估计算法。算法通过单摄像机下的行人跟踪得到跟踪轨迹,利用轨迹的起始点以及终止点进行聚类,得到视野下的摄像机网络拓扑节点。基于此,提出了一种基于表征相似度的平均累积互相关函数,对拓扑节点内行人的离开信号、进入信号以及表征相似度进行分析,得到不同节点之间的相关关系,并通过函数曲线分析出节点间的物理连接关系以及转移时间概率函数。实验表明,所提算法可较好的估计出摄像机网络拓扑结构。5. 设计了出跨镜多行人目标跟踪系统的工作流程,搭建了跨镜多行人目标跟踪系统。根据跨镜多目标跟踪方法各个子任务的性质,分析子任务之间的依赖关系,并设计出跨镜多行人目标跟踪系统的工作流程。基于此,于GPU服务器上搭建跨镜多行人目标跟踪系统,并利用实验室内的多个监控相机对系统进行测试,对系统的实际落地应用进行了初步探索。
Other AbstractWith the continuous development of national urbanization process and the continuous improvement of image acquisition and processing technology, the number of cameras installed indoors and outdoors by different industries is increasing, and the spatio-temporal relationship between cameras is becoming more and more complex. How to quickly sort out the spatio-temporal relationship of a large number of distributed cameras and monitor multiple objects in the scenes for a long time has become an urgent problem. The pose, viewpoint, illumination condition, imaging color and shooting distance of the targets under multiple cameras are different, which makes the continuous monitoring of targets in large-scale scenes covered by multiple cameras more complex, restricting the further development of related visual applications. In order to solve this problem, this paper focuses on the pedestrian target in non-overlapping field of view, and studies the multi-target multi-camera detection and tracking technology. 1. A pedestrian detection algorithm based on semantic regions of interest is proposed. The weak supervised semantic segmentation network is trained based on the detection labels, and the stronger representation ability of the semantic segmentation algorithm is used to reduce the influence of dense crowd and complex background on the detection algorithm. Through the feature reuse of weak supervised semantic segmentation network and detection algorithm based on convolutional neural network, the fusion detection algorithm can modify the detection results and improve the robustness. Through the cascade of semantic region of interest selection and sliding window based detection algorithm, the range of sliding window can be limited, the layers of spatial pyramid can be reduced, and the algorithm accuracy and multi-scale robustness can be improved. Experimental results show that the proposed algorithm has excellent pedestrian detection ability in the above two kinds of applications. 2. A fast online multiple object tracking algorithm based on integrating motion model and deep appearance model is proposed. Based on the pedestrian detection algorithm, the algorithm combines the uniform linear motion model based on Kalman filter and the long- short-term appearance model based on the re-identification network. The Hungarian algorithm is used to associate the motion prediction results of the tracklets with the detection results. By setting the tracking strategy, the algorithm has strong robustness to false positives and false negatives of the detector. The experimental results show that the proposed multiple object tracking algorithm achieves high tracking accuracy and good real-time performance. 3. A person re-identification algorithm based on self-attention feature with attributes and multi-level inhibition is proposed. The algorithm uses the improved CBAM attention mechanism module to select features. On this basis, a multi-level self-attention inhibition structure is proposed, which improves the detail searching ability of re-identification features by mutual inhibition among multi-level self-attention features. Furthermore, the attribute self-attention feature is integrated into the multi-level self-attention inhibition structures as the first level, which further improves the accuracy and interpretability of the algorithm. Experimental results show that the proposed algorithm has strong person re-identification performance. 4. An unsupervised camera network topology estimation algorithm based on pedestrian tracking and re-identification is proposed. The algorithm obtains the tracking trajectories through single-camera multiple pedestrian tracking, and clusters the starting and ending points of the trajectories to obtain the camera network topology nodes in the field of view. On this basis, an average cumulative cross-correlation function based on appearance similarity is proposed, which analyzes the pedestrian departure signal, entry signal and appearance similarity between topological nodes, to obtain the correlation between different nodes and analyze their physical connection relationship and transition time probability function. Experimental results show that the proposed algorithm can estimate the topology of camera network well. 5. The workflow of the multi-target multi-camera tracking system is designed, and the multi-target multi-camera tracking system is built. According to the nature of each subtask of multi-target multi-camera tracking method, the dependencies between the subtasks are analyzed, and the workflow of multi-target multi-camera tracking system is designed. On this basis, a multi-target multi-camera tracking system is built on a GPU server. The system is tested with several monitoring cameras in the laboratory, and the practical application of the system is preliminarily explored.
Language中文
Contribution Rank1
Document Type学位论文
Identifierhttp://ir.sia.cn/handle/173321/27974
Collection光电信息技术研究室
Affiliation1.中国科学院沈阳自动化研究所
2.中国科学院大学
Recommended Citation
GB/T 7714
何淼. 基于卷积神经网络的跨镜多行人目标检测与跟踪方法研究[D]. 沈阳. 中国科学院沈阳自动化研究所,2020.
Files in This Item:
File Name/Size DocType Version Access License
基于卷积神经网络的跨镜多行人目标检测与跟(10372KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[何淼]'s Articles
Baidu academic
Similar articles in Baidu academic
[何淼]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[何淼]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.