SIA OpenIR  > 空间自动化技术研究室
基于视觉的非合作目标位姿估计方法研究
Alternative TitleVision Based Pose Estimation Method for Non-cooperative Target
付明亮
Department空间自动化技术研究室
Thesis Advisor朱枫
Keyword非合作目标 全自由度位姿估计 权重模式滤波 递归贝叶斯滤波 帧间跟踪
Pages114页
Degree Discipline机械电子工程
Degree Name博士
2019-09-19
Degree Grantor中国科学院沈阳自动化研究所
Place of Conferral沈阳
Abstract非合作目标一般指缺少合作标志,运动状态等物理特征描述的目标物体。这类物体经常出现于各类在轨服务任务中,例如太空垃圾主动移除,星球探索等。而在地面,非合作目标的位姿估计问题同样存在,例如杂乱的抓取场景等。实现非合作目标鲁棒位姿估计在相关研究领域受到越来越多的关注。非合作目标位姿估计任务所面临的挑战主要由其所处的环境条件和自身物理特征决定。对于几何模型未知且没有安装合作测量标志的非合作目标,一般需要借助其上的典型几何特征来构建位姿参数的约束方程。对于不含此类几何特征的非合作目标,这类方法将不再适用。对于面临着前景遮挡、杂乱背景和光照条件变化等环境因素干扰任务场景,实现稳定的位姿估计依然是一项极具挑战性的任务。随着深度传感器相关技术的不断发展,越来越多的位姿估计任务中开始使用对光照变化更加鲁棒的深度信息。但是目前商用的ToF传感器分辨率普遍不高,这是限制其进一步应用的主要瓶颈。针对以上非合作目标位姿估计任务中存在的问题,本文通过以下几个方面展开了系统的研究:第一,针对基于时间飞行技术的ToF相机获取的深度图像分辨率过低的问题,提出了一种基于扩展权重模式滤波EWMF的升采样方法。在权重模式滤波WMF中集成了噪声抑制项,使得改进的后的EWMF在提升深度图像分辨率的同时,能很好地抑制原始深度图像中的噪声干扰。另外,不同于原始WMF中使用的固定滤波窗口,本文设计了一种精修的自适应滤波窗口来更好地适应图像的局部细节。第二,为了实现运动物体位姿估计建模时系统噪声的在线估计,实现了一个基于Unscented FastSLAM的位姿估计方法。本文使用若干独立的无损卡尔曼滤波器UKF分别完成对应跟踪特征的位置更新,更新的结果用于下一步的量测更新;为了得到更加精确的建议分布,将Sage-Husa 噪声估计器应用于UKF估计建议分布的过程,以实现过程时变噪声的在线估计。最后通过粒子滤波器从其重要性密度函数采样完成运动参数更新。第三,为了改善动态遮挡场景下帧间位姿跟踪的鲁棒性,本文提出了一个对遮挡鲁棒的帧间位姿跟踪方法。该方法以基线方法RFtracker(Tan 等2015)为基础,从两方面改善 RFtracker 的跟踪精度和应对遮挡时的鲁棒性。一方面为了改善跟踪精度,引入了局部精修模块以利用邻域视点树之间的互补信息。另一外方面为了增强算子在遮挡情况下的鲁棒性,设计了一个基于在线渲染机制的遮挡处理模块,通过图像深度值比较来达到遮挡检测的目的。另外为了应对帧间运动过大和不可避免的图像传输延迟,本文使用一个轻量化的基于卷积神经网络的运动补偿模块以改善帧间投影重叠率。第四,为了提升遮挡场景下位姿估计的精度,提出了基于投影分组和对应关系学习的位姿估计方法。Oberweger等(2018)提出使用物体兴趣区域的随机局部块来预测三维约束盒角点对应的投影热图,实现了部分遮挡情况下物体位姿的稳定估计。然而由于缺少全局方法中物体各部分之间的关联信息,得到的合并热图通道上存在多个局部最大值。为了解决这个问题,本文提出使用投影分组模块来引导初始合并热图的投影分组,以期通过不同通道投影热图之间的关联约束来消除不合理的投影。而后从经过滤波的合并热图上采样得到2D-3D对应关系假设池,并将假设样本输入给对应关系评估网络,最后利用高权重的对应关系假设来计算位姿参数。
Other AbstractNon-cooperative target refers to the target that is lack of the descriptions, completely or partially, of its physical profiles and physical characteristics. This category of targets is frequently encountered in a variety of space missions such as debris capturing, planetary surface explorations, etc. On the ground, pose estimation of uncooperative targets still exists, such as cluttered grabbing scenes. How to effectively estimate the geometrical features and postures (positions and orientations) of these targets have been receiving more and more attentions within related research communities. The challenging level for the estimation of non-cooperative target’s physical characteristics is dependent upon the environmental conditions and the complexity of the objects itself. If non-cooperative targets cannot be inserted or installed with visual labels or markers, the estimate of the postures will rely on the parametric constraint equations established using target’s typical geometrical features; If, on the other hand, typical geometrical features cannot be identified, then parametric constraint equation method will no longer be applicable. For the environmental conditions interfered with targets’ visibility such as occlusions, shadows, or poor illuminations, pose estimation will still be challenging. With the rapid development of depth sensor, more and more projects related to pose estimation turn to the use of depth information. Compared with RGB images, depth maps are more robust to illumination changes. However, the resolution of now-a-days commercially-available ToF sensors is limited, which becomes the main bottleneck restricting their further applications. Aiming at the problems and challenges in the pose estimation of Non-cooperative targets, the following aspects are proposed in this dissertation with an effort to enhance the robustness of pose estimation: First, Extend Weighted Mode Filtering (EWMF) approach is proposed to address insufficient spatial resolution issues of the depth maps captured by ToF cameras. The essence of EWMF is that a “noise-aware” term is designed into the framework of the original WMF model to make the original fixed-filtering window into a refined adaptive-filtering window to better adapt to the local details of the images, and hence, to achieve better performances in noise suppression. Second, a recursive Bayesian framework is designed to determine unknown, time-varying process noise. In order to realize the online estimate of system noise during modeling of pose estimation of moving objects, an Unscented-FastSlam-based pose estimation method is implemented. The proposed method uses several independent Unscented Kalman Filters (UKF) to separately update the locations corresponding to to-be-tracked features, the updated results are then used for the next position updates. In order to obtain more accurate proposal-distribution, Sage-Husa noise estimator is used during UKF estimation of the distribution to achieve online estimation of time-varying noise. Finally, the update of motion parameters is accomplished by sampling the density functions from the particle filter. Third, an occlusion-aware framework for real-time temporal pose tracking is proposed, which is based on the RF-tracker (Tan et al., 2015) but enhances the robustness and accuracy of RF-tracker in two aspects: integrated local refinement of random forest on one end and online rendering-based occlusion handling on the other. In order to improve the robustness against dynamic occlusion, an online rendering-based occlusion handling is implemented. In addition, a lightweight Convolutional Neural Network (CNN) based motion-compensated module is designed to cope with fast motion and physical delay. At last, combined projection grouping and correspondence learning for full pose estimation is studied. Oberweger et al. (2018) utilize CNN to build mapping from local image patch to project heatmaps of 3D bounding boxes’ corners (BBCs). However, local patches mean that the correlation between different parts of a special object is ignored. In each channel of the merged heatmaps, multiple local maxima can be frequently found. To solve these ambiguities thoroughly, a simple projection grouping module to guide the projection selection is proposed. Instead of directly feeding 2D–3D correspondences to the Perspective-n-Point (PnP) algorithm, multiple correspondence hypotheses are sampled from local maxima and the corresponding neighborhood are ranked by a correspondence–evaluation network. Finally, correspondences with higher confidence level are selected to determine the object pose.
Language中文
Contribution Rank1
Document Type学位论文
Identifierhttp://ir.sia.cn/handle/173321/25940
Collection空间自动化技术研究室
Recommended Citation
GB/T 7714
付明亮. 基于视觉的非合作目标位姿估计方法研究[D]. 沈阳. 中国科学院沈阳自动化研究所,2019.
Files in This Item:
File Name/Size DocType Version Access License
基于视觉的非合作目标位姿估计方法研究.p(28457KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[付明亮]'s Articles
Baidu academic
Similar articles in Baidu academic
[付明亮]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[付明亮]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.