SIA OpenIR  > 机器人学研究室
基于视觉感知和空间认知的机器人定位方法研究
Alternative TitleRobot Localization Based on Visual Perception and Spatial Cognition
赵冬晔1,2
Department机器人学研究室
Thesis Advisor封锡盛 ; 唐凤珍
Keyword自主定位 视觉感知 空间认知 长期自主 反向传播神经网络
Pages116页
Degree Discipline模式识别与智能系统
Degree Name博士
2020-12-06
Degree Grantor中国科学院沈阳自动化研究所
Place of Conferral沈阳
Abstract定位技术是实现机器人导航的最基本环节。当移动机器人处于大范围、通信困难、动态性强的非结构化环境时,现有的定位技术面临着自适应性低、可扩展性差等众多问题。新型定位技术的探索此时变得尤为迫切,哺乳动物高超的导航能力给予了这一项探索极大的灵感。神经生物学研究发现这种令人惊叹的能力取决于哺乳动物大脑内部鲁棒、高效的信息处理机制。如何借鉴哺乳动物的大脑导航机制提高移动机器人的定位性能是移动机器人新型定位技术研究的重中之重。本文从理论层面上基于哺乳动物的视觉感知和空间认知机制对移动机器人定位展开了深入探索,其通过模拟大脑内相关脑区及神经元的解剖学连接结构,再现了哺乳动物脑内认知地图的构建过程,实现了端到端的移动机器人定位任务,为推动移动机器人的长期、自主、智能化定位提供了强有力的理论和方法支持。本文的具体内容可概括为:第一,视觉感知。视觉感知能够从二维图像中提取出有关环境的重要路标,为定位任务提供有效的外源性信息,因此移动机器人的定位性能常常受到视觉感知能力的影响。为了提高自然场景中的位置识别能力,本文受到灵长类动物视觉处理过程的启发构建了一个孪生层级网络结构,从成对的视觉图像中自适应提取出可抵抗场景变化的有效视觉特征。区别于深度卷积网络,本文提出的特征提取器摆脱了对大量有标签数据的依赖,能够通过痕迹学习规则从未标记的视觉输入中实现高精度位置识别。实验验证发现,面对天气变化、昼夜更迭、视角变化等多类具有挑战性的自然场景,模型的位置识别能力明显超越已有位置识别工作的最优结果。第二,基于感觉运动信息整合的自主定位。除了获取外源性信息,移动机器人实现定位任务还需要其自身能够具备一定的空间认知能力,对物理空间的大小、形状、方位、距离等重要因素进行信息加工。本文以哺乳动物的脑内空间认知机制为基础,通过整合视觉感知信息和运动信息,建立了环境的认知地图表示模型,编码了位置点间的空间关系。模型包含表征环境视觉信息的深度神经网络,整合多感知信息并编码空间信息的循环神经网络,解码移动体位置坐标的信息读取网络。循环神经网络由环式连接搭建起同层神经元间的相互关联,环式连接的不对称性使得模型能够在特征表示空间中形成与移动体在物理空间中的运动相匹配的动态记忆,为路径整合提供合理的神经生物学解释。模型通过竞争性学习来调节层级间神经元的连接强度,建立了视觉感知与空间记忆间的关联,实现了感知、记忆、决策的一体化建模。在仿真实验环境中,模型展现出优异的实时定位性能和抗噪能力,再现了哺乳动物的海马区空间表示。第三,大尺寸环境中的空间认知。移动机器人走向实用的关键在于长期自主性,即具备高效表达大规模环境的能力。为了提高移动机器人在大规模环境中的定位性能,本文探索了哺乳动物在野外跨越数千公里的导航机制。与常规实验环境不同,海马体中发现的位置细胞在大尺寸环境中常常呈现出多区域、无规则分布式放电。本文通过整合运动信息和视觉信息构建了大尺寸环境认知地图表示的自组织模型,定量评价了位置细胞活动的统计分布特性,发现位置细胞在环境中的编码是随机且相互独立的。实验分析还发现位置细胞在哺乳动物的大脑导航系统中将进行无记忆式征募,征募比例随着探索环境面积的扩大呈亚线性增长。通过对海马区发现的位置细胞的重新制图能力进行详细的分析与验证,本文发现栅格细胞所表征的运动信息是海马区空间编码的决定性因素,其中对栅格细胞进行去极化操作会使得空间记忆明显受损。第四,基于反向传播神经网络的大环境自主定位。为了进一步探索移动机器人在大规模环境中的长期自主定位能力,本文基于反向传播神经网络架构对大尺寸环境中的认知地图表示机理进行了建模。模型以双支深度层级网络模拟内嗅皮层-海马区神经通路的信息传递过程,通过反向传播强化深度网络中权重连接的学习,实现视觉感知信息和运动信息的整合,端到端完成了物理空间的编码和解码任务。仿真实验表明模型能够在大尺寸实验环境中实现高精度、稳定的实时定位任务。此外,当视觉信息输入和运动信息输入不稳定时,模型能够自适应地调整网络中节点间的连接强度,维持优良的定位性能。此项研究说明了反向传播神经网络在测试大脑工作机制方面的重要意义。本文将深度学习、机器视觉、生命科学与机器人学相结合,通过感知、记忆的一体化建模,仿真实现了移动机器人在大规模、未知、动态性强的非结构化环境中的环境感知、空间记忆和实时定位任务。
Other AbstractTo navigate in space, a robot must possess the localization ability. Existing models for navigational tasks are low adaptive and low generalizing when robots move in an unstructured environment. Therefore, the exploration of new robot navigation technologies becomes particularly urgent at this time. Mammals give great inspiration to the exploration of new navigation technologies. In the wild, many mammals have demonstrated excellent field navigation. Neurobiological studies have found that mammalian navigation depends on robust and efficient information-processing mechanisms in the mammalian brain. How to use spatial cognition mechanisms of mammals to improve the navigation ability of robots is the focus of our research. The project provides theoretical models to support the development of the long-term, autonomous, and intelligent navigation system. We list all of our work in the following. First, visual place recognition in changing environments is a challenging and critical task for autonomous robot navigation. We developed an unsupervised learning method (siamese VisNet) to autonomously learn invariant features in changing environments from unlabeled images. The siamese VisNet has two identical branches of sub-networks. With a Hebbian-type of learning rule incorporating a trace of previous activity patterns, the siamese VisNet learns features with increasing invariance in changing environments from layer to layer. Experiments conducting on multiple datasets demonstrate the robustness of the siamese VisNet against viewpoint changes, appearance changes, and joint viewpoint-appearance changes. And the performance of the siamese VisNet outperforms the performance of the state-of-the-art ConvNets for place recognition. Second, how to transform a mixed flow of sensory and motor information into a memory state of self-location and to build map representations of the environment are central questions in navigation research. We proposed a sensorimotor integration network (SeMINet) to learn cognitive map representations by integrating sensory and motor information. This biologically-inspired model consists of a deep neural network representing visual features of the environment, a recurrent network of place units encoding spatial information, and a secondary network to decode the locations of the agent from spatial representations. The recurrent connections between the place units sustain an activity bump in the network without the need of sensory inputs. And the asymmetry of the recurrent connections propagates the activity bump in the network with the motion of the agent, forming a dynamical memory. A competitive learning process establishes the association between the sensory representations and the memory state. Simulation results demonstrate the network could form neural codes that convey location information of the agent, independent of the agent's head direction. The decoding network reliably predicts the location even when the movement is subject to noise. Third, different from the hippocampal representations in small laboratory environments, place cells usually exhibit multiple and irregularly-spaced place fields in large-scale environments. To successfully apply the work to the real world in the future, we proposed a linear summation model for large environments. The proposed model integrates both self-motion information and visual information. Self-motion information of an agent provides inputs for multi-phase grid cells in the MEC. Visual information extracted from visual images provides inputs for neurons in the LEC. We proceeded with a series of simulations in the large-scale box maze. Multiple and irregularly place fields are formed to encode this large laboratory environment. Quantitative measurements indicate that the spatial representations in the model are sparse coding. Besides, we studied the remapping ability of place fields under several conditions. Results show LEC inputs only modulate the formation of the spatial coding in the hippocampus, while MEC inputs determine the hippocampal representations. Even changes in the firing rate of MEC neurons may dramatically impair spatial memory. Fourth, the paper further explores the mammalian navigation mechanism in large-scale environments by deep learning. A biologically-inspired hierarchical architecture is proposed, which composes two parallel subnetworks mimicking the LEC and the MEC, and one convergent net mimicking the hippocampus. LEC relays time-related visual information, while MEC supplies space-related information to the hippocampus. The convergent net would integrate all information from the parallel subnetworks and predict the positions of the agent in the environment. Synaptic weights of the vision-to-place and the grid-to-place transformation are learned based on the stochastic gradient descent algorithm. Simulations in a large-scale virtual maze demonstrate that place units in the model form multiple and irregularly-spaced place fields, similar to those observed in neurobiological experiments. And the model accurately tracks the agent's movement based on such spatial representations. Moreover, the model is robust to degraded visual inputs but would suffer from localization difficulties in the positional prediction when motion inputs are limited. The project integrates computer vision, deep learning, life science, and robotics to theoretically realize autonomous navigation in unstructured environments.
Language中文
Contribution Rank1
Document Type学位论文
Identifierhttp://ir.sia.cn/handle/173321/27976
Collection机器人学研究室
Affiliation1.中国科学院沈阳自动化研究所
2.中国科学院大学
Recommended Citation
GB/T 7714
赵冬晔. 基于视觉感知和空间认知的机器人定位方法研究[D]. 沈阳. 中国科学院沈阳自动化研究所,2020.
Files in This Item:
File Name/Size DocType Version Access License
基于视觉感知和空间认知的机器人定位方法研(224238KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[赵冬晔]'s Articles
Baidu academic
Similar articles in Baidu academic
[赵冬晔]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[赵冬晔]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.