中国科学院沈阳自动化研究所机构知识库
Advanced  
SIA OpenIR  > 其他  > 学位论文
题名: 基于视频的人脸识别和运动人体检测研究
其他题名: Research of Face Recognition and Moving Human Detection Based on Video Sequence
作者: 易轶虎
导师: 曲道奎 ; 徐方
分类号: TP391.4
关键词: 模式分类 ; 人脸检测 ; 人脸识别 ; 运动分割 ; 人体检测
索取号: TP391.4/Y48/2010
学位专业: 模式识别与智能系统
学位类别: 博士
答辩日期: 2010-06-02
授予单位: 中国科学院沈阳自动化研究所
学位授予地点: 中国科学院沈阳自动化研究所
作者部门: 其他
中文摘要: 在生物特征识别技术中,人脸识别和人体检测是最自然、直接和友好的手段。理论上,人脸识别和人体检测的研究涉及多个学科领域,己经成为模式识别和人工智能领域中极富挑战性的热点课题之一,具有重要的理论研究价值。现实中,该项技术在身份认证、电子商务、视频监控、人机交互等领域具有广阔的应用前景。本文主要介绍了作者对人脸识别和人体检测中一些算法的研究,在人脸识别方面,主要对肤色模型、人脸灰度组合特征提取、多视角人脸检测、人脸识别中的小样本和信息融合问题等内容进行了研究;在运动人体检测方面,主要研究了复杂条件下运动区域分割和基于扩展梯度方向直方图的人体检测。具体的研究成果主要包括: (1) 在人脸检测方面,针对现有模型对肤色空间刻画的不足,提出一种基于参数查找表的肤色检测算法。该方法将肤色和非肤色看作两类模式,通过在YCbCr颜色空间的统计,计算样本在不同色度下沿亮度的概率分布,分类采用贝叶斯判别规则,查找表用于存储模型参数,实现了快速查找和计算。实验中,从像素样本分类和彩色图片分割两个方面对算法的检测性能进行对比,说明其有较高的检测率和较强的鲁棒性。 针对基于AdaBoost人脸检测算法训练极为耗时,且对人脸旋转的检测能力较弱的不足。分别提出组合Haar-like特征和宽度优先的决策树的头部姿态估计。特征组合基于人脸检测中多个特征共现的特性,特征共现性可以更好地捕获人脸模式的相似性。与AdaBoost算法每个弱分类器由1个特征构成不同,提出的方法在每个弱分类器中可以有n个特征构成,除了利用积分图计算单个特征的特征值,组合特征的特征值采用二进制编码方式,因此可以快速计算特征值且对噪声不敏感。通过与Viola算法比较,组合特征可以使训练错误率降低37%而训练速度提高2.6倍。针对旋转人脸检测,提出宽度优先的的决策树检测框架,目的是能够检测一定角度内的平面内外旋转人脸。框架可以检测5个范围角度左右偏转和3个角度方位的俯仰旋转。 (2) 在人脸识别方面,针对常用PCA和LDA在处理高维小样本分类问题的不足,提出的多子空间线性判别分析。算法基于最大散度差准则利用多线性子空间技术对每类样本进行单独描述,针对每类样本提取最适合分类的特征子空间,使之更准确地反映样本在类内类间的分布关系;分类时综合考虑投影后样本的概率分布模型,判别不是依据距离,而是按照贝叶斯决策规则得到的隶属置信度作为衡量标准,并以此作为分类的依据,选取最可能的类属划分。实验结果表明本方法的有效性,和单一子空间方法相比,它可以处理由于类重叠和投影后类扭曲等难以分类的情况。在IMDB人脸数据库测试中,就识别率而言,有所提高。 提出人脸识别的信息融合框架,根据认知科学,人类的认知过程是一个从粗到精,逐步细化的过程。与全局特征的整体描述不同,局部特征对人脸模式的类内变化不敏感,对表情、光照、遮挡等变化有更高的鲁棒性,因此通过全局和局部特征的结合进一步提高识别率是一个可行的方法。基于信息融合的人脸识别框架由两级分类器组成,首先利用全局分类器对输入图像进行识别,根据KNN规则,获得K个潜在候选项,进一步,对每个候选项根据局部特征分类器计算得分和排名。对于特征提取,使用LDA和Gabor小波变换分别表示全局和局部判别信息。特别是为了在更充分意义上实现匹配,采用了权重Gabor特征,即根据面部区域在识别中的重要性的不同,分别给予提取的Gabor特征赋予不同的权重系数,权重系数的计算考虑了面部不同部位的信息熵。实验结果显示方法可以提高在表情和光照变化下的人脸识别率。 (3) 在视频序列的运动区域提取中,提出基于场景变化分析的自适应背景更新方法,采用了临时背景的概念,将背景划分成原始背景和临时背景;提出了场景分析的方法,利用若干帧连续图像中每点像素灰度值样本,估算对应像素点的灰度均值和方差。根据正态分布的“3σ原则”判断背景中变化,包括整体光线突变及部分光线突变、前景目标与背景内容间相互转化等情况进行判断,并针对相应的情况采取不同的更新策略和更新率,消除背景变化对目标分割的影响。针对运动区域中可能含有目标自身投影的问题,根据目标自身和阴影区域的色度变化小而亮度差异大特性,提出利用当前帧与背景亮度和色度比值判断阴影区域,最后利用背景模型,并消除阴影与光照的干扰后,通过当前帧I(x,y)与背景模型B(x,y)的差分获得灰度图像,灰度图像按动态阈值分割提取运动区域。 (4) 为了快速准确地对运动人体进行检测,提出了一种采用扩展梯度方向直方图作为特征的运动人体检测方法。该方法首先将人体外观特征,如:人体部位的相对位置关系、对称特征,梯度密度等与梯度方向直方图相结合,提高HOG特征的判别能力,然后对特征进行空间扩展,扩大对人体目标全局梯度特征的描述,使用直方图相似性和Fisher准则来衡量所有定义特征的分辨能力,然后选择一些具有强分辨能力的特征来表征运动人体;从目标和背景中获得的经过筛选的特征用于训练支持向量机分类器。针对以梯度方向直方图作为人体特征的运动人体检测存在向量维数较大、检测时间较长的问题,提出基于人体部位划分的运动人体检测方法,分别在头部及四肢等6个重点区域计算梯度方向直方图,有效地减少了向量维数。实验结果表明,该方法在检测率基本不变的情况下提高了检测速度。
英文摘要: Face recognition and human detection are the most natural, direct, and friendly means among biometrics recognition technology. In theory, such research involves several subjects and has become a challenging research point in pattern recognition and artificial intelligence domain, with very large theoretical values. In practice, face recognition and human detection have been widely applied in image recognition tasks, such as identity authentication, electronic commerce, video surveillance, and human machine interaction. In this dissertation, firstly, we investigate the several key problems of face recognition including skin color modeling, joint feature extraction, multiview face detection, the small sample size problem and information fusion in face recognition. On aspect of moving human detection, we mainly explore the algorithm on moving region segmentation in the complex situation and extend histograms of oriented gradients for human detection. Concrete contents of this dissertation can be summarized as follows: (1) Because of the lack of adequate characterization in existing models for the distribution of skin-tone, algorithm is proposed to detect skin-tone based on a look-up table storing parameters. This method takes skin-tone and non skin-tone as two types of mode. By means of statistics in YCbCr color space, we can obtain the probability distribution of different samples with same chrome along lum coordinate, then classify using Bayesian discriminant rules, among which look-up table is used to store the model parameters to achieve rapid search and calculation rate. During experiment, we evaluate the detection performance of algorithm from two aspects of the pixel samples classification and image segmentation, result shows our method with high detection rate and strong robust. To save the training time in face classifier learning stage and improve the poor performance in multi-view face detection using Adaboost algorithm, the joint Haar-like feature and Width-First-Search tree detector are proposed respectively. We describe a new distinctive feature, called joint Haar-like feature for detecting faces in images. This is based on co-occurrence of multiple Haar-like features. Feature co-occurrence, which captures the structural similarities within the face class, makes it possible to construct an effective classifier. Differing from each weak classifier composed by one feature in Adaboost, a weak classifier can constitute n features in our method. The joint Haar-like features are represented by combining the binary variables computed from multiple features, so it can be calculated very fast and has robustness against addition of noise and change in illumination. A face detector is learned by stagewise selection of the joint Haar-like features using Adaboost. A small number of distinctive features achieve both computational efficiency and accuracy. Experimental results show that our detector yields higher classification performance than Viola and Jones’ detector. Given the same number of features, our method reduces the error by 37%. Our detector is 2.6 times as fast as Viola and Jones’detector to achieve the same performance. Since face images are seldom upright and frontal, furthermore, existing methods suffer performance and speed penalties in rotated face detection. Aiming at detecting faces with rotation-in-plane and rotation-off-plane angles in still images or video sequences, we design the Width-First-Search tree detector structure to detect multiview including roll with 3 angle intervals and yaw with 5 angle intervals. (2) On aspect of face recognition, aiming at the shortages of PCA and LDA in classification of high-dimensional statistical data underlying small sample size problem, we present a new approach for face recognition, named multisubspace linear discriminant analysis. We depict sample of each class using multi-subspace technique and maximum scatter difference criterion to reflect sample between-class and within-class distribution more accurately. Criterion of classification is confidence from Bayes rule rather than distance. Comparing with other approaches, our method yields a piecewise linear feature subspace and is particularly well-suited to difficult recognition problems where classes are highly overlapped, or in cases where a prominent curvature in data renders a projection onto a single linear subspace inadequate. The results of experiments conducted on a subset of IMDB face database, indicate the effectiveness of the proposed method. As far as recognition rate is concerned, a marked improvement is obtained. In paper, a framework is proposed which fusing global and local information for face recognition. Numerous studies in psychophysics and neurophysiologics have shown that human generally identifies one person from coarse to fine using global and local information respectively. Instead of only using holistic representation of face appearance, local features can reduce intrapersonal variation through different mechanisms. more and more local feature descriptors are designed and employed due to their desirable robustness under different facial expressions, varying lighting conditions and partial occlusions. Therefore, it is reasonable and natural to expect better performance by combining global and local information. In our method, We generate two classifiers by taking full advantage of Bayesian rule, all these classifiers are combined to form a hierarchical ensemble. Global features are extracted from whole face images by LDA and local features are extracted by gabor wavelet transform and weighted according to importance of loacl region.. Firstly, input image is identified by global feature classifier, according to k-nearest neighbors rule, we can attain k potential candidates; furthermore, to every candidate, we calculate score and rank in terms of local feature. In order to perform matching in the sense of the richness of identity information rather than the size of a local area and to handle the partial occlusion problem, the proposed method employs an weighting technique to weight every sub-Gabor feature which is extracted from local area, based on the importance of local area in the face recognition.We evaluated the proposed method compared with other classical algorithms. Experiments on the asia face databases show that our method achieves satisfactory performance not only under the conditions of varied facial expression and lighting configuration but also under the conditions where the pose and sample size are varied. (3) To accurately extract the moving region in video sequence, we proposed a novel approach to update background model adaptively based on analysis of scene change: A temporal background concept is represented, and all background contents are divided into the original background and the temporal background. We estimate of mean square error of grey value of each pixel by sequence images to analyze the reasons of variation in scene, according to the 3σ rule of normal distribution, the variety of background can be judged, including whole and partial chang in lighting, commutative transformation between froreground and background, so that adopt different updating strategy and rate. To eliminate shadow from object itself, we estimate the shadow area in terms of property, which distinct difference in lum and slim difference in lum can be observed from the object and its shadow. Finally, gray image obtained after background subtraction is binarized with dynamic threshold. (4) We introduce an augmented histograms of oriented gradients (HOG) feature for moving human detection more fastly and accurately. We enhance the discriminating power of original HOG feature by adding human shape properties, such as contour distances, symmetry, and gradient density. Based on the biological structure of human shape, we impose the symmetry property on HOG features by computing the similarity between itself and its’ symmetric pair to weight HOG features. then, we extend HOG feature on spatial dimension to capture the more global human feature. After that, the capability of describing human features is much better than the original one, especially when the humans are moving across. Histogram similarity and Fisher criterion are employed to measure discriminability of all features and then we selected some discriminative features to identify the human body. SVM classifier is constructed to train the selected features from the target and surrounding background. Experimental results show that the proposed approach is efficient and rapid in pedestrian detection. Aiming at the drawbacks of human detection method based on HOG are larger dimensions of features and slow detection speed. a method based on HOG in region of Interest is proposed. HOG is calculated in six important regions which locate in head and limbs’ regions respectively. Through this method, dimensions of features are decreased effectively. Experimental results show this method speeds up detection process while maintaining comparably accuracy to the method based on HOG. At last, We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results.
语种: 中文
产权排序: 1
内容类型: 学位论文
URI标识: http://ir.sia.cn/handle/173321/9256
Appears in Collections:其他_学位论文

Files in This Item:
File Name/ File Size Content Type Version Access License
基于视频的人脸识别与运动人体检测研究.pdf(4924KB)----限制开放 联系获取全文

Recommended Citation:
易轶虎.基于视频的人脸识别和运动人体检测研究.[博士学位论文 ].中国科学院沈阳自动化研究所 .2010
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[易轶虎]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[易轶虎]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2016  中国科学院沈阳自动化研究所 - Feedback
Powered by CSpace