SIA OpenIR  > 机器人学研究室
Online Similarity Learning for Big Data with Overfitting
Cong Y(丛杨); Liu J(刘霁); Fan BJ(范保杰); Zeng P(曾鹏); Yu HB(于海斌); Luo JB(罗杰波)
作者部门机器人学研究室
关键词Online Learning Similarity Learning Low Rank Sparse Representation Feature Selection Overfitting Redundancy
发表期刊IEEE Transactions on Big Data
ISSN2332-7790
2018
卷号4期号:1页码:78-89
收录类别EI
EI收录号20181104888919
产权排序1
资助机构NSFC (61375014, U1613214, 61533015), CAS Youth Innovation Promotion Association Scholarship (2012163) and also the foundation of Chinese Scholarship Council.
摘要

In this paper, we propose a general model to address the overfitting problem in online similarity learning for big data, which is generally generated by two kinds of redundancies: 1) feature redundancy, that is there exists redundant (irrelevant) features in the training data; 2) rank redundancy, that is non-redundant (or relevant) features lie in a low rank space. To overcome these, our model is designed to obtain a simple and robust metric matrix through detecting the redundant rows and columns in the metric matrix and constraining the remaining matrix to a low rank space. To reduce feature redundancy, we employ the group sparsity regularization, i.e., the `2;1 norm, to encourage a sparse feature set. To address rank redundancy, we adopt the low rank regularization, the max norm, instead of calculating the SVD as in traditional models using the nuclear norm. Therefore, our model can not only generate a low rank metric matrix to avoid overfitting, but also achieves feature selection simultaneously. For model optimization, an online algorithm based on the stochastic proximal method is derived to solve this problem efficiently with the complexity of O(d2). To validate the effectiveness and efficiency of our algorithms, we apply our model to online scene categorization and synthesized data and conduct experiments on various benchmark datasets with comparisons to several state-of-the-art methods. Our model is as efficient as the fastest online similarity learning model OASIS, while performing generally as well as the accurate model OMLLR. Moreover, our model can exclude irrelevant / redundant feature dimension simultaneously.

语种英语
文献类型期刊论文
条目标识符http://ir.sia.cn/handle/173321/21394
专题机器人学研究室
通讯作者Cong Y(丛杨)
作者单位1.State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
2.Department of Computer Science, University of Rochester, Rochester, NY 14611 USA
3.College of Automation, Nanjing University of Posts and Telecommunications, Nanjing, 210042 China
推荐引用方式
GB/T 7714
Cong Y,Liu J,Fan BJ,et al. Online Similarity Learning for Big Data with Overfitting[J]. IEEE Transactions on Big Data,2018,4(1):78-89.
APA Cong Y,Liu J,Fan BJ,Zeng P,Yu HB,&Luo JB.(2018).Online Similarity Learning for Big Data with Overfitting.IEEE Transactions on Big Data,4(1),78-89.
MLA Cong Y,et al."Online Similarity Learning for Big Data with Overfitting".IEEE Transactions on Big Data 4.1(2018):78-89.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Online Similarity Le(930KB)期刊论文作者接受稿开放获取ODC PDDL浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Cong Y(丛杨)]的文章
[Liu J(刘霁)]的文章
[Fan BJ(范保杰)]的文章
百度学术
百度学术中相似的文章
[Cong Y(丛杨)]的文章
[Liu J(刘霁)]的文章
[Fan BJ(范保杰)]的文章
必应学术
必应学术中相似的文章
[Cong Y(丛杨)]的文章
[Liu J(刘霁)]的文章
[Fan BJ(范保杰)]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Online Similarity Learning for Big Data with Overfitting.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。