Institutional Repository of Digital Factory Department
基于E-t-SNE的混合属性数据降维可视化方法 | |
Alternative Title | Dimension reduction and visualization of mixed-type data based on E-t-SNE. Computer Engineering and Applications |
魏世超1,2,3,4; 李歆1,3,4![]() ![]() ![]() ![]() | |
Department | 数字工厂研究室 |
Source Publication | 计算机工程与应用
![]() |
ISSN | 1002-8331 |
2020 | |
Volume | 56Issue:6Pages:66-72 |
Indexed By | CSCD |
CSCD ID | CSCD:6707936 |
Contribution Rank | 1 |
Funding Organization | 沈阳市科技计划项目(No.Z18-5-102) |
Keyword | t-SNE算法 混合属性数据 降维 可视化 |
Abstract | 针对传统的t分布随机近邻嵌入(t-SNE)算法只能处理单一属型数据,不能很好的处理混合属性数据的问题,提出一种扩展的t-SNE降维可视化算法E-t-SNE,用于处理混合属性数据。首先,该方法引入信息熵概念来构建分类属性数据的距离矩阵,其次采用分类属性数据距离与数值属性数据欧式距离相结合的方式构建混合属性数据距离矩阵,最后将新的距离矩阵输入t-SNE算法对数据进行降维并在二维空间可视化展示。此外,为验证算法有效性,采用K近邻(KNN)算法对混合数据降维后的效果进行评价。通过在UCI数据集上的实验表明,该方法在处理混合属性数据方面,不仅具有较好的可视化能力,而且能有效地对不同类别的数据进行降维分簇,提升后续分类器的分类准确率。 |
Other Abstract | Aiming at the problem that the traditional t-SNE algorithm can only deal with single attribute data and can’t handle mixed type data very well. An extended t-SNE dimensionality reduction visualization algorithm named E-t-SNE is proposed. The extension facilitates to handle mixed type data. Firstly, the concept of information entropy is introduced to construct the distance matrix of categorical data. Secondly, the distance matrix of mixed type data is constructed by combining the distance between categorical data and the Euclidean distance of numerical data. Finally, the combined matrix is used into t-SNE algorithm to reduce the dimension and display it in two-dimensional space. In addition, in order to verify the effectiveness of the algorithm, K-Nearest Neighbor (KNN) algorithm is used to evaluate. Experiments on UCI datasets show that this method not only has good visualization ability in dealing with mixed attribute data, but also can effectively reduce the dimension of different classes of data and improve the classification accuracy of subsequent classifiers. |
Language | 中文 |
Citation statistics | |
Document Type | 期刊论文 |
Identifier | http://ir.sia.cn/handle/173321/24933 |
Collection | 数字工厂研究室 |
Corresponding Author | 魏世超 |
Affiliation | 1.中国科学院沈阳自动化研究所 2.中国科学院大学 3.中国科学院网络化控制系统重点实验室 4.中国科学院机器人与智能制造创新研究院 |
Recommended Citation GB/T 7714 | 魏世超,李歆,张宜弛,等. 基于E-t-SNE的混合属性数据降维可视化方法[J]. 计算机工程与应用,2020,56(6):66-72. |
APA | 魏世超,李歆,张宜弛,周晓锋,&李帅.(2020).基于E-t-SNE的混合属性数据降维可视化方法.计算机工程与应用,56(6),66-72. |
MLA | 魏世超,et al."基于E-t-SNE的混合属性数据降维可视化方法".计算机工程与应用 56.6(2020):66-72. |
Files in This Item: | ||||||
File Name/Size | DocType | Version | Access | License | ||
基于E_t_SNE的混合属性数据降维可视(1725KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | View Application Full Text |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment