中国科学院沈阳自动化研究所机构知识库
Advanced  
SIA OpenIR  > 数字工厂研究室  > 学位论文
题名: 基于聚类的社区居民健康指数预测模型研究
其他题名: Research on Prediction Model of Health Index for Community Residents Based on Clustering
作者: 章永来
导师: 史海波 ; 周晓锋
关键词: 社区卫生 ; 机器学习 ; 聚类 ; 降维 ; 预测
索取号: O159/Z29/2015
页码: 111页
学位专业: 机械电子工程
学位类别: 博士
答辩日期: 2015-05-26
授予单位: 中国科学院沈阳自动化研究所
学位授予地点: 中国科学院沈阳自动化研究所
作者部门: 数字工厂研究室
中文摘要: 社区卫生服务是我国公共卫生服务的重要组成部分,是个人或者家庭与国家医疗卫生服务系统之间的最基本环节,是满足居民基本健康需求的较为理想的医疗服务模式。从世界范围来看,最近的几十年来社区卫生服务模式发展非常迅猛。我国城市社区卫生服务起步并不算晚,只是目前出现了医疗资源过度向大医院集中、医疗费用持续增长、看病难、看病贵等问题。因此,医疗卫生服务体系的改革重心将会从以诊疗为主的康复服务体系过渡到以预防为主的社区卫生服务体系。在这种情况下,进一步完善社区卫生服务体系,显得尤为重要与紧迫。完善社区卫生服务体系也是解决我国医疗卫生服务体系困境的一个最重要的突破口。目前,我国的社区卫生服务中心的信息化程度较高。但是信息系统数据的完整性与对数据的利用率有待进一步提高。由于,社区医院与居民的关系非常紧密,接触也非常频繁,比如每年的体检、预防接种、小病的防治、慢病的管理与治疗等都可以在社区医院完成。这样,社区卫生服务中心就全整地保留了所辖地区居民的健康状况相关的信息资料,为综合地评价每一位居民的健康水平提供了可能与保障。本文在分析和整理某社区卫生服务中心的信息系统数据库的基础上,通过预测社区居民一些常见慢性疾病的健康指数,综合评价社区居民的健康状况,为评价与预防社区居民常见病提供数据上的支持。论文的研究内容主要包括六个部分:疾病健康指标体系的分析与建立、医疗数据集数据前处理、降维方法研究、聚类分析与孤立点检测研究、疾病预测模型集研究、疾病辅助识别分类方法研究等。论文的核心思想是利用慢性疾病事先呈现的外部特征数据集进行聚类分析,通过聚类来分析疾病的内部分类机理,从而提高慢病健康指数预测模型的精度。本文的研究思路是在建立慢病预测模型体系结构的基础上,重点研究了早期肝病的辅助诊断分类模型和中风风险预测模型,研究的主要内容有:(1)研究针对医疗卫生数据集的慢性疾病预测模型体系结构,并以早期肝病诊断与中风风险预测为例验证了方法体系的有效性。在研究总体预测体系结构中的数据前处理、聚类分析、降维处理、分类和预测模型等各个方法的同时,不断完善社区居民健康指数指标体系。预测模型体系结构是一个有机的整体,各个部分之间有着紧密的联系,又有明显的区别,医疗数据流的方向可以根据具体的预测问题进行灵活地调整。(2)由于数据集前期特征提取质量的好坏会对预测模型的精度造成极其深远的影响,针对社区卫生服务中心的数据库,本文选择了一种适合于社区卫生服务中心的数据前处理方法。本课题首先利用Visual Stdio 2010和MATLAB 2008R对信息系统数据库数据进行了数据清洗、数据集成与变换、数据归约等三个部分的程序处理,提取了175*14的早期肝病特征数据集、2343*28的中风风险特征数据集和394*28的健康人群特征数据集,并对这些数据集进行了初步分析。(3)重点研究了慢病早期的辅助诊断分类方法。在提取了慢病早期原始化验数据集的基础上,研究针对体检等样本不平衡条件下的慢病早期辅助诊断方法。辅助诊断方法主要分为降维可视化、萤火虫优化算法的参数优化和支持向量数据描述的辅助诊断等三个主要的步骤,为慢病的早发现与早治疗提供了技术上的支持。该方法在早期肝病的辅助诊断中验证了方法的有效性。(4)重点研究了慢病风险指数的预测模型集。首先,研究建立慢病健康指数指标体系;然后,针对目前特征选择算法的缺陷,研究一种新的结合属性特征度量和有用程度的特征选择算法;第三,依据慢病风险数据集的特征选择的重要性评价结果,结合基于密度与距离的快速峰值聚类算法与具有较好孤立点检测功能的超椭球算法,研究适合于慢病特征数据集的多椭球快速密度聚类算法;第四,为了辅助识别患者的类型,针对聚类分析聚类簇,研究改进的多核支持向量机分类算法;最后,针对聚类簇的结果, 研究基于支持向量机回归和极限学习机的慢病风险健康指数的预测模型集。该方法体系在中风的风险预测中验证了方法的有效性。
英文摘要: Community health service is an important part of public health service, is the most basic aspects of individual or family with the national health system, is the ideal health care model of the basic health needs for the residents. From a global perspective, the community health service model is growing very fast in the last few decades. Chinese urban community health service does not start too late, but there are now concentrating medical resources to the big hospital, sustaining growth in medical costs, and difficult and expensive problemsmedical treatment. In this case, further improving community health service system is particularly important and urgent, is too solve one of the most important breakthrough in our health service system.Currently, there is the high degree of information in community health service centers in our country. However, the data integrity of information system and the data utilization need to be further improved. Because of the relationship between hospitals and community residents are very close and frequent contact, such as an annual physical examination, vaccination, prevention and treatment of the basic illnesses, management and treatment of chronic diseases can all be done in community hospitals. Thus, the community health service centers retain the health documents of of local residents. A comprehensive assessment of the health may provide protection for every resident.Based on the analysis and consolidation of the database of the community health service center information system, through the prediction of health index for some common illness and a comprehensive evaluation of the health of community residents, it provide data support for evaluation and prediction of community residents. The main contents of this paper consists of six parts: construction of disease health index system, the data pre-processing for medical data set, dimensionality reduction, cluster analysis and outlier detection, prediction model, the classification method of the disease. The core idea of this paper is that using the external characteristics of the disease analysis the internal mechanism for the illness. Clustering analysis improve the accuracy of prediction model of disease health index. Research ideas in this paper is based on the architecture of the disease health index, and focuses on the diagnosis and classification model of early liver disease and the risk prediction model of stroke. The main contents of this paper are as follows:(1) Research on disease prediction model framework architecture based on health datasets. After studying the pre-processing, cluster analysis, dimensionality reduction, classification and prediction models, the health index of community residents can be constantly improve. Predictive model architecture is an organic one, and there are closely linking by difference parts. The direction of the flow can be adjusted for specific issues.(2) The accuracy of prediction model will cause far-reaching impact for the quality of pre-processing. Based on a database of community health service centers, the paper raised the methods for pre-processing. This paper use the softwares (Visual Stdio 2010 and MATLAB 2008R) for data cleansing, data integration and transformation, data reduction. The paper extracted the early liver datasets (175 * 14), high stroke risk data sets (2343 * 28), and healthy population data sets (394 * 28). Then, these data sets were analyzed.(3) Focus on the classification models for the early diagnosis of liver disease. Using the extraction of raw liver test data set, the paper study method for the early diagnosis based on the imbalance data sets of liver diseases. The method is mainly divided into dimensionality reduction visualization, optimization parameters using glowworm swarm optimization algorithm and support vector data description. The method provides technical support for early detection and early treatment for liver disease.(4) Focus on the risk prediction model for stroke. First, study the establishment of the health index for stroke; second, study the feature selection algorithm combined the properties and characteristics useful measure levels; Third, according to the results of importance for evaluating the risk of stroke, combining the fast peak density clustering algorithm and the super-ellipsoid outlier detection algorithm, study the suitable the multi- ellipsoid fast density clustering algorithm; fourth, to assist in identifying the types of patients for clusters, study the improved multicore support vector machine classification algorithm, and achieved good results in the test; and finally studied the prediction model of health index for the stroke based on support vector machine regression and extreme learning machine for two clusters.
语种: 中文
产权排序: 1
内容类型: 学位论文
URI标识: http://ir.sia.cn/handle/173321/16739
Appears in Collections:数字工厂研究室_学位论文

Files in This Item:
File Name/ File Size Content Type Version Access License
基于聚类的社区居民健康指数预测模型研究.pdf(2868KB)----限制开放 联系获取全文
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[章永来]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[章永来]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2016  中国科学院沈阳自动化研究所 - Feedback
Powered by CSpace