Heart Disease Diagnosis via Nonparametric Mixture Models

Chipo Mufudza *

Department of Applied Mathematics, National University of Science and Technology, Corner Cecil Avenue and Gwanda Road, Bulawayo, Zimbabwe.

Hamza Erol

Department of Computer Engineering, Mersin University, Çiftlikkoy Campus, TR-33343, Mersin, Turkey.

*Author to whom correspondence should be addressed.


Abstract

Aims/Objectives: Effective and efficient heart disease prediction via nonparametric mixture regression models.

Data Source: Data used in this paper is from the UCI database of the Cleveland Clinic Foundation for heart disease. The original data source contains 76 raw attributes with 303 observations each. For the purpose of this paper only 14 attributes were used as explained in section 4.

Methodology: Cluster analysis was applied via mixture models in the form of Nonparametric Density-based models. The clusters were identified using a graph theory based technique. Voronoi diagrams were used and and their distributions were estimated nonparametrically through a mixture model with Gaussian kernels. The optimal number of clusters and components of the identified clusters were determined, analysed and diagnosed using a density based silhouette information criteria. All the data analysis and model diagnosis were performed in R using the PdfCluster package.

Results: Different number of components resulted in different number of clusters when nonparametric mixture are used on heart disease. However, the optimal number of clusters under heart disease risks were found to be represented by two clusters with two components using density based silhouette information criteria. These were both well separated and classified as indicated by lack of spurious clusters and high positive density based silhouette values (See Figs. 2 and 4). Their properties are given in Table 2. The result is irregardless of the flexible conditions which are assumptions free on: shape of the distribution, number of components and number of clusters.

Conclusion: When nonparametric mixture models are used, individuals under risks of heart diseases can be diagnosed either under high or low risk depending on the dominant characteristics on a given individual. Those under high risk have attributes that makes them progress to heart diseases faster compared to those under low risk. Therefore by classifying individuals into these categories, medical personnel can quickly diagonise heart disease and efficiently identify characteristics associated with each category.

Keywords: Density based silhouette information, heart disease, Kernel Density Estimator, Nonparametric Mixture Models.


How to Cite

Mufudza, Chipo, and Hamza Erol. 2018. “Heart Disease Diagnosis via Nonparametric Mixture Models”. Journal of Advances in Mathematics and Computer Science 27 (5):1-17. https://doi.org/10.9734/JAMCS/2018/40440.

Downloads

Download data is not yet available.