Document Type : Original Research Paper

Authors

1 Department of Computer Science, Birjand University of Technology, Birjand, Iran.

2 Department of Electrical Engineering, Faculty of Engineering, University of Birjand, Birjand, Iran.

Abstract

Background and Objectives: When dealing with high-volume and high-dimensional datasets, the distribution of samples becomes sparse, and issues such as feature redundancy or irrelevance arise. Dimensionality reduction techniques aim to incorporate correlation between features and map the original features into a lower dimensional space. This usually reduces the computational burden and increases performance. In this paper, we study the problem of predicting heart disease in a situation where the dataset is large and (or) the proportion of instances belonging to one class compared to others is significantly low.
Methods: We investigated three of the prominent dimensionality reduction techniques, including Principal Component Analysis (PCA), Information Bottleneck (IB) and Variational Autoencoder (VAE) on popular classification algorithms. To have adequate samples in all classes to properly feed the classifier, an efficient data balancing technique is used to compensate for fewer positives than negatives. Among all data balancing methods, a SMOTE-based method is selected, which generates new samples at the boundary of the samples distribution and avoids the synthesis of noise and redundant data.
Results: The experimental results show that VAE-based method outperforms other dimensionality reduction algorithms in the performance measures. The proposed hybrid method improves accuracy to 97.1% and sensitivity to 99.2%.
Conclusion: Finally, it can be concluded that the combination of VAE with oversampling algorithms can significantly enhance system performance as well as computational time.

Keywords

Main Subjects

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/

 

Publisher’s Note

JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

 

Publisher

Shahid Rajaee Teacher Training University


LETTERS TO EDITOR

Journal of Electrical and Computer Engineering Innovations (JECEI) welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in JECEI should be sent to the editorial office of JECEI within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.


[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.

CAPTCHA Image