Aprendizaje automático explicable para la evaluación y prevención del riesgo de arteriopatía coronaria

Louridi Nabaouia; Douzi Samira; El Ouahidi Bouabid

doi:10.56294/dm202365

Original

Published: 2023-12-29

DOI: https://doi.org/10.56294/dm202365

Explainable machine learning for coronary artery disease risk assessment and prevention

Abstract

Coronary Artery Disease (CAD) is an increasingly prevalent ailment that has a significant impact on both longevity and quality of life. Lifestyle, genetics, nutrition, and stress are all significant contributors to rising mortality rates. CAD is preventable through early intervention and lifestyle changes. As a result, low-cost automated solutions are required to detect CAD early and help healthcare professionals treat chronic diseases efficiently. Machine learning applications in medicine have increased due to their ability to detect data patterns. Employing machine learning to classify the occurrence of coronary artery disease could assist doctors in reducing misinterpretation. The research project entails the creation of a coronary artery disease diagnosis system based on machine learning. Using patient medical records, we demonstrate how machine learning can help identify if an individual will acquire coronary artery disease. Furthermore, the study highlights the most critical risk factors for coronary artery disease. We used two machine learning approaches, Catboost and LightGBM classifiers, to predict the patient with coronary artery disease. We employed various data augmentation methods, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAE), to solve the imbalanced data problem. Optuna was applied to optimize hyperparameters. The proposed method was tested on the real-world dataset Z-Alizadeh Sani. The acquired findings were satisfactory, as the model could predict the likelihood of cardiovascular disease in a particular individual by combining Catboost with VAE, which demonstrated good accuracy compared to the other approaches. The proposed model is evaluated using a variety of metrics, including accuracy, recall, f-score, precision, and ROC curve. Furthermore, we used the SHAP values and Boruta Feature Selection (BFS) to determine essential risk factors for coronary artery disease.

Keywords:

Coronary Artery Disease,

Explainable Machine Learning,

Risk Factors,

Data Augmentation,

How to Cite

Nabaouia L, Samira D, Ouahidi Bouabid E. Explainable machine learning for coronary artery disease risk assessment and prevention. Data and Metadata [Internet]. 2023 Dec. 29 [cited 2024 Jul. 26];2:65. Available from: https://dm.saludcyt.ar/index.php/dm/article/view/65

Copyright Notice

The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.

Article metrics

Google scholar: See link

Metrics

Metrics Loading ...

Vol. 2 (2023)

See full issue

Revistas / Journals

Issue

About

Submissions