Analysis of Heart Disease Dataset
The objective of this project is to predict the presence or absence of heart disease in a patient using a subset of 14 attributes from the Heart Disease dataset. The dataset contains patient information from four different locations, and it includes 76 attributes, of which 14 are commonly used for analysis.
Heart disease is a leading cause of death worldwide, and early detection is essential to prevent or mitigate its impact. Machine learning techniques have been used to develop predictive models that can identify individuals at risk of developing heart disease. In this context, the goal is to use machine learning algorithms to analyze large amounts of data related to an individual's lifestyle, medical history, and other factors to develop a model that can accurately predict the risk of heart disease.
Problem Statement
The objective of this project is to predict the presence or absence of heart disease in a patient using a subset of 14 attributes from the Heart Disease dataset. The dataset contains patient information from four different locations, and it includes 76 attributes, of which 14 are commonly used for analysis. The "target" field in the dataset indicates the presence of heart disease, with a value of 0 indicating no disease and 1 indicating the presence of the disease.
The project will involve converting the dataset into a dense vector, both for supervised and unsupervised learning versions. The dataset will then be transformed into a DataFrame, and categorical labels and variables will be dealt with. The data will be split into training and test sets for use in fitting three different classification models: Random Forest, Decision Tree, and Naive Bayes.
Predictions will be made using the trained models, and the evaluation of the models will be done using accuracy as the metric. Finally, the results will be visualized to compare the accuracy of the three models.
The project aims to provide an accurate classification model for the prediction of heart disease, which can assist healthcare professionals in diagnosing and treating the condition in a timely and effective manner.