Classification is a supervised machine learning method where the model tries to predict the correct label of a given input data. One of the algorithms used for classification is “Naïve Bayes Classifier”.
Here, we use the drug200.csv dataset for classification purposes and you can download the dataset from the link on top of this page. This dataset has 5 features:
“Age”, “Sex”, “Blood Pressure”, “Cholesterol” and “Sodium to Potassium Ratio”
And the label for each item in the dataset is the type of Drug you should take, which has 5 values:
“drugX”, “DrugY”, “drugA”, “drugB” and “drugC”.
In the above form, you enter the values of 5 features and the model returns the type of Drug you should take in a pop-up form. As 3 features of our dataset are Categorical and 2 of them are Numerical, for using Naïve Bayes for this mixed dataset, we need to convert categorical features to numerical labels and then we could use the Naïve Bayes Classifier.
Notice that the values and ranges of each feature is as below:
15 < Age < 75
Sex (0 → Female , 1 → Male)
Blood Pressure (0 → HIGH , 1 → LOW, 2 → NORMAL)
Cholesterol (0 → HIGH , 1 → NORMAL)
Sodium to Potassium Ratio (Between 6.2 and 38.3)
The sample code used to train a “Naïve Bayes” classification model, is provided in the link on top of this page. Each time you run this form and classification predictions happen, the values will store in the database, and with the link “result” on top of this page, you will see the previous results of the model prediction. Your recent run will be added to the end of this list.