Naive Bayes in Python
Model Evaluation Key Ideas
- It’s necessary to first check for variable correlation
- If a variable is highly correlated with others, remove it before modeling or the model may be detrimentally impacted
- Attempt to classify NFL games as going over or under the projected number of total points scored by both teams
Modeling (Multinomial Naïve Bayes Default)
The Multinomial Naïve Bayes model, uses a Multinomial probability distribution. This means it is designed to work best for classifying discrete features. Also since multinomial distributions can’t contain negative values, neither can any variables used in modeling. Generally, Multinomial Naïve Bayes works best for text classification. However, I will test it anyway. To begin, a model was created using all of the variables and default parameters. This will act as the baseline model to further improve with hyperparameter tuning.
The confusion matrix and evaluation metrics can be viewed below.
- Accuracy: 0.51
- ROC AUC: 0.54
- Precision: 0.51
- Precision (Over): 0.50
- Precision (Under): 0.51
- Recall: 0.50
- Recall (Over): 0.22
- Recall (Under): 0.79
- F1: 0.46
- F-1 (Over): 0.31
- F-1 (Under): 0.62
Overall, this baseline Multinomial Naïve Bayes model does a really bad job of classifying whether the number of points went over or under the total. With an accuracy of 51% it barely does a better job than randomly predicting. It’s interesting to note that it really struggles with predicting the ‘Under’ when it should predict the ‘Over’. This is where the majority of the accuracy is being lost. Next, hyperparameter tuning will be used to hopefully improve the accuracy of the model.
Hyperparameter Tuning of Baseline Model
Now that a baseline model has been created, hyperparameter tuning will be implemented to determine the optimal parameters with regards to the accuracy of the model. GridSearchCV, RandomSearchCV, and Bayesian Optimization will be used.
GridSearchCV
Performing GridSearchCV to find the optimal parameters returned the following:
- alpha = 4
RandomSearchCV
Coming Soon…
Bayesian Optimization
Coming Soon…
Modeling (GridSearchCV Tuned Model)
The confusion matrix and evaluation metrics for the GridSearchCV tuned model can be viewed below.
- Accuracy: 0.52
- ROC AUC: 0.54
- Precision: 0.51
- Precision (Over): 0.51
- Precision (Under): 0.52
- Recall: 0.51
- Recall (Over): 0.31
- Recall (Under): 0.71
- F1: 0.49
- F-1 (Over): 0.39
- F-1 (Under): 0.60
There is very little change in accuracy from the baseline to the GridSearchCV tuned model. There was only a slight improvement. This indicates that altering parameters does not really have much of an effect on the accuracy of the model and that a Multinomial Naïve Bayes model is not effective for this task.
Cross Validation (GridSearchCV Tuned Model)
To verify that the accuracy of the model is similar to what was obtained my using one random train and test set cross validation will be performed. KFold cross validation can be used since the label distribution is approximately equal (48.5% Over, 51.5% Under). The following cross validation used 10 folds.
Fold 1 : 0.52
Fold 2 : 0.53
Fold 3 : 0.59
Fold 4 : 0.53
Fold 5 : 0.51
Fold 6 : 0.51
Fold 7 : 0.49
Fold 8 : 0.51
Fold 9 : 0.50
Fold 10 : 0.56
Mean Accuracy: 0.52
The accuracies are pretty similar across each fold, which is ideal. This means that the model is performing similarly regardless of the way the training data is split (not much risk of having overfitting). So the mean accuracy of the cross validation, 52%, can be used as a good estimate of how the model will perform on real data.
Modeling (Default Bernoulli Naive Bayes)
The Bernoulli Naïve Bayes model, uses a Bernoulli probability distribution. This means it is designed to work best for classifying binary features. Since, total_result
is a binary feature, it is expected that Bernoulli Naïve Bayes will perform better than Multinomial Naïve Bayes. To begin, a model was created using all of the variables and default parameters. This will act as the baseline model to further improve with hyperparameter tuning.
The confusion matrix and evaluation metrics can be viewed below.
- Accuracy: 0.52
- ROC AUC: 0.54
- Precision: 0.51
- Precision (Over): 0.51
- Precision (Under): 0.52
- Recall: 0.51
- Recall (Over): 0.31
- Recall (Under): 0.71
- F1: 0.49
- F-1 (Over): 0.39
- F-1 (Under): 0.60
Overall, this baseline Bernoulli Naïve Bayes model does a really bad job of classifying whether the number of points went over or under the total. With an accuracy of 52% it barely does a better job than randomly predicting. It’s interesting to note that it really struggles with predicting the ‘Under’ when it should predict the ‘Over’. This is where the majority of the accuracy is being lost. Next, hyperparameter tuning will be used to hopefully improve the accuracy of the model.
Hyperparameter Tuning of Baseline Model
Now that a baseline model has been created, hyperparameter tuning will be implemented to determine the optimal parameters with regards to the accuracy of the model. GridSearchCV, RandomSearchCV, and Bayesian Optimization will be used.
GridSearchCV
Performing GridSearchCV to find the optimal parameters returned the following:
- alpha = 4
RandomSearchCV
Coming Soon…
Bayesian Optimization
Coming Soon…
Modeling (GridSearchCV Tuned Model)
The confusion matrix and evaluation metrics for the GridSearchCV tuned model can be viewed below.
- Accuracy: 0.52
- ROC AUC: 0.54
- Precision: 0.51
- Precision (Over): 0.51
- Precision (Under): 0.52
- Recall: 0.51
- Recall (Over): 0.31
- Recall (Under): 0.71
- F1: 0.49
- F-1 (Over): 0.39
- F-1 (Under): 0.60
There is no change in accuracy from the baseline to the GridSearchCV tuned model. This indicates that altering parameters does not really have much of an effect on the accuracy of the model and that a Bernoulli Naïve Bayes model is not effective for this task.
Cross Validation (GridSearchCV Tuned Model)
To verify that the accuracy of the model is similar to what was obtained my using one random train and test set cross validation will be performed. KFold cross validation can be used since the label distribution is approximately equal (48.5% Over, 51.5% Under). The following cross validation used 10 folds.
Fold 1 : 0.51
Fold 2 : 0.54
Fold 3 : 0.57
Fold 4 : 0.52
Fold 5 : 0.52
Fold 6 : 0.52
Fold 7 : 0.48
Fold 8 : 0.50
Fold 9 : 0.48
Fold 10 : 0.54
Mean Accuracy: 0.52
The accuracies are pretty similar across each fold, which is ideal. This means that the model is performing similarly regardless of the way the training data is split (not much risk of having overfitting). So the mean accuracy of the cross validation, 52%, can be used as a good estimate of how the model will perform on real data.
Naive Bayes in R (Update in Progress)
The Naive Bayes model was created using all of the variables in the prepped dataset.
The following variables were used in the model.
off_def_diff
total_line
wind
temp
away_total_defense_rank
home_total_offense_rank
home_total_defense_rank
surface
total_qb_elo
away_total_offense_rank
total_team_elo
roof
game_type
div_game
weekday
home_rest
location
away_rest
Model Accuracy: 0.508
This accuracy can be further looked into by analyzing the resulting confusion matrix below.
Cross-Validation of Reduced Model
Cross-validation is used to determine how the model performs on multiple different training and testing sets. Cross-validation will find a more accurate determination of how the model is performing. This is because the data in the training and testing sets is randomly selected from the original dataset. So, random sampling might result in a model that performs above or below the average. Cross-validation will train and test the model, randomly sampling each time. This reduces the amount that random sampling can affect the overall performance of the model. K-fold cross-validation was performed on the reduced model with K equal to 10 (the model will be assessed 10 times and then the average accuracy will be taken).
10 Fold CV Accuracy of Model: 0.527
Naive Bayes in Python Results
Coming Soon…