Sports Betting Neural Networks Results

Initial ANN Model

Architecture:

The initial model consists of two hidden layers and an output layer. It is simple on purpose to act as a baseline.

Layer (type) Output Shape Activation Param #

dense_33 (Dense) (None, 8) sigmoid 184

dense_34 (Dense) (None, 16) relu 144

dense_35 (Dense) (None, 1) sigmoid 17

=================================================================
Total params: 345 (1.35 KB)
Trainable params: 345 (1.35 KB)

Epochs: 200 Total

Epoch 1/200
154/154 [==============================] – 2s 6ms/step – loss: 0.6940 – accuracy: 0.5022 – val_loss: 0.6936 – val_accuracy: 0.5292
Epoch 2/200
154/154 [==============================] – 0s 3ms/step – loss: 0.6934 – accuracy: 0.5020 – val_loss: 0.6927 – val_accuracy: 0.5292

Epoch 199/200
154/154 [==============================] – 0s 2ms/step – loss: 0.6851 – accuracy: 0.5473 – val_loss: 0.6905 – val_accuracy: 0.5179
Epoch 200/200
154/154 [==============================] – 0s 2ms/step – loss: 0.6841 – accuracy: 0.5524 – val_loss: 0.6934 – val_accuracy: 0.5097

Loss Plot:

Accuracy Plot:

Test Loss: 0.7002

Test Accuracy: 0.5081

Summary:

Looking at the loss and accuracy plots, there is essentially no change in either. This indicates the model isn’t learning, and underfitting is occurring. It’s also possible that there is a vanishing gradient issue, but since I started with simple architecture, it’s much more likely the issue is underfitting. For the final model, the main focus will be on increasing the complexity of the model architecture enough to allow learning to occur, but not too much to cause overfitting.

Intermediate ANN Model

Architecture:

Layer (type) Output Shape Activation Param #

dense_136 (Dense) (None, 100) relu 2300

dense_137 (Dense) (None, 100) relu 10100

dropout_44 (Dropout) (None, 100) 0

dense_138 (Dense) (None, 1) sigmoid 101

=================================================================
Total params: 12501 (48.83 KB)
Trainable params: 12501 (48.83 KB)

Epochs:

Epoch 1/200
154/154 [==============================] – 2s 5ms/step – loss: 0.6963 – accuracy: 0.5071 – val_loss: 0.6918 – val_accuracy: 0.5211
Epoch 2/200
154/154 [==============================] – 1s 4ms/step – loss: 0.6928 – accuracy: 0.5162 – val_loss: 0.6940 – val_accuracy: 0.5211

Epoch 199/200
154/154 [==============================] – 1s 5ms/step – loss: 0.5612 – accuracy: 0.6865 – val_loss: 0.9260 – val_accuracy: 0.5211
Epoch 200/200
154/154 [==============================] – 1s 6ms/step – loss: 0.5579 – accuracy: 0.6871 – val_loss: 0.9127 – val_accuracy: 0.5227

Loss Plot:

Accuracy Plot:

Test Loss: 0.8502

Test Accuracy: 0.5617

Summary:

Using a more complex architecture resulted in a much higher accuracy on the train set. However, the loss and accuracy plots clearly show the model is now overfitting. The validation accuracy never really improves after approximately the 20th epoch. The model clearly needs to have it’s complexity reduced as just using a dropout layer was not enough. This will be addressed in the final model.

Final ANN Model

Architecture:

Layer (type) Output Shape Activation Param #

dense_151 (Dense) (None, 8) relu 184

dense_152 (Dense) (None, 16) relu 44

dropout_49 (Dropout) (None, 16) 0

dense_153 (Dense) (None, 1) sigmoid 17

=================================================================
Total params: 345 (1.35 KB)
Trainable params: 345 (1.35 KB)

Epochs:

Epoch 1/50
154/154 [==============================] – 2s 4ms/step – loss: 0.6954 – accuracy: 0.5085 – val_loss: 0.6941 – val_accuracy: 0.5049
Epoch 2/50
154/154 [==============================] – 1s 4ms/step – loss: 0.6940 – accuracy: 0.5067 – val_loss: 0.6929 – val_accuracy: 0.5146
Epoch 3/50

Epoch 49/50
154/154 [==============================] – 1s 4ms/step – loss: 0.6870 – accuracy: 0.5503 – val_loss: 0.6898 – val_accuracy: 0.5519
Epoch 50/50
154/154 [==============================] – 1s 5ms/step – loss: 0.6868 – accuracy: 0.5444 – val_loss: 0.6901 – val_accuracy: 0.5503

Loss Plot:

Accuracy Plot:

Test Loss: 0.6906

Test Accuracy: 0.5373

Confusion Matrix:

Summary:

The final model architecture is very similar to the first one. The only difference is using a relu activation in the first layer and adding a dropout layer after the second hidden layer. Increasing the model complexity further or training on additional epochs past 75 results in overfitting. So, while the final results aren’t great, (a validation accuracy of 0.5503 and a test accuracy of 0.5373) there isn’t much more that can be done to improve the model. However, the goal is to obtain a model with accuracy greater than 52.4%, as that is the mark to be profitable. So, in that regard the final model is actually successful.