Is the Model Effective?

Imagine making a cake for whatever occasion. The recipe is in your head or perhaps even jotted down somewhere in a notebook. The recipe is a result of thinking that the cake is perfect, good, or at least is edible to you.

However, there's a fatal flaw in this recipe. The true test of whether your cake is good -the recipe- is serving it to other friends, family, etc. who were not involved in the baking process or the creation of the recipe. Some might enjoy it, some may not. You have to test the performance or rather how well the cake was received!

Was the model validated?

If not, the model's performance is probably too optimistic as it was built on the same data used to train it.

What should we do?

For a single model: Evaluate on a separate dataset known as the test set. A common approach is to utilize 70% of the dataset for training and 30% for testing.

For multiple models: Incorporate a validation set to select the model that performs the best. However, the evaluation of the chosen model will be inflated due to randomness. In this case, employ a third test set. Where the flow for multiple models goes from Training Set —> Validation Set —> Test Set.

K-fold cross validation:

K-fold cross validation is useful for training and validation as it involves splitting each step into k-parts. The model is trained on k-1 parts while validated on the remaining parts. This action is repeated k times and then subsequently averages the evaluations.

It's important to note that once Cross Validation is performed, it is a good idea to also retrain on all the training and validation data before evaluating on the test set.

Model specific measurements:

To be continued…but here are some commonly used ones for regression models:

Sum of Squared Errors
R-Squared
Likelihood/Maximum Likelihood
Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC)

About

Writing

Work