data:image/s3,"s3://crabby-images/5d05b/5d05b1b0f803954704655ec6a0bf836d71692a2e" alt="Deep Learning Quick Reference"
The train, val, and test datasets
For the rest of the book, I will be structuring my data into three separate sets that I'll refer to as train, val, and test. These three separate datasets, drawn as random samples from the total dataset will be structured and sized approximately like this.
data:image/s3,"s3://crabby-images/e325a/e325a213092be591c161284960bfe18eb4ad7d62" alt=""
The train dataset will be used for training the network, as expected.
The val dataset, or the validation dataset, will be used to find ideal hyperparameters, and to measure overfitting. At the end of an epoch, which is when the network has has the opportunity to observe every data point in the training set, we will make a prediction on the val set. That prediction will be used to watch for overfitting and will help us know when the network has finished training. Using the val set at the end of each epoch like this somewhat differs from the typical usage. For more information on Hold-Out Validation please reference The Elements of Statistical Learning by Hastie and Tibshirani (https://web.stanford.edu/~hastie/ElemStatLearn/).
The test dataset will be used once all training is complete, to accurately measure model performance on a set of data that the network hasn't seen.
It is very important that the val and test data comes from the same datasets. It is less important that the train dataset matches val and test, although that is still ideal. If image augmentation were being used (performing minor modifications to training images in an attempt to amplify the training set size) for example, the training set distribution may no longer match the val set distribution. This is acceptable and network performance can be adequately measured as long as val and test are from the same distribution.