1 / 28

Dr. Gari D. Clifford, University Lecturer & Associate Director,

Information Driven Healthcare: Data Visualization & Classification Lecture 6: Neural Networks (continued): Training, Stopping, Validation and Testing Centre for Doctoral Training in Healthcare Innovation. Dr. Gari D. Clifford, University Lecturer & Associate Director,

Download Presentation

Dr. Gari D. Clifford, University Lecturer & Associate Director,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Driven Healthcare:Data Visualization & Classification Lecture 6: Neural Networks (continued): Training, Stopping, Validation and Testing Centre for Doctoral Training in Healthcare Innovation Dr. Gari D. Clifford, University Lecturer & Associate Director, Centre for Doctoral Training in Healthcare Innovation, Institute of Biomedical Engineering, University of Oxford

  2. ANN training continued • Pseudo code for training and reporting classification performance: • 1. Partition data into training, validation & test sets • 2. for(j=1; j<= jmax; j++){ • for(run=1; run <=10; run++){ • intialise weights; • train MLP until stopping criterion reached (min(eval) error in validation, not train, set); • save min(eval); • } • } • 3. Select optimal network on basis of lowest eval • 4. Test optimal network on test data and report results.

  3. How to terminate training … • Your cost function doesn’t tell you when to end training …

  4. Use validation set to avoid overfitting • At some point the error on your training set will continue to drop, but the error, , on an independent (validation) set will start to rise … • Now you are overfitting on the training data • The learned function fits very closely the training data but it does not generalise well, - it cannot model sufficiently well unseen data for the same task 

  5. Warning: local minima and overtraining • Training • Testing Best Test Score (0.695) Accuracy Random ‘Coin Toss’ (0.5) Training Epochs

  6. Examples of over-fitting • Imagine a regression problem: y=f(x) + noise

  7. Examples of over-fitting • Three type of fit to the same data: Piecewise linear nonparametric regression Linear regression Quadratic regression

  8. Which is the ‘best’ fit? • Why not choose the technique which most closely fits the data? • Well – you need to answer the question: “How well are you going to predict future data drawn from the same distribution?”

  9. The train / test approach • The test set method:

  10. The train / test approach • The test set method:

  11. The train / test approach • The test set method:

  12. The train / test approach • The test set method:

  13. The train / test approach • The test set method:

  14. The train / test approach • Good news: • Very simple • Can then simply choose the method with the best test set score • Bad News: • Wastes data: we get an estimate of the best method to apply to 30% less data • Test set estimator of performance has high variance – i.e. if we don’t have enough data, our test set might be unrepresentative

  15. Cross Validation • We can improve this by dropping out some of the data at random, and repeating. Averaging reduces variance … • LOOCV - Leave One Out Cross Validation:

  16. LOOCV • Leave One Out Cross Validation for linear regression:

  17. LOOCV • Leave One Out Cross Validation for quadratic regression:

  18. LOOCV • Leave One Out Cross Validation for piecewise linear regression:

  19. K-Fold CV – Linear Regression

  20. K-Fold CV – Quadratic Regression

  21. K-Fold CV – Piecewise Linear Regression

  22. Which scheme to use?

  23. Let’s look at the ANN train/test procedure again • Pseudo code for training and reporting classification performance: • 1. Partition data into training, validation & test sets • 2. for(j=1; j<= jmax; j++){ • for(run=1; run <=10; run++){ • intialise weights; • train MLP until stopping criterion reached (min( (val)) – error in validation, not train, set); • save min( (val)); • } • } • 3. Select optimal network on basis of lowest  (val) • 4. Test optimal network on test data and report results.

  24. Pruning network nodes • Step 3 is used to identify the optimal I-J-K configuration on the basis of the lowest values of  (val) • Network nodes can be pruned during this stage • Also – an outer loop can be added to create different partitions in the data for each loop = cross validation!

  25. E.g. k=5 on ICU data

  26. Common problems in training and testing • Poor training performance • Incorrect choice of problem – no/weak relationship between input and output • Wrong set of features – incorrect pre-filtering applied • Stuck units (local minima) – initialization problem or normalization problem • Poor generalization performance • Insufficient number of training patterns – you only learn the patterns, not the relationship between the patterns and the class • Over-fitting – Model order/architecture problem – also learns noise • Over-training – Model order is correct, but eventually the noise is learned • Test examples of one class consistently wrong – Unbalanced database • Attempting to extrapolate rather than interpolate – Network is trained on data under one set of conditions (or for one population) and used to predict on another population that exhibits a different set of conditions

  27. Now to the lab …. www.devbio.uga.edu/gallery/index.html

  28. Acknowledgements • Overfitting, Cross-validation and bootstrapping slides adapted from notes by Andrew W. Moore, School of Computer Science Carnegie Mellon University: www.cs.cmu.edu/~awm including “Cross-validation for detecting and preventing overfitting” - http://www.autonlab.org/tutorials/index.html

More Related