17 Mar 2004
CS 3243 - Learning
41
Overfitting
nBetter training performance = test performance?
nNope.  Why?
1.Hypothesis too specific
2.Models noise
nPruning
¡Keep complexity of hypothesis low
¡Stop splitting when:
1.IC below a threshold
2.Too few data points in node
100%
80%
60%
40%
20%
0%
Text Box: Precision
Precision
DT Size
Test performance
Train performance