Learning from Observations

17 Mar 2004

CS 3243 - Learning

41

Overfitting

nBetter training performance = test performance?

nNope. Why?

1.Hypothesis too specific

2.Models noise

nPruning

¡Keep complexity of hypothesis low

¡Stop splitting when:

1.IC below a threshold

2.Too few data points in node

100%

80%

60%

40%

20%

0%

Precision

DT Size

Test performance

Train performance