Learning from Observations

Overfitting

n

Better training

performance = test

performance?

n

Nope. Why?

1.

Hypothesis too specific

2.

Models noise

n

Pruning

¡

Keep complexity of

hypothesis low

¡

Stop splitting when:

1.

IC below a threshold

2.

Too few data points in

node

Test performance

Train performance