Information gain
For the training set,
p
=
n
= 6, I(6/12, 6/12) = 1
bit
Consider the attributes
Patrons
and
Type
(and others too):
Patrons
has the highest IG of all attributes and so is chosen by the
DTL algorithm as the root