Information Content
n Entropy measures purity of sets of examples
¡ Normally denoted H(x)
n Or as information content: the less you need to
know (to determine class of new case), the more
information you have
n With two classes (P,N):
¡ IC(S) = - (p/t) log2 (p/t)  - (n/t) log2 (n/t)
¡ E.g., p=9, n=5;
IC([9,5])    = - (9/14) log2 (9/14) - (5/14) log2 (5/14)
         = 0.940
¡ Also, IC([14,0])=0; IC([7,7])=1