90 likes | 488 Views
Entropy-based & ChiMerge Data Discretization. Feb. 12, 2008 Team #4: Seunghyun Kim Craig Dunham Suryo Muljono Albert Lee. Entropy-based discretization. Table 6.1 Class-labeled training tuples from the AllElectronics customer database (page 299).
E N D
Entropy-based & ChiMerge Data Discretization Feb. 12, 2008 Team #4: Seunghyun Kim Craig Dunham SuryoMuljono Albert Lee
Entropy-based discretization • Table 6.1 Class-labeled training tuples from the AllElectronics customer database (page 299).
Information gain • Info(D) = = 0.940 bits • Infoage(D) = • = 0.649 bits Entropy-based (Cont’d)
Gain(A) = Info(D) –InfoA(D). • Gain(age) = Info(D) – Infoage(D) = 0.940 – 0.694 = 0.246 bits • Gain(income)= Info(D) – Infoincome(D) = 0.940 – 0.911 = 0.029 bits • Gain(student)= Info(D) – Infostudent(D)= 0.940 – 0.694 = 0.152 bits • Gain(credit) = Info(D) – Infocredit(D) = 0.940 – 0.892 = 0.04 bits Entropy-based (Cont’d)
AllElectronics customer database Age ? Entropy-based (Cont’d) Senior Middle_age Youth
AllElectronics customer database Age ? Senior Middle Youth Entropy-based (Cont’d) yes Credit? Student? Non Student Excellent Fair Student no no yes yes