Back to the main page About the NSTB Research Project Research related publications Product Overview Download DM II trial software DMII Development Team E-mail us at dm2@comp.nus.edu.sg

 

The 26 data sets are all from the UCI repository for Machine Learning. In order to have a fair comparison with other classification system (e.g. C4.5 Rel 8), we use C4.5's shuffle utility to shuffle the datasets. After shuffling, class frequency distributions are more suitable for the 10-fold cross-validation.  CBA_Ver 2.0 achieves 15.2% of average error rate over these datasets, and C4.5 Rel 8 only achieves 16.7% for C4.5rules and 17.3 for C4.5tree. RIPPER achieves 16.9% and Naive Bayesiam achieves 17.8% of average error.

Click here to download these datasets and CBA V2.0. 

Parameters used in CBA V2.0: 

  1. Use multiple class min support with Min Support upper bound  = 1%

  2. Use multiple class min confidence
  3. Use rule pruning
  4. Rule Limit = 80,000
  5. Not use small rules

Parameters used in CBA V1.0: 

  1. Use single Min Support = 1%

  2. Use single Min Confidence = 50%
  3. Use rule pruning
  4. Rule Limit = 80,000
  5. Not use small rules

C4.5Tree (Disc.): C4.5tree uses discretized data

C4.5Rules (Disc.): C4.5rules uses discretized data

CBA V2.0 (%) CBA V1.0 (%) C4.5 Tree (%) C4.5 Rules (%) C4.5 Tree (Disc.) (%) C4.5 Rules (Disc.) (%) RIPPER (%) Naive Bayesian(%)
anneal 2.1 3.6 7.5 5.2 9.8 6.5 4.6 2.7
australian 14.6 13.4 14.8 15.3 13 13.5 15.2 14.0
auto 19.9 27.2 17.6 19.9 27.8 29.2 23.8 32.1
breast-w 3.7 4.2 5.6 5.0 5.3 3.9 4.0 2.4
cleve 17.1 16.7 21.5 21.8 20.8 18.2 21.1 17.1
crx 14.6 14.1 15 15.1 14.6 15.9 14.6 14.6
diabetes 25.5 25.3 26.1 25.8 25 27.6 25.3 24.4
german 26.5 26.5 28.4 27.7 28.7 29.5 27.8 24.6
glass 26.1 27.4 30.4 31.3 25.2 27.5 35.0 29.4
heart 18.1 18.5 21.8 19.2 19.2 18.9 19.6 18.1
hepatitis 18.9 15.1 18.2 19.4 18.1 22.6 17.5 15.0
horse 17.6 18.7 14.7 17.4 14.7 16.3 14.7 20.6
hypo 1.0 1.7 0.7 0.8 1 1.2 0.8 1.5
ionosphere 7.7 8.2 10.5 10 9.7 8 11.4 11.9
iris 5.3 7.1 4.7 4.7 6 5.3 5.3 6.0
labor 13.7 17.0 22.3 20.7 23 21 16.5 14.0
led7 28.1 27.8 30.5 26.5 26.5 26.5 30.8 26.7
lymph 22.1 19.6 23.8 26.5 20.9 21 20.8 24.4
pima 27.1 27.6 25.8 24.5 27 27.5 26.3 24.5
sick 2.8 2.7 1.1 1.5 2.1 2.1 1.9 3.9
sonar 22.5 21.7 28.4 29.8 29.8 27.8 27.9 23.0
tic-tac-toe 0.4 0 13.8 0.6 13.8 0.6 2.4 30.1
vehicle 31 31.3 28.5 27.4 34.5 33.6 31.4 40.1
waveform21 20.3 20.6 22.8 21.9 29.9 24.6 20.5 19.3
wine 5.0 8.4 7.3 7.3 8.5 7.9 8.5 9.5
zoo 3.2 5.4 7.8 7.8 7.8 7.8 11.0 13.7
Avg 15.2 15.8 17.3 16.7 17.6 17.1 16.9 17.8

 

Home | Projects | Publications | Product Overview | Download | People | Contact Us
Please direct queries and bug reports via E-mail: dm2@comp.nus.edu.sg
School of Computing, National University of Singapore