|
The 26 data sets are all from the UCI repository for Machine
Learning. In order to have a fair comparison with other
classification system (e.g. C4.5 Rel 8), we use C4.5's shuffle
utility to shuffle the datasets. After shuffling, class frequency
distributions are more suitable for the 10-fold cross-validation.
CBA_Ver
2.0 achieves 15.2% of average error rate over these datasets,
and C4.5 Rel 8 only achieves 16.7% for C4.5rules and 17.3 for
C4.5tree. RIPPER achieves 16.9% and Naive Bayesiam achieves 17.8% of
average error.
Click here to download these datasets
and CBA V2.0.
Parameters used in CBA
V2.0:
-
Use multiple class min support with Min
Support upper bound = 1%
- Use multiple class min confidence
- Use rule pruning
- Rule Limit = 80,000
- Not use small rules
Parameters used in CBA
V1.0:
-
Use single Min Support = 1%
- Use single Min Confidence = 50%
- Use rule pruning
- Rule Limit = 80,000
- Not use small rules
C4.5Tree (Disc.): C4.5tree uses discretized data
C4.5Rules (Disc.): C4.5rules uses discretized data
|
CBA V2.0 (%) |
CBA V1.0 (%) |
C4.5 Tree (%) |
C4.5 Rules (%) |
C4.5 Tree (Disc.) (%) |
C4.5 Rules (Disc.) (%) |
RIPPER (%) |
Naive Bayesian(%) |
| anneal |
2.1 |
3.6 |
7.5 |
5.2 |
9.8 |
6.5 |
4.6 |
2.7 |
| australian |
14.6 |
13.4 |
14.8 |
15.3 |
13 |
13.5 |
15.2 |
14.0 |
| auto |
19.9 |
27.2 |
17.6 |
19.9 |
27.8 |
29.2 |
23.8 |
32.1 |
| breast-w |
3.7 |
4.2 |
5.6 |
5.0 |
5.3 |
3.9 |
4.0 |
2.4 |
| cleve |
17.1 |
16.7 |
21.5 |
21.8 |
20.8 |
18.2 |
21.1 |
17.1 |
| crx |
14.6 |
14.1 |
15 |
15.1 |
14.6 |
15.9 |
14.6 |
14.6 |
| diabetes |
25.5 |
25.3 |
26.1 |
25.8 |
25 |
27.6 |
25.3 |
24.4 |
| german |
26.5 |
26.5 |
28.4 |
27.7 |
28.7 |
29.5 |
27.8 |
24.6 |
| glass |
26.1 |
27.4 |
30.4 |
31.3 |
25.2 |
27.5 |
35.0 |
29.4 |
| heart |
18.1 |
18.5 |
21.8 |
19.2 |
19.2 |
18.9 |
19.6 |
18.1 |
| hepatitis |
18.9 |
15.1 |
18.2 |
19.4 |
18.1 |
22.6 |
17.5 |
15.0 |
| horse |
17.6 |
18.7 |
14.7 |
17.4 |
14.7 |
16.3 |
14.7 |
20.6 |
| hypo |
1.0 |
1.7 |
0.7 |
0.8 |
1 |
1.2 |
0.8 |
1.5 |
| ionosphere |
7.7 |
8.2 |
10.5 |
10 |
9.7 |
8 |
11.4 |
11.9 |
| iris |
5.3 |
7.1 |
4.7 |
4.7 |
6 |
5.3 |
5.3 |
6.0 |
| labor |
13.7 |
17.0 |
22.3 |
20.7 |
23 |
21 |
16.5 |
14.0 |
| led7 |
28.1 |
27.8 |
30.5 |
26.5 |
26.5 |
26.5 |
30.8 |
26.7 |
| lymph |
22.1 |
19.6 |
23.8 |
26.5 |
20.9 |
21 |
20.8 |
24.4 |
| pima |
27.1 |
27.6 |
25.8 |
24.5 |
27 |
27.5 |
26.3 |
24.5 |
| sick |
2.8 |
2.7 |
1.1 |
1.5 |
2.1 |
2.1 |
1.9 |
3.9 |
| sonar |
22.5 |
21.7 |
28.4 |
29.8 |
29.8 |
27.8 |
27.9 |
23.0 |
| tic-tac-toe |
0.4 |
0 |
13.8 |
0.6 |
13.8 |
0.6 |
2.4 |
30.1 |
| vehicle |
31 |
31.3 |
28.5 |
27.4 |
34.5 |
33.6 |
31.4 |
40.1 |
| waveform21 |
20.3 |
20.6 |
22.8 |
21.9 |
29.9 |
24.6 |
20.5 |
19.3 |
| wine |
5.0 |
8.4 |
7.3 |
7.3 |
8.5 |
7.9 |
8.5 |
9.5 |
| zoo |
3.2 |
5.4 |
7.8 |
7.8 |
7.8 |
7.8 |
11.0 |
13.7 |
| Avg |
15.2 |
15.8 |
17.3 |
16.7 |
17.6 |
17.1 |
16.9 |
17.8 |
|