Introduction About the NSTB Research Project Research related publications Product Overview Download DM II trial software DMII Development Team E-mail us at dm2@comp.nus.edu.sg

The DM-II system has two downloadable tools: CBA (v2.1) and IAS.

  1. CBA (v2.1) (Last Modify June, 25, 2001) is a data mining tool developed at School of Computing, National University of Singapore. Its main algorithm was presented in KDD-98. The paper is entitled "Integrating Classification and Association Rule Mining" (KDD-98). Further improvements were made from the ideas in our papers presented at KDD-99 and KDD-00. CBA originally stands for Classification Based on Associations. However, it is not only able to produce an accurate classifier for prediction, but also able to mine various forms of association rules.

    In summary, CBA (v2.1) has the following unique features:

    1. Classification and prediction using association rules
    • Build accurate classifiers from relational data, where each record is described with a fixed number of attributes. This type of data is what traditional classification techniques use, e.g., decision tree, neural networks, and many others.
    • Better classification accuracy (compared to CBA v1.0, C4.5, RIPPER, Naive Bayes): After testing with 26 datasets used in our KDD-98 paper from UCI repository for Machine Learning, we achieved 15.2% of average error rate over these datasets. For these data sets, C4.5 (release 8) obtains the error rate of 16.7%, while for CBA v1.0, it is 15.8%). Click here to see the detailed results. You can also download these datasets from the CBA download section.

    2.   Mining association rules from relational data or transactional data

    3.   Mining with multiple minimum supports (KDD'99)

    4.   Lift analysis (curve) added to the Predicting module

    5.   Faster mining speed (compared to CBA v1.0)

    6.   A HTML viewer to help user in understanding rules

    7.   Text categorization and classification (single class, at this moment)

    • Build accurate classifiers from transactional data, where each data record has a variable number of items, e.g., items bought in a supermarket by a customer, or the keywords in a text document.

     

    CBA also has many other features, e.g., cross-validation for evaluating classifiers, and allows the user to view and to query the discovered rules.

  2. IAS is a post-analysis system that helps the user find interesting association rules. It allows the user to supply his/her existing domain knowledge, and the system then analyzes the discovered rules using the existing knowledge to find those conforming (consistent with the domain knowledge) and unexpected rules. The main ideas are presented in KDD-97 and KDD-99 and PAKDD-99. 

 


Introduction | Projects | Publications | Product Overview | Download | People | Contact Us
Please direct queries and bug reports via E-mail: dm2@comp.nus.edu.sg

School of Computing, National University of Singapore