Back to the main page About the NSTB Research Project Research related publications Product Overview Download DM II trial software DMII Development Team E-mail us at

    Introduction (to data mining) In today's world, knowledge is power. An important source of knowledge is the data stored in databases. Data allows us to learn from the past and to predict the future. With the rapid computerization of businesses and organizations, a huge amount of data has been collected and stored in databases, and the rate at which data is stored is growing at a phenomenal rate. As a result, traditional ad hoc mixtures of statistical techniques and data management tools are no longer adequate for analyzing this vast collection of data. Data mining (or knowledge discovery in databases or KDD in short) has emerged as a growing field of multidisciplinary research for discovering interesting/useful knowledge from large databases. KDD is defined as the extraction of implicit, previously unknown, and potentially useful patterns from data.

    Over the past few years, research and development in data mining has made great progresses. A large number of research and application papers have appeared in the literature. Many successful applications have been reported in various sectors such as marketing, finance, banking, manufacturing, and telecommunications. Some examples of business applications include: Using data mining techniques to analyze customer databases so that potential customers can be selected more precisely; Using data mining techniques to detect fraud - from detecting cellular cloning fraud to identifying financial transactions that may indicate money-laundering activities. Data mining systems typically help businesses to expose previously unknown patterns in their databases. These information "nuggets" are used to improve profits, enhance customer service, and ultimately achieve a competitive advantage.

    According to the US market research firm Gartner Group Inc., data mining is one of the top 10 technologies to be watched in 1998.
    Data Mining Process: A practical data mining application is often complex. It is interactive and iterative, involving a number of key steps:

      1. Understanding the application domain, and the application goals.
      2. Extracting one or more target data sets from databases.
      3. Cleaning the data, e.g., removing noise and handling the missing data.
      4. Removing the irrelevant attributes and tuples from the data.
      5. Choosing the data mining task, i.e., deciding whether the goal of the data mining process is classification, association, clustering, etc., or a combination of them.
      6. Choosing the data mining algorithms.
      7. Data mining using the selected algorithms to discover hidden patterns in data.
      8. Post-processing the discovered patterns, i.e., analyzing the patterns automatically or semi-automatically to identify those truly interesting/useful patterns for the user.

    Data Mining Research at NUS: At the School of Computing, National University of Singapore, we have an active group of data mining researchers. The group conducts both basic and applied research. It has published extensively in important international journals and conferences on data mining. The group is also building a data mining system that can be used by industry. It has a number of unique features, which come from the latest research results of the group. We welcome any form of collaboration and applications. Please contact:
      Dr. Liu Bing
      Phone: +65 874 6736
Home | Projects | Publications | Product Overview | Download | People | Contact Us
Please direct queries and bug reports via E-mail:
School of Computing, National University of Singapore