Overview of the course
Learning Objectives
After this unit, students should
- understand the aims of IT5006.
- understand the key concepts covered in IT5006.
What is this course about?
IT5006 is designed as a core course aimed at equipping students with the necessary skill set for proficient data analytics. The course establishes as a robust foundation in statistics, classical machine learning techniques, and the results analysis. Through interactive tutorials, students are not only introduced to a diverse array of real-world datasets but also gain hands-on experience with the latest tools and libraries utilized in data analysis. This course prepares students for advanced courses in both machine learning and business analytics.
Data analytics has spread across all verticals in the education, with different schools emphasising its different aspects. While the mathematics department views it as a statistical modeling course, the business school emphasizes its analytical and interpretative aspects. In science and engineering, machine learning models are often used as black-box off-the-shelf tools. IT5006, however, adopts a balanced perspective on data analytics. In this course, although the students aren't evaluated on mathematical derivations, they are exposed to the derivation through the lessons. This white-box approach aids in their comprehension of mathematical language and its role in analytics. Additionally, IT5006 places equal emphasis on quantitative, qualitative, and comparative analyses of models.
What will you learn after the course?
Any data analytics project goes through the stages in the following figure that form the building blocks of the analytics pipeline. We will be learning about each of the building blocks over various lessons in the course.
After the course, you would have learnt to
- Formulate the analytics problem from the real-world use case.
- Write small scripts to generate insightful visualisations.
- Write small scripts to utilise statistical and machine learning toolbox.
- Interpret and analyse the results obtained from the models.
- Design the end-to-end pipeline for the analytics projects that work on the real-world data.
What do you need to know before taking the course?
The course uses Python as the default language to demonstrate the concepts. Although the students are free to choose the language of their choice, we prefer the usage of Python to ensure fairness. Thus, the students are expected to have completed a basic Python language course before joining IT5006. Basic knowledge of statistics and linear algebra additionally reduces the slope of the learning curve. We have provided a primer for Python and Linear Algebra under Appendix section of the notes.
What is this course not about?
In the age of buzz words, data analytics becomes an umbrella term for many verticals such as Big Data, Deep Learning, AI, Fintech, Blockchain. Although there is some overlap of these terms with the content of the course, the overlap is not significant enough to specialise in any of these domains. IT5006 aims to establish a strong foundation for you to take any of these specialised courses in future.