CS5249 - Audio
in Multimedia Systems
2008/2009, Semester I, Monday, 18:30-20:30, COM1-206
We will be using the
Integrated Virtual Learning Environment (IVLE)
for forum discussions, announcements, and other temporally-sensitive materials.
module is intended for advanced undergraduate and graduate students who wish
to understand modern audio technologies, from low-level audio representation
to high-level content analysis, from basic waveform to advanced audio
compression and compressed domain processing.
Fundamentals of audio technologies will first be taught, with a focus on
audio aspects relevant for students to perform research in multimedia,
embedded systems, database and networks. We also pay attention to
technologies with significant industrial relevance. Upon completion of this
module, students should be comfortable to use audio in their own research;
They should also have the fundamental knowledge to perform industrial R&D or
Modular Credits: 4.
Prerequisite: CS3242 Hypermedia
Technologies or equivalent is required for the course. Basic knowledge
of signal processing is of great help.
Ye Wang ,
Office: AS6 #04-08 (6516 2980). You may also contact me via email.
Workload: 2 lecture hours, about 8
hours preparation per week depending on your background.
Reference Book: We will be reading
relevant chapters from reference books as well as journal and conference
Ben Gold and Nelson Morgan
(2000) Speech and Audio Signal Processing.
Part I: Basics of Signal and Music Processing
- Introduction to Audio Processing with Matlab
- Introduction to Music and Music Information Retrieval
- Speech/Music Recording and Processing
- Basics of Audio and Speech Synthesis
- Basics of Psychoacoustics and Audio Compression
Part II: Analysis of Speech and Music Signal
- Analysis Framework
- Audio Feature Extraction
- HMM and GMM
- Case study: speech recognition and speaker recognition
Part III: Student Projects
Demo, source code (well-commented) and Readme file are required after the projects.
Recommended programming languages: Matlab/Octave, C/C++, Python, Java.
- Project 1: Proposal of novel ideas on spoken language education, music education or music retrieval with mobile devices (one week)
- Project 2: Audio timescale modification (two weeks)
- Project 3: Music genre classification or Audio segmentation (six weeks)
(you may propose your own project, but dicsuss with the lecturer first)