Notes
Slide Show
Outline
1
LyricAlly: Automatic Synchronization of Acoustic Musical Signals and Textual Lyrics
  • Ye Wang, Min Yen Kan, Tin Lay Nwe, Arun Shenoy, Jun Yin
2
Introduction
  • Motivation
    • Is singing voice transcription really necessary ?


      • Speech recognizers cannot be directly deployed
      • Availability of music lyrics on the internet


3
 
4
"Bryan Adams – Back to..."
  • Bryan Adams – Back to you
5
 
6
 
7
 
8
 
9
 
10
 
11
 
12
 
13
"Chorus sections detected by high..."
  • Chorus sections detected by high level of repetition.
    • Accounts for phoneme, word and line level repetition.


14
 
15
 
16
"Observation : Gaps between sections..."
  • Observation : Gaps between sections are shorter and more stable as compared to the sections themselves


17
 
18
 
19
 
20
"LYRCIALLY SYSTEM DEMO"
  • LYRCIALLY SYSTEM DEMO
21
"Starting point calculation more difficult..."
  • Starting point calculation more difficult than duration estimation
22
"Decreasing order of criticality:"
  • Decreasing order of criticality:




23
"Line level alignment of text..."
  • Line level alignment of text and musical audio


  • Text is crucial for duration estimation


  • Rhythm detection can inform downstream components


  • Accuracy of chorus detection is vital


  • Vocal detection model uses training based approach


  • For real-time performance: need to explore alternative vocal detection models
24
"GENERAL"
  • GENERAL
  • Limitation - 4/4 Meter, V1-C1-V2-C2-B-O
  • Future Work – alternate meter and song structure



  • AUDIO
  • Limitation – MM-HMM Optimal Classifier ?
  • Future Work - mixture modeling or classifiers like SVM and NN


  • Limitation – Restricted to percussive audio
  • Future Work – new approach to drumless rhythm detection



  • TEXT
  • Limitation – Phoneme duration estimation independent of tempo
  • Future Work – Tempo information re-estimation