EPSRC logo
 Home | GoW Home | Back | Programme | Scheme | Topic | Sector | Theme | Region | Organisation     
 
Details of Grant
 
EPSRC Reference: GR/S75802/01
Title: Object-based Coding of Musical Audio
Principal Investigator: Professor M Plumbley
Other Investigators:
Professor ME Davies Professor M Sandler
Researcher Co-investigator:
Project Partner:
Department: Sch of Electronic Eng & Computer Science
Organisation: Queen Mary, University of London
Scheme: Standard Research
Starts: 01 October 2004 Ends: 30 November 2007 Value (£): 202,353
EPSRC Research Topic Classifications:
Multimedia User Interface Technologies
EPSRC Industrial Sector Classifications:
Creative Industries
Related Grants:
Panel History:  
Summary
Current coding systems for musical audio tend to use transform coding or filterbanks, with bit allocation determined by psychophysical masking thresholds. A recent alternative approach is to decompose the signal Into a parametric encoding, consisting for example of sinusoids+noise or tonals+transients+residuals. In this project, we will develop a methodology for audio coding using higher-level "sound objects", consisting of Individual notes or chords played by particular instruments. From a top-down direction, this will be based on our expertise on automatic music transcription. We will explore a range of alternative approaches using different levels of Information, down to sparse coding methods for grouping of low-level parametric coding objects. We will Initially investigate coding of simple monophonic (single-voice) musical audio, extending this work to single-instrument and smallgroup polyphonic musical audio as the project progresses. We will use MPEG-4 Structured Audio as our primary encoding format. While full objectbased coding of complex polyphonic audio scenes (such as a symphony orchestra) is still a long way off, this project is a first step towards this Important long-term goal.
Final Report Summary
Current coding systems for musical audio tend to use transform coding or filterbanks, with bit allocation determined by psychophysical masking thresholds. A recent alternative approach is to decompose the signal into a parametric encoding, consisting for example of either sine waves plus noise, or sine waves plus transients plus noise. In this project, we developed a methodology for audio coding using higher-level "sound objects", consisting of individual notes or chords played by particular instruments, using models based on Bayesian probability theory. While these models can be complex and time-consuming to calculate, we developed new efficient methods to learn these models faster than existing approaches. Using listening tests we showed that these gave better results at low bit rates than alternative methods. We also explored closely related technologyies such as audio source separation and sparse representations, that we believe will be useful for future object-based coding systems. While full object-based coding of complex polyphonic audio scenes (such as a symphony orchestra) is still a long way off, this project is a first step towards this important long-term goal.
Further Information:  
Organisation Website: http://www.qmul.ac.uk
Terms and conditions