|
| EPSRC Reference: |
GR/S75802/01 |
| Title: |
Object-based Coding of Musical Audio |
| Principal Investigator: |
Professor M Plumbley |
| Other Investigators: |
|
| Researcher Co-investigator: |
|
| Project Partner: |
|
| Department: |
Sch of Electronic Eng & Computer Science |
| Organisation: |
Queen Mary, University of London |
| Scheme: |
Standard Research |
| Starts: |
01 October 2004 |
Ends: |
30 November 2007 |
Value (£): |
202,353
|
| EPSRC Research Topic Classifications: |
| Multimedia |
User Interface Technologies |
|
| EPSRC Industrial Sector Classifications: |
|
| Related Grants: |
|
| Panel History: |
|
|
Summary |
|
Current coding systems for musical audio tend to use transform coding or filterbanks, with bit allocation determined by psychophysical masking thresholds. A recent alternative approach is to decompose the signal Into a parametric encoding, consisting for example of sinusoids+noise or tonals+transients+residuals. In this project, we will develop a methodology for audio coding using higher-level "sound objects", consisting of Individual notes or chords played by particular instruments. From a top-down direction, this will be based on our expertise on automatic music transcription. We will explore a range of alternative approaches using different levels of Information, down to sparse coding methods for grouping of low-level parametric coding objects. We will Initially investigate coding of simple monophonic (single-voice) musical audio, extending this work to single-instrument and smallgroup polyphonic musical audio as the project progresses. We will use MPEG-4 Structured Audio as our primary encoding format. While full objectbased coding of complex polyphonic audio scenes (such as a symphony orchestra) is still a long way off, this project is a first step towards this Important long-term goal.
|
| Final Report Summary |
|
Current coding systems for musical audio tend to use transform coding or filterbanks, with bit allocation determined by psychophysical masking thresholds. A recent alternative approach is to decompose the signal into a parametric encoding, consisting for example of either sine waves plus noise, or sine waves plus transients plus noise. In this project, we developed a methodology for audio coding using higher-level "sound objects", consisting of individual notes or chords played by particular instruments, using models based on Bayesian probability theory. While these models can be complex and time-consuming to calculate, we developed new efficient methods to learn these models faster than existing approaches. Using listening tests we showed that these gave better results at low bit rates than alternative methods. We also explored closely related technologyies such as audio source separation and sparse representations, that we believe will be useful for future object-based coding systems. While full object-based coding of complex polyphonic audio scenes (such as a symphony orchestra) is still a long way off, this project is a first step towards this important long-term goal.
|
| Further Information: |
|
| Organisation Website: |
http://www.qmul.ac.uk |
|
|