In sound design for immersive media and games, the designer typically starts by searching a large sound effects library for the desired sounds. The project team have delivered a highly successful InnovateUK feasibility project demonstrating that existing sound effects libraries can be replaced by sound synthesis techniques that use software algorithms to generate the desired sounds, enabling high quality sound effects across all forms of content creation. But the biggest challenge for professional sound design is the effort required to integrate sound effects into the timeline and story, and synchronise them with video or animation content.
The Autonomous Systems for Sound Integration and GeneratioN (ASSIGN) project will deliver and validate a prototype sensor-driven sound synthesis service, capable of autonomous decision-making for use by anyone wanting to enhance or interact with sounds. It will allow synthesis of any sound, with integration into existing workflows,.
The ASSIGN system generates sounds with their context from other sensor data. It uses animation storyboarding and visual object recognition information to automatically synthesize sounds and effects with their correct to drive sound generation, placement and perspective, thus enabling new forms of interaction.
By exploiting sensors to generate sounds and their context, we give intelligence to the sound effects generation. This fits nicely with computer graphics approaches, where much of the animation is driven by some high level information, e.g., if a man drops a glass, we see it falling in the virtual world of the game, film or augmented reality. The animation is a property of the object, and sound effects should follow this same paradigm, thus enabling synchronisation of sound effects with CGI in immersive, game, film and augmented reality design.
ASSIGN has the potential to revolutionize the sound design process for production of film, TV and animation content. By integrating such technologies with intuitive cloud media production tools developed by the lead partner, RPPtv, we can deliver a novel, sensor-driven SFX synthesis service. It democratises the industry, giving anyone the ability to become a sound designer, giving users control, harnessing their creativity.
Initial work by the academic partner in this area has shown great promise. A wide range of sounds, including sound textures, soundscapes and impulse sounds covering most popular sound effects can be synthesized in real-time with high quality and relevant user controls. But significant research questions remain unanswered;
- How rich and relevant are available metadata for control and placement of synthesized sound effects?
- How effective are state-of-the-art object and scene recognition and object tracking methods for extracting details of an acoustic environment?
- How versatile are the sound synthesis models?
- How can the parameters of sound synthesis models be mapped to intuitive controls?
- How should autonomous sound effect generation and integration methods be evaluated?
The project is industry-led, with an experienced academic partner and validated with industry users. It exploits the outputs of a highly successful InnovateUK Feasibility Study and groundbreaking research from Queen Mary University of London's audio engineering team. It brings together knowledge, skills and technologies from immersive and games audio production, cloud service delivery and academic research excellence. A demonstrator of the service will be evaluated and validated in audio post-production with immersive, object-based and 3D audio specialists Gareth Llewelyn Sound. The business potential is compelling, since the project will demonstrate a disruptive cloud service for a globally used and purchased media resource. The outputs will include a prototype for CGI-linked SFX production, with full market analysis, business models and road map to launch a commercial service.