Scalable Music: Automatic Music Retargeting and Synthesis

Project Members

Simon Wenner (ETH Zurich)
Jean-Charles Bazin (ETH Zurich)
Alexander Sorkine-Hornung (Disney Research Zurich)



Just like images and videos, music is an essential part of our everyday life. In fact, in many applications video and audio content are tightly coupled, such as in soundtracks for movies, computer games, or music videos. But despite apparent similarities, many recent advances in image and video processing have not been generalized appropriately to the audio domain, even though it would be beneficial in many applications. In this paper, we propose a method for dynamic rescaling and reassembly of music, inspired by recent works on image retargeting, video reshuffling and character animation in the computer graphics community. Given the desired target length of a piece of music and optional additional constraints such as positions and importance of certain parts, we build on concepts from seam carving, video textures and motion graphs and extend them to allow for a global optimization of cuts, jumps and repetitions in an audio signal. Based on an automatic feature extraction and spectral clustering for segmentation, we employ k-stops shortest path search via dynamic programming to synthesize a novel piece of music that best fulfills all desired constraints, with imperceptible transitions between reshuffled parts. We show various applications of music retargeting such as decreasing or increasing music length (up to infinity), video editing and removal or addition of parts (e.g. discarding of unpleasant singing parts)


May 6, 2013
Eurographics 2013
