Video Processing

A Disney story is often told through video, whether it's a movie, a serial, a newscast, or professional sports. This raises a gamut of research challenges with hard-hitting economic impact: for example, automating labor-intensive processes while preserving art directability, avoiding expensive reshoots by adding content-aware flexibility in postproduction, and adapting to a world with increasingly diverse devices. Video technology, from capture to display, is of utmost importance for many Disney business units. Therefore, the Advanced Video Technology group addresses these needs in leading edge research. This spans the whole range from conventional 2D video, over 3D video, stereoscopic and beyond, to most sophisticated free viewpoint video representations. Camera systems are designed, which take signal acquisition to the next level. A particular focus is on any type of subsequent processing, automatic and interactive, real-time and post-production to create highest quality images.

Projects

2D to 3D Video Conversion

The purpose of this project is to make research contributions in the area of 2d to 3d conversion that can eventually be incorporated into the existing conversion pipeline used by Animation. In order to support 2d3d conversion using warping technologies, improvements must be made to the fundamental warping algorithms. Unlike with image retargeting and stereo editing, objects often have to separate to move in opposite directions. This is currently poorly supported by continuous image warping. Therefore, we will analyze new methods to introduce discontinuities into continuous image warping techniques.

Traditional cinematographic cameras consist of a single camera, while two-camera cinematographic rigs have also become common with the recent wave of 3D cinema. Taking camera design a further step, this project proposes a system in which a central cinematographic camera is augmented with a clip-on frame of satellite sensors. The satellite devices include compact cameras, a depth sensor, and a thermal camera. The result is a FusionCam that supports more powerful post-production analysis than is possible with a single camera or a two-camera rig, and is able to synthetically generate stereoscopic 3D imagery with specified stereo parameters.

The core research challenge is to produce better depth maps by integrating the high-resolution image from the central cinematographic camera with the information from the satellite sensor modalities. Performing fusion of these different modalities raises questions regarding how the strengths of the modalities can be best exploited, and how the weaknesses of each can best be compensated for. Current work is on analysis of the data at a single time instance, and new work will extend this to temporal analysis of video. [More...]

*This project is categorized under Video Processing and Computer Vision.

Our technology for video retargeting allows to adapt the aspect ratio of a video to different output devices, while preserving the shape of visually important objects and hiding the necessary deformation is visually less salient regions. [More]
Stereoscopic 3D creates the illusion of depth. However, extremely careful design is necessary to ensure an excellent user experience, which has to consider display technology, human visual perception and artistic intent. One important functionality in this context is the ability to change the disparity composition (and with that, the depth perception) of the stereo content AFTER capture. This was not supported satisfactory by any system so far. In this ground-breaking work we developed algorithms that provide full control over disparity of given stereo.[More...]

Stereoscopic 3D Copy & Paste

With the increase in popularity of stereoscopic 3D imagery for film, TV, and interactive entertainment, an urgent need for editing tools to support stereo content creation has become apparent. In this paper we present an end-to-end system for object copy & paste in a stereoscopic setting to address this need. There is no straightforward extension of 2D copy & paste to support the addition of the third dimension as we show in this paper.

Warping-based Motion Estimation for High Efficiency Video Coding

Recently, image warping techniques are applied in different domains to create synthetic but visually plausible images, e.g. for image retargeting, artistic manipulation of images, or disparity mapping for stereoscopic 3D. On the other hand, most state-of-the-art video coders like H.264|AVC still employ block-based motion estimation. There, each predicted image is described by blocks of texture, which are copied from an already encoded image. Predicted images often suffer from so called blocking artifacts, which are strongly noticeable and which appear mainly on block boundaries. The goal of this project is a) to evaluate if warping can lead to an improved temporal prediction while creating visually plausible images, and b) to evaluate if warping can contribute to an overall higher video coding efficiency.