[visionlist] PHD proposal at INRIA, Grenoble: Learning-Based Camera Control for Stereoscopic 3D Cinematography
Remi Ronfard
remi.ronfard at inria.fr
Mon May 14 08:38:42 GMT 2012
The IMAGINE team at INRIA Grenoble is looking for a PhD student on the
topic of
Learning-Based Camera Control for Stereoscopic 3D Cinematography
Please contact Dr. Remi Ronfard (remi.ronfard at inria.fr) for information
or to apply
Requirements : Masters in Computer Science or equivalent.Excellent
academic records in machine learning and computer graphics.
Context : The PhD is funded by the French government as part of
ACTION3DS project, a national research project coordinated by Thalès
Angénieux, with the goal of building the next generation of stereoscopic
3D camera rigs for cinema and broadcast production.
Controling a stereoscopic 3D camera in real time is a difficult
practical problem, involving the simultaneous control of a large number
of parameters, which typically requires a large crew of skilled and
perfectly synchronized technicians and artists. Automatic tools for
controlling some or all of the parameters are therefore needed to
increase the quality and bring down the cost of stereoscopic 3D
production, especially in broadcast and independent cinema. The problem
is made especially difficult in the case of dynamic scenes, because (1)
there is no established theory for predicting the quality of a
stereoscopic 3D sequence as a function of its parameters and (2)
real-time video analysis ofdisparities is difficult and error-prone in
fast moving dynamic scenes.
To overcome those difficulties, we propose to study the problem
experimentally in a simulated environment. We will pose the problem of
controlling a stereoscopic camera rig as a *sequential decision
problem,* where a virtual stereographer program evaluates the
stereoscopic image frame by frame and remotely controls the camera rig
parameters in order to maximize the expected viewing comfort and
interest of the entire sequence.
The goal of the thesis will therefore be to propose, design and
implement techniques for choosing optimal control policies that maximize
a reward function combining the viewing comfort and the interest of the
recorded sequence, from the viewpoint of the audience. To do this, we
will use the framework of reinforcement learning, using mathematical
models of viewing comfort and interest proposed in the literature, as
well as real examples from experts in 3-D cinematography and editing.
The proposed models will be tested and evaluated using real-time
computer simulations in the Blender Game Engine. When the tests are
successful, the trained models will be embedded in the motion control
program of a stereoscopic 3D rig.
In the first year, the candidate will design and implement a realistic,
real-time stereoscopic camera rig model for the Blender Game Engine,
including the simulation of depth-of-field effects, and propose
algorithms for learning a control policy allowing the initial setup and
tuning of the rig parameters. He will also review existing mathematical
models for predicting the viewing comfort (3,12,13) as a function of the
scene content, the optical parameters of the cameras (focal length,
focus distance, depth of field) and the stereoscopic parameters of the
rig (convergence and interaxial distance). Together with our academic
and industrial partners, we will also propose mathematical models for
predicting the narrative and aesthetic "interest" of the 3D image as a
function of the same parameters (2,5,9,11,15).
Second year : Using the viewing comfort and aesthetic value as a reward
function, the candidate will learn optimal policies for controlling the
stereoscopic camera rig in real time, using examples of motion control
laws given by experts in a a variety of typical movie and broadcast
scenarios. As a starting point, we will assume that the stereoscopic
camera rig controler (virtual stereographer) can be modeled with a
Markov Decision Process (MDP). Recently, Levine et al. (8) have trained
MDPs for coordinating the gestures of virtual actor with an input speech
signal, using a training set of synchronized speech+gesture examples. We
therefore expect that similar techniques can be used for learning
« camera controlers » from a training set of synchronized stereoscopic
video and motion control examples.
The third year of the thesis will be devoted to tests and validation
with industry partners, and to generalize the proposed methods to the
multi-camera case, with the additional difficulty of estimating the
comfort and interest during cuts from one camera to another
(stereoscopic editing).
References
1.Daniyal and Cavallaro : Multi-camera scheduling for video production.
CVMP 2011.
2.De Souza : Think in 3D: Food For Thought for Directors,
Cinematographers and Stereographers. CreateSpace, March 2012.
3.Didyk , Ritschel , Eisemann , Myszkowski , Seidel, A perceptual model
for disparity, ACM Transactions on Graphics (TOG), v.30 n.4, July 2011
<http://dl.acm.org/citation.cfm?id=1964991&CFID=80949628&CFTOKEN=90924745>
4.Heinzle, Greisen, Gallup, Chen, Saner, Smolic, Burg, Matusik, Gross:
Computational stereo camera system with programmable control loop.
In ACM SIGGRAPH 2011.
5.Hummel: 3-D Stereoscopic Cinematography. American Cinematographer,
Vol. 89 Num. 4, April 2008
<http://news-business.vlex.com/source/american-cinematographer-3953/issue_nbr/%2389%234>.
6.Ichikari, Kikuchi, Toishita, Tenmoku, Shibata, Tamura: On-site
real-time 3D match move for MR-based previsualization with relighting.
In ACM SIGGRAPH 2010 Talks (SIGGRAPH '10).
7.Koppal, Zitnick, Cohen, Kang, Ressler, and Colburn, "A view-centric
editor for stereoscopic camera," IEEE Computer Graphics and
Applications, Jan./Feb. 2011, pp. 20-35.
8.Levine, Kr.henbu?hl, Thrun, Koltun. Gesture Controllers. ACM SIGGRAPH
2010.
9.Mendiburu : 3D Movie MakingStereoscopic Digital Cinema from Script to
Screen. Focal Press, 2009.
10.Oskam, Hornung, Bowles, Mitchell, Gross. Optimized stereoscopic
camera control for interactive 3D. ACM Trans. Graph. 30, 6, 2011.
11.Pennington and Giardina : Exploring 3D. The New Grammar of
Stereoscopic Filmmaking. Focal Press, in press.
12.Ronfard and Taubin (Eds). Image and Geometry Processing for 3-D
Cinematography, Springer Berlin Heidelberg (Ed.) (2010).
13.Shibata, Kim, Hoffman, Banks : The zone of comfort: Predicting visual
discomfort with stereo displays. Journal of Vision 11, 8. 2011.
14.Smolic, Kauff, Knorr, Hornung, Kunter, Müller, Lang :
Three-dimensional video postproduction and processing. Proceedings of
the IEEE 99, 4 2011.
15.Zilly, Kluger, Kauff: Production rules for stereo
acquisition. Proceedings of the IEEE 99, 4 2011.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://visionscience.com/pipermail/visionlist/attachments/20120514/3961dea9/attachment-0001.htm>
More information about the visionlist
mailing list