[visionlist] PHD proposal at INRIA, Grenoble: Learning-Based Camera Control for Stereoscopic 3D Cinematography

Remi Ronfard remi.ronfard at inria.fr
Mon May 14 08:38:42 GMT 2012


The IMAGINE team at INRIA Grenoble is looking for a PhD student on the 
topic of
Learning-Based Camera Control for Stereoscopic 3D Cinematography

Please contact Dr. Remi Ronfard (remi.ronfard at inria.fr) for information 
or to apply

Requirements : Masters in Computer Science or equivalent.Excellent 
academic records in machine learning and computer graphics.

Context : The PhD is funded by the French government as part of 
ACTION3DS project, a national research project coordinated by Thalès 
Angénieux, with the goal of building the next generation of stereoscopic 
3D camera rigs for cinema and broadcast production.

Controling a stereoscopic 3D camera in real time is a difficult 
practical problem, involving the simultaneous control of a large number 
of parameters, which typically requires a large crew of skilled and 
perfectly synchronized technicians and artists. Automatic tools for 
controlling some or all of the parameters are therefore needed to 
increase the quality and bring down the cost of stereoscopic 3D 
production, especially in broadcast and independent cinema. The problem 
is made especially difficult in the case of dynamic scenes, because (1) 
there is no established theory for predicting the quality of a 
stereoscopic 3D sequence as a function of its parameters and (2) 
real-time video analysis ofdisparities is difficult and error-prone in 
fast moving dynamic scenes.

To overcome those difficulties, we propose to study the problem 
experimentally in a simulated environment. We will pose the problem of 
controlling a stereoscopic camera rig as a *sequential decision 
problem,* where a virtual stereographer program evaluates the 
stereoscopic image frame by frame and remotely controls the camera rig 
parameters in order to maximize the expected viewing comfort and 
interest of the entire sequence.

The goal of the thesis will therefore be to propose, design and 
implement techniques for choosing optimal control policies that maximize 
a reward function combining the viewing comfort and the interest of the 
recorded sequence, from the viewpoint of the audience. To do this, we 
will use the framework of reinforcement learning, using mathematical 
models of viewing comfort and interest proposed in the literature, as 
well as real examples from experts in 3-D cinematography and editing. 
The proposed models will be tested and evaluated using real-time 
computer simulations in the Blender Game Engine. When the tests are 
successful, the trained models will be embedded in the motion control 
program of a stereoscopic 3D rig.

In the first year, the candidate will design and implement a realistic, 
real-time stereoscopic camera rig model for the Blender Game Engine, 
including the simulation of depth-of-field effects, and propose 
algorithms for learning a control policy allowing the initial setup and 
tuning of the rig parameters. He will also review existing mathematical 
models for predicting the viewing comfort (3,12,13) as a function of the 
scene content, the optical parameters of the cameras (focal length, 
focus distance, depth of field) and the stereoscopic parameters of the 
rig (convergence and interaxial distance). Together with our academic 
and industrial partners, we will also propose mathematical models for 
predicting the narrative and aesthetic "interest" of the 3D image as a 
function of the same parameters (2,5,9,11,15).

Second year : Using the viewing comfort and aesthetic value as a reward 
function, the candidate will learn optimal policies for controlling the 
stereoscopic camera rig in real time, using examples of motion control 
laws given by experts in a a variety of typical movie and broadcast 
scenarios. As a starting point, we will assume that the stereoscopic 
camera rig controler (virtual stereographer) can be modeled with a 
Markov Decision Process (MDP). Recently, Levine et al. (8) have trained 
MDPs for coordinating the gestures of virtual actor with an input speech 
signal, using a training set of synchronized speech+gesture examples. We 
therefore expect that similar techniques can be used for learning 
« camera controlers » from a training set of synchronized stereoscopic 
video and motion control examples.

The third year of the thesis will be devoted to tests and validation 
with industry partners, and to generalize the proposed methods to the 
multi-camera case, with the additional difficulty of estimating the 
comfort and interest during cuts from one camera to another 
(stereoscopic editing).

References

1.Daniyal and Cavallaro : Multi-camera scheduling for video production. 
CVMP 2011.

2.De Souza : Think in 3D: Food For Thought for Directors, 
Cinematographers and Stereographers. CreateSpace, March 2012.

3.Didyk , Ritschel , Eisemann , Myszkowski , Seidel, A perceptual model 
for disparity, ACM Transactions on Graphics (TOG), v.30 n.4, July 2011 
<http://dl.acm.org/citation.cfm?id=1964991&CFID=80949628&CFTOKEN=90924745>

4.Heinzle, Greisen, Gallup, Chen, Saner, Smolic, Burg, Matusik, Gross: 
Computational stereo camera system with programmable control loop. 
In ACM SIGGRAPH 2011.

5.Hummel: 3-D Stereoscopic Cinematography. American Cinematographer, 
Vol. 89 Num. 4, April 2008 
<http://news-business.vlex.com/source/american-cinematographer-3953/issue_nbr/%2389%234>.

6.Ichikari, Kikuchi, Toishita, Tenmoku, Shibata, Tamura: On-site 
real-time 3D match move for MR-based previsualization with relighting. 
In ACM SIGGRAPH 2010 Talks (SIGGRAPH '10).

7.Koppal, Zitnick, Cohen, Kang, Ressler, and Colburn, "A view-centric 
editor for stereoscopic camera," IEEE Computer Graphics and 
Applications, Jan./Feb. 2011, pp. 20-35.

8.Levine, Kr.henbu?hl, Thrun, Koltun. Gesture Controllers. ACM SIGGRAPH 
2010.

9.Mendiburu : 3D Movie MakingStereoscopic Digital Cinema from Script to 
Screen. Focal Press, 2009.

10.Oskam, Hornung, Bowles, Mitchell, Gross. Optimized stereoscopic 
camera control for interactive 3D. ACM Trans. Graph. 30, 6, 2011.

11.Pennington and Giardina : Exploring 3D. The New Grammar of 
Stereoscopic Filmmaking. Focal Press, in press.

12.Ronfard and Taubin (Eds). Image and Geometry Processing for 3-D 
Cinematography, Springer Berlin Heidelberg (Ed.) (2010).

13.Shibata, Kim, Hoffman, Banks : The zone of comfort: Predicting visual 
discomfort with stereo displays. Journal of Vision 11, 8. 2011.

14.Smolic, Kauff, Knorr, Hornung, Kunter, Müller, Lang : 
Three-dimensional video postproduction and processing. Proceedings of 
the IEEE 99, 4 2011.

15.Zilly, Kluger, Kauff: Production rules for stereo 
acquisition. Proceedings of the IEEE 99, 4 2011.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://visionscience.com/pipermail/visionlist/attachments/20120514/3961dea9/attachment-0001.htm>


More information about the visionlist mailing list