
NEWS: 9 December 2005 Click >>here<<
to download the correct page 121 of the lecture notes. NEWS: 24 November 2005 Due to serious and unforeseeable family problems, it is likely that Prof.
Van Gool will not be able to be in Verona for the School. In that case, Maarten Vergauwen,
who was already in charge of half of the course, will give all the
lectures. However, you can withdraw from the School by sending a message
to the school secterariat (vips_school@sci.univr.it)
by friday 2/12/05, and the fees will be refunded. We apologize for the inconvenient. Vittorio Murino
Andrea Fusiello NEWS: 7 October 2005 The end date of the course is changed: instead of the 9th is the 7th December.
Hence the course period will be 57 December.
As a consequence, the contents written in the bottom will be adapted to the shorter period.
We will update the contents as soon as we will know something more precise.
6th VIPS Advanced School on
Computer Vision, Pattern Recognition
and Image Processing

Verona by night from Castel S. Pietro
(click >>here<< for a free tourist guide and infos about Verona)

December 57, 2005
Organized by the Vision, Image Processing and
Sound Laboratory
Department of
Computer Science, University of Verona, Italy 
This is the 6th Advanced School organized by the VIPS laboratory, the sixth
of a series of advanced lectures on significant topics in Computer Vision, Pattern Recognition,
and Image Processing.
These courses are particularly addressed to PhD
students, but open to all types of researchers. Each course will
typically be held in at most one week and will be focused on one specific
topic in order to provide a more productive interaction with the
lecturer.
The maximum number of participants is limited to 50
persons. In case of a larger number of applications, priority will be
given to PhD students.
This school is titled "Computer Vision Techniques for Passive 3D Acquisition".
Details about the course, contents and the registration procedure are given
in the following.
The 6th Advanced School is supported by GIRPR,
(Gruppo Italiano Ricercatori in Pattern Recognition)


Lecturers

Department of Information Technology and Electrical Engineering

ETH Zurich, Switzerland


Departement Elektrotechniek

Katholieke Universiteit Leuven, Heverlee (Leuven), Belgium

Course title
Computer Vision Techniques for Passive 3D Acquisition
Contents

Introduction on 3D Techniques
The course starts with an overview of 3D reconstruction techniques. A distinction will be made between
active and passive, uni and multidirectional, manual and automatic techniques.

Automatic Passive 3D Reconstruction: Basics
In the course we will focus on reconstructing 3D models of scenes and objects automatically from images.
In order to understand the techniques and algorithms, some basic principles must be explained.
 What is an image? How is it formed?
 Camera models:
 The linear pinholemodel
 Nonlinear distortions (radial, tangential)
 Transformations between cameras for the special case of planar objects. The difference between
Euclidean, metric, affine and projective transformations.
 Internal Calibration of cameras
 Intrinsics (and sometimes extrinsics) from known calibration objects (Tsai, . . . )
 Radial distortion: from calibration object or algorithm
which "straightens bended lines"
 External Calibration of cameras or "poseestimation"
(Grunert's algorithm)
 The projective world: A projection is written as a multiplication with a 3x4 matrix
 Principle of passive 3D: The basic stereosetup. For 3D reconstruction to succeed we need the 2
"c's": calibration and correspondences

Relating Images
 Features
As explained in the basics, one of the important prerequisites for 3D reconstruction is the detection of
correspondences between images. To facilitate matters we will first search for interesting points in the
images which are stable across views and are therefore excellent candidates for matching.
 Feature extraction: Harris corners; KLT features; Invariant features; Smallversus widebaseline matching
 Feature matching: Matching simple features without descriptors: SSD, NCC;
Matching invariant features with descriptors; Matching lines
 The Geometry of Two Images
If we want to search for correspondences between two images, there is an underlying geometric structure
that can be employed, called epipolar geometry.
 What is epipolar geometry?
 How can it be employed?
 Computing the Fmatrix: linearly (8 matches); nonlinearly (7 matches); leastsquares with more
matches; with conditioning
 Robust matching  dealing with outliers with RANSAC: Description of the algorithm;
Some statistical notes on the number of tries; Preemptive RANSAC
 FGuided matching of features
RANSAC is a generic tool, not only to be used for computing Fmatrices. Another typical example is the
computation of an homography matrix H between two images.
 How to compute H (another example of RANSAC)h

Relating Multiple Views
Matches between pairs of images can tell us something but not everything about the camerasetup. We will
now deal with multiple views.
 Three views are related by means of a trifocal tensor.
Multiple images can be related with other tensors.
 Projective reconstruction approach: Projective frame initialization;
Projective projection matrix estimation with PRANSAC; Update and initialization of 3D points

Self Calibration
The images have been related to each other in a projective frame. Unfortunately this implies that the
resulting reconstruction is only valid up to any projective transformation. This means that many properties
(like orthogonality, relative distances, parallelism) are not preserved. We need to upgrade the result from
projective to metric, a process called self calibration.
 The projective ambiguity
 Upgrading the result means constraints are needed
 Constraints on the scene
 Constraints on the camera's intrinsics:
The absolute quadric;
Writing down constraints;
Solving for the quadric means upgrading to metric;
Coupled selfcalibration if multiple sequences have the same intrinsics.

Bundle Adjustment
Sequential matching of images has an impact on the buildup of errors. In order to distribute the error over
all images, a global optimization process is executed on the data which minimizes the total reprojection
error of all 3D points in all images, taking into account the camera model and other constraints.
 What is bundle adjustment?
 Substitution to limit the size of the system
 Usage of sparseness of the resulting matrix

Model Selection
At this point we are capable of reconstructing 3D points and cameras from images only. Unfortunately,
it happens often that socalled critical motions of the camera or critical surfaces are encountered during
recording. The most obvious of these is a (partly) planar scene.
 Why does computing an Fmatrix fail if the scene is dominantly planar?
 What is an essential matrix?
 How to compute an essential matrix: From F to E via K; Directly (Nister's algorithm).
An essential matrix needs the intrinsics. We can recover them from the nonplanar parts. How can we find
out which part is planar and which isn't?
 What is GRIC? Relation to Occam's Razor
 Formula of GRIC. Explanation of the elements
 FGRIC, HGRIC, PPPGRIC, HHGRICr

Dense Matching
Camera calibration and sparse 3D point reconstruction is only one part of the story. For convincing 3D
models, we must reconstruct much more 3D points, i.e. obtain a dense reconstruction. In order to do so,
we must search for dense matches between the images.
 Standard Stereo: Rectification (homography, or radial polar); Dynamic programming for matching.
 Matching on the GPU
 Linking multiple sequential stereo pairs into dense depth maps
 MultiView Stereo. A Bayesian approach deals better with occlusions

Combining all techniques: 3DWebservice
The Epochwebservice combines elements of all previous sections into an automatic 3D reconstruction system.
 Explanation of the setup
 Explanation of the serverside
 Triplet matching and coupled selfcalibration
 Hierarchical method
Possible Extras
If time permits, the topics of the following paragraphs can be discussed.
Final Lectures Schedule
Monday 5 
Tuesday 6 
Wednesday 7 

09.30  13.00 
09.30  13.00 
14.30  18.30 
15.00  18.30 
15.00  16.30 
Course Fees
150 euro for PhD and undergraduate
students.
200 euro for post doc, researchers, and other
people working directly in a university.
300 euro for everybody else.
Registration
If you are interested, you must send an email to vips_school@sci.univr.it
in which you ask for participation. Please, state your identity and
your status (undergraduate, PhD student, other) and wait for the
confirmation email. The ultimate deadline is November 12, 2005.
Attached to our confirmation email you will find a
registration form to print, compile and send together with a proof of the payment by fax before
November 19, 2005, to the following no. +39 045 8027068, to the
attention of Prof. V. Murino, 6th VIPS School on Computer Vision,
Pattern Recognition, and Image Processing.
The proposed payment method is bank wire transfer
(all necessary data are in the form).
Important Dates
Registration deadline:

November 12, 2005

(Email)

Course Fee payment deadline:

November 19, 2005

(Registration form + Proof of payment)

School:

December 57, 2005


Accomodations
The accomodation costs are not
covered by the Course Fee. However, we have made agreements
with some convenient hotels and you can find a list of available
places here.
If you wish to take advantage of these opportunities please remember
to notify to the hotel that you are attending our school.
Information on how to reach our department
are presented in this page.
For any other information, please send an email to vips_school@sci.univr.it

