6th VIPS Advanced School on
Computer Vision, Pattern Recognition
and Image Processing
NEWS: 9 December 2005
to download the correct page 121 of the lecture notes.
NEWS: 24 November 2005
Due to serious and unforeseeable family problems, it is likely that Prof.
Van Gool will not be able to be in Verona for the School. In that case, Maarten Vergauwen,
who was already in charge of half of the course, will give all the
However, you can withdraw from the School by sending a message
to the school secterariat (firstname.lastname@example.org)
by friday 2/12/05, and the fees will be refunded.
We apologize for the inconvenient.
NEWS: 7 October 2005
The end date of the course is changed: instead of the 9th is the 7th December.
Hence the course period will be 5-7 December.
As a consequence, the contents written in the bottom will be adapted to the shorter period.
We will update the contents as soon as we will know something more precise.
Verona by night from Castel S. Pietro
(click >>here<< for a free tourist guide and infos about Verona)
|December 5-7, 2005
Organized by the Vision, Image Processing and
Computer Science, University of Verona, Italy
This is the 6th Advanced School organized by the VIPS laboratory, the sixth
of a series of advanced lectures on significant topics in Computer Vision, Pattern Recognition,
and Image Processing.
These courses are particularly addressed to PhD
students, but open to all types of researchers. Each course will
typically be held in at most one week and will be focused on one specific
topic in order to provide a more productive interaction with the
The maximum number of participants is limited to 50
persons. In case of a larger number of applications, priority will be
given to PhD students.
This school is titled "Computer Vision Techniques for Passive 3D Acquisition".
Details about the course, contents and the registration procedure are given
in the following.
The 6th Advanced School is supported by GIRPR,
(Gruppo Italiano Ricercatori in Pattern Recognition)
Department of Information Technology and Electrical Engineering
ETH Zurich, Switzerland
Katholieke Universiteit Leuven, Heverlee (Leuven), Belgium
Computer Vision Techniques for Passive 3D Acquisition
Introduction on 3D Techniques
The course starts with an overview of 3D reconstruction techniques. A distinction will be made between
active and passive, uni- and multi-directional, manual and automatic techniques.
Automatic Passive 3D Reconstruction: Basics
In the course we will focus on reconstructing 3D models of scenes and objects automatically from images.
In order to understand the techniques and algorithms, some basic principles must be explained.
- What is an image? How is it formed?
- Camera models:
- The linear pinhole-model
- Non-linear distortions (radial, tangential)
- Transformations between cameras for the special case of planar objects. The difference between
Euclidean, metric, affine and projective transformations.
- Internal Calibration of cameras
- Intrinsics (and sometimes extrinsics) from known calibration objects (Tsai, . . . )
- Radial distortion: from calibration object or algorithm
which "straightens bended lines"
- External Calibration of cameras or "pose-estimation"
- The projective world: A projection is written as a multiplication with a 3x4 matrix
- Principle of passive 3D: The basic stereo-setup. For 3D reconstruction to succeed we need the 2
"c's": calibration and correspondences
As explained in the basics, one of the important prerequisites for 3D reconstruction is the detection of
correspondences between images. To facilitate matters we will first search for interesting points in the
images which are stable across views and are therefore excellent candidates for matching.
- Feature extraction: Harris corners; KLT features; Invariant features; Small-versus wide-baseline matching
- Feature matching: Matching simple features without descriptors: SSD, NCC;
Matching invariant features with descriptors; Matching lines
- The Geometry of Two Images
If we want to search for correspondences between two images, there is an underlying geometric structure
that can be employed, called epipolar geometry.
RANSAC is a generic tool, not only to be used for computing F-matrices. Another typical example is the
computation of an homography matrix H between two images.
- What is epipolar geometry?
- How can it be employed?
- Computing the F-matrix: linearly (8 matches); non-linearly (7 matches); least-squares with more
matches; with conditioning
- Robust matching - dealing with outliers with RANSAC: Description of the algorithm;
Some statistical notes on the number of tries; Preemptive RANSAC
- F-Guided matching of features
- How to compute H (another example of RANSAC)h
Relating Multiple Views
Matches between pairs of images can tell us something but not everything about the camera-setup. We will
now deal with multiple views.
- Three views are related by means of a trifocal tensor.
Multiple images can be related with other tensors.
- Projective reconstruction approach: Projective frame initialization;
Projective projection matrix estimation with P-RANSAC; Update and initialization of 3D points
The images have been related to each other in a projective frame. Unfortunately this implies that the
resulting reconstruction is only valid up to any projective transformation. This means that many properties
(like orthogonality, relative distances, parallelism) are not preserved. We need to upgrade the result from
projective to metric, a process called self calibration.
- The projective ambiguity
- Upgrading the result means constraints are needed
- Constraints on the scene
- Constraints on the camera's intrinsics:
The absolute quadric;
Writing down constraints;
Solving for the quadric means upgrading to metric;
Coupled self-calibration if multiple sequences have the same intrinsics.
Sequential matching of images has an impact on the build-up of errors. In order to distribute the error over
all images, a global optimization process is executed on the data which minimizes the total reprojection
error of all 3D points in all images, taking into account the camera model and other constraints.
- What is bundle adjustment?
- Substitution to limit the size of the system
- Usage of sparseness of the resulting matrix
At this point we are capable of reconstructing 3D points and cameras from images only. Unfortunately,
it happens often that so-called critical motions of the camera or critical surfaces are encountered during
recording. The most obvious of these is a (partly) planar scene.
- Why does computing an F-matrix fail if the scene is dominantly planar?
- What is an essential matrix?
- How to compute an essential matrix: From F to E via K; Directly (Nister's algorithm).
An essential matrix needs the intrinsics. We can recover them from the non-planar parts. How can we find
out which part is planar and which isn't?
- What is GRIC? Relation to Occam's Razor
- Formula of GRIC. Explanation of the elements
- F-GRIC, H-GRIC, PPP-GRIC, HH-GRICr
Camera calibration and sparse 3D point reconstruction is only one part of the story. For convincing 3D
models, we must reconstruct much more 3D points, i.e. obtain a dense reconstruction. In order to do so,
we must search for dense matches between the images.
- Standard Stereo: Rectification (homography, or radial polar); Dynamic programming for matching.
- Matching on the GPU
- Linking multiple sequential stereo pairs into dense depth maps
- Multi-View Stereo. A Bayesian approach deals better with occlusions
Combining all techniques: 3DWebservice
The Epoch-webservice combines elements of all previous sections into an automatic 3D reconstruction system.
- Explanation of the setup
- Explanation of the server-side
- Triplet matching and coupled self-calibration
- Hierarchical method
If time permits, the topics of the following paragraphs can be discussed.
Final Lectures Schedule
||09.30 - 13.00
||09.30 - 13.00
|14.30 - 18.30
||15.00 - 18.30
||15.00 - 16.30
150 euro for PhD and undergraduate
200 euro for post doc, researchers, and other
people working directly in a university.
300 euro for everybody else.
If you are interested, you must send an email to email@example.com
in which you ask for participation. Please, state your identity and
your status (undergraduate, PhD student, other) and wait for the
confirmation email. The ultimate deadline is November 12, 2005.
Attached to our confirmation email you will find a
registration form to print, compile and send together with a proof of the payment by fax before
November 19, 2005, to the following no. +39 045 8027068, to the
attention of Prof. V. Murino, 6th VIPS School on Computer Vision,
Pattern Recognition, and Image Processing.
The proposed payment method is bank wire transfer
(all necessary data are in the form).
November 12, 2005
Course Fee payment deadline:
November 19, 2005
(Registration form + Proof of payment)
December 5-7, 2005
The accomodation costs are not
covered by the Course Fee. However, we have made agreements
with some convenient hotels and you can find a list of available
If you wish to take advantage of these opportunities please remember
to notify to the hotel that you are attending our school.
Information on how to reach our department
are presented in this page.
For any other information, please send an email to firstname.lastname@example.org