The research envisaged at the VIPS laboratory covers a broad spectrum of issues related to Computer Vision and Pattern Recognition, Image Processing, and Signal Processing (audio). In the following, we will shortly describe the mainstreams of our research.
VIPS research is naturally focused on general aspects of image analysis and understanding, which are at the basis of all advanced image processing. In particular, filtering and segmentation, feature extraction, object recognition, and scene reconstruction are all areas where VIPS people have gained skills and experience. Moreover, specific expertise on multisensory data processing, namely acoustics and optical imaging, has also been acquired, ultimately aimed at environment understanding.
VIPS people is interested in model acquisition, starting from 3D information acquired by different sensors, e.g. acoustic cameras and optical cameras. To this end, acoustic range images are used for underwater environment reconstruction. Research is focused on image restoration, multiview registration and fusion. The noisy nature of the data makes these tasks challenging. As for the optical depth recovery, VIPS is investigating aspects of view synthesis, mosaicing, and methods for improving computational stereopsis with Markov Random Fields models.
Research in this field is mainly focused on probabilistic models
for pattern recognition, for their interesting theoretical
properties and their wide applicability in computer vision, bioinformatics and other
research areas. Our interest lies mainly in investigating Hidden Markov Models,
Markov Random Fields and Bayesian Networks. VIPS is interested both in
methodological, e.g. learning and model selection, and applicative
issues, such as shape classification, signal classification and clustering,
target tracking, person detection, behavior analysis, object classification, object segmentation and video surveillance.
Social signal processing represents an intriguing intersection among Social Psychology, Pattern Recognition and Computer Vision. It aims at modelling human activities exploiting social theories, which are translated in statistical vision models. In particular, social signal processing focuses on social signals, that are the expressions of ones attitude towards social situations and interplays. In turn, social signals are composed by social cues, that are temporal changes in neuromuscular and physiological activity that last for short intervals of time (milliseconds to minutes). In social signal processing, social cues become the features, and the social signals are statistical models suited for classification and clustering. VIPS research is oriented toward the design of novel generative, discriminative and hybrid models for representing social signals, by collecting experimental data through advanced technologies and sharing knowledge with Social Psychology partners. In particular, we are focused on the modeling of interactive activities, with a special regard for video surveillance scenarios.
One of the most exciting areas of research nowadays lies in the
marriage between Computer Vision and Graphics. In particular, the
so-called Image Based Modeling and Rendering is overtaking traditional
model-based rendering in the quest for photo-realism. In this context,
VIPS people have started investigating on view synthesis, motion capture and
augmented reality (e.g., for vehicles remote control).
In several domains (home video market, cinematography, multimedia document indexind, etc.) the aspect of audio mixing plays a key role. It is based on aesthetic rules that convey different emotions, ranging from sadness to joy, from tension to relax. The computational media aestethics is defined as the algorithmic study of the mechanisms underlying the aestetic rules. The project focuses on the algorithmic analysis pf audio-video aestethics rules and methods that allow automatic combination od the two modalities.