The research envisaged at the VIPS laboratory covers a broad spectrum of issues related to Computer Vision and Pattern Recognition, Image Processing, and Signal Processing (audio). In the following, we will shortly describe the mainstreams of our research.

Image analysis and understanding

VIPS research is naturally focused on general aspects of image analysis and understanding, which are at the basis of all advanced image processing. In particular, filtering and segmentation, feature extraction, object recognition, and scene reconstruction are all areas where VIPS people have gained skills and experience. Moreover, specific expertise on multisensory data processing, namely acoustics and optical imaging, has also been acquired, ultimately aimed at environment understanding.

Three-dimensional computer vision

VIPS people is interested in model acquisition, starting from 3D information acquired by different sensors, e.g. acoustic cameras and optical cameras. To this end, acoustic range images are used for underwater environment reconstruction. Research is focused on image restoration, multiview registration and fusion. The noisy nature of the data makes these tasks challenging. As for the optical depth recovery, VIPS is investigating aspects of view synthesis, mosaicing, and methods for improving computational stereopsis with Markov Random Fields models.

Pattern Recognition

Research in this field is mainly focused on probabilistic models for pattern recognition, for their interesting theoretical properties and their wide applicability in computer vision, bioinformatics and other research areas. Our interest lies mainly in investigating Hidden Markov Models, Markov Random Fields and Bayesian Networks. VIPS is interested both in methodological, e.g. learning and model selection, and applicative issues, such as shape classification, signal classification and clustering, target tracking, person detection, behavior analysis, object classification, object segmentation and video surveillance.
Pattern Recognition research is also of crucial impact in the bioinformatics community. This line of research is mainly devoted to the development of interpretable pattern recognition solutions for relevant bioinformatics problems, like expression microarray analysis or NMR spectra classification. Interpretability is typically obtained by tailoring or designing new probabilistic graphical models, like topic models, mixture models or Hidden Markov Models.

Social Signal Processing

Social signal processing represents an intriguing intersection among Social Psychology, Pattern Recognition and Computer Vision. It aims at modelling human activities exploiting social theories, which are translated in statistical vision models. In particular, social signal processing focuses on social signals, that are the expressions of ones attitude towards social situations and interplays. In turn, social signals are composed by social cues, that are temporal changes in neuromuscular and physiological activity that last for short intervals of time (milliseconds to minutes). In social signal processing, social cues become the features, and the social signals are statistical models suited for classification and clustering. VIPS research is oriented toward the design of novel generative, discriminative and hybrid models for representing social signals, by collecting experimental data through advanced technologies and sharing knowledge with Social Psychology partners. In particular, we are focused on the modeling of interactive activities, with a special regard for video surveillance scenarios.

Vision & Graphics

One of the most exciting areas of research nowadays lies in the marriage between Computer Vision and Graphics. In particular, the so-called Image Based Modeling and Rendering is overtaking traditional model-based rendering in the quest for photo-realism. In this context, VIPS people have started investigating on view synthesis, motion capture and augmented reality (e.g., for vehicles remote control).
Moreover, we are focused on the research on human body modeling, addressed by employing 3D digital scanning techniques. The overall aim is to automatically and reliably estimate anthropometric measurements. The involved methodologies are 3D model fitting, segmentation and skeletonization. Particular emphasis is given to feature-based techniques in order to detect and classify feature or salient points which can be associated to anthropometric landmarks.

Vision & Sound

In several domains (home video market, cinematography, multimedia document indexind, etc.) the aspect of audio mixing plays a key role. It is based on aesthetic rules that convey different emotions, ranging from sadness to joy, from tension to relax. The computational media aestethics is defined as the algorithmic study of the mechanisms underlying the aestetic rules. The project focuses on the algorithmic analysis pf audio-video aestethics rules and methods that allow automatic combination od the two modalities.

Former Activities

Sound Modelling
Developing sound models for human-computer interaction is a relatively under-explored area that may benefit from decades of experience in sound synthesis for computer music. The VIPS laboratory is steering sound synthesis techniques, especially those based on physical models, towards applications where the primary goal is neither realism nor pure abstraction, but rather an increased sense of Presence, that is fidelity in interaction.

Auditory Display
Sound can complement images or even substitute them for effective display of information (data, processes, etc.). Understanding the perceptual basis of sound perception, translating them into effective sonifications and auditory icons, and studying effective ways for displaying the acoustic message is one of the main activities at the VIPS laboratory. Sound modelling and spatial audio are used as tools to achieve these goals.

Sound and Music Computing
Sounds can be analyzed and modified in different ways. Classical approaches are based on time and frequency domains (i.e. Fourier transform). Different domains can be exploited to achieve same goals or new objectives. We are studying Mellin, scale and beta-scale transformations, creating new kind of sound analysis tools, filters and digital audio effects.

Top of the page

Last revision: Jan. 2012