In the first part of the talk, Prof. Xavier Giró-i-Nieto will offer an overview of the main research activities at the Image Processing Group of the Universitat Politecnica de Catalunya. After a first introduction of the overall activities of the group, the UPSeek retrieval system together with the open source interfaces GAT and GOS will be presented. Then, the ongoing work pursued by Prof. Giró and his students will be summarised, with topics such as joint analysis of regions and interest points, object segmentation through online games and correlation between saliency maps and EEG signals.
In the second part of the talk, Prof. Xavier Giró-i-Nieto will present his Phd thesis co-advised by Prof. Ferran Marqués (Universitat Politècnica de Catalunya) and Prof. Shih-Fu Chang (Columbia University). This work addresses the problem of visual object retrieval, where a user formulates a query to an image database by providing one or multiple examples of an object of interest. The presented techniques aim both at finding those images in the database that contain the object as well as locating the object in the image and segmenting it from the background.
Every considered image, both the ones used as queries and the ones contained in the target database, is represented as a Binary Partition Tree (BPT), the hierarchy of regions previously proposed by Salembier and Garrido (2000). This data structure offers multiple opportunities and challenges when applied to the object retrieval problem.
One application of BPTs appears during the formulation of the query, when the user must interactively segment the query object from the background. Firstly, the BPT can assist in adjusting an initial marker, such as a scribble or bounding box, to the object contours. Secondly, BPT can also define a navigation path for the user to adjust an initial selection to the appropriate scale.
The hierarchical structure of the BPT is also exploited to extract a new type of visual words named Hierarchical Bag of Regions (HBoR). Each region defined in the BPT is characterised with a feature vector that combines a soft quantisation on a visual codebook with an efficient bottom-up computation through the BPT. These features allow the definition of a novel feature space, the Parts Space, where each object is located according to the parts that compose it.
HBoR features have been applied to two scenarios for object retrieval, both of them solved by considering the decomposition of the objects in parts. In the first scenario, the query is formulated with a single object exemplar which is to be matched with each BPT in the target database. The matching problem is solved in two stages: an initial top-down one that assumes that the hierarchy from the query is respected in the target BPT, and a second bottom-up one that relaxes this condition and considers region merges which are not in the target BPT.
The second scenario where HBoR features are applied considers a query composed of several visual objects, such as a person, a bottle or a logo. In this case, the provided exemplars are considered as a training set to build a model of the query concept. This model is composed of two levels, a first one where each part is modelled and detected separately, and a second one that characterises the combinations of parts that describe the complete object. The analysis process exploits the hierarchical nature of the BPT by using a novel classifier that drives an effcient top-down analysis of the target BPTs.
Xavier Giro-i-Nieto is an assistant professor at the Universitat Politecnica de Catalunya (UPC). He graduated in Electrical Engineering studies at ETSETB (UPC) in 2000, after completing his master thesis on image compression at the Vrije Universiteit in Brussels (VUB) under the direction of Professor Peter Schelkens. In 2001 he worked in the digital television group of Sony Brussels, before returning to Barcelona and joining the Image Processing Group at the UPC. In 2003, he started teaching courses in Electrical Engineering degress at the EET and ETSETB schools from UPC. Xavier Giro-i-Nieto his Phd on image retrieval in 2012, under the supervision by Professor Ferran Marques from UPC and Professor Shih-Fu Chang from Columbia University. In 2008, 2009, 2011 and 2012, he visited the Digital Video and MultiMedia laboratory at Columbia University, in New York. His more recent activity can be followed on the BitSearch blog.
This talk is organzied by the research group Interactive Media Systems at the Institute of Software Technology and Interactive Systems.