Informatics, TU Vienna

Consistent biclustering for feature selection

The selection of features that describe samples in a given set of data is a typical problem in data mining.

Abstract

The selection of features that describe samples in a given set of data is a typical problem in data mining. A crucial issue is to select a maximal set of pertinent features, because the scarce knowledge about the problem under study often leads to consider features which actually do not pro‐ vide a good description of the set. The concept of consistent biclustering of a set of data has been introduced with the aim of identifying a maxi‐ mal subset of relevant features. This problem can be modeled as a 0–1 linear fractional program, which was proved to be NP‐hard. We propose a bilevel reformulation for this optimization problem, and a heuristic for its solution that is based on the meta‐heuristic Variable Neighborhood Search (VNS). We will review some interesting applications of consistent biclustering, such as the analysis of gene expressions and the early detec‐ tion of problematic wine fermentations.

Biography

Antonio Mucherino has a Master de‐ gree in Applied Mathematics and a PhD in Computational Biology, granted by the Second University of Naples, Italy. He was postdoc in University of Florida and other universities in France (with one postdoc at the École Polytechnique of Palaiseau). He's currently assistant professor at the University of Rennes 1, France. His research interests are mostly focused on distance geometry and classification methods, with a particular attention to the optimization problems arising in these two domains.

Note

This talk is organized by the Institute of Computer Technology (ICT).