Koç University, istanbuk, Turkey
Title: Big-Data Analytics: Multi-Group Data Classification with Mathematical Programming
Abstract: Data classification is a hierarchical approach for predicting the class of incoming instances by using the models constructed based on a dataset with known class memberships. The first phase in data classification is the training phase where a set of classification rules are constructed using the training set. Class membership of the instances in the training set are known and classification rules try to infer relations among the classes and the attribute values of the data instances. Attributes represent the characteristics of an instance and is known both for training and test data set. In the second phase, the attribute values of test data instances are used to predict their class membership via the classifier that is built in training phase. In this talk, we focus on solving multi-class data classification problems based on an hyper-boxes approach presented by Uney and Turkay (2006) that has been attracting a lot of attention recently. We enclose data set of each class by hyper-boxes that can be represented by upper and lower bounds of attributes. An error function is minimized by a mixed-integer linear programming problem to construct the hyper boxes that represent different classes. The goal is to separate the classes using minimum number of hyper boxes and misclassified points. The solution of the mathematical programming model provides a classifier that is used to label the class membership of new coming samples. We also present a novel extension that builds the class boundaries by convex hulls rather than hyperboxes. We first present the main approach and illustrate the effectiveness of these approaches on a variety of benchmark problems. The extensions of these novel approaches to handle big data and also improve the computational performance will be presented.
Biography: Prof. Dr. Metin Türkay holds a PhD from Carnegie Mellon University (1996) and MS (1991) and BS degrees (1989) from Middle East Technical University. Before joining Koç University in 2000 where he set up Systems Lab, he was principal consultant for optimization technologies at Mitsubishi Corporation Mizushima Research Center. His doctoral work on computational and system theories was selected to be the most innovative thesis among all PhD dissertations completed in US and received the 1997 Ted Peterson Student Paper Award. He is also the recipient of the Scientific and Technological Research Council of Turkey (TÜBİTAK) Career Award (2005), the TÜBİTAK Encouragement Award (2006), Turkey’s first IBM Shared University Research Award (2007), the IBM Faculty Award (2009) and the Open Collaborative Research Award by IBM Haifa Research (2012). He set up the Koç-IBM Supply Chain Research Center with funds from the IBM Shared University Research Award. His research focuses on optimizations theory, mixed-integer programming, and development of novel solution algorithms for mixed-integer programming problems. He is applying these theoretical developments on the solution of sustainable supply chain management and logistics, design of transportation system with special emphasis on urban logistics and systems biology. In 2006, he was elected as the Chair of the EURO Working Group on Computational Biology, Bioinformatics and Medicine. He is the İstanbul Representative of Government-University-Industry Working Group of the Ministry of Science, Industry and Technology of the Republic of Turkey.