Data Summarization and Fairness in Big Data Processing (Lingxiao Huang)

Abstract

Researchers from across business, machine learning and social science are increasingly designing algorithms to address developing challenges. In this talk, I examine two burgeoning directions in big data processing: data summarization and fairness in automated decision-making tasks.

With the prevalence of automatic information extraction/integration systems and predictive machine learning algorithms in numerous application areas, we are faced with a huge volume of (potentially uncertain) data. This has led to a surge of interests in data summarization from several research communities including theoretical computer science, databases, machine learning. We investigate a popular approach of data summarization, called coreset construction, which can help manage, analyze and optimize over big datasets. In this talk, I will introduce how to construct coresets for different types of data, e.g., uncertain data or data with bounded doubling dimension.

Automated decision-making algorithms are increasingly deployed and affect people’s lives significantly. Recently, there has been growing concern about systematically discriminate against particular groups of individuals that may exist in such algorithms. I will first introduce the motivations of fairness requirements in real-world applications. Then I will introduce our recent progress in designing algorithms that maintain fairness requirements for some important decision-making tasks like personalization and multiwinner voting.

Time

2018-08-02   10:00 ~ 11:00   

Speaker

Lingxiao Huang, EPFL

Room

Room 602,School of Information Management & Engineering, Shanghai University of Finance & Economics