Data Summarization and Fairness in Big Data Processing (Lingxiao Huang)


Researchers from across business, machine learning and social science are increasingly designing algorithms to address developing challenges. In this talk, I examine two burgeoning directions in big data processing: data summarization and fairness in automated decision-making tasks.

With the prevalence of automatic information extraction/integration systems and predictive machine learning algorithms in numerous application areas, we are faced with a huge volume of (potentially uncertain) data. This has led to a surge of interests in data summarization from several research communities including theoretical computer science, databases, machine learning. We investigate a popular approach of data summarization, called coreset construction, which can help manage, analyze and optimize over big datasets. In this talk, I will introduce how to construct coresets for different types of data, e.g., uncertain data or data with bounded doubling dimension.

Automated decision-making algorithms are increasingly deployed and affect people’s lives significantly. Recently, there has been growing concern about systematically discriminate against particular groups of individuals that may exist in such algorithms. I will first introduce the motivations of fairness requirements in real-world applications. Then I will introduce our recent progress in designing algorithms that maintain fairness requirements for some important decision-making tasks like personalization and multiwinner voting.


2018-08-02   10:00 ~ 11:00   


Lingxiao Huang, EPFL


Room 602,School of Information Management & Engineering, Shanghai University of Finance & Economics