This is a course aimed at introducing various dimensionality reduction techniques, such as sampling and sketching, to speed up commonly occurring optimization problems, in particular those occurring in numerical linear algebra.
2016-06-27 ~ 2016-06-30 08:00-11:45
David P Woodruff,IBM Almaden Research Center
Room112, 4th Teaching Building, Shanghai University of Finance & Economics
Application and RegistrationNo registration fee.
We will cover problems such as least squares regression, low rank approximation, and many variants of these problems. In particular, we will also discuss column subset selection and CUR decompositions, kernel variations of these problems, and different error measures such as Frobenius, spectral, and versions robust to outliers, as well as weighted variants. Finally, we will also cover communication-efficient protocols in a distributed environment and memory-efficient algorithms in a data stream model.
The course will be mostly based on my monograph “Sketching as a Tool for Numerical Linear Algebra”, available here: http://arxiv.org/abs/1411.4357
Much of this course is based on work that has occurred in the last five years, so please see also the references therein.
Linear algebra, probability theory. This is a theory course so students are expected to have some comfort with mathematical proofs.
This course highlights the recent advances in algorithms for numerical linear algebra that have come from the technique of linear sketching, whereby given a matrix, one first compresses it to a much smaller matrix by multiplying it by a (usually) random matrix with certain properties. Much of the expensive computation can then be performed on the smaller matrix, thereby accelerating the solution for the original problem. This technique has led to the fastest known algorithms for fundamental problems in this area, and we will consider least squares as well as robust regression problems, low rank approximation, and many variants of these problems, such as those in distributed and streaming environments. Time-permitting we will also discuss the limitations of sketching methods.
Lecture 1: Regression
1. Dimensionality reduction
2. Approximate matrix product and subspace embeddings
3. Least squares regression
Lecture 2: Low Rank Approximation
1. Frobenius norm error
2. Spectral norm error
3. CUR decompositions
Lecture 3: Robust and Weighted Algorithms
1. M-estimator regression
2. Robust low rank approximation
3. Weighted low rank approximation
Lecture 4: Distributed and streaming environments
1. Distributed and streaming models
2. Low rank approximation in these models
3. Lower bounds
Directions to ITCS