Latino Studies at New York University

Aditya V. Rangan

Courant Institute of Mathematical Sciences
New York University

September 4

Efficient methods for detecting low-rank substructure

A common goal in data-analysis is to capture some subset of the data using a reduced number of degrees-of-freedom.  For example, when analyzing genomic data one is often interested in discovering subgroups of genes which exhibit correlated activity across a subset of patients.  This goal can be rephrased as follows: given a large data matrix in a high-dimensional space, how can one efficiently determine if some submatrix is well captured using only a few principal components?  Naive methods for solving this problem are either very slow, or do not scale well as the size of the matrix increases.  In this talk I will present a method that is quite fast, and practical even when the data sets are very large.