Learning Multi-Index Models

Tuesday, April 15, 2025 - 4:15pm to 5:15pm
Refreshments: 
4:00 PM
Location: 
32-G449 (Kiva/Patil)
Speaker: 
Ilias Diakonikolas, UW Madison
Biography: 
Ilias Diakonikolas is the Lubar Professor in the Department of Computer Sciences at UW Madison. He obtained a Diploma in electrical and computer engineering from the National Technical University of Athens and a Ph.D. in computer science from Columbia University where he was advised by Mihalis Yannakakis. Before moving to UW, he was an Andrew and Erna Viterbi Early Career Chair at USC and a faculty member at the University of Edinburgh. Prior to that, he was the Simons postdoctoral fellow in theoretical computer science at the University of California, Berkeley. His research is on the algorithmic foundations of massive data sets, in particular on designing efficient algorithms for fundamental problems in machine learning. He is a recipient of a Sloan Fellowship, an NSF CAREER Award, a Romnes Faculty Fellowship, a Google Faculty Research Award, a Marie Curie Fellowship, the best paper award at NeurIPS 2019, the IBM Research Pat Goldberg Best Paper Award, and an honorable mention in the George Nicholson competition from the INFORMS society. Ilias wrote with Daniel Kane the textbook "Algorithmic High-dimensional Robust Statistics" published by Cambridge University Press.

Multi-index models (MIMs) are functions that depend on the projection of the input onto a low-dimensional subspace. These models offer a powerful framework for studying various machine learning tasks, including multiclass linear classification, learning intersections of halfspaces,
and more complex neural networks. Despite extensive investigation, there remains a vast gap in our understanding of the efficient learnability of MIMs.

In this talk, we will survey recent algorithmic developments on learning MIMs, focusing on methods with provable performance guarantees.
In particular, we will present a robust learning algorithm for a broad class of well-behaved MIMs under the Gaussian distribution. A key feature of our algorithm is that its running time has fixed-degree polynomial dependence on the input dimension. We will also demonstrate how this framework leads to more efficient and noise-tolerant learners for multiclass linear classifiers and intersections of halfspaces.

Time permitting, we will highlight some of the many open problems in this area.

The main part of the talk is based on joint work with G. Iakovidis, D. Kane, and N. Zarifis.