The frontier of modern machine learning lies in designing systems that can reliably make sequences of decisions. Such systems typically invoke a trained machine learning model in a loop, where the model's decision at one step affects its input henceforth. This feedback loop leads to empirical issues such as error compounding and distribution shift; mitigating these issues calls for specialized algorithm design. How do we design learning algorithms to train models that will be applied sequentially? And how do we design inference algorithms that use trained models to solve sequential tasks?
Taking the standard "building blocks" of machine learning—classifiers, regressors, even generative models—as given, my work identifies when and how these building blocks can (or cannot) be used to provably solve sequential learning and inference. This thesis presents (1) the first algorithm for inference-time alignment with an imperfect process verifier; (2) the computational and statistical limits of supervised fine-tuning with a misspecified expert; and (3) the minimal regression oracles needed to solve reinforcement learning in a broad class of Markov decision processes. Together, these results build towards a computational theory of sequential decision-making.
Committee: Ankur Moitra (advisor), Sam Hopkins, Sasha Rakhlin.