Event sponsored by:
Abstract: In practical reinforcement learning (RL), a representation of the full state which makes the system Markovian and therefore amenable to most existing RL algorithms is not known a priori. Decision makers are often facing so-called partial observability of the state information, which significantly hinders the task of RL. Motivated by recent advances in causal inference, we study batch RL in the face of unmeasured confounders using auxiliary variables. A number of non-parametric identification results are established, based on which several promising policy optimization algorithms are proposed with finite-sample regret guarantees. Further, if time permits, I will discuss the phenomenon named "blessing from human-AI interaction" and introduce the framework of super reinforcement learning in the batch setting.
Bio: Zhengling Qi is an assistant professor of Decision Sciences at School of Business, The George Washington University. He finished his PhD from Department of Statistics and Operations Research at University of North Carolina, Chapel Hill. His general research interests are statistical machine Learning and related non-convex optimization. His research is mainly focused on reinforcement learning and causal inference.
Zoom link: https://bit.ly/3Sv4sZF