Press "Enter" to skip to content

Regularizing policy iteration for recursive feasibility and stability

Speaker  Prof. Dragan Nesic (University of Melbourne)

Place Building 133 Room 316-1

Time Oct. 16th (Mon) 13:00

Abstract

We will present a new algorithm called policy iteration plus (PI+) for the optimal control of nonlinear deterministic discrete-time plants with general cost functions. PI+ builds upon classical policy iteration and has the distinctive feature to enforce recursive feasibility under mild conditions, in the sense that the minimization problems solved at each iteration are guaranteed to admit a solution. While recursive feasibility is a desired property, it appears that existing results on the policy iteration algorithm fail to ensure it in general, contrary to PI+. We also establish the recursive stability of PI+: the policies generated at each iteration ensure a stability property for the closed-loop system. For this purpose we rely on more general conditions than those currently available for policy iteration, by notably covering set stability. Finally, we present characterizations of near-optimality bounds for PI+ and prove the uniform convergence of the value functions generated by PI+ to the optimal value function. We believe that these results would benefit the burgeoning literature on reinforcement learning, where recursive feasibility is typically assumed without a clear method for verifying it and where recursive stability is essential for safe operation of the system.

Biography

Dragan Nesic is a Professor at the Department of Electrical and Electronic Engineering at The University of Melbourne. He received his Bachelor of Mechanical Engineering Degree at the University of Belgrade (1990) and his PhD at the Australian National University (1997). Professor Nesic’s research interests are in the broad area of control engineering including its mathematical foundations (e.g. Lyapunov stability theory, hybrid systems, singular perturbations, averaging) and its applications to various areas of engineering (e.g. automotive control, optical telecommunications) and science (e.g. neuroscience). More specifically, he has made significant contributions to the areas of nonlinear sampled-data systems, nonlinear networked control systems, event-triggered control, optimization-based control and extremum seeking control and he presented several keynote lectures on these topics at international conferences.
Prof. Nesic is a Fellow of IEEE and a Fellow of IFAC and he served as a Distinguished Lecturer of the Control Systems Society of the IEEE. He was a co-recipient (with M. Nagahara and D. Quevedo) of the George S. Axelby Outstanding Paper Award (2017). He is a recipient of numerous awards and prizes, including Doctorate Honoris Causa by the University of Lorraine (2019), Humboldt Research Award (2020), Humboldt Research Fellowship (2003-2004), as well as Future Fellowship (2010-2014) and an Australian Professorial Fellowship (2004-2009) funded by the Australian Research Council. He is an Associate Editor for the journal IEEE Transactions on Network Control Systems (CONES) and Foundations and Trends in Systems and Control. He has also served as Associate Editor for the IEEE Transactions on Automatic Control , Automatica , European Journal of Control and Systems and Control Letters . Prof. Nesic was a General Co-Chair of 2017 IEEE Conference on Decision and Control and a General Chair of the 2011 Australian Control Conference. He served on International Program Committees of many international conferences, such as the American Control Conference, IEEE Conference on Decision and Control, NOLCOS, Asian Control Conference, European Control Conference, and so on. Prof. Nesic also served on various committees including the Board of Governors, IEEE Control Systems Society.