2025 |
Yong Joo Do / Deuksun Hong / Hyungbo Shim Distributed Q-Learning on Multi-Agent Markov Decision Process with Heterogeneous State Transition Probabilities Proceedings Article In: 2025 IEEE 64th Conference on Decision and Control (CDC), IEEE Control Systems Society, Rio de Janeiro, Brazil, 2025, ISBN: 979-8-3315-2627-6. Abstract | Links | BibTeX | Tags: Blended dynamics, Reinforcement learning, Stochastic Approximation @inproceedings{nokey,This paper presents a distributed Q-learning (DQlearning) algorithm within the framework of Multi-agent Markov Decision Process characterized by heterogeneous state transition probabilities and a common reward function. By communicating during learning, each agents experiencing different state transition probabilities, do not converge to their own local optimal Q-functions. Instead, the interaction forces their learning dynamics to align, and every agent converges to the optimal Q-function corresponding to the average over these heterogeneous state transition probabilities. Further analyzing its behavior, we reformulate the update algorithm of DQlearning as a continuous time dynamics using modified ODE based stochastic approximation. Through the blended dynamics approach, the asymptotics of such dynamics is analyzed, theoretically guaranteeing the emergent behavior of DQ-learning. |
List of English Publication
2025 |
Distributed Q-Learning on Multi-Agent Markov Decision Process with Heterogeneous State Transition Probabilities Proceedings Article In: 2025 IEEE 64th Conference on Decision and Control (CDC), IEEE Control Systems Society, Rio de Janeiro, Brazil, 2025, ISBN: 979-8-3315-2627-6. |