2025 |
Yong Joo Do / Deuksun Hong / Hyungbo Shim Distributed Q-Learning on Multi-Agent Markov Decision Process with Heterogeneous State Transition Probabilities Proceedings Article In: 2025 IEEE 64th Conference on Decision and Control (CDC), IEEE Control Systems Society, Rio de Janeiro, Brazil, 2025, ISBN: 979-8-3315-2627-6. Abstract | Links | BibTeX | Tags: Blended dynamics, Reinforcement learning, Stochastic Approximation @inproceedings{nokey,This paper presents a distributed Q-learning (DQlearning) algorithm within the framework of Multi-agent Markov Decision Process characterized by heterogeneous state transition probabilities and a common reward function. By communicating during learning, each agents experiencing different state transition probabilities, do not converge to their own local optimal Q-functions. Instead, the interaction forces their learning dynamics to align, and every agent converges to the optimal Q-function corresponding to the average over these heterogeneous state transition probabilities. Further analyzing its behavior, we reformulate the update algorithm of DQlearning as a continuous time dynamics using modified ODE based stochastic approximation. Through the blended dynamics approach, the asymptotics of such dynamics is analyzed, theoretically guaranteeing the emergent behavior of DQ-learning. |
Jihoon Suh / Yeongjun Jang / Kaoru Teranishi / Takashi Tanaka Relative entropy regularized reinforcement learning for efficient encrypted policy synthesis Journal Article In: IEEE Control Systems Letters, vol. 9, 2025, ISSN: 2475-1456. Abstract | Links | BibTeX | Tags: Entropy Regularization, Homomorphic encryption, Reinforcement learning @article{nokey,We propose an efficient encrypted policy synthesis to develop privacy-preserving model-based reinforcement learning. We first demonstrate that the relative-entropy-regularized reinforcement learning framework offers a computationally convenient linear and “min-free” structure for value iteration, enabling a direct and efficient integration of fully homomorphic encryption with bootstrapping into policy synthesis. Convergence and error bounds are analyzed as encrypted policy synthesis propagates errors under the presence of encryption-induced errors including quantization and bootstrapping. Theoretical analysis is validated by numerical simulations. Results demonstrate the effectiveness of the RERL framework in integrating FHE for encrypted policy synthesis. |
2024 |
Deuksun Hong / Hyungbo Shim A Reinforcement Learning Approach for Safe Control Using Multi-Agent Q-Learning Proceedings Article In: 2024 14th Asian Control Conference, 2024. Abstract | BibTeX | Tags: Consensus, Multi-agent system, Reinforcement learning @inproceedings{nokey,Reinforcement learning techniques typically optimize a controller only for the trained environment. This characteristic leads to trouble when applying the learned models to other settings, particularly in real control scenarios where control needs to be performed under different conditions than those during the learning phase due to modeling errors or aging. This paper introduces a multi-agent version of Q-learning known as QD-learning. We also present how this algorithm operates when applied to the various environment settings, and experimentally demonstrate its effectiveness for designing a safe controller. |
2019 |
Jeong Woo Kim / Hyungbo Shim / Insoon Yang On Improving the Robustness of Reinforcement Learning-Based Controllers Using Disturbance Observer Proceedings Article In: Proc. of 2019 IEEE 58th Conference on Decision and Control, pp. 8487-852, IEEE, Nice, France, 2019. Abstract | Links | BibTeX | Tags: Disturbance observer, Reinforcement learning @inproceedings{KimShimYang19,Because reinforcement learning (RL) may cause issues in stability and safety when directly applied to physical systems, a simulator is often used to learn a control policy. However, the control performance may be easily deteriorated in a real plant due to the discrepancy between the simulator and the plant. In this paper, we propose an idea to enhance the robustness of such RL-based controllers by utilizing the disturbance observer (DOB). This method compensates for the mismatch between the plant and simulator, and rejects disturbance to maintain the nominal performance while guaranteeing robust stability. Furthermore, the proposed approach can be applied to partially observable systems. We also characterize conditions under which the learned controller has a provable performance bound when connected to the physical system. |
List of English Publication
2025 |
Distributed Q-Learning on Multi-Agent Markov Decision Process with Heterogeneous State Transition Probabilities Proceedings Article In: 2025 IEEE 64th Conference on Decision and Control (CDC), IEEE Control Systems Society, Rio de Janeiro, Brazil, 2025, ISBN: 979-8-3315-2627-6. |
Relative entropy regularized reinforcement learning for efficient encrypted policy synthesis Journal Article In: IEEE Control Systems Letters, vol. 9, 2025, ISSN: 2475-1456. |
2024 |
A Reinforcement Learning Approach for Safe Control Using Multi-Agent Q-Learning Proceedings Article In: 2024 14th Asian Control Conference, 2024. |
2019 |
On Improving the Robustness of Reinforcement Learning-Based Controllers Using Disturbance Observer Proceedings Article In: Proc. of 2019 IEEE 58th Conference on Decision and Control, pp. 8487-852, IEEE, Nice, France, 2019. |