Loading Events

« All Events

  • This event has passed.

Ph.D. Defense: Songyang Han

April 4, 2023 @ 10:00 am - 11:00 am EDT

Title: Safe, Stable, and Robust Multi-Agent Reinforcement Learning for Connected Autonomous Vehicles

Student: Songyang Han

Major Advisor:  Dr. Fei Miao          

Associate Advisors:  Dr. Caiwen Ding, Dr. Jinbo Bi

Review Committee Members: Dr. Dongjin Song, Dr. Yufeng Wu

Date/Time: Tuesday, April 4, 2023, 10:00 am

Location: WebEx Online & In Person

Meeting room: HBL1102 

Meeting link:https://uconn-cmr.webex.com/uconn-cmr/j.php?MTID=m7e81d9999c4c1880da4c5ab204a2021e

Meeting number: 2621 854 1526

Password: MRcbwvyF534

Join by video system: Dial 26218541526@uconn-cmr.webex.com

You can also dial and enter your meeting number.

Join by phone: +1-415-655-0002 US Toll

Access code: 2621 854 1526


With the development of sensing and communication technologies in networked cyber-physical systems (CPSs), multi-agent reinforcement learning (MARL)-based methodologies are integrated into the control process of physical systems and demonstrate prominent performance in a wide array of CPS domains, such as connected autonomous vehicles (CAVs). However, it remains challenging how to take advantage of shared information in MARL to improve the safety of the CAVs and the efficiency of the traffic flow under dynamic and uncertain environments. It is also challenging to mathematically characterize the improvement of the performance of CAVs with communication and cooperation capability. To address these challenges, first, we design an information-sharing-based MARL framework for CAVs, to take advantage of the extra information when making decisions to improve traffic efficiency and safety with two new techniques: the truncated Q-function and safe action mapping. The truncated Q-function utilizes the shared information from neighboring CAVs such that the joint state and action spaces of the Q-function do not grow for a large-scale CAV system. The safe action mapping provides a provable safety guarantee for both the training and execution based on control barrier functions. Second, we propose a Shapley value-based reward reallocation to motivate stable cooperation among autonomous vehicles. We prove that Shapley value-based reward reallocation of MARL is stable and efficient. Agents will stay within the coalition or the cooperating group, communicate and cooperate with other coalition members to optimize the coalition-level objective. Finally, we study the fundamental properties of MARL under state uncertainties. We prove that the optimal agent policy and the robust Nash equilibrium do not always exist for a State-Adversarial Markov Game (SAMG). Instead, we define a new solution concept, robust agent policy, of the proposed SAMG under adversarial state perturbations, where agents want to maximize the worst-case expected state value.



April 4, 2023
10:00 am - 11:00 am EDT


HBL 1102

Connect With Us