2

Safe multi-agent reinforcement learning for multi-robot control

A challenging problem in robotics is how to control multiple robots cooperatively and safely in real-world applications. Yet, …

Shangding Gu, Jakub Grudzien Kuba, Yuanpei Chen, Yali Du, Long Yang, Alois Knoll, Yaodong Yang

Large Sequence Models for Sequential Decision-Making: A Survey

Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in …

Muning WEN, Runji LIN, HanjingWANG, Yaodong Yang, Ying Wen, Luo MAI, Jun Wang, Haifeng ZHANG, Weinan ZHANG

A Deep Reinforcement Learning-driven Vine Copula Method for Correlation Structure Analysis of Mortgage

Controlling risk is the key to playing a core role in financial services and effectively serving the high-quality development of the …

Qinghao WANG, Yanling PENG, Yijie PENG, Yaodong Yang

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

Population-based multi-agent reinforcement learning (PB-MARL) encompasses a range of methods that merge dynamic population selection …

Ming Zhou, Ziyu Wan, Hanjing Wang, Muning WEN, Runzhe Wu, Ying Wen, Yaodong Yang, Yong Yu, Jun Wang, Weinan ZHANG

On the complexity of computing Markov perfect equilibrium in general-sum stochastic games

We introduce approximate Markov perfect equilibrium as a solution to the computational problem of finite-state stochastic games repeated in the infinite horizon and prove its PPAD-completeness.

Xiaotie Deng, Ningyuan Li, David Mguni, Jun Wang, Yaodong Yang

Solving Inventory Management Problems through Deep Reinforcement Learning

Inventory management (e.g. lost sales) is a central problem in supply chain management. Lost sales inventory systems with lead times …

Qinghao WANG, Yijie PENG, Yaodong Yang

MSRL: Distributed Reinforcement Learning with Dataflow Fragments

Reinforcement learning (RL) trains many agents, which is resource-intensive and must scale to large GPU clusters. Different RL training …

Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter Pietzuch, Lei Chen

Online Double Oracle

Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial …

Le Cong Dinh, Yaodong Yang, Stephen McAleer, Zheng Tian, Nicolas Perez-Nieves, Oliver Slumbers, David Henry Mguni, Haitham Bou Ammar, Jun Wang

Offline Pre-trained Multi-agent Decision Transformer

Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity to access …

Linghui Meng, Muning WEN, Chenyang Le, Xiyun Li, Dengpeng Xing, Weinan ZHANG, Ying Wen, Haifeng ZHANG, Jun Wang, Yaodong Yang, Bo Xu

Measuring the Non-Transitivity in Chess

In this paper, we quantify the non-transitivity in chess using human game data. Specifically, we perform non-transitivity …

Ricky Sanjaya, Jun Wang, Yaodong Yang