PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
Talks
Publications
Resources
Contact
1
Is Nash Equilibrium Approximator Learnable?
In this paper, we investigate the learnability of the function approximator that approximates Nash equilibrium (NE) for games generated …
Zhijian Duan
,
Wenhan Huang
,
Dinghuai Zhang
,
Yali Du
,
Jun Wang
,
Yaodong Yang
,
Xiaotie Deng
PDF
Cite
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning …
Hanjing Wang
,
Man-Kit Sit
,
Congjie He
,
Ying Wen
,
Weinan Zhang
,
Jun Wang
,
Yaodong Yang
,
Luo Mai
PDF
Cite
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems
In order for agents in multi-agent systems (MAS) to be safe, they need to take into account the risks posed by the actions of other …
Oliver Slumbers
,
David Henry Mguni
,
Stephen Marcus McAleer
,
Stefano B. Blumberg
,
Jun Wang
,
Yaodong Yang
PDF
Cite
Regret-Minimizing Double Oracle for Extensive-Form Games
By incorporating regret minimization, double oracle methods have demonstrated rapid convergence to Nash Equilibrium (NE) in normal-form …
Xiaohang Tang
,
Le Cong Dinh
,
Stephen Marcus McAleer
,
Yaodong Yang
PDF
Cite
MANSA: Learning Fast and Slow in Multi-Agent Systems
In multi-agent reinforcement learning (MARL), independent learning (IL) often shows remarkable performance and easily scales with the …
David Mguni
,
Haojun Chen
,
Taher Jafferjee
,
Jianhong Wang
,
Long Fei
,
Xidong Feng
,
Stephen McAleer
,
Feifei Tong
,
Jun Wang
,
Yaodong Yang
PDF
Cite
Learning to Shape Rewards using a Game of Two Partners
Reward shaping (RS) is a powerful method in reinforcement learning (RL) for overcoming the problem of sparse or uninformative rewards. …
David Mguni
,
Taher Jafferjee
,
Jianhong Wang
,
Nicolas Perez Nieves
,
Tianpei Yang
,
Matthew Taylor
,
Wenbin Song
,
Feifei Tong
,
Hui Chen
,
Jiangcheng Zhu
,
Jun Wang
,
Yaodong Yang
PDF
Cite
Quality-Similar Diversity via Population Based Reinforcement Learning
Diversity is a growing research topic in Reinforcement Learning (RL). Previous research on diversity has mainly focused on promoting …
Shuang Wu
,
Jian Yao
,
Haobo Fu
,
Ye Tian
,
Chao Qian
,
Yaodong Yang
,
QIANG FU
,
Yang Wei
PDF
Cite
A game-theoretic approach to multi-agent trust region optimization
Trust region methods are widely applied in single-agent reinforcement learning problems due to their monotonic performance-improvement …
Ying Wen
,
Hui Chen
,
Yaodong Yang
,
Minne Li
,
Zheng Tian
,
Xu Chen
,
Jun Wang
PDF
Cite
ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem, which is the ever-changing targets at every …
Chuming Li
,
Jie Liu
,
Yinmin Zhang
,
Yuhong Wei
,
Yazhe Niu
,
Yaodong Yang
,
Yu Liu
,
Wanli Ouyang
PDF
Cite
Contextual Transformer for Offline Meta Reinforcement Learning
The pretrain-finetuning paradigm in large-scale sequence models has made significant progress in natural language processing and …
Runji Lin
,
Ye Li
,
Xidong Feng
,
Zhaowei Zhang
,
Xian Hong Wu Fung
,
Haifeng Zhang
,
Jun Wang
,
Yali Du
,
Yaodong Yang
PDF
Cite
»
Cite
×