PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
Talks
Publications
Resources
Contact
Publications
Type
1
2
3
Date
2023
2022
2021
Jiaming Ji
,
Jiayi Zhou
,
Borong Zhang
,
Juntao Dai
,
Xuehai Pan
,
Ruiyang Sun
,
Weidong Huang
,
Yiran Geng
,
Mickel Liu
,
Yaodong Yang
(2023).
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
. Under Review.
PDF
Cite
Shangding Gu
,
Jakub Grudzien Kuba
,
Yuanpei Chen
,
Yali Du
,
Long Yang
,
Alois Knoll
,
Yaodong Yang
(2023).
Safe multi-agent reinforcement learning for multi-robot control
. Artificial Intelligence (AIJ).
PDF
Cite
Zhijian Duan
,
Wenhan Huang
,
Dinghuai Zhang
,
Yali Du
,
Jun Wang
,
Yaodong Yang
,
Xiaotie Deng
(2023).
Is Nash Equilibrium Approximator Learnable?
. Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023).
PDF
Cite
Muning Wen
,
Runji Lin
,
HanjingWANG
,
Yaodong Yang
,
Ying Wen
,
Luo Mai
,
Jun Wang
,
Haifeng Zhang
,
Weinan Zhang
(2023).
Large Sequence Models for Sequential Decision-Making: A Survey
. Frontiers of Computer Science (FCS).
PDF
Cite
Hanjing Wang
,
Man-Kit Sit
,
Congjie He
,
Ying Wen
,
Weinan Zhang
,
Jun Wang
,
Yaodong Yang
,
Luo Mai
(2023).
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
. The Fortieth International Conference on Machine Learning (ICML 2023).
PDF
Cite
Oliver Slumbers
,
David Henry Mguni
,
Stephen Marcus McAleer
,
Stefano B. Blumberg
,
Jun Wang
,
Yaodong Yang
(2023).
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems
. The Fortieth International Conference on Machine Learning (ICML 2023).
PDF
Cite
Xiaohang Tang
,
Le Cong Dinh
,
Stephen Marcus McAleer
,
Yaodong Yang
(2023).
Regret-Minimizing Double Oracle for Extensive-Form Games
. The Fortieth International Conference on Machine Learning (ICML 2023).
PDF
Cite
Qinghao Wang
,
Yanling PENG
,
Yijie Peng
,
Yaodong Yang
(2023).
A Deep Reinforcement Learning-driven Vine Copula Method for Correlation Structure Analysis of Mortgage
. China Journal of Econometrics.
PDF
Cite
Ming Zhou
,
Ziyu Wan
,
Hanjing Wang
,
Muning Wen
,
Runzhe Wu
,
Ying Wen
,
Yaodong Yang
,
Yong Yu
,
Jun Wang
,
Weinan Zhang
(2023).
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
. Journal of Machine Learning Research (JMLR).
PDF
Cite
David Mguni
,
Haojun Chen
,
Taher Jafferjee
,
Jianhong Wang
,
Long Fei
,
Xidong Feng
,
Stephen McAleer
,
Feifei Tong
,
Jun Wang
,
Yaodong Yang
(2023).
MANSA: Learning Fast and Slow in Multi-Agent Systems
. The Fortieth International Conference on Machine Learning (ICML 2023).
PDF
Cite
David Mguni
,
Taher Jafferjee
,
Jianhong Wang
,
Nicolas Perez Nieves
,
Tianpei Yang
,
Matthew Taylor
,
Wenbin Song
,
Feifei Tong
,
Hui Chen
,
Jiangcheng Zhu
,
Jun Wang
,
Yaodong Yang
(2023).
Learning to Shape Rewards using a Game of Two Partners
. Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023).
PDF
Cite
Shuang Wu
,
Jian Yao
,
Haobo Fu
,
Ye Tian
,
Chao Qian
,
Yaodong Yang
,
QIANG FU
,
Yang Wei
(2023).
Quality-Similar Diversity via Population Based Reinforcement Learning
. The Eleventh International Conference on Learning Representations (ICLR 2023).
PDF
Cite
Xiaotie Deng
,
Ningyuan Li
,
David Mguni
,
Jun Wang
,
Yaodong Yang
(2023).
On the complexity of computing Markov perfect equilibrium in general-sum stochastic games
. National Science Review (NSR).
PDF
Cite
Ying Wen
,
Hui Chen
,
Yaodong Yang
,
Minne Li
,
Zheng Tian
,
Xu Chen
,
Jun Wang
(2022).
A game-theoretic approach to multi-agent trust region optimization
. International Conference on Distributed Artificial Intelligence (DAI 2022).
PDF
Cite
Qinghao Wang
,
Yijie Peng
,
Yaodong Yang
(2022).
Solving Inventory Management Problems through Deep Reinforcement Learning
. Journal of Systems Science and Systems Engineering.
PDF
Cite
Chuming Li
,
Jie Liu
,
Yinmin Zhang
,
Yuhong Wei
,
Yazhe Niu
,
Yaodong Yang
,
Yu Liu
,
Wanli Ouyang
(2022).
ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency
. Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023).
PDF
Cite
Runji Lin
,
Ye Li
,
Xidong Feng
,
Zhaowei Zhang
,
Xian Hong Wu Fung
,
Haifeng Zhang
,
Jun Wang
,
Yali Du
,
Yaodong Yang
(2022).
Contextual Transformer for Offline Meta Reinforcement Learning
. NeurIPS 2022 Foundation Models for Decision Making Workshop.
PDF
Cite
Jie Ren
,
Xidong Feng
,
Bo Liu
,
Xuehai Pan
,
Yao Fu
,
Luo Mai
,
Yaodong Yang
(2022).
TorchOpt: An Efficient Library for Differentiable Optimization
. OPT2022: 14th Annual Workshop on Optimization for Machine Learning.
PDF
Cite
Yali Du
,
Chengdong Ma
,
Yuchen Liu
,
Runji Lin
,
Hao Dong
,
Jun Wang
,
Yaodong Yang
(2022).
Scalable Model-based Policy Optimization for Decentralized Networked Systems
. The 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022).
PDF
Cite
Huanzhou Zhu
,
Bo Zhao
,
Gang Chen
,
Weifeng Chen
,
Yijie Chen
,
Liang Shi
,
Yaodong Yang
,
Peter Pietzuch
,
Lei Chen
(2022).
MSRL: Distributed Reinforcement Learning with Dataflow Fragments
. USENIX Annual Technical Conference (ATC).
PDF
Cite
Puhao Li
,
Tengyu Liu
,
Yuyang Li
,
Yiran Geng
,
Yixin Zhu
,
Yaodong Yang
,
Siyuan Huang
(2022).
GenDexGrasp: Generalizable Dexterous Grasping
. 2023 IEEE International Conference on Robotics and Automation (ICRA 2023).
PDF
Cite
Le Cong Dinh
,
Yaodong Yang
,
Stephen McAleer
,
Zheng Tian
,
Nicolas Perez Nieves
,
Oliver Slumbers
,
David Henry Mguni
,
Haitham Bou Ammar
,
Jun Wang
(2022).
Online Double Oracle
. Transactions on Machine Learning Research (TMLR).
PDF
Cite
Yuanpei Chen
,
Tianhao Wu
,
Shengjie Wang
,
Xidong Feng
,
Jiechuang Jiang
,
Stephen Marcus McAleer
,
Hao Dong
,
Zongqing Lu
,
Song-Chun Zhu
,
Yaodong Yang
(2022).
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
. The 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks.
PDF
Cite
Runze Liu
,
Fengshuo Bai
,
Yali Du
,
Yaodong Yang
(2022).
Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning
. The 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
PDF
Cite
Xuehai Pan
,
Mickel Liu
,
Fangwei Zhong
,
Yaodong Yang
,
Song-Chun Zhu
,
Yizhou Wang
(2022).
MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control
. The 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks.
PDF
Cite
Long Yang
,
Jiaming Ji
,
Juntao Dai
,
Linrui Zhang
,
Binbin Zhou
,
Pengfei Li
,
Yaodong Yang
,
Gang Pan
(2022).
Constrained Update Projection Approach to Safe Policy Optimization
. The 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
PDF
Cite
Zongkai Liu
,
Chao Yu
,
Yaodong Yang
,
Peng Sun
,
Zifan Wu
,
Yuan Li
(2022).
A Unified Diversity Measure for Multiagent Reinforcement Learning
. The 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
PDF
Cite
Bo Liu
,
Xidong Feng
,
Jie Ren
,
Luo Mai
,
Rui Zhu
,
Haifeng Zhang
,
Jun Wang
,
Yaodong Yang
(2022).
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
. The 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
PDF
Cite
Yiran Geng
,
Boshi An
,
Haoran Geng
,
Yuanpei Chen
,
Yaodong Yang
,
Hao Dong
(2022).
End-to-End Affordance Learning for Robotic Manipulation
. 2023 IEEE International Conference on Robotics and Automation (ICRA 2023).
PDF
Cite
Zhitao Zhu
,
Shijing Si
,
Jianzong Wang
,
Yaodong Yang
,
Jing Xiao
(2022).
Debias the Black-Box: A Fair Ranking Framework via Knowledge Distillation
. Web Information Systems Engineering–WISE 2022: 23rd International Conference.
PDF
Cite
Linghui Meng
,
Muning Wen
,
Chenyang Le
,
Xiyun Li
,
Dengpeng Xing
,
Weinan Zhang
,
Ying Wen
,
Haifeng Zhang
,
Jun Wang
,
Yaodong Yang
,
Bo Xu
(2022).
Offline Pre-trained Multi-agent Decision Transformer
. Machine Intelligence Research.
PDF
Cite
Muning Wen
,
Jakub Grudzien Kuba
,
Runji Lin
,
Weinan Zhang
,
Ying Wen
,
Jun Wang
,
Yaodong Yang
(2022).
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
. The 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
PDF
Cite
Yurong Chen
,
Xiaotie Deng
,
Chenchen Li
,
David Mguni
,
Jun Wang
,
Xiang Yan
,
Yaodong Yang
(2022).
On the Convergence of Fictitious Play: A Decomposition Approach
. The 31st International Joint Conference on Artificial Intelligence (IJCAI 2022).
PDF
Cite
Ricky Sanjaya
,
Jun Wang
,
Yaodong Yang
(2022).
Measuring the Non-Transitivity in Chess
. Algorithms 2022.
PDF
Cite
Xidong Feng
,
Oliver Slumbers
,
Ziyu Wan
,
Bo Liu
,
Stephen McAleer
,
Ying Wen
,
Jun Wang
,
Yaodong Yang
(2021).
Neural Auto-Curricula in Two-Player Zero-Sum Games
. The 35th Conference on Neural Information Processing Systems (NeurIPS 2021).
PDF
Cite
David Henry Mguni
,
Taher Jafferjee
,
Jianhong Wang
,
Oliver Slumbers
,
Nicolas Perez Nieves
,
Feifei Tong
,
Li Yang
,
Jiangcheng Zhu
,
Yaodong Yang
,
Jun Wang
(2021).
LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning
. Tenth International Conference on Learning Representations (ICLR 2022).
PDF
Cite
Le Cong Dinh
,
David Henry Mguni
,
Long Tran-Thanh
,
Jun Wang
,
Yaodong Yang
(2021).
Online Markov Decision Processes with Non-oblivious Strategic Adversary
. Autonomous Agents and Multi-Agent Systems (2023).
PDF
Cite
Jakub Grudzien Kuba
,
Ruiqing Chen
,
Muning Wen
,
Ying Wen
,
Fanglei Sun
,
Jun Wang
,
Yaodong Yang
(2021).
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning
. Tenth International Conference on Learning Representations (ICLR 2022).
PDF
Cite
Jakub Grudzien Kuba
,
Muning Wen
,
Linghui Meng
,
Shangding Gu
,
Haifeng Zhang
,
David Henry Mguni
,
Jun Wang
,
Yaodong Yang
(2021).
Settling the Variance of Multi-Agent Policy Gradients
. The 35th Conference on Neural Information Processing Systems (NeurIPS 2021).
PDF
Cite
Xiangyu Liu
,
Hangtian Jia
,
Ying Wen
,
Yujing Hu
,
Yingfeng Chen
,
Changjie Fan
,
Zhipeng Hu
,
Yaodong Yang
(2021).
Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games
. The 35th Conference on Neural Information Processing Systems (NeurIPS 2021).
PDF
Cite
Cite
×