PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
Talks
Publications
Resources
Contact
Reinforcement Learning
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning …
Hanjing Wang
,
Man-Kit Sit
,
Congjie He
,
Ying Wen
,
Weinan Zhang
,
Jun Wang
,
Yaodong Yang
,
Luo Mai
PDF
Cite
A Deep Reinforcement Learning-driven Vine Copula Method for Correlation Structure Analysis of Mortgage
Controlling risk is the key to playing a core role in financial services and effectively serving the high-quality development of the …
Qinghao Wang
,
Yanling PENG
,
Yijie Peng
,
Yaodong Yang
PDF
Cite
Learning to Shape Rewards using a Game of Two Partners
Reward shaping (RS) is a powerful method in reinforcement learning (RL) for overcoming the problem of sparse or uninformative rewards. …
David Mguni
,
Taher Jafferjee
,
Jianhong Wang
,
Nicolas Perez Nieves
,
Tianpei Yang
,
Matthew Taylor
,
Wenbin Song
,
Feifei Tong
,
Hui Chen
,
Jiangcheng Zhu
,
Jun Wang
,
Yaodong Yang
PDF
Cite
Quality-Similar Diversity via Population Based Reinforcement Learning
Diversity is a growing research topic in Reinforcement Learning (RL). Previous research on diversity has mainly focused on promoting …
Shuang Wu
,
Jian Yao
,
Haobo Fu
,
Ye Tian
,
Chao Qian
,
Yaodong Yang
,
QIANG FU
,
Yang Wei
PDF
Cite
Solving Inventory Management Problems through Deep Reinforcement Learning
Inventory management (e.g. lost sales) is a central problem in supply chain management. Lost sales inventory systems with lead times …
Qinghao Wang
,
Yijie Peng
,
Yaodong Yang
PDF
Cite
MSRL: Distributed Reinforcement Learning with Dataflow Fragments
Reinforcement learning (RL) trains many agents, which is resource-intensive and must scale to large GPU clusters. Different RL training …
Huanzhou Zhu
,
Bo Zhao
,
Gang Chen
,
Weifeng Chen
,
Yijie Chen
,
Liang Shi
,
Yaodong Yang
,
Peter Pietzuch
,
Lei Chen
PDF
Cite
Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning
Setting up a well-designed reward function has been challenging for many reinforcement learning applications. Preference-based …
Runze Liu
,
Fengshuo Bai
,
Yali Du
,
Yaodong Yang
PDF
Cite
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Achieving human-level dexterity is an important open problem in robotics. However, tasks of dexterous hand manipulation even at the …
Yuanpei Chen
,
Tianhao Wu
,
Shengjie Wang
,
Xidong Feng
,
Jiechuang Jiang
,
Stephen Marcus McAleer
,
Hao Dong
,
Zongqing Lu
,
Song-Chun Zhu
,
Yaodong Yang
PDF
Cite
End-to-End Affordance Learning for Robotic Manipulation
Learning to manipulate 3D objects in an interactive environment has been a challenging problem in Reinforcement Learning (RL). In …
Yiran Geng
,
Boshi An
,
Haoran Geng
,
Yuanpei Chen
,
Yaodong Yang
,
Hao Dong
PDF
Cite
Cite
×