PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
Talks
Publications
Resources
Contact
Preference-Based Reinforcement Learning
Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning
Setting up a well-designed reward function has been challenging for many reinforcement learning applications. Preference-based …
Runze Liu
,
Fengshuo Bai
,
Yali Du
,
Yaodong Yang
PDF
Cite
Cite
×