PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
Talks
Publications
Resources
Contact
RLHF
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Current methods for large language model alignment typically use scalar human preference labels. However, this convention tends to …
Yifan Zhong
,
Chengdong Ma
,
Xiaoyuan Zhang
,
Ziran Yang
,
Haojun Chen
,
Qingfu Zhang
,
Siyuan Qi
,
Yaodong Yang
PDF
Cite
Cite
×