博士学术论坛系列活动

　发布时间：2024年12月10日 13:26　阅读量：

活动时间：2024年12月12日下午2：00

活动地点：学11-304

活动主题：Nature 论文分享

论文题目：RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback（ICML2024）

论文刊物：International Conference on Machine Learning

出版时间：2024

报告人：姚宗贵

主要内容：Reward engineering has long been a challenge in Reinforcement Learning (RL) research, as it often requiresextensive human effort and iterative processes of trial-and-error to design effective reward functions. In this paper, we propose RL-VLM-F, a method that automatically generates reward functions for agents to learn new tasks, using only a text description of the task goal and the agent's visual observations, by leveraging feedbacks from vision language foundation models (VLMs). The key to our approach is to query these models to give preferences over pairs of the agent's image observations based on the text description of the task goal, and then learn a reward function from the preference labels, rather than directly prompting these models to output a raw reward score, which can be noisy and inconsistent. We demonstrate that RL-VLM-F successfully produces effective rewards and policies across various domains - including classic control, as well as manipulation of rigid, articulated, and deformable objects - without the need for human supervision, outperforming prior methods that use large pretrained models for reward generation under the same assumptions.

活动介绍该文章的研究背景及其主要工作，并探讨与之相关的研究设想。

欢迎广大师生积极参加！

上一条：关于举办浙江省大学生程序设计竞赛校内选拔赛暨浙江农林大学2024年“新生杯”程序设计竞赛的通知

下一条：关于公布2024年浙江农林大学数学与计算机科学学院第十八届“东湖杯“大学生课外学术科技作品竞赛院赛结果的通知