数学学科Seminar第2216讲 An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation-上海大学理学院


数学学科Seminar第2216讲 An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation


创建时间： 2021/11/25 龚惠英浏览次数：返回

报告题目 (Title)：An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation

报告人 (Speaker)：杨争峰教授（华东师范大学大学）

报告时间 (Time)：2021年11月25日(周四) 19:00

报告地点 (Place)：腾讯会议号： 427648902，密码：123456

邀请人(Inviter)：曾振柄

主办部门：理学院数学系系

报告摘要：In this talk, I will introduce a safe reinforcement learning approach to synthesize deep neural network (DNN) controllers for nonlinear systems subject to safety constraints. The proposed approach employs an iterative scheme where a learner and a verifier interact to synthesize safe DNN controllers. The learner trains a DNN controller via deep reinforcement learning, and the verifier certifies the learned controller through computing a maximal safe initial region and its corresponding barrier certificate, based on polynomial abstraction and bilinear matrix inequalities solving. Compared with the existing verification-in-the-loop synthesis methods, our iterative framework is a sequential synthesis scheme of controllers and barrier certificates, which can learn safe controllers with adaptive barrier certificates rather than user-defined ones. We implement the tool SRLBC and evaluate its performance over a set of benchmark examples. The experimental results demonstrate that our approach efficiently synthesizes safe DNN controllers even for a nonlinear system with dimension up to 12.

上一条：物理学科Seminar第574讲 α/β钛合金晶体缺陷附近α相析出机制的相场法研究

下一条：数学学科Seminar第2215讲 Value-Gradient Formulation for Optimal Control Problem and its Machine-Learning Algorithm


数学学科Seminar第2216讲 An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation


创建时间： 2021/11/25 龚惠英浏览次数：返回

报告题目 (Title)：An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation

报告人 (Speaker)：杨争峰教授（华东师范大学大学）

报告时间 (Time)：2021年11月25日(周四) 19:00

报告地点 (Place)：腾讯会议号： 427648902，密码：123456

邀请人(Inviter)：曾振柄

主办部门：理学院数学系系

首页

学院概况

党务工作

师资队伍

科学研究

本科生教育

研究生教育

学生工作

国际交流

安全工作

人才招聘

相关下载