数学学科Seminar第2778讲 利用深度神经网络设计高维闭环最优控制

创建时间:  2024/11/22  龚惠英   浏览次数:   返回

报告题目 (Title):Designing High-Dimensional Closed-Loop Optimal Control Using Deep Neural Networks(利用深度神经网络设计高维闭环最优控制)

报告人 (Speaker): 胡卫(上海算法创新研究院)

报告时间 (Time):2024年11月22日(周五)16:00

报告地点 (Place):校本部GJ406

邀请人(Inviter):秦晓雪

主办部门:理学院数学系

报告摘要:Optimal control of high-dimensional nonlinear systems is a longstanding challenge, traditionally constrained by the curse of dimensionality in methods like the Hamilton-Jacobi-Bellman equation. A promising alternative leverages supervised learning, using open-loop optimal control solvers to generate training data and neural networks as high-dimensional function approximators. Despite some successes, this approach struggles with more complex problems due to covariance shift (distribution mismatch) between training and testing and data discontinuities. We address these by a strategic forward training method and a special neural network design exploiting the linear quadratic regulator (LQR). By integrating these techniques, we develop a robust closed-loop policy that operates effectively over a broad domain for a 7-DoF manipulator, achieving near globally optimal total costs.

上一条:上海大学核心数学研究所——几何与分析综合报告第99讲 为什么有限 Blaschke 积主要考虑两个成分的黎曼曲面

下一条:数学学科Seminar第2777讲 从Brown运动到随机微分方程


数学学科Seminar第2778讲 利用深度神经网络设计高维闭环最优控制

创建时间:  2024/11/22  龚惠英   浏览次数:   返回

报告题目 (Title):Designing High-Dimensional Closed-Loop Optimal Control Using Deep Neural Networks(利用深度神经网络设计高维闭环最优控制)

报告人 (Speaker): 胡卫(上海算法创新研究院)

报告时间 (Time):2024年11月22日(周五)16:00

报告地点 (Place):校本部GJ406

邀请人(Inviter):秦晓雪

主办部门:理学院数学系

报告摘要:Optimal control of high-dimensional nonlinear systems is a longstanding challenge, traditionally constrained by the curse of dimensionality in methods like the Hamilton-Jacobi-Bellman equation. A promising alternative leverages supervised learning, using open-loop optimal control solvers to generate training data and neural networks as high-dimensional function approximators. Despite some successes, this approach struggles with more complex problems due to covariance shift (distribution mismatch) between training and testing and data discontinuities. We address these by a strategic forward training method and a special neural network design exploiting the linear quadratic regulator (LQR). By integrating these techniques, we develop a robust closed-loop policy that operates effectively over a broad domain for a 7-DoF manipulator, achieving near globally optimal total costs.

上一条:上海大学核心数学研究所——几何与分析综合报告第99讲 为什么有限 Blaschke 积主要考虑两个成分的黎曼曲面

下一条:数学学科Seminar第2777讲 从Brown运动到随机微分方程