Speaker: Donato Vásquez
Johan Radon Institute for Computational and Applied Mathematics.
Date:January 20th. at 12:10 pm
Abstract : The design of optimal feedbacks for control problems is a challenging task. The classical method for tackling this problem is based on dynamic programming. This involves finding the value function of the control problem by solving the Hamilton-Jacobi-Bellman (HJB) equation. However, this equation suffers from the “curse of dimensionality”, i.e., the computational cost of solving it grows exponentially with the dimension of the underlying control problem. For this reason, several methods based on machine learning have been proposed to solve HJB. Although numerical experiments have shown promising results, it is still necessary to find theoretical guarantees on the performance of this type of methods. In this regard, one of the main difficulties is the low regularity of HJB solutions.
In this talk we will present results related to the approximation of HJB solutions. These results allow us to find bounds for the performance of feedback generated by machine learning methods. It is important to note that these bounds only require the value function to be Hölder continuous, while similar results in the literature require the value function to be at least C^1. To illustrate the importance of bounds, a family of control problems indexed by a penalty coefficient will be presented. This coefficient controls the regularity of the value function, so that, for values close to zero the value function is C^2, whereas, it becomes non-differentiable when it is sufficiently large. Additionally, the application of these results to the method called Averaged Feedback Learning Scheme (AFLS), which consists of solving an averaged version of the control problem, will be presented. Finally, the ability of this method to solve problems with high dimensionality will be shown through numerical examples.
Venue: DIM seminar room, Beauchef 851, 5th floor.
Zoom: https://uchile.zoom.us/j/96642349167?pwd=MkRVbWxzOFBUUXlCTWFicW0reWZ6dz09