Why trust-region-reflective algorithm in fmincon require a gradient as input?
이전 댓글 표시
Hello
According to Matlab fmincon documentation, the 'trust-region-reflective' algorithm needs to specify the objective gradient.
1- Why does it need gradient as input?
2- Does the objective function have to be analytical? since my objective function, is function handle that calculates some scalar as output
Thanks for your help in advance
댓글 수: 12
Torsten
2022년 2월 25일
The objective function has to be differentiable and return a scalar value.
This is senseful because how do you want to minimize a matrix ?
SM
2022년 2월 25일
My guess is that for the trust-region-reflective algorithm it is very important that the gradient of the objective function is calculated with high precision. That's why the user is requested to supply it (at best analytically).
Of course, all the other methods also need the gradient of the objective, but it will be calculated within fmincon by numerical differentiation.
But why care about it - you won't be able to alter the requirements MATLAB imposes.
SM
2022년 2월 25일
I tried to explain the logic behind it. Please read my response more carefully.
The essence is that all algorithms need the gradient of the objective, but that it is vital for the trust-region-reflective algorithm that the gradient is supplied with high precision. That's why the user is explicitly requested to supply it, at best analytically.
Of course you can use TRA also in case you have no analytical derivative of the objective, namely by calculating the gradient of the objective numerically, but I don't know if this is precise enough for the TRA to succeed. Especially because you said that you do not even know if your objective is differentiable. If not, better use ga instead of fmincon.
SM
2022년 2월 25일
(f(x(1),x(2),...,x(i )+ h,x(i+1),...,x(120))-f(x(1),x(2),....,x(120)))/h
gives you an approximation to the i-th element of the gradient of the objective function f where h is a small number (e.g.1e-8).
There are several MATLAB codes in the File Exchange that calculate numerical gradients of a scalar function (I think Matt J gave a link to one of these codes).
But when using fmincon, make sure that your objective and your constraints are really differentiable with respect to the variables. If not, better use ga.
SM
2022년 2월 26일
Zaikun Zhang
2022년 2월 26일
편집: Zaikun Zhang
2022년 2월 26일
Walter Roberson
2022년 2월 26일
편집: Walter Roberson
2022년 2월 26일
https://nmayorov.wordpress.com/2015/06/19/trust-region-reflective-algorithm/ has some description of the algorithm.
I do not see anything in there that requires analytic solutions, but I might have missed something.
Bruno Luong
2022년 2월 26일
"I have 120 variables (x) and trying to minimise my objective function (f). The objective function is calculated numerically at each iteration, and I cannot calculate its gradient since there is no explicit formula."
This statement is simply not true. I have minimizes a function with 1e6 variables with no explicit-formula (solving a PDE) and I can compute the exact gradient by chain rules and adjoint equation. The only requirent is your code that compute the objective function only uses diffentiable functions and operators (so no min, max, if) within it.
Walter Roberson
2022년 2월 26일
I could imagine situations such as the function involving solving integral equations where no closed form formula exists. For example ODE often have a general solution involving the exp() of the sum of the roots of an integral equation.
채택된 답변
추가 답변 (2개)
Zaikun Zhang
2022년 2월 26일
0 개 추천
Bruno Luong
2022년 2월 26일
편집: Bruno Luong
2022년 2월 26일
0 개 추천
I don't think Matt's answer is correct, rather Torsen answer is better. The Trust region algorithm must approximate the objective function in a current "trust region". It is usually use the cost function and gradient in many points and create a quadratic function approximating. That requires both cost function and gradient calculation are continuous with respect to the points at which the gradients (and the objectiive) are evaluated. Therefore the numerical finite differences with adaptive step is not suitable, since there is no known method to estimate the step continuously with the point.
Therefore TR algorithm requires user to supply the "exact" gradient. Otherwise it won't converge.
Better still, if user provides the Hessian x vector calculation, it will help the TR to work even more efficiently.
댓글 수: 5
Matt J
2022년 2월 26일
I don't follow that reasoning. By default, the TR method computes the Hessian using finite differences. I don't see why TR would be tolerant to finite difference approximations to the Hessian, but not the gradient.
Bruno Luong
2022년 2월 26일
편집: Bruno Luong
2022년 2월 26일
They use (L)-BFGS formula to approximate the Hessian from the gradients not finite difference. The difference with other method is that TR have to solve subproblem using the quadratic model (thus the Hessain) and not the descend direction (1D) which is more or less d:=-H*g in many cases. Therefore the requirement of gradient being smooth is more important.
Bruno Luong
2022년 2월 26일
OK, I miss that.
But my point does not change, if they compute H by finite differences on g, it requres the g to be exact so as H is meaningfuly estimatedl. Bottom line is the TR requires more accuracy on g, not because of the computation time.
카테고리
도움말 센터 및 File Exchange에서 Solver-Based Nonlinear Optimization에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
