Will Matlab at some point support parallel computation of the finite difference Hessian? More specifically, I've been using UseParallel in my fminunc settings (which have a lot of parameters), but computing the Hessian takes a fair amount of time.

댓글 수: 4

ruobing han
ruobing han 2022년 10월 2일
Same question here, have you figured it out?
Matt J
Matt J 2022년 10월 2일
What makes you suppose UseParallel applies to gradient, but not to Hessian computations?
Walter Roberson
Walter Roberson 2022년 10월 2일
the option description is "When true, fminunc estimates gradients in parallel." but that is gradients not hessian
Jonne Guyt
Jonne Guyt 2022년 10월 4일
I've stopped using Matlab but simply monitoring the system usage (# of cores) as well as the slowness were giveaways that it's not calculating in parallel

댓글을 달려면 로그인하십시오.

답변 (1개)

Matt J
Matt J 2022년 10월 2일
편집: Matt J 2022년 10월 2일

0 개 추천

I don't speak for MathWorks, but I think the issue is that finite difference Hessians are only relevant to the trust-region algorithm, since the quasi-newton algorithm does not use Hessian computations. But in the trust-region algorithm, the user is required to provide an analytical gradient computation via SpecifyObjectiveGradient=true. It seems a rather narrow use case that an analytical gradient calculation would be tractable, but not an analytical Hessian computation, assuming the memory footprint of such a matrix is not prohibitive. If the memory footprint of the Hessian is prohibitive, the user is meant to be use the HessianMultiplyFcn or HessPattern options.

댓글 수: 10

Jonne Guyt
Jonne Guyt 2022년 10월 4일
편집: Jonne Guyt 2022년 10월 4일
I do not think the case as narrow as described in your answer. I've stopped using Matlab in the meantime...
While not a drop-in replacement, one can use/tweak this function to calculate the hessian in parallel.
(simply omit/abort the hessian calculation during optimization and then run this function to calculate the hessian)
Matt J
Matt J 2022년 10월 4일
but using a quasi-newton algorithm (fminunc) as well as an interior-point (fmincon) will use finite differences to calculate the hessian and these are not fringe/edge cases.
No, fminunc's quasi-newton algorithm does not do a full Hessian computation. Only gradients are used.
fmincon's interior-point does have an option to compute the Hessian by finite differences but, similar to fminunc's trust-region-algorithm, it requires the user to supply an analytical gradient computation, which means the Hessian is likely to be analytically tractable as well (or at least I've yet to see a counter-example). So, I wonder why this option would ever be used.
Jonne Guyt
Jonne Guyt 2022년 10월 4일
No longer using Matlab and maybe I should've rephrased it, but I am pretty sure that not supplying Gradients and/or Hessians is relatively common with fmincon/fminunc (and it is by no means a requirement to provide them). In those cases (there are many cases in which they're a pain to calculate/provide) you are still stuck with this problem....
Matt J
Matt J 2022년 10월 4일
편집: Matt J 2022년 10월 4일
As I've been saying, not supplying gradients is common, but not in the specific algorithms where full Hessians are used.
Another reason why finite difference Hessians may be discouraged is that the Hessian needs to be inverted, which can be sensitive to finite differencing errors.
Bruno Luong
Bruno Luong 2022년 10월 4일
편집: Bruno Luong 2022년 10월 4일
@Jonne Guyt " but using a quasi-newton algorithm (fminunc) as well as an interior-point (fmincon) will use finite differences to calculate the hessian and these are not fringe/edge cases."
This statement is wrong. If the Hessian function is not provided by user, the quasi newton Hessian used by both algorithms is resulting from bookeeping of the gradients evaluated at different points. There is no need of finite difference on top of the gradient.
The doc mention the "sparse finite difference algorithm on the gradients" only performed in trust-region algorithm as Matt's correctly stated.
Jonne Guyt
Jonne Guyt 2022년 10월 4일
@Bruno Luong you're right.. my point was rephrased in my comment below it but I'll edit the post.
I think that the argument is going sideways and Matt J's comments are not helping users (other than saying "you shouldn't be in that situation"). The point is that if you use fminunc or fmincon and do not supply gradients/hessians, but do ask for the hessian to be calculated and if it is calculated via finite-differences, it is calculated without parallelizing. This use case is common within for example discrete choice modeling. The hessian is used to calculate the standard errors, so you do need it and there's no simple way to get it via alternative means (e.g., providing the gradient/hessian analytically).
The question was if this can be parallelized. Mathworks has not done so, the linked code in my post allows you to do so manually...
Matt J
Matt J 2022년 10월 4일
편집: Matt J 2022년 10월 4일
if you use fminunc or fmincon and do not supply gradients/hessians, but do ask for the hessian to be calculated and if it is calculated via finite-differences.
That case does not exist in fmincon/fminunc. There is no fmincon/fminunc algorithm that performs a finite difference Hessian calculation when an analytical gradient is not provided.
So the only question is, do you know of a case where it would make sense to supply an analytical gradient, but not an analytical Hessian.
Matt J
Matt J 2022년 10월 4일
편집: Matt J 2022년 10월 4일
This use case is common within for example discrete choice modeling. The hessian is used to calculate the standard errors, so you do need it and there's no simple way to get it via alternative means
If you need the Hessian for the purposes of computing standard errors (and not iterative optimization), then I agree it may make sense to have a parallelized finite differencer for that. However, it is not clear why that belongs in fminunc/fmincon. Because the Hessian is not being recomputed iteratively, you would use a standalone Hessian computing routine for that.
Bruno Luong
Bruno Luong 2022년 10월 4일
And furthermore the Hessian returned by minimization algorithms are usuallt NOT suitable to compute error standard deviations.
Jonne Guyt
Jonne Guyt 2022년 10월 4일
If you need the Hessian for the purposes of computing standard errors (and not iterative optimization), then I agree it may make sense to have a parallelized finite differencer for that. However, it is not clear why that belongs in fminunc/fmincon. Because the Hessian is not being recomputed iteratively, you would use a standalone Hessian computing routine for that.
Yes - this is exactly the use case. I think you inferred it was optimization, but I never mentioned this. If you request the hessian to be returned at the end (to compute SE's), it is approximated with finite differences, so it is 'built-in', but it might as well be standalone (hence my workaround). I hope it makes more sense now...
@Bruno Luong exact and approximated SE's may differ a small amount, but can you show me to where I can find that these are not suitable for discrete choice models with non-obvious gradients?

댓글을 달려면 로그인하십시오.

카테고리

질문:

2018년 6월 29일

댓글:

2022년 10월 4일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by