How does the L2 Regularization in a custom training loop work?
조회 수: 6 (최근 30일)
이전 댓글 표시
Hi,
I implemented the custom training loop to train a sequence to sequence regression model. I also implemented the L2 regularization as described in the documentation here: https://de.mathworks.com/help/deeplearning/ug/specify-training-options-in-custom-training-loop.html#mw_50581933-e0ce-4670-9456-af23b2b6f337
Now I'm wondering how this works. If I have a look in other documentations like this one from Google, it seems to work differently. Google describes it as adding the square of the weights to the loss. In Matlab it looks like I add the weights to the gradients. Isn't that something different? Is one way better than the other?
Cheers
댓글 수: 0
채택된 답변
Richard
2024년 9월 26일
Short story: these are the same.
Long story: The link you gave explains the underlying mathematics of L2 regularization and its effects on weights. However it doesn't really explain the mechanics of how adding a regularization to the loss affects the minimization algorithm - it stops at saying you "minimize(loss + lambda*complexity)".
In this case the minimization is being done by taking steps based on the gradient of the total loss with respect to each weight. For each weight parameter the d(loss + lambda*complexity)/dw that is required is equal to dL/dw + d(lambda*complexity)/dw, i.e. the gradients are the sum of the non-regularized gradient and a regularization term. It is this sum which the MATLAB example is performing.
댓글 수: 0
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Image Data Workflows에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!