gradient descent for custom function

I have four equations:
1) e = m - y
2) y = W_3 * h
3) h = z + W_2 * z + f
4) f = W_1 * x
I want to update W_1, W_2 and W_3 in order to minimize a cost function J = (e^T e ) by using gradient descent.
x is an input, y is the output and m is the desired value for each sample in the dataset
I would like to do W_1 = W_1 - eta* grad(J)_w_1
W_2 = W_2 - eta* grad(J)_w_2
W_3 = W_3 - eta* grad(J)_w_3
Going through documentation I found out that you can train standard neural networks. But notice that I have some custom functions, so I guess it would be more of an optimization built in function to use.
Any ideas?

댓글 수: 2

Matt J
Matt J 2024년 4월 24일
x is an input, y is the output and m is the desired value for each sample in the dataset
It looks like z is also an input. It is not given by any other equations.
L
L 2024년 4월 24일
Yes, z is another input.

댓글을 달려면 로그인하십시오.

답변 (2개)

Matt J
Matt J 2024년 4월 24일
편집: Matt J 2024년 4월 24일

0 개 추천

so I guess it would be more of an optimization built in function to use.
No, not necessarily. Your equations can be implemented with fullyConnectedLayers and additionLayers.

댓글 수: 3

L
L 2024년 4월 24일
Thanks, and then I could traingd? Do I need to create a network object?
Matt J
Matt J 2024년 4월 24일
You would create a dlnetwork and then use trainnet.
L
L 2024년 4월 24일
편집: L 2024년 4월 24일
Thanks @Matt J. I found the documentation hard to follow. I tried using the DL tool box, but I was only able to generate the topology. What should I do next ? how to train it on the x, z inputs? For each x,z input, I have a desired output m. For all matrices I don't wish to learn a bias term, so, in the tool box I set learnratebias =0

댓글을 달려면 로그인하십시오.

Torsten
Torsten 2024년 4월 24일
이동: Torsten 2024년 4월 24일

0 개 추천

e = m - y = m - W_3*h = m - W_3*(z + W_2 * z + W_1 * x )
Now if you formulate this as
e = W1*z + W2*x - m
with
W1 = W_3 + W_2*W_3 and W2 = W_1*W_3
your problem is
min: || [z.',x.']*[W1;W2] - m ||_2
and you can use "lsqlin" to solve.

댓글 수: 10

L
L 2024년 4월 24일
But when I find W1 and W2, it will be an undertermined system to find W_1,W_2 and W_3
Torsten
Torsten 2024년 4월 24일
편집: Torsten 2024년 4월 24일
Yes, your system is overfitted. One parameter is free - the other two follow.
L
L 2024년 4월 24일
편집: L 2024년 4월 24일
Can lsqlin handle complex variables? One other thing: shouldnt this W1 = W_3 + W_2*W_3 and W2 = W_1*W_3 be W1 = W_3 + W_3*W_2 and W2 = W_3*W_1 ?
Torsten
Torsten 2024년 4월 24일
편집: Torsten 2024년 4월 24일
One other thing: shouldnt this W1 = W_3 + W_2*W_3 and W2 = W_1*W_3 be W1 = W_3 + W_3*W_2 and W2 = W_3*W_1 ?
Is W_2*W_3 different from W_3*W_2 and W_1*W_3 different from W_3*W_1 ?
I thought the W_i's are single numbers as usual for regression problems:
y = a*x + b
where a, b are scalars and x and y are vectors of a certain length.
L
L 2024년 4월 24일
W_is are matrices!
Torsten
Torsten 2024년 4월 24일
편집: Torsten 2024년 4월 24일
Then please specify the dimensions of all variables/arrays involved.
L
L 2024년 4월 24일
e,m,y in C^1
W_3 in C^1x100
h in C^100
z in C^100
W_2 in C^100x100
W_1 in R^100x1
x in C^1
Torsten
Torsten 2024년 4월 24일
But then you have much more free variables to be fitted than input data. Does that make sense ?
L
L 2024년 4월 24일
Well, I can prescribe some variables.
Torsten
Torsten 2024년 4월 24일
편집: Torsten 2024년 4월 24일
As far as I understand, you have 10200 free parameters and 200 known values. I think you should re-consider your problem.

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

질문:

L
L
2024년 4월 24일

편집:

2024년 4월 24일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by