Problem while implementing "Gradient Descent Algorithm" in Matlab

I'm solving a programming assignment in machine learning course. In which I've to implement "Gradient Descent Algorithm" like below
I'm using the following code
data = load('ex1data1.txt');
% text file conatins 2 values in each row separated by commas
X = [ones(m, 1), data(:,1)];
theta = zeros(2, 1);
iterations = 1500;
alpha = 0.01;
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
k=1:m;
j1=(1/m)*sum((theta(1)+theta(2).*X(k,2))-y(k))
j2=((1/m)*sum((theta(1)+theta(2).*X(k,2))-y(k)))*(X(k,2))
theta(1)=theta(1)-alpha*(j1);
theta(2)=theta(2)-alpha*(j2);
J_history(iter) = computeCost(X, y, theta);
end
end
theta = gradientDescent(X, y, theta, alpha, iterations);
On running the above code I'm getting this error message

댓글 수: 3

Racz Robert
Racz Robert 2019년 1월 6일
편집: Racz Robert 2019년 1월 6일
brackets mate, should work!
Cheers
Calculation of k can be outside the for loop. Improves performance!
hey have u found answer of your question

댓글을 달려면 로그인하십시오.

 채택된 답변

Matt J
Matt J 2015년 4월 11일
j2 is not a scalar, but you are trying to assign it to a scalar location theta(2).
Did you intend for this line
k=1:m;
to be a for-loop
for k=1:m

댓글 수: 2

Why j2 is not scalar, the expression
(1/m)*sum((theta(1)+theta(2).*X(k,2))-y(k))
is producing scalar result which can be multiplied by
X(k,2)
to produce scalar result. But on the matlab, I've also seen the result that is going to be stored in j2 is a vector. But Why ??
k is not a scalar. You defined it to be the vector 1:m. Therefore X(k,2) is also a vector.

댓글을 달려면 로그인하십시오.

추가 답변 (11개)

Jayan Joshi
Jayan Joshi 2019년 10월 15일
편집: Jayan Joshi 2019년 10월 15일
predictions =X*theta;
theta=theta-(alpha/m*sum((predictions-y).*X))';
Margo Khokhlova
Margo Khokhlova 2015년 10월 19일
편집: Walter Roberson 2015년 10월 19일
Well, sort of super late, but you just made it wrong with the brackets... This one works for me:
k=1:m;
j1=(1/m)*sum((theta(1)+theta(2).*X(k,2))-y(k))
j2=(1/m)*sum(((theta(1)+theta(2).*X(k,2))-y(k)).*X(k,2))
theta(1)=theta(1)-alpha*(j1);
theta(2)=theta(2)-alpha*(j2);
Shekhar Raj
Shekhar Raj 2019년 9월 19일
Below Code works for me -
Prediction = X * theta;
temp1 = alpha/m * sum((Prediction - y));
temp2 = alpha/m * sum((Prediction - y) .* X(:,2));
theta(1) = theta(1) - temp1;
theta(2) = theta(2) - temp2;

댓글 수: 2

Thank you this really helped. I tried more vectorized form of this and it worked.
predictions =X*theta;
theta=theta-(alpha/m*sum((predictions-y).*X))';
How did you manage to vectorize it that much? I don't understand how to translate the formula to code, seems confusing

댓글을 달려면 로그인하십시오.

Sesha Sai Anudeep Karnam
Sesha Sai Anudeep Karnam 2019년 8월 7일
편집: Sesha Sai Anudeep Karnam 2019년 8월 7일
temp0 = theta(1)-alpha*((1/m)*(theta(1)+theta(2).*X(k,2)-y(k)));
temp1 = theta(2)- alpha*((1/m)*(theta(1)+theta(2).*X(k,2)-y(k)).*X(k,2));
theta(1) = temp0;
theta(2) = temp1;
% this code gives approximate values but while submitting I'm getting 0points for this
% Theta found by gradient descent:
% -3.588389
% 1.123667
% Expected theta values (approx)
% -3.6303
% 1.1664
% How to overcome this??

댓글 수: 2

Below code gave the exact value -
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
Prediction = X * theta;
temp1 = alpha/m * sum((Prediction - y));
temp2 = alpha/m * sum((Prediction - y) .* X(:,2));
theta(1) = theta(1) - temp1;
theta(2) = theta(2) - temp2;
% ============================================================
i've tried this code but still get error due to not enough input arguments for m = length(y) ? do you know what may be the cause as it appears i have coded correctly

댓글을 달려면 로그인하십시오.

ICHEN WU
ICHEN WU 2015년 11월 8일
Can you tell me why my answer is not correct? I felt they are the same.
theta(1)=theta(1)-(alpha/m)*sum( (X*theta)-y);
theta(2)=theta(2)-(alpha/m)*sum( ((X*theta)-y)'*X(:,2));

댓글 수: 5

I think we need the context of the rest of your code. Also, are you getting an error message?
Hi, Thank you for response. The other parts of the code are exactly the same. there is no error message. just the final result it generates is different.
I only replace this part
k=1:m;
j1=(1/m)*sum((theta(1)+theta(2).*X(k,2))-y(k));
j2=(1/m)*sum(((theta(1)+theta(2).*X(k,2))-y(k)).*X(k,2));
theta(1)=theta(1)-alpha*(j1);
theta(2)=theta(2)-alpha*(j2);
to
theta(1)=theta(1)-(alpha/m)*sum( (X*theta)-y);
theta(2)=theta(2)-(alpha/m)*sum( ((X*theta)-y)'*X(:,2));
because I was thinking that I can use matrix for this instead of doing individual summation by 1:m. But the result of final theta(1,2) are different from the correct answer by a little bit. my answer: Theta found by gradient descent: -3.636063 1.166989 correct answer: Theta found by gradient descent: -3.630291 1.166362
By assigning theta(1) before assigning theta(2), you've introduced a side effect.
One way of writing it:
temp1 = theta(1)-(alpha/m)*sum(X*theta-y);
theta(2) = theta(2)-(alpha/m)*sum((X*theta-y)'*X(:,2));
theta(1) = temp1;
above one works perfect .try below code of mine too
earlier i used h = X * theta; a0 = (1/m)*sum((h-y)); a1 = (1/m)*sum((h-y)'*x1); surprisingly it didn't work
working code: x1 = X(:,2); a0 = (1/m)*sum((X * theta-y)); a1 = (1/m)*sum((X * theta-y)'*x1); a = [a0;a1]; theta = theta- (alpha*a);
if anyone find out whats wrong with my earlier code it would be appreciated.
yea I tried h = X*theta and it didn't work too, I'm thinking that when we use the variable h, as we update theta, the value of h will remain unchanged.

댓글을 달려면 로그인하십시오.

Ali Dezfooli
Ali Dezfooli 2016년 6월 17일
In this line
X = [ones(m, 1), data(:,1)];
You add bias to your X, but in the formula of your picture (Ng's slides) when you want to compute theta(2) you should remove it.
Utkarsh Anand
Utkarsh Anand 2018년 3월 17일

0 개 추천

Looking at the problem, I also think that you cannot initiate Theta as Zero.
Rajeswari G
Rajeswari G 2021년 1월 2일

0 개 추천

error = (X * theta) - y;
theta = theta - ((alpha/m) * X'*error);
In this equation why we take x'?

댓글 수: 1

This is because X is a 97x2 matrix. To perform dot products, only X' (2x97)will make the answer valid to be 2x1 vectors, entrys are theta(1)&theta(2) respectively.

댓글을 달려면 로그인하십시오.

Wamin Thammanusati
Wamin Thammanusati 2021년 2월 21일
편집: Wamin Thammanusati 2021년 2월 21일
The code below works for this case (one variable) and also multiple variables -
for iter = 1:num_iters
Hypothesis = X * theta;
for i=1:size(X,2)
theta(i) = theta(i) - alpha/m * sum((Hypothesis-y) .* X(:,i));
end
end

댓글 수: 1

having tried the same code i am struggling to understand what i am doing wrong - i receive error due to not enough jnput arguments for m = length(y) line. do you have any ideas?

댓글을 달려면 로그인하십시오.

Chong Lu
Chong Lu 2021년 11월 16일
편집: Walter Roberson 2021년 11월 27일
temp1 = theta(1) - alpha*(sum(X*theta - y)/m);
temp2 = theta(2) - alpha*(sum((X*theta - y).*X(:,2))/m);
theta(1) = temp1;
theta(2) = temp2;

카테고리

도움말 센터File Exchange에서 Creating and Concatenating Matrices에 대해 자세히 알아보기

질문:

2015년 4월 11일

댓글:

2022년 7월 4일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by