Coursera: Machine Learning (Week 5) [Assignment Solution] - Andrew NG

▸ Back-propagation algorithm for neural networks to the task of hand-written digit recognition.

I have recently completed the Machine Learning course from Coursera by Andrew NG.

While doing the course we have to go through various quiz and assignments.

Here, I am sharing my solutions for the weekly assignments throughout the course.

These solutions are for reference only.

It is recommended that you should solve the assignments by yourself honestly then only it makes sense to complete the course.
But, In case you stuck in between, feel free to refer to the solutions provided by me.

NOTE:

Don't just copy paste the code for the sake of completion. 
Even if you copy the code, make sure you understand the code first.

Click here to check out week-4 assignment solutions, Scroll down for the solutions for week-5 assignment.




In this exercise, you will implement the back-propagation algorithm for neural networks and apply it to the task of hand-written digit recognition. Before starting on the programming exercise, we strongly recommend watching the video lectures and completing the review questions for the associated topics.

It consist of the following files:
  • ex4.m - Octave/MATLAB script that steps you through the exercise
  • ex4data1.mat - Training set of hand-written digits
  • ex4weights.mat - Neural network parameters for exercise 4
  • submit.m - Submission script that sends your solutions to our servers
  • displayData.m - Function to help visualize the dataset
  • fmincg.m - Function minimization routine (similar to fminunc)
  • sigmoid.m - Sigmoid function
  • computeNumericalGradient.m - Numerically compute gradients
  • checkNNGradients.m - Function to help check your gradients
  • debugInitializeWeights.m - Function for initializing weights
  • predict.m - Neural network prediction function
  • [*] sigmoidGradient.m - Compute the gradient of the sigmoid function
  • [*] randInitializeWeights.m - Randomly initialize weights
  • [*] nnCostFunction.m - Neural network cost function
  • Video - YouTube videos featuring Free IOT/ML tutorials
* indicates files you will need to complete


sigmoidGradient.m :

function g = sigmoidGradient(z)
  %SIGMOIDGRADIENT returns the gradient of the sigmoid function
  %evaluated at z
  %   g = SIGMOIDGRADIENT(z) computes the gradient of the sigmoid function
  %   evaluated at z. This should work regardless if z is a matrix or a
  %   vector. In particular, if z is a vector or matrix, you should return
  %   the gradient for each element.
  
  g = zeros(size(z));
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: Compute the gradient of the sigmoid function evaluated at
  %               each value of z (z can be a matrix, vector or scalar).
  
  g = sigmoid(z).*(1-sigmoid(z));
  
  % =============================================================
end




randInitializeWeights.m :

function W = randInitializeWeights(L_in, L_out)
  %RANDINITIALIZEWEIGHTS Randomly initialize the weights of a layer with L_in
  %incoming connections and L_out outgoing connections
  %   W = RANDINITIALIZEWEIGHTS(L_in, L_out) randomly initializes the weights 
  %   of a layer with L_in incoming connections and L_out outgoing 
  %   connections. 
  %
  %   Note that W should be set to a matrix of size(L_out, 1 + L_in) as
  %   the first column of W handles the "bias" terms
  %
  
  % You need to return the following variables correctly 
  W = zeros(L_out, 1 + L_in);
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: Initialize W randomly so that we break the symmetry while
  %               training the neural network.
  %
  % Note: The first column of W corresponds to the parameters for the bias unit
  %
  % epsilon_init = 0.12;
  
  epsilon_init = sqrt(6)/(sqrt(L_in)+sqrt(L_out));
  W = - epsilon_init + rand(L_out, 1 + L_in) * 2 * epsilon_init ;
  
  % =========================================================================
end

Check-out our free tutorials on IOT (Internet of Things):







nnCostFunction.m :

function [J, grad] = nnCostFunction(nn_params, ...
      input_layer_size, ...
      hidden_layer_size, ...
      num_labels, ...
      X, y, lambda)
  %NNCOSTFUNCTION Implements the neural network cost function for a two layer
  %neural network which performs classification
  %   [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
  %   X, y, lambda) computes the cost and gradient of the neural network. The
  %   parameters for the neural network are "unrolled" into the vector
  %   nn_params and need to be converted back into the weight matrices.
  %
  %   The returned parameter grad should be a "unrolled" vector of the
  %   partial derivatives of the neural network.
  %
  
  % Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
  % for our 2 layer neural network
  % DIMENSIONS:
  % Theta1 = 25 x 401
  % Theta2 = 10 x 26
  
  Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
      hidden_layer_size, (input_layer_size + 1));
  
  Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
      num_labels, (hidden_layer_size + 1));
  
  % Setup some useful variables
  m = size(X, 1);
  
  % You need to return the following variables correctly
  J = 0;
  Theta1_grad = zeros(size(Theta1)); %25 x401
  Theta2_grad = zeros(size(Theta2)); %10 x 26
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: You should complete the code by working through the
  %               following parts.
  %
  % Part 1: Feedforward the neural network and return the cost in the
  %         variable J. After implementing Part 1, you can verify that your
  %         cost function computation is correct by verifying the cost
  %         computed in ex4.m
  %
  % Part 2: Implement the backpropagation algorithm to compute the gradients
  %         Theta1_grad and Theta2_grad. You should return the partial derivatives of
  %         the cost function with respect to Theta1 and Theta2 in Theta1_grad and
  %         Theta2_grad, respectively. After implementing Part 2, you can check
  %         that your implementation is correct by running checkNNGradients
  %
  %         Note: The vector y passed into the function is a vector of labels
  %               containing values from 1..K. You need to map this vector into a
  %               binary vector of 1's and 0's to be used with the neural network
  %               cost function.
  %
  %         Hint: We recommend implementing backpropagation using a for-loop
  %               over the training examples if you are implementing it for the
  %               first time.
  %
  % Part 3: Implement regularization with the cost function and gradients.
  %
  %         Hint: You can implement this around the code for
  %               backpropagation. That is, you can compute the gradients for
  %               the regularization separately and then add them to Theta1_grad
  %               and Theta2_grad from Part 2.
  %
  
  %%%%%%%%%%% Part 1: Calculating J w/o Regularization %%%%%%%%%%%%%%%
  
  X = [ones(m,1), X];  % Adding 1 as first column in X
  
  a1 = X; % 5000 x 401
  
  z2 = a1 * Theta1';  % m x hidden_layer_size == 5000 x 25
  a2 = sigmoid(z2); % m x hidden_layer_size == 5000 x 25
  a2 = [ones(size(a2,1),1), a2]; % Adding 1 as first column in z = (Adding bias unit) % m x (hidden_layer_size + 1) == 5000 x 26
  
  z3 = a2 * Theta2';  % m x num_labels == 5000 x 10
  a3 = sigmoid(z3); % m x num_labels == 5000 x 10
  
  h_x = a3; % m x num_labels == 5000 x 10
  
  %Converting y into vector of 0's and 1's for multi-class classification
  
  %%%%% WORKING %%%%%
  % y_Vec = zeros(m,num_labels);
  % for i = 1:m
  %     y_Vec(i,y(i)) = 1;
  % end
  %%%%%%%%%%%%%%%%%%%
  
  y_Vec = (1:num_labels)==y; % m x num_labels == 5000 x 10
  
  %Costfunction Without regularization
  J = (1/m) * sum(sum((-y_Vec.*log(h_x))-((1-y_Vec).*log(1-h_x))));  %scalar
  
  
  %%%%%%%%%%% Part 2: Implementing Backpropogation for Theta_gra w/o Regularization %%%%%%%%%%%%%
  
  %%%%%%% WORKING: Backpropogation using for loop %%%%%%%
  % for t=1:m
  %     % Here X is including 1 column at begining
  %     
  %     % for layer-1
  %     a1 = X(t,:)'; % (n+1) x 1 == 401 x 1
  %     
  %     % for layer-2
  %     z2 = Theta1 * a1;  % hidden_layer_size x 1 == 25 x 1
  %     a2 = [1; sigmoid(z2)]; % (hidden_layer_size+1) x 1 == 26 x 1
  %   
  %     % for layer-3
  %     z3 = Theta2 * a2; % num_labels x 1 == 10 x 1    
  %     a3 = sigmoid(z3); % num_labels x 1 == 10 x 1    
  % 
  %     yVector = (1:num_labels)'==y(t); % num_labels x 1 == 10 x 1    
  %     
  %     %calculating delta values
  %     delta3 = a3 - yVector; % num_labels x 1 == 10 x 1    
  %     
  %     delta2 = (Theta2' * delta3) .* [1; sigmoidGradient(z2)]; % (hidden_layer_size+1) x 1 == 26 x 1
  %     
  %     delta2 = delta2(2:end); % hidden_layer_size x 1 == 25 x 1 %Removing delta2 for bias node  
  %     
  %     % delta_1 is not calculated because we do not associate error with the input  
  %     
  %     % CAPITAL delta update
  %     Theta1_grad = Theta1_grad + (delta2 * a1'); % 25 x 401
  %     Theta2_grad = Theta2_grad + (delta3 * a2'); % 10 x 26
  %  
  % end
  % 
  % Theta1_grad = (1/m) * Theta1_grad; % 25 x 401
  % Theta2_grad = (1/m) * Theta2_grad; % 10 x 26
  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  
  %%%%%% WORKING: Backpropogation (Vectorized Implementation) %%%%%%%
  % Here X is including 1 column at begining
  A1 = X; % 5000 x 401
  
  Z2 = A1 * Theta1';  % m x hidden_layer_size == 5000 x 25
  A2 = sigmoid(Z2); % m x hidden_layer_size == 5000 x 25
  A2 = [ones(size(A2,1),1), A2]; % Adding 1 as first column in z = (Adding bias unit) % m x (hidden_layer_size + 1) == 5000 x 26
  
  Z3 = A2 * Theta2';  % m x num_labels == 5000 x 10
  A3 = sigmoid(Z3); % m x num_labels == 5000 x 10
  
  % h_x = a3; % m x num_labels == 5000 x 10
  
  y_Vec = (1:num_labels)==y; % m x num_labels == 5000 x 10
  
  DELTA3 = A3 - y_Vec; % 5000 x 10
  DELTA2 = (DELTA3 * Theta2) .* [ones(size(Z2,1),1) sigmoidGradient(Z2)]; % 5000 x 26
  DELTA2 = DELTA2(:,2:end); % 5000 x 25 %Removing delta2 for bias node
  
  Theta1_grad = (1/m) * (DELTA2' * A1); % 25 x 401
  Theta2_grad = (1/m) * (DELTA3' * A2); % 10 x 26
  
  %%%%%%%%%%%% WORKING: DIRECT CALCULATION OF THETA GRADIENT WITH REGULARISATION %%%%%%%%%%%
  % %Regularization term is later added in Part 3
  % Theta1_grad = (1/m) * Theta1_grad + (lambda/m) * [zeros(size(Theta1, 1), 1) Theta1(:,2:end)]; % 25 x 401
  % Theta2_grad = (1/m) * Theta2_grad + (lambda/m) * [zeros(size(Theta2, 1), 1) Theta2(:,2:end)]; % 10 x 26
  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  
  
  %%%%%%%%%%%% Part 3: Adding Regularisation term in J and Theta_grad %%%%%%%%%%%%%
  reg_term = (lambda/(2*m)) * (sum(sum(Theta1(:,2:end).^2)) + sum(sum(Theta2(:,2:end).^2))); %scalar
  
  %Costfunction With regularization
  J = J + reg_term; %scalar
  
  %Calculating gradients for the regularization
  Theta1_grad_reg_term = (lambda/m) * [zeros(size(Theta1, 1), 1) Theta1(:,2:end)]; % 25 x 401
  Theta2_grad_reg_term = (lambda/m) * [zeros(size(Theta2, 1), 1) Theta2(:,2:end)]; % 10 x 26
  
  %Adding regularization term to earlier calculated Theta_grad
  Theta1_grad = Theta1_grad + Theta1_grad_reg_term;
  Theta2_grad = Theta2_grad + Theta2_grad_reg_term;
  
  % -------------------------------------------------------------
  
  % =========================================================================
  
  % Unroll gradients
  grad = [Theta1_grad(:) ; Theta2_grad(:)];

end





I tried to provide optimized solutions like vectorized implementation for each assignment. If you think that more optimization can be done, then put suggest the corrections / improvements.

--------------------------------------------------------------------------------
Click here to see solutions for all Machine Learning Coursera Assignments.
&
Click here to see more codes for Raspberry Pi 3 and similar Family.
&
Click here to see more codes for NodeMCU ESP8266 and similar Family.
&
Click here to see more codes for Arduino Mega (ATMega 2560) and similar Family.

Feel free to ask doubts in the comment section. I will try my best to solve it.
If you find this helpful by any mean like, comment and share the post.
This is the simplest way to encourage me to keep doing such work.

Thanks and Regards,
-Akshay P. Daga





61 Comments

  1. Hi,

    I am clear up to how to calculate partial derivatives. But, I am having doubt after calculating delta values. I have got delta-2 values in the dimension 10 X 25 and delta-1 with dimension 25X400. This is I have got for first row of input layer. So, for 5000 rows how these delta values will be calculated?

    Thanks.

    ReplyDelete
    Replies

    1. Dimensions of delta should be similar as to corresponding Z dimensions not similar to theta's dimensions.

      Delete
  2. please clear why you have used different variables in cost fumctiof and Backpropagation

    ReplyDelete
    Replies
    1. Just to keep two implementations separate for easy understanding of users.

      Delete
  3. Your Program is not running.
    When I try to run this program sigmoidGradient.m it says error in line 9. Now what is the error ?

    ReplyDelete
    Replies
    1. Hi Sarthak, You should also post the error you are getting.

      All above programs are working and tested by me multiple times.

      Delete
  4. The program is giving the following error when runnning the nnCostFunction.m

    Error using ==
    Matrix dimensions must agree.

    Error in nnCostFunction (line 87)
    y_Vec = (1:num_labels)==y; % m x num_labels == 5000 x 10

    ReplyDelete
  5. Hi Andrew,when I tried to test the sigmoid gradient descent by inputing the code
    g = sigmoid(z).*(1-sigmoid(z));

    I got the following error:
    undefined function or variable 'z'.

    Please why is this happening?


    ReplyDelete
    Replies
    1. Hi, I am not Andrew. I am Akshay.
      I think you are doing this assignment in Octave and that's why you are facing this issue.

      Chethan Bhandarkar has provided solution for it. Please check it out: https://www.apdaga.com/2018/06/coursera-machine-learning-week-2.html?showComment=1563986935868#c4682866656714070064

      Thanks

      Delete
    2. You'll be getting this error because you are running your program sigmoidGradient.m
      The variable z is defined in the main ex4.m file. So run the ex4.m file after running the sigmoidGradient.m file.

      Delete
  6. Hi Akshay,the link you sent is for week2 and referring to week5

    ReplyDelete
    Replies
    1. In the link I have provided, Go and check the comment by "Chethan Bhandarkar"
      She has provided the solution for the similar to your problem.

      Delete
  7. Hi Akshay,

    I have a quick Q about the delta2 calculation in the 'nnCostFunction.m'

    delta2 = (Theta2' * delta3) .* [1; sigmoidGradient(z2)]; % (hidden_layer_size+1) x 1 == 26 x 1

    Why do you use a '1' --> .* [1; sigmoidGradient(z2)]; and just not multiply by the --> .* sigmoidGradient(z2) ?

    thank you in advance
    Bruno

    ReplyDelete
  8. I have entered same code u mentioned for nncostfunction and submit it but backward propagation part has not submitted.
    A1 = X; % 5000 x 401

    Z2 = A1 * Theta1'; % m x hidden_layer_size == 5000 x 25
    A2 = sigmoid(Z2); % m x hidden_layer_size == 5000 x 25
    A2 = [ones(size(A2,1),1), A2]; % Adding 1 as first column in z = (Adding bias unit) % m x (hidden_layer_size + 1) == 5000 x 26

    Z3 = A2 * Theta2'; % m x num_labels == 5000 x 10
    A3 = sigmoid(Z3); % m x num_labels == 5000 x 10

    % h_x = a3; % m x num_labels == 5000 x 10

    y_Vec = (1:num_labels)==y; % m x num_labels == 5000 x 10

    DELTA3 = A3 - y_Vec; % 5000 x 10
    DELTA2 = (DELTA3 * Theta2) .* [ones(size(Z2,1),1) sigmoidGradient(Z2)]; % 5000 x 26
    DELTA2 = DELTA2(:,2:end); % 5000 x 25 %Removing delta2 for bias node

    Theta1_grad = (1/m) * (DELTA2' * A1); % 25 x 401
    Theta2_grad = (1/m) * (DELTA3' * A2); % 10 x 26

    %%%%%%%%%%%% WORKING: DIRECT CALCULATION OF THETA GRADIENT WITH REGULARISATION %%%%%%%%%%%
    % %Regularization term is later added in Part 3
    % Theta1_grad = (1/m) * Theta1_grad + (lambda/m) * [zeros(size(Theta1, 1), 1) Theta1(:,2:end)]; % 25 x 401
    % Theta2_grad = (1/m) * Theta2_grad + (lambda/m) * [zeros(size(Theta2, 1), 1) Theta2(:,2:end)]; % 10 x 26
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


    %%%%%%%%%%%% Part 3: Adding Regularisation term in J and Theta_grad %%%%%%%%%%%%%
    reg_term = (lambda/(2*m)) * (sum(sum(Theta1(:,2:end).^2)) + sum(sum(Theta2(:,2:end).^2))); %scalar

    %Costfunction With regularization
    J = J + reg_term; %scalar

    %Calculating gradients for the regularization
    Theta1_grad_reg_term = (lambda/m) * [zeros(size(Theta1, 1), 1) Theta1(:,2:end)]; % 25 x 401
    Theta2_grad_reg_term = (lambda/m) * [zeros(size(Theta2, 1), 1) Theta2(:,2:end)]; % 10 x 26

    %Adding regularization term to earlier calculated Theta_grad
    Theta1_grad = Theta1_grad + Theta1_grad_reg_term;
    Theta2_grad = Theta2_grad + Theta2_grad_reg_term;

    ReplyDelete
    Replies
    1. My first concern is "Why did you copied the code as it is ?".
      Understand the code and do it on your own. Above code is just for reference.

      I have tested all my codes multiple times and they are 100% correct.

      Delete
    2. HI Akshay, thanks for your great work, could you please explain the below code,

      DELTA2 = (DELTA3 * Theta2) .* [ones(size(Z2,1),1) sigmoidGradient(Z2)]; % 5000 x 26

      Delete
  9. Hi Akshay,
    My approach to this problem is somewhat similar to yours (i.e going from part 1 to part 3 in exact order). However the relative difference I am getting is far to high. Don't you think we should first calculate regularised cost function before implementing backpropogation? What difference will it make? Also I have used a slightly different function to generate randomized initial weights assuming it is random anyways. Can that cause any problem?

    ReplyDelete
  10. Hey,
    Why we are initialising epsilon as
    epsilon_init = sqrt(6)/(sqrt(L_in)+sqrt(L_out));
    Why by 6?

    ReplyDelete
  11. y_Vec = (1:num_labels)==y; ...what is this line of code doing ..plz help me get a clear picture. i it converting y to a vector

    ReplyDelete
    Replies
    1. This line of code does exact same operation as below:
      y_Vec = zeros(m,num_labels);
      for i = 1:m
      y_Vec(i,y(i)) = 1;
      end

      Delete
    2. yeah thats what you have also commented in the code as well, but brother will you please elaborate, why it is done and also please provide a picture if you can, i tried it but i cannot visualize it

      Delete
    3. %%%%% WORKING %%%%%
      % y_Vec = zeros(m,num_labels);
      % for i = 1:m
      % y_Vec(i,y(i)) = 1;
      % end
      %%%%%%%%%%%%%%%%%%% OR

      y_Vec = (1:num_labels)==y;
      ----------------------------------------------------------
      % yVector = (1:num_labels)'==y(t);

      why it is in the code, what it is doing and how to visualize this, Brother?. Thanks From Gujarat?

      Delete
    4. Please read about vectorized implemention. vectorized implementation is faster than for loop because it does operations on the all elements of the matrix / array simultaneously.

      About the above code, It is "%Converting y into vector of 0's and 1's for multi-class classification"
      1. Initialing y_Vec as Zero matrix
      2. then as per the multiclass, it is assigning 1 for correct category.
      NOTE: each row in y_Vec correspond to one inplut example (x)
      each column in y_vec represent output category. (1 represent correct category, 0 means incorrect)

      Delete
  12. Cheers from brazil,my friend.
    This was of great help!!

    ReplyDelete
  13. Hi Akshay,

    Thanks for the guide. I had a question regarding the following code:

    Theta1_grad_reg_term = (lambda/m) * [zeros(size(Theta1, 1), 1) Theta1(:,2:end)]; % 25 x 401
    Theta2_grad_reg_term = (lambda/m) * [zeros(size(Theta2, 1), 1) Theta2(:,2:end)]; % 10 x 26

    I'm not sure why the zeros(size(Theta1, 1), 1) is combined with the Theta1. I know from the lectures the formula to calculate the reg term is just:

    lambda/m * Thetaij ^ (L).

    Can you help me get a better understanding? I wasn't able to finish this part before the due date.

    ReplyDelete
    Replies
    1. As per the theory in lectures, you don't apply regularization on theta_0, so, to exclude theta_0 from regularization we have replaced it with 0. so, regularization effected will be nullified for theta_0. (since, 0 multiplied by anything is 0).

      In other words, we have applied regularization on 2nd entry to last entry using Theta1(:,2:end) and first entry i.e. theta_0 is replaced by 0 using zeros(size(Theta1, 1), 1))

      Delete
  14. Hey,i'm get this type of error in random.....m
    ਀warning: function 'randInitializeWeights' defined within script file 'C:\Users\admin\Desktop\ML i
    mportant\machine-learning-ex4\ex4\randInitializeWeights.m'
    error: 'L_out' undefined near line 14 column 11
    error: called from
    randInitializeWeights at line 14 column 3

    ReplyDelete
  15. i know it doesn't matter much, but isn't it "sqrt(L_in+L_out)"
    not "sqrt(L_in)+sqrt(L_out)"

    ReplyDelete
  16. hi
    i am getting my answer as 0.0000 for everything be it forward propagation, backpropagation.. could you please tell where i can be wrong?

    ReplyDelete
    Replies
    1. Maybe you should check your randomInitializeWeights

      Delete
  17. Thank you so much, clearly explained each line of code.Its very helpful.

    ReplyDelete
  18. Thank you, its really helping me a lot how to figure out the code and steps.

    ReplyDelete
  19. Hey Akshay it's a great work

    ReplyDelete
  20. I’m getting 0.0000 for all the answers and I have checked for errors, there are none.

    ReplyDelete
  21. Hi! In
    [y_Vec = zeros(m,num_labels);
    for i = 1:m
    y_Vec(i,y(i)) = 1;
    end]
    and
    [y_Vec = (1:num_labels)==y; ]

    Is y_Vec a 5000x10 matrix?? Why so?? Or should it be 10x1??

    ReplyDelete
  22. Yes, y_Vec is 5000x10 i.e. m x num_labels

    y_Vec = (1:num_labels)==y; % m x num_labels == 5000 x 10

    ReplyDelete
  23. Just wanna say thank you! Seeing the differences between the iterative and vectorized approaches helped a lot!

    ReplyDelete
    Replies
    1. You are always welcome. Glad to know that my work helped you.

      Delete
  24. Hi.Thanks for your help.I had a question:
    I used this cost function (J = (1/m)*(-y'*log(h)-((1-y)')*log(1-h))) for my logistic regression assignment and it worked. I was wondering why can't I use the same thing here for neural network like (J = (1/m)* sum(sum((-y'*log(h)-((1-y)')*log(1-h)))) ) since the cost function formulas of both logistic regression and neural network look almost the same.

    ReplyDelete
  25. Here y and log(h) are two dimensional while in case of logistic regression they were one dimensional.

    ReplyDelete
  26. I'm getting the submission failed and the following error. Please help me!!
    unexpected error: Out of memory. The likely cause is an infinite recursion within the program.
    !! Please try again later.

    ReplyDelete
  27. Im the regularization for Theta_grad, why is the zeros necessary?

    ReplyDelete
  28. can u explain the step wise code used in each of the assignments

    ReplyDelete
    Replies
    1. I have already written comments for important lines of code.
      Please refer those comments.

      Delete
  29. I highly appreciate you explainations within the code. And also you have given the code in a very much step by step manner making each and everyone understands the code and the logic involved in it. GOOD JOB

    ReplyDelete
  30. why do you considered h_x = a3 ?

    ReplyDelete
    Replies
    1. because a3 is the output of 3rd layer and also the final output of the Neural Network. which mean a3 is the predicted output (Which is also denoted by h_x).
      That's why h_x = a3.

      For more & better clarification, go through the respective theory video from the course.

      Delete
  31. sh: 1: curl: not found
    [error] submission with curl() was not successful

    !! Submission failed: Grader sent no response


    Function: submitWithConfiguration>validateResponse
    FileName: /home/bhargava/Downloads/ex4/lib/submitWithConfiguration.m
    LineNumber: 156

    Please correct your code and resubmit.
    I am getting this error. Can someone please help?

    ReplyDelete
  32. Could you explain
    DELTA2 = (DELTA3 * Theta2) .* [ones(size(Z2,1),1) sigmoidGradient(Z2)]; % 5000 x 26
    Because DELTA3 = 5000x10 and Theta2 = 10x26, how does this product work?

    ReplyDelete
  33. The codes and comments were very helpful Thank you.

    ReplyDelete
  34. Error in submitWithConfiguration (line 4)
    parts = parts(conf);

    Error in submit (line 35)
    submitWithConfiguration(conf);

    ReplyDelete
  35. Submission failed: unexpected error: Undefined function 'makeValidFieldName' for input arguments of type 'char'.

    I am getting this error after am submitting
    Can anyone help me out

    ReplyDelete
Post a Comment
Previous Post Next Post