How to change activation function for fully connected layer in convolutional neural network?

Question

Taylor Smith 2017년 6월 27일

1
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/346488-how-to-change-activation-function-for-fully-connected-layer-in-convolutional-neural-network

댓글: Maxime Bezanilla 2019년 3월 14일

I'm in the process of implementing a wavelet neural network (WNN) using the Series Network class of the neural networking toolbox v7. While executing a simple network line-by-line, I can clearly see where the fully connected layer multiplies the inputs by the appropriate weights and adds the bias, however as best I can tell there are no additional calculations performed for the activations of the fully connected layer. It was my general understanding that standard perceptrons always have an activation/transfer function and was fully expecting to see the familiar sigmoid. However, it appears that the fully connected layer, as implemented here, assumes the identity operation as the transfer function (or, equivalently, no transfer function at all).

1) Do fully connected layers use an activation function, or are the outputs simply the weighted sums of the inputs with the addition of the bias? My initial assumption is no since I see activations greater than +1 (see example code at bottom)

2) If an activation function is used, does anyone have any suggestions where I might find and/or alter the source? I have examined the FullyConnected class and definition files and the FullyConnectedGPU(HOST)Strategy, the latter of which has the actual multiplication by weight and addition of bias.

3) If I want to use a custom activation function (in this case a wavelet), is it safe for me to simply apply said transfer function following the weighting and addition of bias? For example, if I wanted to modify a FullyConnectedLayer to have a tanh activation function, for the forward pass could I simply alter the forward method as follows? (obviously changes to the backward pass and gradient determination would also be required for the full implementation):

classdef FullyConnectedGPUStrategy < nnet.internal.cnn.layer.util.ExecutionStrategy
...
function [Z, memory] = forward(~, X, weights, bias)
  Z = iForwardConvolveOrMultiply(X, weights);
  Z = Z + bias;
  Z = tanh(z);    %addition of activation function
  memory = [];
end

Example code to illustrate problem:

%Generate training data
[XTrain, YTrain] = digitTrain4DArrayData;
%Define layers
layers = [ ...
  imageInputLayer([28 28 1])
  fullyConnectedLayer(10)
  softmaxLayer()
  classificationLayer()];
%Train network using stochastic gradient descent with momentum 
options = trainingOptions('sgdm');
net = trainNetwork(XTrain, YTrain, layers, options);
%View activations of fully connected layer
%Note: When testing this I see activations greater than +1 and
%less than 0, so it can't be using tanh or sigmoid
activations(net,XTrain(:,:,:,1),2)

Note: The reason I chose to use the Series Network class used for CNNs as opposed to the generic Neural Network class is because the output of the WNN will need to act as the input to a CNN which will then be trained together as one unit.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Greg Heath 2018년 8월 18일

MATLAB Online에서 열기

Before reading your question, let me state:

1. I am an engineer, not a mathematician. So, my following statements may not be as precise as some would like. However, I believe it should be perfectly clear what I am stating:

The STANDARD UNIVERSAL APPROXIMATOR single hidden layer regression net has

    1. A nonlinear hidden layer transfer function
    2. A LINEAR output layer transfer function

I'm stating this because it is obvious that some believe that, for a universal approximator, the standard output transfer function has to be nonlinear.

Of course there are additional conditions on finiteness, etc which I have omitted, but I think I have made my point.

Hope this Helps,

Greg

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Joss Knight 2017년 6월 28일

4
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/346488-how-to-change-activation-function-for-fully-connected-layer-in-convolutional-neural-network#answer_272217

Activations are added as a separate layer, and in R2017a there is only the RelULayer (see reluLayer).

Custom layers have not been introduced yet, so you'd have to be hacking or masking the toolbox files, but that's fine. You could take a copy of the RelULayer classes and modify them, or just edit your MATLAB install directly if you think that's safe.

댓글 수: 10
이전 댓글 8개 표시이전 댓글 8개 숨기기

Balakrishnan Rajan 2018년 8월 23일

@wenyi I think the backprop has to be: dLdX = Z*(1-Z)*dLdZ;

Maxime Bezanilla 2019년 3월 14일

MATLAB Online에서 열기

@wenyi Thank you for your work. It however has a few mistakes in it.

I know that the topic is old but I am sure this can help some people, so I post my code for the sigmoid layer based on the one of @wenyi and @Balakrishnan_Rajan. There was also a mistake with the "~".

 classdef sigmoidLayer < nnet.layer.Layer
    methods
        function layer = sigmoidLayer(name) 
            % Set layer name
            if nargin == 2
                layer.Name = name;
            end
            % Set layer description
            layer.Description = 'sigmoidLayer'; 
        end
        function Z = predict(layer,X)
            % Forward input data through the layer and output the result
            Z = exp(X)./(exp(X)+1);
        end
        function dLdX = backward(layer, X ,Z,dLdZ,memory)
            % Backward propagate the derivative of the loss function through 
            % the layer 
            dLdX = Z.*(1-Z) .* dLdZ;
        end
    end
 end

This is accepted by checkLayer.

댓글을 달려면 로그인하십시오.

How to change activation function for fully connected layer in convolutional neural network?

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 10
이전 댓글 8개 표시이전 댓글 8개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

How to change activation function for fully connected layer in convolutional neural network?

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 10 이전 댓글 8개 표시이전 댓글 8개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 10
이전 댓글 8개 표시이전 댓글 8개 숨기기