Training a CNN model with Numerical Data for Binary Classification

Question

Emmanuel 2023년 7월 10일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1994033-training-a-cnn-model-with-numerical-data-for-binary-classification

편집: Emmanuel 2023년 7월 12일

I want to train a CNN Model for Binary classification on numeric datasets extracted from level (1-4) and the approximate coefficient level of Discrete Wavelet Transform decomposition. The data have been partitioned into Training, Validation and Test sets, and stored as seperate CSV file format with corresponding labels.

Input Data reshaped as (4D double):

Train data size is : 5x30660

Test size: 5x6570

Validation size: 5x6570

While the responses(categorical) are:

XTrain label size: 5x30660,

XTest label size:5x6570,

XValidation label size: 5x6570. I used the imageInputLayer as input and Convolution1DLayer as seen in the code provided.

This is the Error message from this code:

"Error in another (line 107)

net = trainNetwork(XTrain, categorical(YTrain), layers, options);

Caused by:

Layer 'Conv_1': Input data must have one spatial dimension only, one temporal dimension only, or one of each. Instead, it

has 2 spatial dimensions and 0 temporal dimensions."

This is the code:

% Initializing empty arrays for data and labels

allTrainData = cell(1, 5);

allTrainLabels = cell(1, 5);

allValidationData = cell(1, 5);

allValidationLabels = cell(1, 5);

allTestData = cell(1, 5);

allTestLabels = cell(1, 5);

% Loading and concatenating the training datasets

trainDataFiles = ["activetrain.csv", "ambienttrain.csv", "generatedtrain.csv", "moduletrain.csv", "radiationtrain.csv"];

trainLabelFiles = ["labeltrainactive.csv", "labeltrainambient.csv", "labeltraingenerated.csv", "labeltrainmodule.csv", "labeltrainradiation.csv"];

for i = 1:5

trainData = load(trainDataFiles(i));

trainLabels = load(trainLabelFiles(i));

% Extracting the numeric arrays from the structure arrays

allTrainData{i} = trainData; % Store the numeric arrays directly

allTrainLabels{i} = trainLabels;

end

% Loading and concatenating the validation datasets

validationDataFiles = ["activevalid.csv", "ambientvalid.csv", "generatedvalid.csv", "modulevalid.csv", "radiationvalid.csv"];

validationLabelFiles = ["labelvalidactive.csv", "labelvalidambient.csv", "labelvalidgenerated.csv", "labelvalidmodule.csv", "labelvalidradiation.csv"];

for i = 1:5

validationData = load(validationDataFiles(i));

validationLabels = load(validationLabelFiles(i));

% Extracting the numeric arrays from the structure arrays

allValidationData{i} = validationData; % Store the numeric arrays directly

allValidationLabels{i} = validationLabels;

end

% Loading and concatenating the test datasets

testDataFiles = ["activetest.csv", "ambienttest.csv", "generatedtest.csv", "moduletest.csv", "radiationtest.csv"];

testLabelFiles = ["labeltestactive.csv", "labeltestambient.csv", "labeltestgenerated.csv", "labeltestmodule.csv", "labeltestradiation.csv"];

for i = 1:5

testData = load(testDataFiles(i));

testLabels = load(testLabelFiles(i));

% Extract the numeric arrays from the structure arrays

allTestData{i} = testData; % Store the numeric arrays directly

allTestLabels{i} = testLabels;

end

% Reshaping the input data to 4D tensor: [height, width, channels, samples]

inputHeight = 1;

inputWidth = 5; %length of your input data

numChannels = 5; % 5 coefficient levels

numTrainSamples = size(allTrainData{i}, 2);

numTestSamples = size(allTestData{i}, 2);

numValidationSamples = size( allValidationData{i}, 2);

XTrain = reshape( allTrainData{i}, inputHeight, inputWidth, numChannels, numTrainSamples);

XTest = reshape( allTestData{i}, inputHeight, inputWidth, numChannels, numTestSamples);

XValidation = reshape( allValidationData{i}, inputHeight, inputWidth, numChannels, numValidationSamples);

% Normalizing the input data

XTrain = normalize(XTrain);

XTest = normalize(XTest);

XValidation = normalize(XValidation);

% Converting the labels to categorical format

YTrain = categorical(cell2mat(allTrainLabels));

YTest = categorical(cell2mat(allTestLabels));

YValidation = categorical(cell2mat(allValidationLabels));

% Defining the CNN architecture

layers = [

imageInputLayer([1 5 5],"Name","Input","Normalization","zscore")

convolution1dLayer(3,8,"Name","Conv_1","Padding","same")

batchNormalizationLayer("Name","Bnorm")

reluLayer("Name","relu_1")

maxPooling1dLayer(2, "Padding", "same", "Stride", 2)

convolution1dLayer(3,16,"Name","Conv_2","Padding","same")

batchNormalizationLayer("Name","Bnorm_2")

reluLayer("Name","relu_2")

maxPooling1dLayer(2,"Padding","same","Stride",2)

convolution1dLayer(3,32,"Name","Conv_3","Padding","same")

batchNormalizationLayer("Name","Bnorn_3")

reluLayer("Name","relu_3")

maxPooling1dLayer(2,"Padding","same","Stride",2)

convolution2dLayer(3,64,"Name","Conv_4","Padding","same")

batchNormalizationLayer("Name","BNorm_4")

reluLayer("Name","relu_4")

fullyConnectedLayer(8,"Name","FC_1","WeightLearnRateFactor",0.01)

reluLayer("Name","relu_5")

fullyConnectedLayer(2,"Name","FC_2","WeightLearnRateFactor",0.01)

softmaxLayer("Name","Softmax_layer")

classificationLayer("Classes","auto")];

plot(layerGraph(layers));

% Setting the training options

options = trainingOptions('adam', ...

'InitialLearnRate', 0.01, ...

'MaxEpochs', 5, ...

'MiniBatchSize', 32, ...

'ValidationData', {XValidation, YValidation}, ...

'ValidationFrequency', 10, ...

'Verbose', true, ...

'Plots', 'training-progress');

% Training the CNN model

net = trainNetwork(XTrain, categorical(YTrain), layers, options);

% Perform anomaly detection on the test dataset

YTestPred = classify(net, XTest);

% Evaluating the performance

accuracy = sum(YTestPred == YTest) / numel(YTest);

disp(['Test Accuracy: ', num2str(accuracy)]);

Kindly share your thought. Thanks

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

ProblemSolver 2023년 7월 10일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1994033-training-a-cnn-model-with-numerical-data-for-binary-classification#answer_1270468

편집: ProblemSolver 2023년 7월 10일

MATLAB Online에서 열기

Hello @Emmanuel:

There are couple of things that you have overlooked and therefore causing the error issues:

Since you are working with the numerical data only, I would suggest changing from "cell" to "zeros" to optimize the code:

allTrainData = zeros(5, 30660);
allTrainLabels = zeros(5, 30660);
allValidationData = zeros(5, 6570);
allValidationLabels = zeros(5, 6570);
allTestData = zeros(5, 6570);
allTestLabels = zeros(5, 6570);

You have not properly concatenated the loaded data into your variables "allTrainData" and "allTrainLabels". Therefore, instead of creating separate arrays for each dataset, you can do is combine them in a single loop such as:

trainDataFiles = ["activetrain.csv", "ambienttrain.csv", "generatedtrain.csv", "moduletrain.csv", "radiationtrain.csv"];
trainLabelFiles = ["labeltrainactive.csv", "labeltrainambient.csv", "labeltraingenerated.csv", "labeltrainmodule.csv", "labeltrainradiation.csv"];
for i = 1:5
    trainData = load(trainDataFiles(i));
    trainLabels = load(trainLabelFiles(i));
    
    allTrainData(i, :) = trainData(:)';
    allTrainLabels(i, :) = trainLabels(:)';
end

The dimmensions of your reshaping the data input is wrong:

inputHeight = 1;
inputWidth = 5; % length of your input data
numChannels = 5; % 5 coefficient levels
numTrainSamples = size(allTrainData, 2);
numTestSamples = size(allTestData, 2);
numValidationSamples = size(allValidationData, 2);
XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, numTrainSamples);
XTest = reshape(allTestData, inputHeight, inputWidth, numChannels, numTestSamples);
XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, numValidationSamples);

You original code did account for normalizing the dataset. I am not sure if it is required or not, but I use it for my data set:

XTrain = normalize(XTrain);
XTest = normalize(XTest);
XValidation = normalize(XValidation);

Finally, I see that the error generated is because that your input shape of the 'imageInputLayer' is incorrect, therefore, adjust that based on the reshaped input data something like this:

layers = [
    imageInputLayer([inputHeight, inputWidth, numChannels], "Name", "Input", "Normalization", "zscore")
    % Rest of the layers...
];

I hope these suggestions helps you to solve the error.

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Emmanuel 2023년 7월 11일

편집: Emmanuel 2023년 7월 11일

MATLAB Online에서 열기

@ProblemSolver, Thank you for your detailed response and input. It is highly appreciated.

However, I got the error below from this Reshaping:

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, numTrainSamples);
XTest = reshape(allTestData, inputHeight, inputWidth, numChannels, numTestSamples);
XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, numValidationSamples);

Error using reshape

Number of elements must not change. Use [ ] as one of the size inputs to automatically calculate the appropriate size for that dimension.

Error in (line 60)

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, numTrainSamples)

When I tried doing this, it went back to getting the error below:

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, []);
XTest = reshape(allTestData, inputHeight, inputWidth, numChannels, []);
XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, []);

Error in (line 117)

net = trainNetwork(XTrain, categorical(YTrain), layers, options);

Caused by:

Layer 'Conv_1': Input data must have one spatial dimension only, one temporal dimension only, or one of each. Instead, it has 2 spatial dimensions and 0 temporal dimensions.

ProblemSolver 2023년 7월 11일

@Emmanuel: I need to check your .csv files. You have to send me some base data that I know what I am dealing with. The structure of the tables and all.

Emmanuel 2023년 7월 11일

편집: Emmanuel 2023년 7월 12일

@ProblemSolver, I was able to find a way around it using this:

% Reshaping the input data to 4D tensor: [height, width, channels, samples]

inputHeight = 1;

inputWidth = 1;

numChannels = 1;

numTrainSamples = size(allTrainData, 2);

numTestSamples = size(allTestData, 2);

numValidationSamples = size(allValidationData, 2);

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, []);

XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, []);

% Normalize Input Data.......

% Convert response to categorical vector........

% Define the CNN architecture

layers = [

imageInputLayer([inputHeight, inputWidth, numChannels], "Name", "Input", "Normalization", "zscore")

Thank you very much for your initial response.

댓글을 달려면 로그인하십시오.

Training a CNN model with Numerical Data for Binary Classification

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Training a CNN model with Numerical Data for Binary Classification

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기