signalDatastore of a large Dataset for feedforward training

조회 수: 5 (최근 30일)
Daniele
Daniele 2024년 12월 19일
답변: Gayathri 2024년 12월 23일
i'm trying to train a feedforward net with a very large number of files andh datas (approx 13k files more than 3000 rows each). not being able to fit every data in a single matrix for the training, i tried to build a signal datastore and give it to the network, but i always receive the same error: 'Error using trainNetwork (line 191)
Invalid training data. The output size (1) of the last layer does not match the response size (2201).
Error in NN_datastore_v2 (line 76)
net=trainNetwork(sdsTrain, layers,options);'.
where is the mistake? i suppose it's in the readfunction, maybe the format? i tried several options but i can't seem to get the right combination. please help.
here's the full code:
clc
clear all;
Folders="********";
sds=signalDatastore(Folders,"IncludeSubfolders",true,"ReadFcn", @dataproc, 'FileExtensions','.txt');
numFiles = numel(sds.Files);
rng('default'); % Per la riproducibilità
fileIndices = randperm(numFiles);
trainRatio = 0.7;
valRatio = 0.15;
numTrain = floor(trainRatio * numFiles);
numVal = floor(valRatio * numFiles);
% Indici per ciascun set
trainIdx = fileIndices(1:numTrain);
valIdx = fileIndices(numTrain+1:numTrain+numVal);
testIdx = fileIndices(numTrain+numVal+1:end);
% Crea i sottodatastore
sdsTrain = subset(sds, trainIdx);
sdsVal = subset(sds, valIdx);
sdsTest = subset(sds, testIdx);
%%
layers = [
featureInputLayer(429, "Normalization", "zscore")
reluLayer
...
fullyConnectedLayer(1)
regressionLayer
];
options = trainingOptions('adam', ...
'MaxEpochs', 1000, ...
'MiniBatchSize', 64, ...
'ValidationData',sdsVal,...
'OutputNetwork','best-validation',...
'Verbose',true');
%%
net=trainNetwork(sdsTrain, layers,options);
%%
%%
function data=dataproc(filename)
l_max=2500;
% opts = detectImportOptions(filename, 'Delimiter','\t');
opts=delimitedTextImportOptions("NumVariables", 442);
opts.Delimiter = "\t";
fixedVariableNames = [******];
dynamicVariableNames = "Gage" + string(1:429);
opts.VariableNames = [fixedVariableNames, dynamicVariableNames];
opts.VariableTypes = repmat("double", 1, 442);
opts=setvaropts(opts, "DecimalSeparator", ",");
tableData = readtable(filename, opts);
dataNumeric=table2array(tableData);
if size(dataNumeric,1) <l_max
data={};
return
end
% if size(dataNumeric,1) > l_max
Fz= dataNumeric(300:l_max,strcmp(fixedVariableNames, 'FzN'));
lambdas = dataNumeric(300:l_max, 14:end);
[b_butter, a_butter] = butter(7, 0.03); % Filtro passa-basso
window_size = 5; % Finestra per filtro mediano
outlierIndices = isoutlier(lambdas, 'mean');
lambdas(outlierIndices) = nan;
lambdas = fillmissing(lambdas, 'linear');
strain_filt = medfilt1(lambdas, window_size)
filtered_force = filtfilt(b_butter, a_butter, Fz);
% data.X =strain_filt;
% data.Y =filtered_force;
data = {strain_filt, filtered_force};
% end
end
  댓글 수: 2
Abhaya
Abhaya 2024년 12월 19일
Hi Daniele, could you please provide the data you're using to train the network?
Daniele
Daniele 2024년 12월 19일

I'm sorry but I can't, they are text files taken from sperimentation. The "lambda" are 2201x429 (samples x features) and the forces 2201 x1 (samples x feature) for each file

댓글을 달려면 로그인하십시오.

답변 (1개)

Gayathri
Gayathri 2024년 12월 23일
As per my understanding, each of your files have 2201 samples. But the network outputs only one sample as the number of neurons in the last "fullyConnectedLayer" is 1. Please replace this line of code with the following code.
fullyConnectedLayer(2201)
This would most probably solve the issue you are facing. I have not implemented the code at my end, as I do not have access to the input data.
For more information about "fullyConnectedLayer", please refer to the below link.
Hope you find this information helpful!

카테고리

Help CenterFile Exchange에서 AI for Signals에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by