How to partition data in cells for validation in machine learning model?

조회 수: 3 (최근 30일)
Isabelle Museck
Isabelle Museck 2024년 8월 1일
댓글: Isabelle Museck 2024년 8월 13일
Hello there , I have training data for 4 trials stores in a 4x1 cell named "trainingdataX" and "trainingdataY" as whoen here and I am trying to pull out 15 percent of all this data for validation purposes and store it in variables "Xval" and "Yval". How would I be able to do this if the data is stored in a cells corresponding to the trials and ensure the corresponding value is partioned out for validation too? Any help is greatly appreciated!
%Exclude Data for Val
rng('default')
n = %im not sure what to put here to have it pull data from each of the 4 trials
partition = cvpartition(n,'Holdout',0.15);
idxTrain = training(partition);
FinalTrainX = trainingdataX(idxTrain,:)
FinalTrainY = trainingdataY(idxTrain,:)
idxNew = test(partition);
Xval = trainingdataX(idxNew,:)
Yval = trainingdataY(idxNew,:)

답변 (2개)

YERRAMADAS
YERRAMADAS 2024년 8월 1일
Use the cross-validation method to maximize the data available for each of these sets
  댓글 수: 1
Isabelle Museck
Isabelle Museck 2024년 8월 13일
Could you possibly epand more on how to use the cross-validation method in MATLAB?

댓글을 달려면 로그인하십시오.


Aditya
Aditya 2024년 8월 1일
To partition data stored in cells for validation, you need to first concatenate the data from all trials into single matrices. After partitioning, you can then split the data back into the training and validation sets.
before moving forward you need to transpose your X and Y data, so that each row of X can correspond to the row of Y.
Here's a sample code for this:
% sample data
trainingdataX = cell(4, 1);
trainingdataY = cell(4, 1);
for i = 1:4
trainingdataX{i} = rand(541, 63);
trainingdataY{i} = rand(541, 1);
end
% Concatenate data
allX = vertcat(trainingdataX{:});
allY = vertcat(trainingdataY{:});
% Partition data (15% holdout for validation)
rng('default'); % For reproducibility
partition = cvpartition(size(allX, 1), 'Holdout', 0.15);
idxTrain = training(partition);
idxVal = test(partition);
% Split into training and validation sets
FinalTrainX = allX(idxTrain, :);
FinalTrainY = allY(idxTrain, :);
Xval = allX(idxVal, :);
Yval = allY(idxVal, :);
% Display results
fprintf('Training data X size: %dx%d\n', size(FinalTrainX, 1), size(FinalTrainX, 2));
fprintf('Training data Y size: %dx%d\n', size(FinalTrainY, 1), size(FinalTrainY, 2));
fprintf('Validation data X size: %dx%d\n', size(Xval, 1), size(Xval, 2));
fprintf('Validation data Y size: %dx%d\n', size(Yval, 1), size(Yval, 2));
I hope this helps!
  댓글 수: 2
Isabelle Museck
Isabelle Museck 2024년 8월 1일
Hello Aditya,
Thank you so much for your help. This makes alot of sense however when I try intergrate the code qwith my data is not vertically concatoning the data in the cells properly. I am ending up with "allX" being a 252x541 double and "allY" being a 4x541 double as shown here:
When I run the code you provided, I should be getting a 2163x63 double for "allX" and a 2164x1 double for "allY". Do you know why it may not be concatonating correclty for me and my data?
Aditya
Aditya 2024년 8월 1일
편집: Aditya 2024년 8월 1일
As mentioned in my post that your initial data is in shape: 63X541 & 1X541, which is incorrect for vertical concat, for this you need to take the transpose of it and use it:
Inorder to transpose it you can use the below line of code:
% Transpose each cell using cellfun
trainingdataX = cellfun(@transpose, trainingdataX, 'UniformOutput', false);
trainingdataY = cellfun(@transpose, trainingdataY, 'UniformOutput', false);
or you can do it manually using the for loop!
Hope this clarifies your doubt!

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by