How can I do a 80-20 split on datasets to obtain training and test datasets?

조회 수: 27 (최근 30일)
I tried [training, test] = partition (faceDatabase, [0.8, 0.2]); but it gives me error. Can anyone help? Are there ways to do this manually? I can't find a function for this!
  댓글 수: 2
Chidiebere Ike
Chidiebere Ike 2018년 3월 15일
OK. Thanks for your response. I will give it a try. But can this be achieved via a for loop??

댓글을 달려면 로그인하십시오.

채택된 답변

KSSV
KSSV 2018년 3월 15일
Let P and T be your input and target sets.
PD = 0.80 ; % percentage 80%
Ptrain = P(1:round(PD*length(T)),:) ; Ttrain = T(1:round(PD*length(T))) ;
Ptest = P(round(PD*length(T)):end,:) ;Ttest = T(round(PD*length(T)):end) ;
  댓글 수: 2
Chidiebere Ike
Chidiebere Ike 2018년 3월 15일
편집: Chidiebere Ike 2018년 3월 15일
I tried the code, it says "undefined function or variable T"... I will appreciate if you describe the letter P, T and length ... How do I resolve this. ?
Prasobhkumar P. P.
Prasobhkumar P. P. 2020년 11월 7일
P and T corresponds to each labels (or categories)

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Akira Agata
Akira Agata 2018년 3월 15일
편집: Akira Agata 2018년 3월 15일
If you want to randomly select 80% of your data as training dataset, please try following:
PD = 0.80 ; % percentage 80%
% Let P be your N-by-M input dataset
% Solution-1 (need Statistics & ML Toolbox)
cv = cvpartition(size(P,1),'HoldOut',PD);
Ptrain = P(cv.training,:);
Ptest = P(cv.test,:);
Another possible solution:
% Solution-2 (using basic MATLAB function)
N = size(P,1);
idx = randperm(N);
Ptrain = P(idx(1:round(N*PD)),:);
Ptest = P(idx(round(N*PD)+1:end),:);
  댓글 수: 1
Chidiebere Ike
Chidiebere Ike 2018년 3월 15일
Solution 1 gives an error message.. Error in cvpartition CV.Impl = internal.stats.cvpartitionInMemoryImpl(varargin{:});

댓글을 달려면 로그인하십시오.


Munshida P
Munshida P 2020년 1월 14일
This will help you.
[training,test] = partition(faceDatabase,[0.8 0.2]);

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by