How to randomly select data out of a dataset?

조회 수: 6 (최근 30일)
Ines
Ines 2012년 5월 29일
hey there, I want to randomly select 80% from my data to create a training dataset and use the residual 20% for the evaluation of my model obtained from the training dataset. How I can I best perform this split in matlab? (Actually I want to perform this split multiple times within a loop in order to be able to deliver a more robust result)

답변 (3개)

Wayne King
Wayne King 2012년 5월 29일
One way if you have the Statistics Toolbox is to use randsample
x = randn(1000,1);
y = randsample(length(x),800);
Another way if you don't have the Statistics Toolbox.
R = randperm(length(x));
indices = R(1:800);
y = x(indices);
  댓글 수: 1
Peter Perkins
Peter Perkins 2012년 5월 29일
In newer release of the Statistics Toolbox, you can/should use datasample, rather than randsample. It does some things better, and is perhaps a little easier to use.

댓글을 달려면 로그인하십시오.


Thomas
Thomas 2012년 5월 29일
I dont know if this will help..
Suppose your data is in a
a=rand(10,1); % generate random data
[trainingset]= intersect(a,randsample(a,8)) % gives training set with 8 random samples from a you can set what size your trainign set needs to be
testset=a(~ismember(a,trainingset)) % gives test set

Peter Perkins
Peter Perkins 2012년 5월 29일
Another possibility if you have the Statistics Toolbox is to use cvpartition. There are various ways to use it, from the simplest kind of "hold out" scheme that you describe, to more complicated k-fold cross-validation.

카테고리

Help CenterFile Exchange에서 Linear Regression에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by