crossvalind
Generate indices for training and test sets
Syntax
Description
___ = crossvalind(___,
                specifies additional options using one or more name-value pair arguments in addition
                to the arguments in previous syntaxes. For example, Name,Value)cvIndices =
                    crossvalind('HoldOut',Groups,0.2,'Class',{'Cancer','Control'})
                specifies to use observations from the 'Cancer' and 'Control' groups to generate
                indices that represent 20% of observations as the holdout set and 80% as the
                training set.
Examples
Create indices for the 10-fold cross-validation and classify measurement data for the Fisher iris data set. The Fisher iris data set contains width and length measurements of petals and sepals from three species of irises.
Load the data set.
load fisheririsCreate indices for the 10-fold cross-validation.
indices = crossvalind('Kfold',species,10);Initialize an object to measure the performance of the classifier.
cp = classperf(species);
Perform the classification using the measurement data and report the error rate, which is the ratio of the number of incorrectly classified samples divided by the total number of classified samples.
for i = 1:10 test = (indices == i); train = ~test; class = classify(meas(test,:),meas(train,:),species(train,:)); classperf(cp,class,test); end cp.ErrorRate
ans = 0.0200
Suppose you want to use the observation data from the setosa and virginica species only and exclude the versicolor species from cross-validation.
labels = {'setosa','virginica'};
indices = crossvalind('Kfold',species,10,'Classes',labels);indices now contains zeros for the rows that belong to the versicolor species.
Perform the classification again.
for i = 1:10 test = (indices == i); train = ~test; class = classify(meas(test,:),meas(train,:),species(train,:)); classperf(cp,class,test); end cp.ErrorRate
ans = 0.0160
Load the carbig data set.
load carbig;
x = Displacement; 
y = Acceleration;
N = length(x);Train a second degree polynomial model with the leave-one-out cross-validation, and evaluate the averaged cross-validation error. The function randomly selects one observation to hold out for the evaluation set, and using this method within a loop does not guarantee disjointed evaluation sets, and you may see a different CVerr for each run.
sse = 0; % Initialize the sum of squared error. for i = 1:100 [train,test] = crossvalind('LeaveMOut',N,1); yhat = polyval(polyfit(x(train),y(train),2),x(test)); sse = sse + sum((yhat - y(test)).^2); end CVerr = sse / 100;
Input Arguments
Cross-validation method, specified as a character vector or string.
This table describes the valid cross-validation methods. Depending on the
                        method, the third input argument (M) has different
                        meanings and requirements. 
| cvMethod | M | Description | 
|---|---|---|
| 
 | 
 | The method uses  K-fold
                                            cross-validation to generate indices. This method uses
                                                   | 
| 
 | 
 | The method randomly selects approximately
                                                   | 
| 
 | 
 | The method randomly selects  | 
| 
 | 
 | The method randomly selects
                                                   
 | 
Example: 'Kfold'
Data Types: char | string
Total number of observations or grouping information, specified as a positive integer, vector of positive integers, logical vector, or cell array of character vectors.
N can be a positive integer specifying the total
                        number of samples in your data set, for instance.
N can also be a vector of positive integers or
                        logical values, or a cell array of character vectors, containing grouping
                        information or labels for your samples. The partition of the groups depends
                        on the type of cross-validation. For 'Kfold', each group
                        is divided into M subsets, approximately equal in size.
                        For all other methods, approximately equal numbers of observations from each
                        group are selected for the evaluation (test) set. The training set contains
                        at least one observation from each group regardless of the cross-validation
                        method you use.
Example: 100
Data Types: double | cell
Cross-validation parameter, specified as a positive scalar between 0 and
                        1, positive integer, or two-element vector. Depending on the
                        cross-validation method, the requirements for M differ.
                        For details, see cvMethod.
Example: 5
Data Types: double
Name-Value Arguments
Specify optional pairs of arguments as
      Name1=Value1,...,NameN=ValueN, where Name is
      the argument name and Value is the corresponding value.
      Name-value arguments must appear after other arguments, but the order of the
      pairs does not matter.
    
      Before R2021a, use commas to separate each name and value, and enclose 
      Name in quotes.
    
Example: [train,test] =
                    crossvalind('LeaveMOut',groups,1,'Min',3) specifies to have at least
                three observations in each group in the training set when performing the
                leave-one-out cross-validation.
Class or group information, specified as the comma-separated pair
                            consisting of 'Classes' and a vector of positive
                            integers, character vector, string, string vector, or cell array of
                            character vectors. This option lets you restrict the observations to
                            only the specified groups.
This name-value pair argument is applicable only when you specify
                                N as a grouping variable. The data type of
                                'Classes' must match that of
                                N. For example, if you specify
                                N as a cell array of character vectors
                            containing class labels, you must use a cell array of character vectors
                            to specify 'Classes'. The output arguments you
                            specify contain the value 0 for observations
                            belonging to excluded classes.
Example: 'Classes',{'Cancer','Control'}
Data Types: double | cell
Minimum number of observations for each group in the training set,
                            specified as the comma-separated pair consisting of
                                'Min' and a positive integer. Setting a large
                            value can help to balance the training groups, but causes partial
                            resubstitution when there are not enough observations.
This name-value pair argument is not applicable for the
                                'Kfold' method.
Example: 'Min',3
Data Types: double
Output Arguments
Cross-validation indices, returned as a vector.
If you are using 'Kfold' as the cross-validation
                        method, cvIndices contains equal (or approximately
                        equal) proportions of the integers 1 through M, which
                        define a partition of the N observations into
                            M disjointed subsets.
For other cross-validation methods, cvIndices is a
                        logical vector containing 1s for  observations that belong to the training
                        set and 0s for observations that belong to the test (evaluation) set.
Training set, returned as a logical vector. This argument specifies which observations belong to the training set.
Test set, returned as a logical vector. This argument specifies which observations belong to the test set.
Version History
Introduced before R2006a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)