Well yes, I know how to actually partition the data. The problem is you can't throw a cell array of separate data sets into fitcensemble and have it calculate a kfold loss across the entire thing
Info
이 질문은 마감되었습니다. 편집하거나 답변을 올리려면 질문을 다시 여십시오.
Explicit indices for k-fold partitioning
조회 수: 1 (최근 30일)
이전 댓글 표시
Is there any way to explicity provide the indices of each partition in a k-fold partition? I'd like to find optimal hyperparameters, but all the methods seem to either sequentially or randomly divide up the data. My data evolves over time, where each time step has a different number of observations. Doing things either sequentially or randomly results in 'looking into the future'. I'd like the partitions to reflect the information I have up to that time, and predict the response for next time to obtain a kfoldloss.
(Time itself has no relevance however, so this isn't amenable to time-series type analysis. It's a classification problem)
thanks in advance
anthony
답변 (1개)
Adam Danz
2020년 9월 11일
편집: Adam Danz
2020년 9월 14일
Perhaps something like
x = 1:100; % demo vector
k = 5; % 5-partitions
folds = cell(k,1);
for i = 1:k
folds{i} = x(i:k:end);
end
Though, those partitions are far from randomized but they maintain temporal order. To fix that, you could 1) create a grouping variable for each segment, randomize the segments, and the execute the loop above on the randomized segments.
Alternatively, you could use stratified sampling within subgroups using
but that only ensure that each group is represented equally, it will not maintain the temporal order of your data.
댓글 수: 1
이 질문은 마감되었습니다.
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!