필터 지우기
필터 지우기

error using the function 'splitapply'

조회 수: 6 (최근 30일)
Alberto Acri
Alberto Acri 2023년 11월 30일
댓글: Alberto Acri 2023년 11월 30일
Hi! I have a problem about using 'splitapply'.
Using the matrix 'matrix_out_98' the code works, but using the matrix 'matrix_out_12' does not work.
Would anyone be able to help me understand why this?
% matrix_out = importdata("matrix_out_12.mat");
matrix_out = importdata("matrix_out_98.mat");
max_matrix_out = max(matrix_out(:,2));
max_matrix_out_r = floor(max_matrix_out*10)/10;
%Mention the bins to group data in
j = [0 0.1:0.1:max_matrix_out_r Inf];
%Discretize the data
idx2 = discretize(matrix_out(:,2),j);
%Get the sum of the 2nd column according to the groups
out2 = splitapply(@(x) sum(x), matrix_out(:,2), idx2);
Error:
Error using splitapply (line 111)
For N groups, every integer between 1 and N must occur at least once in the vector of group numbers.
  댓글 수: 2
Dyuman Joshi
Dyuman Joshi 2023년 11월 30일
이동: Dyuman Joshi 2023년 11월 30일
accumarray will be a better fit here.
matrix_out_12 = importdata("matrix_out_12.mat");
matrix_out_98 = importdata("matrix_out_98.mat");
out_12 = fun(matrix_out_12)
out_12 = 22×1
0.7000 1.4500 7.0500 3.5000 4.7500 6.2700 3.7600 11.0000 6.6800 5.6100
out_98 = fun(matrix_out_98)
out_98 = 17×1
2.3100 5.1200 7.9900 5.3200 2.1600 4.4100 4.5100 5.9000 2.5300 8.4900
function out = fun(matrix_out);
max_matrix_out = max(matrix_out(:,2));
max_matrix_out_r = floor(max_matrix_out*10)/10;
%Mention the bins to group data in
j = [0 0.1:0.1:max_matrix_out_r Inf];
%Discretize the data
idx = discretize(matrix_out(:,2),j);
%Get the sum of the 2nd column according to the groups
out = accumarray(idx, matrix_out(:,2), [], @sum);
end
Dyuman Joshi
Dyuman Joshi 2023년 11월 30일
이동: Dyuman Joshi 2023년 11월 30일
splitapply, as the error message states, requires there to be atleast 1 data element in every bin/group in which the data is discretized.
There's no such requirement with accumarray(), where it fills the gap automatically - with the default value of 0, or manually as the value specified (for example NaN)

댓글을 달려면 로그인하십시오.

채택된 답변

Stephen23
Stephen23 2023년 11월 30일
편집: Stephen23 2023년 11월 30일
S = load('matrix_out_12.mat') % LOAD is better than IMPORTDATA
S = struct with fields:
matrix_out: [159×2 double]
max_matrix_out = max(S.matrix_out(:,2));
max_matrix_out_r = floor(max_matrix_out*10)/10; % use ROUND(..,1)
j = [0:0.1:max_matrix_out_r,Inf];
idx2 = discretize(S.matrix_out(:,2),j);
Lets check if all of those groups exist in your data:
ismember(1:max(idx2),idx2) % Nope, some don't:
ans = 1×22 logical array
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1
SPLITAPPLY cannot cope with that: it applies the function to every group from 1 to N, it does not know how to apply your function to data if that data simply does not exist.
Use ACCUMARRAY instead, which in contrast will happily replace undefined outputs with a fill value (by default zero, but you can also set the fill value to whatever you want):
out = accumarray(idx2,S.matrix_out(:,2),[max(idx2),1],@sum)
out = 22×1
0.7000 1.4500 7.0500 3.5000 4.7500 6.2700 3.7600 11.0000 6.6800 5.6100
out(end-6:end)
ans = 7×1
1.5700 3.3400 5.2100 0 0 0 2.1200
For example, using NaN as the fill value:
out = accumarray(idx2,S.matrix_out(:,2),[max(idx2),1],@sum, NaN)
out = 22×1
0.7000 1.4500 7.0500 3.5000 4.7500 6.2700 3.7600 11.0000 6.6800 5.6100
out(end-6:end)
ans = 7×1
1.5700 3.3400 5.2100 NaN NaN NaN 2.1200
  댓글 수: 7
Stephen23
Stephen23 2023년 11월 30일
편집: Stephen23 2023년 11월 30일
"'out1' consists of {A×1 double}, should be {A×2 double}"
So you want both columns. Here is an approach using ARRAYFUN:
load("matrix_out_98.mat")
load("idx1_98.mat")
out = arrayfun(@(n)matrix_out(n==idx1,:),1:max(idx1),'Uni',0)
out = 1×17 cell array
Columns 1 through 12 {104×2 double} {34×2 double} {30×2 double} {19×2 double} {5×2 double} {8×2 double} {7×2 double} {8×2 double} {3×2 double} {9×2 double} {7×2 double} {5×2 double} Columns 13 through 17 {4×2 double} {13×2 double} {8×2 double} {2×2 double} {[460 1.6400]}
Checking a few cells:
out{5}
ans = 5×2
410.0000 0.4100 415.0000 0.4100 416.0000 0.4400 418.0000 0.4700 494.0000 0.4300
out{9}
ans = 3×2
432.0000 0.8500 434.0000 0.8100 483.0000 0.8700
Note that often splitting up data make it harder to work with.
Alberto Acri
Alberto Acri 2023년 11월 30일
ok

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

제품


릴리스

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by