For loop only working/filling cell array for half of data

조회 수: 5 (최근 30일)
Claudia
Claudia 2022년 11월 11일
편집: Karim 2022년 11월 12일
I am trying to use a for loop to fill a cell array containing tables with various statistics (e.g. mean, median ...) for sites within a large dataset.
The aim is to end up with a cell array 1x42, with a table for each variable.
The loop seems to only work for the first 16 variables. The remaining tables are empty. However, if I run the same loop specifiying a single variable (eg. i = 20), the code works and that output gives a filled table.
Code and input data are attached.
clear variables; clc; load x.mat;
for i = 1:(size(x,2))
x = x(~isnan(table2array(x(:,i))),:);
[site_num,ia,obs_count] = unique(x.site_num,'sorted');
ans_mean = accumarray(obs_count,table2array(x(:,i)),[],@(x)mean(x,'omitnan')); ans_mean = [array2table(ans_mean)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_mean.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_mean = renamevars(ans_mean,'ans_mean',header);
ans_median = accumarray(obs_count,table2array(x(:,i)),[],@(x)median(x,'omitnan')); ans_median = [array2table(ans_median)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_median.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_median = renamevars(ans_median,'ans_median',header);
ans_std = accumarray(obs_count,table2array(x(:,i)),[],@(x)std(x,'omitnan')); ans_std = [array2table(ans_std)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_std.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_std = renamevars(ans_std,'ans_std',header);
ans_lq = accumarray(obs_count,table2array(x(:,i)),[],@(x)quantile(x,0.25)); ans_lq = [array2table(ans_lq)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_lq.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_lq = renamevars(ans_lq,'ans_lq',header);
ans_uq = accumarray(obs_count,table2array(x(:,i)),[],@(x)quantile(x,0.75)); ans_uq = [array2table(ans_uq)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_uq.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_uq = renamevars(ans_uq,'ans_uq',header);
obs_count = array2table(accumarray(obs_count,1)); txt1 = x(:,i).Properties.VariableNames; header = strcat(txt1,{'_'},{'obs_count'}); obs_count = renamevars(obs_count,'Var1',header);
all{i} = [array2table(site_num) ans_mean ans_median ans_std ans_lq ans_uq obs_count];
end
Any thoughts/help/tips would be greatly appreciated! Thank you!
Apologies if my code is quite inefficient, I'm still in the learning process :)
  댓글 수: 2
Stephen23
Stephen23 2022년 11월 11일
Simplify your code by replacing these:
table2array( x(:,1) )
with
x{:,1}
Claudia
Claudia 2022년 11월 11일
Thanks so much for the tip Stephen! I will make sure to that in the future :)

댓글을 달려면 로그인하십시오.

채택된 답변

Karim
Karim 2022년 11월 11일
편집: Karim 2022년 11월 12일
One issue was the reuse of the variable name "x" directly after entering the loop, you overwrite your orinal data by removing elements with a nan. After a few loops you are left with no data.
It's better to create a temporary variable, I called it "currData" to extract the data on which your are working in the current loop. I shortend the code a bit and added a few comments.
% load mat file
load(websave('myFile', "https://www.mathworks.com/matlabcentral/answers/uploaded_files/1189013/x.mat"));
% allocate a cell array for the output data
AllData = cell(1,size(x,2));
for i = 1:size(x,2)
% extract data for current loop, and convert to array
% EDIT: included Stephen23's proposal to extract the data
currData = x{:,i};
% figure out which values are a number
NumIdx = ~isnan( currData );
% only keep the numbers for further processing
currData = currData(NumIdx);
% sort the "site num" for the numbers in tha array
[site_num,~,obs_count] = unique(x.site_num(NumIdx) ,'sorted');
% get the name of the current variable
currVarName = x(:,i).Properties.VariableNames + "_";
% do the processing
ans_mean = accumarray(obs_count,currData,[],@(x)mean(x,'omitnan'));
ans_median = accumarray(obs_count,currData,[],@(x)median(x,'omitnan'));
ans_std = accumarray(obs_count,currData,[],@(x)std(x,'omitnan'));
ans_lq = accumarray(obs_count,currData,[],@(x)quantile(x,0.25));
ans_uq = accumarray(obs_count,currData,[],@(x)quantile(x,0.75));
% create the table names for the current variable
varNames = [ currVarName + "site_num";
currVarName + "ans_mean";
currVarName + "ans_median";
currVarName + "ans_std";
currVarName + "ans_lq";
currVarName + "ans_uq"
currVarName + "obs_count";];
% gather the data in a table
currTable = table(site_num, ans_mean, ans_median, ans_std, ans_lq, ans_uq, accumarray(obs_count,1),...
'VariableNames',varNames);
% store the table in the output cell array
AllData{i} = currTable;
end
% have a look at the data in the output cell
AllData
AllData = 1×42 cell array
{130×7 table} {130×7 table} {130×7 table} {130×7 table} {92×7 table} {76×7 table} {57×7 table} {104×7 table} {99×7 table} {53×7 table} {130×7 table} {104×7 table} {67×7 table} {102×7 table} {98×7 table} {45×7 table} {26×7 table} {26×7 table} {18×7 table} {68×7 table} {81×7 table} {69×7 table} {27×7 table} {62×7 table} {9×7 table} {0×7 table} {66×7 table} {66×7 table} {65×7 table} {65×7 table} {15×7 table} {8×7 table} {46×7 table} {27×7 table} {51×7 table} {29×7 table} {51×7 table} {29×7 table} {29×7 table} {17×7 table} {16×7 table} {9×7 table}
  댓글 수: 1
Claudia
Claudia 2022년 11월 11일
Thank you soooooo much Karim! You have literally saved the day :)
Thank you for your detailed, thoughtful and super helpful answer! I really appreciate it!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by