- Have your loadPrc return a 4 × 1483 × 2824 numeric matrix (rather than a cell array)
- Your corresponding tall array t will then be 25000 × 1483 × 2824
- Instead of the for loop, simply call prctile in dimension 1
big data 2d matrix percentile calculation using tall
조회 수: 4 (최근 30일)
이전 댓글 표시
I'm trying to calculate a percentile of a lot of files (25000 or even more) containing 4x1 cell, representing 4 maps or 1483x2824 matrixes.
I'm using tall arrays following indications of Percentiles of Tall Matrix Along Different Dimensions:
tic
%start local pool for mutithreading
c=parcluster('local');
c.NumWorkers=20;
parpool(c, c.NumWorkers);
folder='/home/temporal2/dsantos/mat/*.mat'; %more than 25000 files
A=ones(1483,2824,2);%aux matrix for stablish prdtile data type
y=tall(A);
%database of files cointaining 4x1cell of 1483*2824 maps
ds=fileDatastore(folder,'ReadFcn',@loadPrc,'FileExtensions','.mat','UniformRead', true)
t=tall(ds);
%fill the aux tall array with each map in the correct format
for i=1:25000
y(:,:,i)=t(1+(i-1)*1483:1483*i,:);
end
%calculate the percentile
p90_1=prctile(y,90,3)
P90_1=gather(p90_1);
save('/home/temporal2/dsantos/p90_1.mat','P90_1','-v7.3');
toc
But it seems that tall arrays won't work for this because I get the error:
Warning: Error encountered during preview of tall array 'p90_1'. At
tempting to
gather 'p90_1' will probably result in an error. The error encountered was:
Requested 500025x500025 (1862.8GB) array exceeds maximum array size preference.
Creation of arrays greater than this limit may take a long time and cause
MATLAB to become unresponsive. See <a href="matlab: helpview([docroot
'/matlab/helptargets.map'], 'matlab_env_workspace_prefs')">array size limit</a>
or preference panel for more information.
> In tall/display (line 21)
p90_1 =
MxNx... tall array
? ? ? ...
? ? ? ...
? ? ? ...
: : :
: : :
>> Error using digraph/distances (line 72)
Internal problem while evaluating tall expression. The problem was:
Requested 500028x500028 (1862.9GB) array exceeds maximum array size preference.
Creation of arrays greater than this limit may take a long time and cause
MATLAB to become unresponsive. See <a href="matlab: helpview([docroot
'/matlab/helptargets.map'], 'matlab_env_workspace_prefs')">array size limit</a>
or preference panel for more information.
Error in
matlab.bigdata.internal.lazyeval.LazyPartitionedArray>iGenerateMetadata (line
756)
allDistances = distances(cg.Graph);
Error in
matlab.bigdata.internal.lazyeval.LazyPartitionedArray>iGenerateMetadataFillingPart
itionedArrays
(line 739)
[metadatas, partitionedArrays] = iGenerateMetadata(inputArrays,
executorToConsider);
Error in ...
Error in tall/gather (line 50)
[varargout{:}] = iGather(varargin{:});
Caused by:
Error using matlab.internal.graph.MLDigraph/bfsAllShortestPaths
Requested 500028x500028 (1862.9GB) array exceeds maximum array size
preference. Creation of arrays greater than this limit may take a long time
and cause MATLAB to become unresponsive. See <a href="matlab:
helpview([docroot '/matlab/helptargets.map'],
'matlab_env_workspace_prefs')">array size limit</a> or preference panel for
more information.
Any clue on how to solve this problem?
All the best
댓글 수: 0
답변 (2개)
Edric Ellis
2019년 8월 13일
That particular error is an internal error basically because your tall array expression is simply too large - contains too many expressions. tall arrays operate by building up a symbolic representation of all the expressions you've evaluated, and then running them all together when you call gather. Because you've got a for loop over 25000 elements, this symbolic representation is large - too large to be evaluated. tall arrays are basically not designed to be looped over in this way. Instead, you need to express your program in terms of a smaller number of vectorised operations.
I would proceed in the following manner (I can't be more specific since your problem statement isn't executable - see this page on tips regarding making a minimal reproduction):
ds = fileDatastore();
t = tall(ds);
p90_1=prctile(t,90,1);
P90_1=gather(p90_1);
% and then perhaps
P90_1 = shiftdim(P90_1, 1)
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Large Files and Big Data에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!