How to use ImageDataStore together with tall array?
조회 수: 12 (최근 30일)
이전 댓글 표시
So I'm trying to work with rather large amounts of data, that'll by no means fit into memory (~10TB of data), even though the computer I'm working on has 256GB memory.
I run into an issue using the following code (simply as an example, trying to understand the functionality of ImageDataStore and Tall arrays):
% Please don't mind the variable names!
ds = datastore('/Users/dummyPath/blabla/','Type','image');
K = tall(ds);
D = 0;
for ii = 1:10
D = D + mean(mean(mean(K(ii))));
end
N = gather(D);
OUTPUT:
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 2: Completed in 17 sec
- Pass 2 of 2: Completed in 0 sec
Evaluation 100% complete
Error using tall/mean (line 22)
Argument 1 to MEAN must be one of the following data types: numeric logical duration datetime char.
Learn more about errors encountered during GATHER.
Error in untitled (line 5)
D = mean(mean(mean(K(ii))));
Error in tall/gather (line 50)
[varargout{:}] = iGather(varargin{:});
Error in untitled (line 7)
N = gather(D)
Hope someone can help, I was unable to find anyone having the issue with datastore and tall arrays anywhere :/ Using them together could potentially alleviate a tedious process for me (shifting data in and out of memory, etc.).
댓글 수: 0
채택된 답변
Edric Ellis
2018년 9월 7일
The output of the imageDatastore is a tall cell array, so you'll almost certainly want to use cell2mat to convert this to a tall array.
There's an additional complexity - tall arrays can be "large" only in the first dimension, so it's going to be easiest to perform per-image computations by first using permute on each image so that the dimensions are offset by 1. An example is probably in order.
imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'images', 'imdata', 'AT*.tif'));
t = tall(imds);
At this point, t is an M×1 tall cell array where each cell is a 480×640 uint8 array. If we use cell2mat at this point, we'll end up with a (M*480)×640 uint8 array. By using permute on each cell prior to calling cell2mat, we can end up instead with an M×480×640 uint8 array.
t2 = cellfun(@(im) permute(im, [3, 1, 2]), t, 'UniformOutput', false);
t2 = cell2mat(t2);
gather(size(t2)) % gets [10 480 640]
Now, we can perform mean on each image separately
meanPerImage = mean(mean(t2, 2), 3)
gather(meanPerImage)
Unfortunately, in this case, this turns out to be not the most efficient way to compute this mean. It works better to do:
sumPerImage = sum(sum(t2, 2), 3);
numelPerImage = cellfun(@numel, t);
meanPerImage2 = gather(sumPerImage ./ numelPerImage)
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Big Data Processing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!