Dear all,
Can you please help me to vectorize (or speed-up somehow else) this code? Below is the original (parfor) version and the vectorized one, but its not working (the image is different). How to vectorize this, where is the error? The inner loop (two lines) is executed 47 bln times in my code, so any speed up is a good thing.
noised = imnoise(zeros(230,230), 'salt & pepper', 0.2);
imshow(noised, []); impixelinfo
%%Oryginal
myTempModel = zeros(1, 230);
signalInBlock = zeros(230, 230);
tic
parfor i = 1 : 199
myTemp = myTempModel;
ii=i+31;
for j = 1 : 199
block = noised( i:ii, j:j+31);
myTemp(j+15) = sum(block(:));
end
signalInBlock(i+15, :) = myTemp;
end
toc
imshow(signalInBlock,[]); impixelinfo
%%Vectorized, but not working
signalInBlock = zeros(230, 230);
tic
i = 1:1:199;
j = 1:1:199;
signalInBlock(i+15, j+15) = sum(sum(noised(i:i+31, j:j+31)));
toc
imshow(signalInBlock,[]); impixelinfo
Best regards, Alex

 채택된 답변

Joseph Cheng
Joseph Cheng 2015년 9월 25일

0 개 추천

why not use conv2?
signalInBlock2 = zeros(230, 230);
tic
temp = conv2(noised,ones(32,32),'valid');
signalInBlock2(16:214,16:214)=temp;
figure,imshow(signalInBlock2,[]);
toc
when running your code the parfor took 0.327807 seconds, the conv2 took 0.131374 seconds

댓글 수: 3

I just realized that this is computed in loop 30x in average:
function signalInBlock = squaredFrameProcess(noised, signalInBlock)
arrayOnes = ones(31,31);
temp = conv2(noised, arrayOnes, 'valid');
signalInBlock(16:214, 16:214) = temp;
end
And i refere to it like this:
...
signalInBlock = squaredFrameProcess(noiseFrame, signalInBlock);
blurred = imfilter(signalInBlock, GaussianFilter);
...
So is there a way to make it faster? Can I somehow put it to one datacube and execute (on GPU?) in vectorized format?
Alex Kurek
Alex Kurek 2015년 9월 30일
편집: Alex Kurek 2015년 9월 30일
I tried this:
noiseFrameCollector = zeros(230, 230, 30);
signalInBlock = noiseFrameCollector;
zzz = 1:1:30;
tic
signalInBlock(:,:,zzz) = squaredFrameProcess(noiseFrameCollector(:,:,zzz), signalInBlock);
toc
But got the following error:
Undefined function 'conv2' for input arguments of type 'double' and attributes 'full 3d real'. Error in squaredFrameProcess (line 3) temp = conv2(noised, arrayOnes, 'valid');
Is there any other possibility?
Joseph Cheng
Joseph Cheng 2015년 10월 2일
편집: Joseph Cheng 2015년 10월 2일
for that conv2 is for a 2D matrix if my memory of the documentation is correct. you can write a for loop to go through each "layer" of signalblock. which if large the parallel tool box can make if faster if it is really slow since each "layer" is not dependent on each other. As for GPU processing, i'm still dabbling in using the GPU so i'm not sure.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Alex Kurek
Alex Kurek 2015년 9월 25일

0 개 추천

Thank you,
I implemented it like this (toc after figure, preallocation):
signalInBlock2 = zeros(230, 230);
tic
arrayOnes = ones(32,32);
temp = conv2(noised, arrayOnes, 'valid');
signalInBlock2(16:214, 16:214) = temp;
toc
figure, imshow(signalInBlock2, []);
And it takes 0.005525 seconds with is 34x faster.
Now I wonder if there is something faster than conv2

댓글 수: 2

Joseph Cheng
Joseph Cheng 2015년 9월 25일
편집: Joseph Cheng 2015년 9월 25일
good catch, I stuck the figure portion towards the end to visually compare the parfor output and the conv2 output. forgot to copy the timing results without the figure when replying to you
Image Analyst
Image Analyst 2015년 9월 25일
conv2() is highly optimized, especially for separable kernels like you're using (just a flat box filter). You won't find anything faster. You could compare it with imfilter() if you want - it's similar.

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Parallel Computing Toolbox에 대해 자세히 알아보기

제품

질문:

2015년 9월 25일

편집:

2015년 10월 2일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by