Why ocr function doesn't recognize the numbers?

조회 수: 20 (최근 30일)
Adriano
Adriano 2018년 1월 16일
댓글: José Luis Sandoval 2020년 6월 8일
Hi,
I have the image below:
I need to capture the numbers through ocr function. Thus, I use this code to do it:
capture = imread('Capture.png');
my_image = imresize(capture, 1.4);
ocrResults = ocr(my_image,'CharacterSet','.0123456789');
recognizedText = ocrResults.Words;
However, ocr function only recognize some numbers. In fact, the output is a cell array 37x1 while I should have 39 rows:
{'18.33'}
{'1423' }
{'1' }
{'6.55' }
{'1' }
{'5.65' }
{'12.54'}
{'14.77'}
{'10.33'}
{'13.79'}
{'12.94'}
{'1255' }
{'1' }
{'1.70' }
{'9.84' }
{'10.71'}
{'9.74' }
{'9.98' }
{'933' }
{'9.00' }
{'7.22' }
{'3.02' }
{'7.45' }
{'7.10' }
{'6.56' }
{'6.28' }
{'5.86' }
{'5.40' }
{'5.01' }
{'4.57' }
{'4.10' }
{'174' }
{'3.39' }
{'3.011'}
{'2.71' }
{'2.33' }
{'2.118'}
Then, many numbers are worng. Please, somone can help me? Thanks!

채택된 답변

Birju Patel
Birju Patel 2018년 1월 17일
편집: Birju Patel 2018년 1월 17일
Hi,
A little bit of pre-processing and using ROIs to specify where the words are will help. By default, OCR uses page layout analysis to determine blocks of text. In this case, the image doesn't look like a normal page of text (like a PDF article for example).
To make it easier for OCR, first you can find the location of the words using regionprops and then pass the location of the words (as bounding boxes) to the OCR function. See the code below and results. They look accurate. You may have to play around more with the pre-processing to make this robust for a collection of different images. But hopefully, this gives you an idea on how to proceed:
capture = imread('Captura.PNG');
% Increase image size by 3x
my_image = imresize(capture, 3);
figure
imshow(my_image)
% Localize words
BW = imbinarize(rgb2gray(my_image));
BW1 = imdilate(BW,strel('disk',6));
s = regionprops(BW1,'BoundingBox');
bboxes = vertcat(s(:).BoundingBox);
% Sort boxes by image height
[~,ord] = sort(bboxes(:,2));
bboxes = bboxes(ord,:);
% Pre-process image to make letters thicker
BW = imdilate(BW,strel('disk',1));
% Call OCR and pass in location of words. Also, set TextLayout to 'word'
ocrResults = ocr(BW,bboxes,'CharacterSet','.0123456789','TextLayout','word');
words = {ocrResults(:).Text}';
words = deblank(words)
words =
39×1 cell array
{'0' }
{'18.33'}
{'0' }
{'14.23'}
{'0' }
{'16.55'}
{'0' }
{'15.65'}
{'12.64'}
{'14.77'}
{'10.83'}
{'13.79'}
{'12.94'}
{'12.55'}
{'0' }
{'11.70'}
{'9.84' }
{'10.71'}
{'9.74' }
{'9.98' }
{'9.33' }
{'9.00' }
{'7.22' }
{'8.02' }
{'7.45' }
{'7.10' }
{'6.56' }
{'6.28' }
{'5.86' }
{'5.40' }
{'5.02' }
{'4.57' }
{'4.10' }
{'3.74' }
{'3.39' }
{'3.00' }
{'2.71' }
{'2.33' }
{'2.08' }
  댓글 수: 4
Robert Cadavos
Robert Cadavos 2020년 4월 26일
you're my hero man!
José Luis Sandoval
José Luis Sandoval 2020년 6월 8일
It still does not work for me.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by