OCR low-res image from specified set of characters

조회 수: 4 (최근 30일)
Amit
Amit 2016년 2월 8일
답변: Amit 2016년 2월 21일
Hello all:
Is there a way in which I provide the super set of characters (no characters to be expected outside the set) to the OCR. I mean to say for example if I know that my images has only [U, T, C, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, :, space] as the only characters, can I feed it to matlab OCR and thus expect better predictions.
Can this help somehow on low res image to increase the accuracy?
Kindly see the attached image I am struggling to get good results with.
Thanks much for your attention. Any thoughts will be immensely helpful.
Regards,
Amit

채택된 답변

Image Analyst
Image Analyst 2016년 2월 8일
For something like that you can probably just crop out each character and find the area. Then, assuming each character has a unique area, have a look up table where a certain area means the character must be a certain character.
  댓글 수: 2
Amit
Amit 2016년 2월 8일
Ingenious! So yo mean using functions like 'regionprops'?
Thanks @Image Analyst. Such an approach might be helpful for many other related things though I am just hoping for a simpler solution for this one particular case.
Please let me know if you have functions other than 'regionprops' in mind.
Regards,
Amit
Image Analyst
Image Analyst 2016년 2월 10일
Right. Something like
measurements = regionprops(labeledImage, 'Area');
allAreas = [measurements.Area];
% Define character areas
characterAreas = [300,410,130,500,........] % Whatever they are.
for k = 1 : length(allAreas)
differences = abs(measurements(k).Area - characterAreas);
[~, closestCharacterIndex] = min(differences);
% Now you know what character it is....
end

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Anand
Anand 2016년 2월 8일
편집: Anand 2016년 2월 8일
This exact functionality is available in the ocr function. Use the 'CharacterSet' Name-Value pair to achieve this. Something like this:
ocrResults = ocr(yourImage,'CharacterSet','UTC1234567890')
In your case, you may have even more information that you can supply to the ocr function. If it's a valid assumption that the left half of the image only contains characters, you could have two calls to ocr with different ROI's.
For example (this is pseudo-code),
leftROI = [1 1 floor(size(im,2)/2) size(im,1)-1];
ocrLeftResults = ocr(im, leftROI, 'CharacterSet','UTC');
rightROI = [floor(size(im,2)/2)+1 1 floor(size(im,2)/2) size(im,1)-1];
ocrRightResults = ocr(im, rightROI, 'CharacterSet', '0123456789');
  댓글 수: 5
Anand
Anand 2016년 2월 8일
Amit, that should be perfectly fine. There is a list of supported languages which you can see here.
You need to add the Name-Value pair 'Language'.
Amit
Amit 2016년 2월 10일
Hello Anand, thank you. Its very helpful, though the low resolution of my image I guess is resulting in very bad quality of predictions.
Any thoughts.
Thanks again indeed.

댓글을 달려면 로그인하십시오.


Amit
Amit 2016년 2월 21일
Dear all:
Opening the question again. I have 2864 files such as one attached. I have not been able to find anything, MATLAB or otherwise that works reliably to give me the OCR out. That was my Sunday.
Any of your kind suggestions/directions will be immensely helpful.
Thanks much.
Amit

카테고리

Help CenterFile Exchange에서 Language Support에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by