How to make OCR recognize Upper and Lower case letters with similar shapes

조회 수: 15 (최근 30일)
Jacob Ebilane
Jacob Ebilane 2022년 6월 6일
답변: Ayush 2024년 1월 3일
I'm having trouble with Upper and Lower case letter recognition. For instance, I'm trying to read 'c' but it keeps returning as 'C', same goes for the letter O/o or any other letters with similar upper and lower case shapes. My classifier is trained using a dataset which contains both upper and lower case samples from the emnist byclass dataset.

답변 (1개)

Ayush
Ayush 2024년 1월 3일
Hi Jacob,
I understand that you want to distinguish between lowercase and uppercase letters that have similar shapes, such as "c" versus "C" and "o" versus "O", during classification.
One way to do this is by using the contour formation perspective. Refer the below steps for better understanding:
  • Normalize the character size. This will ensure that the characters will have the consistent size and aspect ratio is maintained. This can help the classifier learn size-based differences between upper and lower-case letters.
  • Use contour feature extraction. Extract features based on the contours of the characters that might help to distinguish between letter cases. For example, the relative size of the character within a bounding box could be a useful feature, as upper-case letters are larger.
  • You can also use the additional features like topological structure of letters. When the aspect ratio of characters is maintained, you can compare structure of the letters such as in case of “O” the hole is larger compared to “o”.
For more information on the contour formation, refer to the link below:
Regards,
Ayush

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by