ocrTrainingData
Description
[
creates datastores for loading images, bounding boxes, and image text from ground
truth.imds
,boxds
,txtds
] = ocrTrainingData(gTruth
,labelName
,attributeName
)
ocrTrainingData
creates training data that you can use to train and
evaluate an optical character recognition (OCR) model from ground truth data. Use the
trainOCR
function to train an OCR model and the
evaluateOCR
function to evaluate the model.
Examples
Analyze OCR Ground Truth Data
This example shows how to analyze an OCR ground truth data to identify its character set and to understand the class distribution.
Load the ground truth data and then extract text labels.
ld = load("14SegmentGtruth.mat"); gTruth = ld.gTruth; [~,~,txtds] = ocrTrainingData(gTruth,"Text","Word");
Read all ground truth text corresponding to each image and combine them.
allImagesText = txtds.readall;
allText = strjoin([allImagesText{:}], "");
Find the unique set of characters in the ground truth text.
[characterSet, ~, idx] = unique(char(allText));
Display the ground truth character set.
disp("Ground Character Set: " + string(characterSet))
Ground Character Set: +,-./3ABCDEFGHIJKLMNOPQRSTUVWXYZ
The ground truth data contains all the 26 alphabets of English language in capital case, the digit 3 and five special characters: +,-./.
To understand the class distribution, count the character occurences and tabulate the character count.
characterSet = cellstr(characterSet'); characterCount = accumarray(idx,1); characterCountTbl = table(characterSet, characterCount, ... VariableNames=["Character", "CharacterCount"]); characterCountTbl = sortrows(characterCountTbl, ... "CharacterCount", "descend");
Visualize the character count with a word cloud chart.
wordcloud(characterCountTbl, "Character", "CharacterCount")
The characters O, E, T, N and A have the highest character count and the characters -, +, /, . , 3 have the least character count.
Visualize the class distribution with a bar graph.
figure numCharacters = numel(characterSet); bar(1:numCharacters, characterCountTbl.CharacterCount) xticks(1:numCharacters) xticklabels(characterCountTbl.Character) xlabel("Character") ylabel("Number of samples")
Prepare data for OCR training
This example shows preparing data to train an OCR model that can recognize fourteen-segment characters.
The training data contains word samples of fourteen-segment characters from a page of text. Read the training image and display it.
I = imread("CVT-DSEG14.jpg");
imshow(I)
This image was annotated with bounding boxes containing words and text labels were added to these bounding boxes as an attribute using the Image Labeler. To learn more about labeling images for OCR training, see Train Custom OCR Model. The labels were exported from the app as groundTruth object and saved in 14SegmentGtruth.mat file.
ld = load("14SegmentGtruth.mat");
gTruth = ld.gTruth;
Create datastores that contain images, bounding boxes and text labels from the groundTruth
object using the ocrTrainingData
function with the label and attribute names used during labeling.
labelName = "Text"; attributeName = "Word"; [imds,boxds,txtds] = ocrTrainingData(gTruth,labelName,attributeName);
Combine the datastores.
cds = combine(imds,boxds,txtds);
The combined datastore can be used for training an OCR model using the trainOCR
function.
Input Arguments
gTruth
— Ground truth data
groundTruth
object | M-by-1 array of groundTruth
objects
Ground truth data, specified as a groundTruth
object or an M-by-1 array of groundTruth
objects exported from the Image Labeler
app.
labelName
— Name of rectangular ROI label
string scalar | character vector
Name of the rectangular ROI label used for labeling ground truth, specified as a
string scalar or character vector. You must use the Rectangle
label type for OCR ground truth labeling.
Use the Image Labeler
app to label ground truth data. After loading your images into the app, select
Label from the toolbar, then select
Rectangle
. A dialog box appears that provides the field for
entering the label name.
attributeName
— Attribute name
string scalar | character vector
Attribute name that corresponds to the label name, specified as a string scalar or
character vector. The attribute identifies what the OCR detects in the specified ROI
labelName
. For example, word
. To name an
attribute in Image Labeler,
after creating the ROI, select Attribute from the toolbar. A
dialog box appears that provides the field for entering the attribute name.
Output Arguments
imds
— Image datastore
imageDatastore
object
Image datastore, returned as an imageDatastore
object that contains images extracted from specified groundTruth
object or objects.
boxds
— Bounding box label datastore
arrayDatastore
object
Bounding box label datastore associated with the ground truth images, returned as an
arrayDatastore
object.
txtds
— Text label datastore
arrayDatastore
object
Text label datastore that corresponds to the attribute name input, returned as an
arrayDatastore
object.
Version History
Introduced in R2023a
MATLAB 명령
다음 MATLAB 명령에 해당하는 링크를 클릭했습니다.
명령을 실행하려면 MATLAB 명령 창에 입력하십시오. 웹 브라우저는 MATLAB 명령을 지원하지 않습니다.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)