Main Content

evaluateOCR

Evaluate OCR results against ground truth

Since R2023a

    Description

    example

    metrics = evaluateOCR(resultTxt,groundTruthTxt) evaluates the optical character recognition (OCR) results resultTxt against the ground truth groundTruthTxt. The function evaluates the quality of the OCR results by computing the character error rate and word error rate metrics across images and the entire data set.

    metrics = evaluateOCR(___,Name=Value) specifies options using one or more name-value arguments in addition to any combination of arguments from previous syntaxes. For example, Metrics="word-error-rate" specifies to evaluate results using only the word error rate metric.

    Examples

    collapse all

    This example shows how to evaluate the accuracy of an OCR model that can recognize seven-segment numerals on a dataset. The evaluation dataset contain images of energy meter displays that have seven-segment numerals in them.

    Download and extract dataset.

    datasetURL = "https://ssd.mathworks.com/supportfiles/vision/data/7SegmentImages.zip";
    datasetZip = "7SegmentImages.zip";
    if ~exist(datasetZip,"file")
        disp("Downloading evaluation dataset (" + datasetZip + " - 96 MB) ...");
        websave(datasetZip,datasetURL);
    end
    
    datasetFiles = unzip(datasetZip);

    Load the evaluation ground truth.

    ld = load("7SegmentGtruth.mat");
    gTruth = ld.gTruth;

    Create datastores that contain images, bounding boxes and text labels from the groundTruth object using the ocrTrainingData function with the label and attribute names used during labeling.

    labelName = "Text";
    attributeName = "Digits";
    [imds,boxds,txtds] = ocrTrainingData(gTruth,labelName,attributeName);

    Combine the datastores.

    cds = combine(imds,boxds,txtds);

    Run OCR on the evaluation dataset.

    results = ocr(cds, Model="seven-segment");

    Evaluate the OCR results against the ground truth.

    metrics = evaluateOCR(results,cds);
    Evaluating ocr results
    ----------------------
    * Selected metrics: character error rate, word error rate.
    * Processed 119 images.
    * Finalizing... Done.
    * Data set metrics:
    
        CharacterErrorRate    WordErrorRate
        __________________    _____________
    
             0.082195            0.19958   
    

    Display accuracy of the OCR model.

    modelAccuracy = 100*(1-metrics.DataSetMetrics.CharacterErrorRate);
    disp("Accuracy of the OCR model= " + modelAccuracy + "%")
    Accuracy of the OCR model= 91.7805%
    

    Input Arguments

    collapse all

    Prediction from OCR model, specified as a cell array of ocrText objects, or any datastore object that returns a cell array with at least two elements, such that the last two cells contain:

    • Cell 1 — N-by-4 matrix, where N is the number of ROIs [in the original ground truth data.] text line bounding boxes.

    • Cell 2 — N-element vector of strings, where N is the number of ROIs [in the original ground truth data.] of text predictions.

    If specified as a cell array, resultTxt must contain one element for each image in the original ground truth data.

    Ground truth text, specified as a datastore object that returns a cell array with at least two elements, such that the last two cells contain:

    • Cell 1 — N-by-4 matrix, where N is the number of ROIs [in the original ground truth data.] text line bounding boxes.

    • Cell 2 — N-element vector of strings, where N is the number of ROIs [in the original ground truth data.] of text predictions.

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: metrics = evaluateOCR(resultTxt,groundTruthTxt,Metrics="word-error-rate") specifies to evaluate results using only the word error rate metric.

    Metrics to compute, specified as "all", "word-error-rate", or "character-error-rate".

    Progress information display, specified as a numeric or logical 1 (true) or 0 (false).

    Output Arguments

    collapse all

    OCR metrics, returned as an ocrMetrics object.

    Tips

    • Error rates are a percentage of characters (words) in the input groundTruthTxt that have been incorrectly predicted in resultTxt.

    • To compute the number of incorrect predictions to use in the error rate calculation, the function uses the Levenshtein distance, which is defined as the minimum number of edits (such as insertions, deletions, or substitutions) required to change one word (or sentence) into another one.

      Error rate = (S + D + I)/N where,

      • S — Number of substitutions

      • D — Number of deletions

      • I — Number of insertions

      • N — Maximum number of characters (words) between groundTruthTxt or resultTxt

    Version History

    Introduced in R2023a