Object Detection in Large Satellite Imagery Using Deep Learning

This example uses:

This example shows how to perform object detection on large satellite imagery using deep learning.

Overview

Object detection is a key component in many computer vision applications such as automated driving, surveillance, and tracking. For many of these applications, the size of the image data is typically smaller than 1K-by-1K pixels. Generally, images of this size do not require a size-management process. However, satellite images, which can be greater than 10K-by-10K pixels in size will usually require additional strategies.

The size of satellite imagery gives rise to several challenges. One challenge is the amount of memory needed to store and process the images. Object detector training and prediction on very large images is impractical due to GPU resource constraints.

Another challenge is the sparsity of objects within the images. There are often large regions in the image that do not contain any objects at all. Processing these areas is wasteful and often not useful for training object detectors.

A third challenge is class imbalance where one or more classes do not have the same number of samples as other classes. This can bias the performance of deep learning based object detectors towards the over-represented classes.

The final challenge is detecting each object when they are closely grouped together. Traditional object detectors struggle to handle these tightly fitted scenarios, which are common in overhead imagery.

This example shows how to apply several strategies to mitigate these challenges by:

Using block processing during preprocessing and inference to make better use of the available GPU resources.
Automatically sampling blocks of data from the large imagery to ensure that the blocks used for training contain objects of interest.
Balancing the class distribution in a training data set created from sampled blocks.
Utilizing rotated rectangle bounding boxes in training and inference to handle closely packed clusters of objects.

This example first shows how to perform object detection on a large satellite image from the RarePlanes [1,2] data set using a pretrained YOLO v4 object detector [3]. The second part of the example shows how to train a YOLO v4 object detector on the RarePlanes data set. All the steps for object detection and training can be adapted to other large image data sets.

To learn more about the RarePlanes data set, see the RarePlanes User Guide.

Load Pretrained Object Detector

Download a pretrained object detector. See the Train Object Detector example section for more information on training this detector.

downloadFolder = tempdir;
detector = helperDownloadObjectDetector(downloadFolder);

Load Satellite Image

Use blockedImage to load a test image from the RarePlanes data set. The blockedImage object represents a very large image as a collection of smaller blocks which permits processing on a resource constrained system.

imageFilename = helperDownloadSampleImage(downloadFolder);
bim = blockedImage(imageFilename);

Use bigimageshow to display the image.

figure
bigimageshow(bim)

Figure contains an axes object. The axes object contains an object of type bigimageshow.

Detect Planes in Large Satellite Images

Apply the pretrained object detector to overlapping image blocks from the large image using the apply object function of blockedImage. Overlapping blocks are necessary for object detection in large imagery because some objects may be clipped when a block is extracted from the image. If this is not addressed, the clipped objects may introduce detection artifacts. The helperDetectObjectsInBlock function, listed at the end of this example, addresses this by discarding detections that overlap the border area by more than 50%. The use of overlapping blocks ensures that an object clipped in one block is going to be fully visible in an adjacent block.

Specify the desired size of the blocks to process based on the detector input size. See Select Blocks for Training and Validation for more information on choosing a block size.

blockSize = detector.InputSize(1:2);

Specify the border size around the block to create overlapping blocks. Choose the border size based on the largest object size you expect for your application to ensure that the object is not clipped in at least one of the overlapping blocks. For the real portion of the RarePlanes data set, the largest object is about 360-by-360 pixels. See Analyze Object Sizes to determine object sizes in a data set.

borderSize = [180 180];

Calculate the actual block size that the apply object function of blockedImage should produce.

actualBlockSize = blockSize - 2*borderSize;

The apply object function executes a custom function for each block within the blockedImage. Define the custom function, helperDetectObjectsInBlock, as the function to execute for each block.

threshold = 0.4;
detectionFcn = @(bstruct)helperDetectObjectsInBlock(bstruct, detector, borderSize, threshold);

For faster throughput on a GPU at the cost of additional memory usage, specify a batch size value greater than one to have blocks concatenated into a batch of images. The exact amount of speed-up depends on how fast blocks can be read from the image versus the time it takes to process the batch of data. Empirical performance analysis is required to identify the ideal batch size each system. Reduce the batch size to prevent out-of-memory errors.

batchSize = 16;

Invoke the apply object function to run the object detector on overlapping blocks. Set PadPartialBlocks to true to ensure all the blocks have the same size. This simplifies the code in helperDetectObjectsInBlock because all the input blocks have the same size.

results = apply(bim, detectionFcn, ...
    PadPartialBlocks=true, ... 
    BlockSize=actualBlockSize,...
    BorderSize=borderSize, ...
    DisplayWaitbar=true,...
    BatchSize=batchSize);

Aggregate the detection results across all the blocks.

allBoxes  = vertcat(results.Source{1,1}.bboxes);
allScores = vertcat(results.Source{1,1}.scores);
allLabels = vertcat(results.Source{1,1}.labels);

Display the all the detection results.

figure
bigimageshow(bim)
showShape("rectangle", allBoxes,ShowOrientation=true)

Figure contains an axes object. The axes object contains an object of type bigimageshow.

It is difficult to see the detections in the large image because the objects in the RarePlanes data set are much smaller compared to the image. Set the x and y axis limits to zoom into a region with multiple detections. Display a zoomed in section of a blockedImage and the associated labels using bigimageshow.

xlim([2700 3300])
ylim([3800 4100])

Figure contains an axes object. The axes object contains an object of type bigimageshow.

The pretrained detector detects each of the airplanes in this zoomed in region but the orientation is not exact for every detection. Many factors contribute to the overall performance of the detector such as the number of objects in the training data, the object detector configuration, as well as the hyperparameters used for training.

Detecting objects in satellite imagery is a challenging application and the RarePlanes data set provides data you can use to explore various techniques to create a robust detector. This example shows you setup the training and prediction pipelines but does not explore other avenues to improve the detector as that requires additional empirical analysis.

The rest of the example shows how to train a YOLO v4 object detector on the real portion of the RarePlanes data set.

Load Training Data

Create a directory to store the RarePlanes data set.

dataFolder = fullfile(tempdir,"RarePlanes");

Go to the RarePlanes data set website, follow the instructions to download all the real images (~107 GB), and then uncompress the data into the folder created above. After uncompressing the data you should have the following folders:

<dataFolder>/real/train

<dataFolder>/real/test

Create a list of all the RGB images and their corresponding label data files from the train/PS-RGB_cog folder with matlab.io.datastore.FileSet. The data from this folder is used for training and validation.

trainingImagesFolder = fullfile(dataFolder,"real","train","PS-RGB_cog");
trainingLabelFolder = fullfile(dataFolder,"real","train","geojson_aircraft");

trainingImages = matlab.io.datastore.FileSet(trainingImagesFolder);
trainingLabels = matlab.io.datastore.FileSet(trainingLabelFolder);

The RarePlanes data set contains ground truth for many object attributes. In this example, the object classes are created based on the "wing_type", which consists of four classes:

classes = ["delta"
    "straight"
    "swept"
    "variable swept"
    ];

Load the labels using a fileDatastore with the custom read function, helperReadGeoJSONGroundTruth, which is listed at the end of this example. helperReadGeoJSONGroundTruth parses the GeoJSON files that contain the ground truth information for each image and returns the latitude and longitude coordinates of polygon ROI labels around each plane.

labelDS = fileDatastore(trainingLabels, ReadFcn=@(filename)helperReadGeoJSONGroundTruth(filename,'wing_type'));

Prepare Data for Training

The polygon ROI label data is provided in latitude and longitude coordinates. To train an object detector, the polygon ROIs must be transformed to axis-aligned rectangle or rotated rectangle ROIs and the latitude and longitude coordinate values must be transformed to intrinsic image coordinates. The helperLatLonPolyToBoundingBox function uses georasterinfo (Mapping Toolbox) and geographicToIntrinsic (Mapping Toolbox) from the Mapping Toolbox™ to convert geographic coordinates into intrinsic image coordinates.

The conversion process requires the label and image data. Combine the label datastore with a datastore that returns the image filenames and create a datastore transform to apply the helperLatLonPolyToBoundingBox function to the combined datastore.

When rotatedBboxes is true, helperLatLonPolyToBoundingBox converts the polygon ROI label data into rotated rectangle bounding box label data. Note that detecting rotated objects is only supported for YOLO v4.

rotatedBboxes = true;

imageFileNameDS = arrayDatastore(trainingImages.FileInfo.Filename);
bldsTrain = combine(labelDS, imageFileNameDS);
bldsTrain = transform(bldsTrain, @(data)helperLatLonPolyToBoundingBox(data, classes, rotatedBboxes));

Extract the transformed bounding boxes and labels.

boxLabels = readall(bldsTrain);
bboxes = vertcat(boxLabels{:,1});
labels = vertcat(boxLabels{:,2});

Display a Sample Ground Truth Image

It is useful to analyze the training data both visually and statistically. Display a sample ground truth image and the associated bounding boxes.

I = trainingImages.FileInfo.Filename(1);
sampleBboxes = vertcat(boxLabels{1,1});
sampleBim = blockedImage(I);

figure
bigimageshow(sampleBim)

showShape("rectangle",sampleBboxes,ShowOrientation=true);
xlim([13400 13800])
ylim([3200 3600])

Figure contains an axes object. The axes object contains an object of type bigimageshow.

Inspect Data Set Statistics

It is important to understand the distribution of classes in the data set as well as the size of objects. This can help you identify issues in your data set prior to running training experiments and can often help you remedy certain data issues ahead of time.

Analyze Object Sizes

Approximate the size of each object using the diagonal of the bounding box.

diagonalLength = hypot(bboxes(:,3),bboxes(:,4));

Group object sizes by class.

G = findgroups(labels);
groupedDiagonalLength = splitapply(@(x){x},diagonalLength,G);

Visualize the distribution of object lengths for each class.

figure
numClasses = numel(classes);
for i = 1:numClasses
    len = groupedDiagonalLength{i};
    x = repelem(i,numel(len),1);
    semilogy(x,len,"o");
    hold on
end
hold off
ylabel("Diagonal box length (pixels)")

xticks(1:numClasses)
xticklabels(classes)

Figure contains an axes object. The axes object with ylabel Diagonal box length (pixels) contains 4 objects of type line. One or more of the lines displays its values using only markers

The object size analysis shows that, across all classes, most of the objects have roughly the same size. In the next section, the example shows how to use this information to select blocks for training.

Analyze Object Class Distribution

Count the labels in the training data set to determine the distribution of classes, and evaluate whether the data set classes are balanced.

originaldatasetCount = countlabels(labels);

Display the class distribution.

figure
histogram(Categories=originaldatasetCount.Label, BinCounts=originaldatasetCount.Count);

Figure contains an axes object. The axes object contains an object of type categoricalhistogram.

The class distribution analysis shows that this data set is imbalanced. The delta and variable swept classes have significantly fewer samples than straight and swept. Class imbalance is a common challenge in many object detection applications. Common approaches to address this challenge include over or under sampling objects, data augmentation, specialized loss functions, and data synthesis. The RarePlanes data set includes synthetic data to help balance the classes, but this example does not highlight that workflow. Instead, the Select Blocks for Training and Validation section below shows how to sample very large images to balance the class distribution in the training data set.

Select Blocks for Training and Validation

One challenge with processing large satellite imagery using deep learning is that the data must be processed in blocks due to GPU resource constraints. Use blockedImage to represent training images as a collection of blocks.

filenames = trainingImages.FileInfo.Filename;
bims = blockedImage(filenames);

The block size is a critical parameter for blocked-based object detector training. Select a block size based on the size of objects in the data set such that the object and a sufficient amount of background is visible. This ensures that the object detector is trained on image blocks where the objects of interest are fully visible. Use the object data size analysis to guide the block size selection. In this data set, using a block size of 1024-by-1024 pixels ensures that all the objects of interest are visible in the image blocks. This block size is also selected as the YOLO v4 architecture will be adjusted later to make these larger images more practical while reducing training time.

blockSize = [1024 1024];

With the block size defined, the next step is to specify which blocks to use from the training images. This is not a trivial task in satellite imagery because large areas within the images often do not contain any objects of interest. Therefore, naively selecting all the overlapping blocks from the training images using the selectBlockLocations function would create many image blocks with no objects, which do not provide any useful information during training. In addition, the class distribution analysis showed that the classes are imbalanced.

To find image blocks with objects for training and balance the training data set, use balanceBoxLabels. This function samples blocks in the images from regions that contain objects and returns a predefined number of blocks. Areas of the image with underrepresented object classes are sampled at a higher frequency to help balance the class distribution. The sampling processing randomly shifts a sampling window to ensure objects are not at the same position in all the blocks. Set the number of blocks balanceBoxLabels should select based on the average number of object instances per class.

numClasses = height(originaldatasetCount);
numBlocks = mean(originaldatasetCount.Count) * numClasses;

Create a table from the boxes and labels and invoke balanceBoxLabels. In this example, blocks are selected from the highest resolution level.

boxLabelTable = table(boxLabels(:,1),boxLabels(:,2));
balancedLocationSet = balanceBoxLabels(boxLabelTable, bims, blockSize, numBlocks, Levels=1);

[==================================================] 100%
[==================================================] 100%
Elapsed time: 00:00:10
Estimated time remaining: 00:00:00
Balancing box labels complete.

Recompute the class distribution to verify that the class distribution is better.

bldsBalanced = boxLabelDatastore(boxLabelTable, balancedLocationSet);
balanceddatasetCount = countEachLabel(bldsBalanced);

Display the class distribution.

figure
histogram(Categories=balanceddatasetCount.Label, BinCounts=balanceddatasetCount.Count);

Figure contains an axes object. The axes object contains an object of type categoricalhistogram.

The balancing process increased the number of underrepresented classes, but there is still an imbalance due to the severity of the class imbalance. This will hinder the performance of the detector on the underrepresented classes. You can consider trying additional techniques to address the class imbalance such as collecting more data, use a data augmentation, or generate synthetic data. Using these additional techniques is beyond the scope of this example.

Because of the class imbalance, training a robust detector for all four classes is not feasible. This example combines all the classes into a single Airplane class. Use helperCombineClasses, to combine all the box labels to Airplane.

blds = helperCombineClasses(boxLabels);

With each airplane type being combined into a singular super class, the class labels must be set to simply Airplane.

classes = "Airplane";

Write Blocks to Disk

Write image blocks containing objects to disk to speed up training time. Training an object detector requires multiple passes through the data set. Repeatedly sampling the same blocks from large images adds extra overhead to the training time and should be avoided when block locations do not change during training. Use selectBlockLocations to created a blockLocationSet of the blocked images with a block size of 1024-by-1024 and a block offset configured to the set border size. Then, use blockLocationsWithROI to select only the relevant image blocks that contain a groundTruth object as defined by blds. Bounding box data for partially included objects within an image block will not count in image block selection, as this degrades the detector's performance. Also return an updated boxLabelDatastore that correspond to the block images selected with blockLocationsWithROI. Finally, create a blockedImageDatastore with the blockLocationSet that defines only object-containing image blocks, and call the writeall method to write out all image blocks.

Once the data has been written out once, blocksWrittenToDisk is set to true to avoid rewriting images on additional example runs. Adjust blockSize to write out different sized image blocks. Reducing the image blocks to 512-by-512 will allow for training on a smaller GPU, but will increase training time due to several thousand more images being introduced into the data set.

if ~exist("blocksWrittenToDisk","var")
    blocksWrittenToDisk = false;
end

blockOverLapRatio = max(borderSize./blockSize);
blockOffsets = round(blockSize.*(1-blockOverLapRatio));
blset = selectBlockLocations(bims,BlockSize=blockSize,BlockOffsets=blockOffsets);
[blset, blds] = blockLocationsWithROI(blset,blds);
bimds = blockedImageDatastore(bims,BlockLocationSet=blset);
writeDirectory = fullfile(tempdir,"data");

if ~blocksWrittenToDisk
    % Create block data set output folder.
    if ~isfolder(writeDirectory)
        mkdir(writeDirectory)
    end

    writeall(bimds,writeDirectory,OutputFormat="jpeg",UseParallel=true);
    blocksWrittenToDisk = true;
end

Load Blocked Image Training Data

Load and create an imageDatastore from the blocked images in the writeDirectory folder.

imds = imageDatastore(writeDirectory);

Combine the datastores.

ds = combine(imds,blds);

Split Data into Training and Validation Sets

Shuffle the datastore prior to splitting into training and validation sets to ensure blocks from all images are included in both the training and validation sets.

ds = shuffle(ds);

Split the selected blocks into a training set and validation set into an 80/20 split.

numTraining = round(size(imds.Files,1)*0.8);
dsTrain = subset(ds,1:numTraining);
dsVal = subset(ds,numTraining+1:size(imds.Files,1));

Augment Training Data

The helper function helperAugmentData will apply scaling, rotation, and color jitter for rotated rectangle training data. If the training data uses axis-aligned bounding boxes, then horizontal and vertical flipping substitutes rotation augmentation. This augmentation helps improve the detector's ability to learn different orientations of planes and minimizes the effects of varying environmental conditions.

augmentedDsTrain = transform(dsTrain, @helperAugmentData);

Configure Object Detector

Configure a YOLO v4 object detector using the following steps:

Select a pretrained backbone for transfer learning.
Select which network features to use for each detection head.
Estimate anchor boxes and assign them to the appropriate detection head.

Select the Tiny YOLO v4 pretrained backbone for transfer learning, which is a lightweight version of the YOLO v4 network with fewer network layers and parameters. This reduction in network size helps reduce training and inference time at a cost of some reduction in accuracy.

To use the pretrained YOLO v4 object detection networks trained on COCO data set, you must install the Computer Vision Toolbox™ Model for YOLO v4 Object Detection. You can download and install the Computer Vision Toolbox Model for YOLO v4 Object Detection from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. To run this function, you will require the Deep Learning Toolbox™.

tinyYolo = yolov4ObjectDetector("tiny-yolov4-coco");

Next, select which features within the Tiny YOLO v4 network should be used for detection. By default, Tiny YOLO v4 object detector has network output feature maps of size 13-by-13 and 26-by-26 for computing predictions based on an input image size of 416-by-416. This corresponds to 32x and 16x smaller than the input image. Here, since the training images are of size 1024-by-1024 and many objects are 20 to 100 pixels in diagonal size, so too much downsampling can make it nearly impossible for the network to properly detect smaller objects. The network output feature maps are selected to be 64-by-64 and 128-by-128, which is 16x and 8x smaller than the input image size. This modification will make the detector have an easier time learning small scale objects when training.

Specify the layers where the detection heads are to be placed. leaky_17 and leaky_20 correspond to the 128-by-128 and 64-by-64 network output feature maps, respectively. Using analyzeNetwork (Deep Learning Toolbox) to analyze tinyYolo.Network can assist in locating these layers. Finding the optimal layers requires empirical analysis for a given data set.

detectionNetworkSources =  ["leaky_17","leaky_20"];

Then, use the estimateAnchorBoxes function to estimate anchor boxes based on the size of objects in the training data. In this example, 6 anchor boxes are estimated using the procedure delineated in Estimate Anchor Boxes from Training Data. Choosing the optimal number of anchor boxes requires empirical analysis.

numAnchors = 6;
anchors = estimateAnchorBoxes(blds, numAnchors)

anchors = 6×2

    59    68
   107   103
   210   276
    38    29
   207   189
   118   161

The estimated anchors have contains three regularly incremented anchor box sizes: [18, 11], [33, 25], and [48, 43]. From there, the sizing is incremented more rapidly. Given the priority on tight anchor size increments for these smaller anchors, the results in the Inspect Data Set Statistics section confirm that many objects are in the smaller object range. These anchors are to be assigned to the 16x detection head, and the larger, more spaced out anchor boxes will be assigned to the 8x detection head. In this case, it is an even split of three anchors per detection head, and those anchors are split in descending order. The number of anchor boxes and which ones should be assigned to a detection head should be carefully analyzed for a given data set.

Sort the anchors by size and distribute them into two groups for each detection head in YOLO v4. For more information on about anchor boxes, see Anchor Boxes for Object Detection. The descending order of the anchor boxes will ensure that the smallest three anchor boxes will be assigned to the 64-by-64 feature map output and the largest three anchor boxes will be assigned to the 128-by-128 feature map output.

area = anchors(:,1).*anchors(:,2);
[~,idx] = sort(area,"descend");
sortedAnchors = anchors(idx,:);
anchorBoxes = {sortedAnchors(1:3,:); sortedAnchors(4:6,:)};

The YOLO v4 network is defined on creation to predict the location, size, and rotation of an object or just the location and size. The YOLO v4 network in this example is configured to predict rotated bounding boxes. When using the pretrained network "tiny-yolov4-coco" to create a YOLO v4 object detector that predicts rotation, the network heads are replaced, so the network will need to be retrained before it is capable of detecting any objects.

if rotatedBboxes
    predictedBoxType = "rotated";
else
    predictedBoxType = "axis-aligned";
end

Finally, configure YOLO v4 with the specified options. The input size is set to match the size of the written image blocks and the detection network heads are automatically configured through the DetectionNetworkSource argument. The PredictedBoxType argument configures the yolov4ObjectDetector to be either a rotated rectangle or an axis-aligned detector.

inputSize = [blockSize 3];
detector = yolov4ObjectDetector("tiny-yolov4-coco", ...
    classes, anchorBoxes, InputSize=inputSize, ...
    DetectionNetworkSource=detectionNetworkSources, ...
    PredictedBoxType=predictedBoxType);

Train Object Detector

Use trainYOLOv4ObjectDetector to train the object detector if the doTraining variable is true. Otherwise, load a pretrained YOLO v4 object detector. Training was verified on an NVIDIA GeForce RTX 3090 Ti with 24 GB of memory and took about 6.5 hours to complete. Reduce batchSize to prevent out-of-memory errors.

doTraining = false;
if doTraining
    % Reduce the batch size to 8 to manage GPU resources using the 1024-by-1024 input images.
    batchSize = 8;
    
    opts = trainingOptions("adam", ...
        MaxEpochs=30,...
        MiniBatchSize=batchSize,...
        InitialLearnRate=1e-3,...
        LearnRateSchedule="piecewise",...
        LearnRateDropPeriod=25,...
        Plots="none",...
        Shuffle="every-epoch",...
        BatchNormalizationStatistics="moving",...
        ResetInputNormalization=false,...
        GradientThreshold=1e6,...
        ValidationData=dsVal,...
        VerboseFrequency=100);

    detector = trainYOLOv4ObjectDetector(augmentedDsTrain, detector, opts);
    filename = "detectorRR_Augmented_Full";
    save(filename,'detector');
else
    detector = helperDownloadObjectDetector(downloadFolder);
end

Evaluate Object Detector

Evaluate the trained object detector on test images to measure the performance. The Computer Vision Toolbox object detector evaluation function evaluateObjectDetection evaluates common metrics such as average precision (AP), log-average miss rate (LAMR), and average orientation similarity (AOS) for rotated rectangle bounding boxes. In this example, the AP and AOS metrics to evaluate performance. The AP is a single number that evaluates the detector's ability to make correct classifications (precision) and the ability of the detector to find all relevant objects (recall). The AOS measures the detector's ability to correctly orient the bounding boxes of the detected objects.

Load the test set data using the helperLoadTestData function.

[bimsTest, bldsTest] = helperLoadTestData(dataFolder,rotatedBboxes);

Use the apply object function of blockedImage to run the object detector on all the test images. Set a low detection threshold value of 0.01 to detect more objects and evaluate the detector performance across a large threshold range. Load the saved test results to improve example run time. Set the doEvaluation to true to recompute the evaluation results.

doEvaluation = false;
if doEvaluation
    
    threshold = 0.01;
    actualBlockInputSize = blockSize - 2*borderSize;
    batchSize = 16;
    blockedTestLocation = fullfile(tempdir,"eval");
    blockedTestResults = apply(bimsTest, ...
        @(bs) helperDetectObjectsInBlock(bs,detector,borderSize,threshold), ...
        PadPartialBlocks=true, ...
        BlockSize=actualBlockInputSize, ...
        BorderSize=borderSize, ...
        DisplayWaitbar=true, ...
        BatchSize=batchSize, ...
        OutputLocation=blockedTestLocation);

    % Gather the results from across all the images into a table.
    numTestImages = size(blockedTestResults,2);
    allResults = table( ...
        Size=[numTestImages 3], ...
        VariableNames=["Boxes","Scores","Labels"], ...
        VariableTypes=["cell","cell","cell"]);

    for i = 1:numTestImages
        bboxes = [];
        labels = [];
        scores = [];
        imgResults = gather(blockedTestResults(i));
        for blockIdx = 1:numel(imgResults)
            bboxes = [bboxes;imgResults(blockIdx).bboxes];
            labels = [labels;imgResults(blockIdx).labels];
            scores = [scores;imgResults(blockIdx).scores];
        end
        allResults.Boxes{i} = bboxes;
        allResults.Scores{i} = scores;
        allResults.Labels{i} = labels;
    end
else
    % Load test results.
    allResults = helperLoadTestResults(downloadFolder);
end

Use the evaluateObjectDetection to compute the precision and recall metrics for the Airplane class. Set the overlap threshold iouThreshold to 0.5. A detection is considered a match to the ground truth when the intersection over union (IoU) of the pixels in the ground truth bounding box and the predicted bounding box is equal to or greater than iouThreshold.

Calculate the AOS to evaluate rotated rectangle bounding box detections in addition to the AP metrics. AOS quantifies how closely the predicted orientations match the ground truth data.

iouThreshold = 0.5;
if rotatedBboxes
    metrics = evaluateObjectDetection(allResults,bldsTest,iouThreshold,AdditionalMetrics="AOS");
else
    metrics = evaluateObjectDetection(allResults,bldsTest,iouThreshold);
end

datasetSummary = summarize(metrics)

datasetSummary=1×5 table
    NumObjects    mAPOverlapAvg    mAP0.5     mAOSOverlapAvg    mAOS0.5
    __________    _____________    _______    ______________    _______

       3807          0.72453       0.72453       0.71023        0.71023

[precision, recall] = precisionRecall(metrics);
AP = averagePrecision(metrics);

Plot the precision and recall metrics for the Airplane class. The plot shows that the detector recalls about 75% of the objects in the test data set at the specified threshold value. This highlights the challenging nature of the RarePlanes data set. To improve the results, apply data augmentation, increase training time, or tune the hyperparameters more.

figure
plot(recall{:},precision{:})
title("Airplane" + " (AP:" + AP + ")")
xlabel("Recall")
ylabel("Precision")
grid on

Figure contains an axes object. The axes object with title Airplane (AP:0.72453), xlabel Recall, ylabel Precision contains an object of type line.

The AP value of 0.72 provides a basic picture of how it is doing in the test data set. To evaluate detector performance across various plane sizes, use the metricsByArea object function. In order to use the metricsByArea method, specify an object size range based on the six bounding box areas calculated in the Configure Object Detector. This object size range is semi-uniform because the box area estimates are based on close matching of most object sizes. For this data set, all objects are larger than the smallest estimated anchor box area, and some objects are larger than the largest estimated anchor box area. Because the objects in the size range above the largest box size are not distributed uniformly, add another range that extends to three times the highest estimated anchor box area.

sortedAreas = sort(area,"ascend");

lowerLimits = sortedAreas;
upperLimits = [sortedAreas(2:end);3*sortedAreas(end)];
objectAreaRanges = [lowerLimits, upperLimits];
areaMetrics = metricsByArea(metrics,objectAreaRanges)

areaMetrics=6×9 table
         AreaRange         NumObjects    APOverlapAvg        AP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Precision                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Recall                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  AOSOverlapAvg       AOS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            OrientationSimilarity

Plot the precision as a function of the object size for the Airplane class. The plot shows that when detecting objects with areas up to approximately 4,000 pixels, the detector achieves an mAP of approximately 0.81-0.83, but mAP sharply declines as the object size increases. The table above shows that AOS similarly falls off as object size increases.

figure
plot(mean(objectAreaRanges,2),areaMetrics.APOverlapAvg)
title("Precision by Airplane Size")
xlabel("Airplane Mean Area")
ylabel("Precision")
grid on

Figure contains an axes object. The axes object with title Precision by Airplane Size, xlabel Airplane Mean Area, ylabel Precision contains an object of type line.

Using metricsByArea reveals that the object detector performance decreases when detecting larger objects. This decreased performance may result from changing both of the detection heads to larger feature map when configuring the YOLO v4 network in the Configure Object Detector section. To improve performance, decrease the feature map size of the detection head, or introduce an additional detection head. In either case, to further improve performance, increase the anchor box sizes for larger objects. Additionally, to improve performance, you can set the training data to be randomly scaled up more frequently in the Augment Training Data section of this example. This would allow the detector to encounter larger objects more often during training, and improve performance for this size range.

Summary

This example shows how to use blockedImage to implement block-based object detector preprocessing and inference workflows. Block-based processing enables you to detect objects in large images by breaking them down into blocks of data that can be processed on resource-constrained GPUs. To reduce training time and maintain a comprehensive data set for retraining, write only the image blocks that contain valid objects out to the disk.

Before retraining a detector to improve performance, evaluate the detector in various scenarios. Analyze the detector performance across a range of object sizes to understand how to modify the network and data to maximize detector performance and robustness.

Further Applications

You can apply the YOLO v4 training and detection workflow shown in this example to use with other object detectors such as SSD, Faster R-CNN, YOLO v2, and YOLOX by changing the detector object and training function used in the Configure Object Detector and Train Object Detector section, respectively. To use the process with a detector that supports axis-aligned bounding boxes only, set rotatedBboxes to false in the Prepare Data for Training section and rewrite the data to the disk. To learn more about other object detectors, see Get Started with Object Detection Using Deep Learning and Choose an Object Detector.

Supporting Functions

helperDetectObjectsInBlock

Run object detector on blocks of data supplied by the apply object function of blockedImage.

function bres = helperDetectObjectsInBlock(bstruct, detector, borderSize, threshold)

% Get the block of data. 
paddedBlock = bstruct.Data;

% Run the object detector.
[bboxes, scores, labels] = detect(detector, paddedBlock, Threshold = threshold);
if ~iscell(bboxes)
    bboxes = {bboxes};
    scores = {scores};
    labels = {labels};
end

% Determine the position of the valid block region (excluding the border
% area). This is needed to remove boxes that are detected in the border. 
actualBlockSize = size(paddedBlock,[1 2]) - 2*borderSize;
blockPosition = [borderSize([2 1])+1 actualBlockSize([2 1])];

% Offset to place boxes in data world coords. This offset is used to update
% the position of the boxes from the local block space to the world
% coordinates of the larger image.
offset = [1 1] - bstruct.Start(:,[2 1]);

for i = 1:numel(bboxes)

    % Remove boxes that lie in the border region. 
    [bboxes{i}, scores{i}, labels{i}] = ...
        helperRemoveDetectionsInBorderRegion(bboxes{i}, scores{i}, labels{i}, blockPosition);

    % Update box positions to be relative to the world coordinates.
    bboxes{i}(:,[1 2]) = bboxes{i}(:,[1 2]) - offset(i,:);
end

% Pack the detection results into a struct and ensure the last dimension
% equals bstruct.BatchSize by transposing the struct.
bres = struct('bboxes', bboxes, 'labels', labels, 'scores', scores)';

end

helperRemoveDetectionsInBorderRegion

This function removes detections that fall between the edge of two blocks and would have been cut in half.

function [bboxes, scores, labels] = helperRemoveDetectionsInBorderRegion(...
    bboxes, scores, labels, blockPosition)

% Use bboxcrop to find out which boxes are inside the block position.
if size(bboxes,2) == 5
    overlapThreshold = 1;
else
    overlapThreshold = 0.5;
end
[~, valid] = bboxcrop(bboxes, blockPosition, OverlapThreshold=overlapThreshold);
bboxes = bboxes(valid,:);
scores = scores(valid);
labels = labels(valid);
end

helperBboxFromPolygon

Converts M four-sided polygons stored in a 4-by-2-by-M array to axis-aligned bounding boxes stored in an M-by-4 matrix.

function bbox = helperBboxFromPolygon(poly)
X = poly(:,1,:);
Y = poly(:,2,:);
X = squeeze(X);
Y = squeeze(Y);
xmin = min(X)';
xmax = max(X)';
ymin = min(Y)';
ymax = max(Y)';

bbox = [xmin ymin xmax-xmin ymax-ymin];
end

helperRotatedBboxFromPolygon

Converts M four-sided polygons stored in a 4-by-2-by-M array to rotated rectangle bounding boxes stored in an M-by-5 matrix.

function bbox = helperRotatedBboxFromPolygon(poly)
X = poly(:,1,:);
Y = poly(:,2,:);
X = squeeze(X);
Y = squeeze(Y);

nose  = [X(1,:); Y(1,:)]';
lWing = [X(2,:); Y(2,:)]';
tail  = [X(3,:); Y(3,:)]';
rWing = [X(4,:); Y(4,:)]';

% At 0 deg, the plane nose faces right.
height = diag(pdist2(lWing, rWing)); % wingspan  = height
width = diag(pdist2(nose, tail));    % nose-tail = width

center = (nose + tail)./2;

angle = rad2deg(atan2(nose(:,2)-center(:,2),nose(:,1)-center(:,1)));

bbox = [center(:,1) center(:,2) width height angle];
end

helperLatLonPolyToBoundingBox

Convert a polygon specified in latitude and longitude coordinates to a bounding box in pixel or spatial coordinates. This function uses georasterinfo and geographicToIntrinsic from the Mapping Toolbox.

function out = helperLatLonPolyToBoundingBox(data, classes, rotatedBboxes)

poly = data{1}{1};
imageFile = data{2};
rasterInfo = georasterinfo(imageFile);
rasterRef = rasterInfo.RasterReference;
for i = 1:size(poly,3)
    lon = poly(:,1,i);
    lat = poly(:,2,i);
    [xi,yi] = geographicToIntrinsic(rasterRef,lat,lon);
    poly(:,1,i) = xi;
    poly(:,2,i) = yi;
end

if ~rotatedBboxes
    bbox = helperBboxFromPolygon(poly);
else
    bbox = helperRotatedBboxFromPolygon(poly);
end

if nargin == 1 || isempty(classes)
    % Use single Airplane class.
    numObjects = size(bbox,1);
    labels = repmat("Airplane",numObjects,1);
    labels = categorical(labels,"Airplane");
else
    % Use classes from input. 
    labels = categorical(data{1}{2}, classes);
end
out = {bbox, labels};
end

helperDownloadObjectDetector

Download a pretrained YOLO v4 object detector.

function detector = helperDownloadObjectDetector(folder)
url = "https://ssd.mathworks.com/supportfiles/vision/data/pretrainedYOLOv4RarePlanes.zip";
zipFile = fullfile(folder,"pretrainedYOLOv4RarePlanes.zip");
if ~exist(zipFile,"file")
    websave(zipFile, url);
end
matFile = fullfile(folder, "pretrainedYOLOv4RarePlanes.mat");
if ~exist(matFile, "file")
    unzip(zipFile,folder);
end

pretrained = load(matFile);
detector = pretrained.detector;
end

helperReadGeoJSONGroundTruth

Read GeoJSON ground truth data from the RarePlanes data set. For more information about the ground truth format, see the RarePlanes User Guide.

function [data, info] = helperReadGeoJSONGroundTruth(filename, attribute)

% Load labels from JSON file.
txt = fileread(filename);
labelsJSON = jsondecode(txt);

s = [labelsJSON.features(:).geometry];

% Concatenate coordinates into numObjects-by-5-by-2. These are the
% coordinates of the polygon used to label the planes. The coordinates are
% stored as [longitude latitude] in GeoJSON.
lonlat = cat(1,s(:).coordinates);

% Discard the last coordinate. It is the same as the first and closes the
% polygon.
lonlat(:,5,:) = [];

% Permute the coordinates to 4-by-2-by-N and then to a cell of array of
% M-by-2 matrices.
lonlat = permute(lonlat,[2 3 1]);

% Extract object property information.
info = vertcat(labelsJSON.features(:).properties);

% Extract out object label information. 
labels = string({info(:).(attribute)}');

data = {lonlat labels};
end

helperAugmentData

Augment the training data inputs.

function data = helperAugmentData(A)
% Apply random horizontal flipping for axis-aligned bounding boxes, and for
% rotated rectangle bounding boxes, apply random rotation instead. Apply 
% random X/Y scaling. Boxes that get scaled outside the bounds are clipped
% if the overlap is above 0.25 for axis-aligned bounding boxes. If the
% bounding boxes are rotated rectangles, they must be fully contained in
% the image frame after augmentation. Also, jitter image color.

data = cell(size(A));
for ii = 1:size(A,1)
    I = A{ii,1};
    bboxes = A{ii,2};
    labels = A{ii,3};
    sz = size(I);

    if numel(sz) == 3 && sz(3) == 3
        I = jitterColorHSV(I,...
                Contrast=0.0,...
                Hue=0.1,...
                Saturation=0.2,...
                Brightness=0.2);
    end
    
    % Randomly flip or rotate image.
    scaleRange = [0.8 1.2];
    if size(bboxes,2) == 5
        tform = randomAffine2d(Scale=scaleRange,Rotation=[-180 180]);
        overLapThreshold = 1;
    else
        tform = randomAffine2d(XReflection=true,YReflection=true,Scale=scaleRange);
        overLapThreshold = 0.25;
    end

    rout = affineOutputView(sz,tform,BoundsStyle='centerOutput');
    I = imwarp(I,tform,OutputView=rout);
    
    % Apply the same transform to boxes.
    [bboxes,indices] = bboxwarp(bboxes,tform,rout,OverlapThreshold=overLapThreshold);
    bboxes = round(bboxes);
    labels = labels(indices);
    
    % Return original data only when all boxes are removed by warping.
    if isempty(indices)
        data(ii,:) = A(ii,:);
    else
        data(ii,:) = {I, bboxes, labels};
    end
end
end

helperLoadTestData

Load RarePlanes test data.

function [bimsTest, bldsTest] = helperLoadTestData(dataFolder, rotatedBboxes)
% Load test images.
testImagesFolder = fullfile(dataFolder,"real","test","PS-RGB_cog");
testImages = matlab.io.datastore.FileSet(testImagesFolder);
bimsTest = blockedImage(testImages);

% Load test box labels.
testLabelFolder = fullfile(dataFolder,"real","test","geojson_aircraft");
testLabels = matlab.io.datastore.FileSet(testLabelFolder);
testLabelDS = fileDatastore(testLabels, ReadFcn=@(filename)helperReadGeoJSONGroundTruth(filename,'wing_type'));
imageFileNameDS = arrayDatastore(testImages.FileInfo.Filename);
dsTest = combine(testLabelDS, imageFileNameDS);
bldsTest = transform(dsTest, @(data)helperLatLonPolyToBoundingBox(data, [], rotatedBboxes));
end

helperLoadTestResults

Load saved test results.

function allResults = helperLoadTestResults(folder)
url = "https://ssd.mathworks.com/supportfiles/vision/data/pretrainedYOLOv4RarePlanes.zip";
zipFile = fullfile(folder,"pretrainedYOLOv4RarePlanes.zip");
if ~exist(zipFile,"file")
    websave(zipFile, url);
end
matFile = fullfile(folder, "pretrainedYOLOv4RarePlanes.mat");
if ~exist(matFile, "file")
    unzip(zipFile,folder);
end

pretrained = load(matFile);
allResults = pretrained.allResults;
end

helperCombineClasses

Combine airplane classes into a single Airplane class and create a boxLabelDatastore.

function blds = helperCombineClasses(boxLabels)
for i = 1:size(boxLabels,1)
    numLabels = numel(boxLabels{i,2});
    boxLabels{i,2} = categorical(repmat("Airplane",numLabels, 1));
end
blds = boxLabelDatastore(cell2table(boxLabels));
end

helperDownloadSampleImage

Download a sample image from the RarePlanes data set.

function filename = helperDownloadSampleImage(folder)
url = "https://ssd.mathworks.com/supportfiles/vision/data/RarePlanesSampleImage.zip";
zipFile = fullfile(folder,"RarePlanesSampleImage.zip");
if ~exist(zipFile,"file")
    websave(zipFile, url);
end
filename = fullfile(folder,"113_104001003D8DB300.tif");
if ~exist(filename,"file")
    unzip(zipFile,folder);
end
end

References

[1] J. Shermeyer, T. Hossler, A. Van Etten, D. Hogan, R. Lewis, and D. Kim, RarePlanes data set. In-Q-Tel - CosmiQ Works, 2020.

[2] J. Shermeyer, T. Hossler, A. Van Etten, D. Hogan, R. Lewis, and D. Kim, “RarePlanes: Synthetic Data Takes Flight,” Jun. 2020.

[3] A. Bochkovskiy, C.-Y. Wang, and H.-Y. Mark Liao. "YOLOv4: Optimal Speed and Accuracy of Object Detection." April. 2020.

Object Detection in Large Satellite Imagery Using Deep Learning

Overview

Load Pretrained Object Detector

Load Satellite Image

Detect Planes in Large Satellite Images

Load Training Data

Prepare Data for Training

Display a Sample Ground Truth Image

Inspect Data Set Statistics

Analyze Object Sizes

Analyze Object Class Distribution

Select Blocks for Training and Validation

Write Blocks to Disk

Load Blocked Image Training Data

Split Data into Training and Validation Sets

Augment Training Data

Configure Object Detector

Train Object Detector

Evaluate Object Detector

Summary

Further Applications

Supporting Functions

helperDetectObjectsInBlock

helperRemoveDetectionsInBorderRegion

helperBboxFromPolygon

helperRotatedBboxFromPolygon

helperLatLonPolyToBoundingBox

helperDownloadObjectDetector

helperReadGeoJSONGroundTruth

helperAugmentData

helperLoadTestData

helperLoadTestResults

helperCombineClasses

helperDownloadSampleImage

References

See Also