I have been at it for awhile but cannot figure it out, and I have also gotten lost in documentation for a few days now. I am attemping to train a Faster R-CNN model with a pretrained ResNet backbone. So far I have found documentation stating not to use lgraph and to use dlnetwork instead, so I attemtped it that way also and got the same error. More documentation stated not to use net=resnet50 either, and to use [net,classNames] = imagePretrainedNetwork instead. The issue is that I cannot figure out how to fit all of these pieces together. The dataset can be found here and downlaoded for free: https://www.flir.com/oem/adas/adas-dataset-form/
When the model attempts to run, it appears that it detects only 3 classes. I also used analyzeNetwork and network designer to look at the layers and it appears that the boxdeltas and R-CNN classification layers have the correct number of outputs. Any help is greatly appreciated!!
Here is the code so far (some parts generated by chatgpt and others taken from official documentation), but I have several versions of this with slight variations:
%% Define the custom read function
function imgOut = ensureRGB(imgIn)
[~, ~, numChannels] = size(imgIn);
if numChannels == 1
imgOut = repmat(imgIn, [1 1 3]);
else
imgOut = imgIn;
end
end
%% Define the paths
imageFolder = "C:\Users\User\Desktop\FLIR_Thermal_Dataset\FLIR_ADAS_v2\images_thermal_train";
annotationFolder = "C:\Users\User\Documents\MATLAB\trainingData.mat";
matFile = "C:\Users\User\Documents\MATLAB\trainingData.mat"; % MATLAB format annotations(there is a function to convert the original data into this .mat file if anyone needs it)
%% Load the training data from the MAT-file
load(matFile, 'trainingData');
% Shuffle the training data
rng(0);
shuffledIdx = randperm(height(trainingData));
trainingData = trainingData(shuffledIdx,:);
%% Create image datastore with custom read function and specify file extensions
imds = imageDatastore(trainingData.imageFilename, ...
'ReadFcn', @(filename) ensureRGB(imread(filename)), ...
'FileExtensions', {'.jpg', '.jpeg', '.png', '.bmp'});
%% Create box label datastore
blds = boxLabelDatastore(trainingData(:, {'bbox', 'label'}));
%% Combine the datastores
ds = combine(imds, blds);
%% Verify with a sample image
sampleImg = readimage(imds, 1);
[height, width, numChannels] = size(sampleImg);
disp(['Sample Image Number of Channels: ', num2str(numChannels)]);
%% Define number of classes
numClasses = 16;
%% Define input image size and anchor boxes
inputImageSize = [512 640 3];
anchorBoxes = [32 32; 64 64; 128 128];
%% Load the ResNet-50 network
lgraph = layerGraph(resnet50);
% Specify the feature extraction layer
featureLayer = 'activation_40_relu';
% Create Faster R-CNN layers
dlnetwork = fasterRCNNLayers(inputImageSize, numClasses, anchorBoxes, lgraph, featureLayer);
%% Analyze the network to ensure all layers are correct
analyzeNetwork(dlnetwork);
%% Define training options
options = trainingOptions('sgdm', ...
'MiniBatchSize', 16, ...
'InitialLearnRate', 1e-4, ...
'MaxEpochs', 10, ...
'Verbose', true, ...
'Shuffle', 'every-epoch', ...
'Plots', 'training-progress');
% Train the network
detector = trainFasterRCNNObjectDetector(ds, dlnetwork, options);
ERROR:
Training a Faster R-CNN Object Detector for the following object classes:
* car
* light
* person
Error using trainFasterRCNNObjectDetector (line 33)
Invalid network.
Error in
untitled (line 74)
detector = trainFasterRCNNObjectDetector(ds, dlnetwork, options);
Caused by:
Layer 'boxDeltas': The input size must be 1×1×12. This R-CNN box regression layer expects the third input dimension to be 4 times the number of object classes
the network should detect (3 classes). See the
documentation for more details about creating Fast or Faster R-CNN networks.
Layer 'rcnnClassification': The input size must be 1×1×4. The classification layer expects the third input dimension to be the number of object classes the
network should detect (3 classes) plus 1. The additional class is required for the "background" class. See the
documentation for more details about creating
Fast or Faster R-CNN networks.
So far I have tried:
1) using dlnetwork instead of lgraph
2) using [net,classNames] = imagePretrainedNetwork instead of net=resnet50
3) manually changing the layers in the designer
4) changing the channels from 1 to 3. (when loaded into my python environment the images had three channels, in MATLAB they showed 1)
5) resizing the images

댓글 수: 2

Corey Kurowski
Corey Kurowski 2024년 7월 24일
Hey Brenon,
At a surface level, it looks like there may be a mismatch in the input datastore and the network configuration. Unfortunately, I am unable to use the dataset directly, but if you are able to share your network and combinedDatastore, it might provide some clarity as to where this mismatch is occurring. Alternatively, even a set of screenshots of the feature extraction layer and output layers in your dlnetwork (from within analyzeNetwork) and your boxLabelDatastore may yield the information needed.
As a slight aside, what is gearing you towards using Faster RCNN as opposed to another network? It would be good to understand your approach fully to see if another solution/approach might exist.
brenon tate
brenon tate 2024년 7월 24일
Good morning Corey, and thanks for the reply! The reason I am using Faster R-CNN is because I believe it may do a better job at identifying small objects than some of the others. I have already trained a Faster R-CNN model in python with this set, and want to do the same thing in matlab for practice. The ultimate goal is to use this model for my own dataset that I am in the process of creating with a thermal sensor that will contain small arms. The plan is to get it working in parallel, then train the model as I capture data and annotate my images.
I'm not sure if these files are what you were requesting, but if they arent please let me know and I will upload more. I could not upload the model as it is too large, but there is an image of the last 10 layers.

댓글을 달려면 로그인하십시오.

 채택된 답변

Corey Kurowski
Corey Kurowski 2024년 7월 30일

0 개 추천

The following resolved the immediate issue at hand:
Hey Brenon,
My apologies on not seeing your initial reply sooner. After diving into this a bit, the issue seems to stem from your boxLabelDatastore and the categories produced for your labels. There is an underlying expectation that the categories denoted in each row in the combined datastore will provide a comprehensive list of all classes in your datastore. Currently, your combinedDatastore produces only a subset of classes (representing the present labels in the image):
categories(blds.LabelData{1,2})
ans =
3×1 cell array
{'car' }
{'light' }
{'person'}
The easiest fix would likely be configuring how your trainingData.mat is formatted. The first column should be the image names and then every column after should be a specific class with each row having associated bounding boxes for that respective image. Similar to:
Then, to create a boxLabelDatastore, you'd call:
blds = boxLabelDatastore(trainingData(:,2:end));
This will automatically configure that underlying comprehensive categorical class assumption. Please try this out when you are able to and see if it allows you to proceed.
In the meantime, it would be great if you could share your trainingData.mat file as it is so we can see what your formatting is. In most cases, we are able to build that underlying assumption upon boxLabelDatastore construction, but it seems like you may have an edge case configuration that we haven't properly captured.

추가 답변 (1개)

Brenon
Brenon 2024년 7월 27일

0 개 추천

I decided to go back to Python as this is consuming way more time than its worth. I loaded an example to compare my datstore to that one and everything is the same, I loaded an image with bounding boxes to ensure they were correct, then I checked that the input to the "rcnnBoxDeltas" and "rcnnClassification" layers were 64(4*16 classes) and 17(16 classes +1) and they were. For some reason this model is only detecting 3 of the 16 classes.
Also, the example Faster R-CNN loads with simialr errors.

댓글 수: 8

Corey Kurowski
Corey Kurowski 2024년 7월 29일
편집: Corey Kurowski 2024년 7월 29일
Hey Brenon,
My apologies on not seeing your initial reply sooner. After diving into this a bit, the issue seems to stem from your boxLabelDatastore and the categories produced for your labels. There is an underlying expectation that the categories denoted in each row in the combined datastore will provide a comprehensive list of all classes in your datastore. Currently, your combinedDatastore produces only a subset of classes (representing the present labels in the image):
categories(blds.LabelData{1,2})
ans =
3×1 cell array
{'car' }
{'light' }
{'person'}
The easiest fix would likely be configuring how your trainingData.mat is formatted. The first column should be the image names and then every column after should be a specific class with each row having associated bounding boxes for that respective image. Similar to:
Then, to create a boxLabelDatastore, you'd call:
blds = boxLabelDatastore(trainingData(:,2:end));
This will automatically configure that underlying comprehensive categorical class assumption. Please try this out when you are able to and see if it allows you to proceed.
In the meantime, it would be great if you could share your trainingData.mat file as it is so we can see what your formatting is. In most cases, we are able to build that underlying assumption upon boxLabelDatastore construction, but it seems like you may have an edge case configuration that we haven't properly captured.
Brenon
Brenon 2024년 7월 29일
Good morning Corey,
Thank you for bringing this to my attention. I will try to alter the trainingData file as you suggested as soon as I get chunk of free time. The issue may be the function that I created to convert the COCO annotations to a format that (I thought) MATLAB likes. I inspected some other data stores and mine appeared to be in the same format, but there is likely an issue there. I have attached the function used to convert the annotations, along with the associated trainingDate file in case you have a chance to look at it before I work on the fix you suggested.
R/
Brenon
Brenon
Brenon 2024년 7월 29일
편집: Brenon 2024년 7월 29일
Corey,
My impatience got the best of me, it appears that your suggestion fixed the issue. I did the quick fix which was asking chatgpt to alter the trainingData to match your example, but when I have more time I will have to properly adjust the original function to convert the COCO annotations so I can use it in the future without issues. I am getting another error, but an initial look online seems to point to something RAM related, which I believe shouldnt be too difficult to address. Thanks again for looking into this for me, I greatly appreciate it!
EDIT: should I copy and paste your answer into a new comment and accept it?
Maximum variable size allowed on the device is exceeded.
Corey Kurowski
Corey Kurowski 2024년 7월 30일
I am glad that the change fixed the issue. I did get a chance to look at your original formatting for trainingData and I very much appreciate you sharing that. We can look at ways that we can cover this type of formatting in the future, since it is certainly a reasonable way that data might be organized.
And yes, regarding your new error, that is RAM related. A variable is being created that is too large for your currently allowed max array size in MATLAB. The first steps would be to go to Preferences->MATLAB->Workspace and increase your MATLAB array size limit setting. If maxing this out does not fix the problem, You can look to decrease batch size or reduce input image size (or both). If you are still not able to get past this issue, please let me know.
Brenon
Brenon 2024년 7월 30일
Good morning Corey,
I havent adjusted the array size limit yet, but yesterday I decreased the batch size to 4 and the model began training without errors. I'll try some other adjustments later in the week when I have more time, including that limit you mentioned. Now that I know what the issue was, I plan to go back and change some of the code to reflect the recent documentation I read, such as using dlnetwork instead of lgraph, and incorporating the imagePretrainedNetwork. Thanks again for your help, there is no telling when or if I would have figured that out on my own!
Brenon
Corey Kurowski
Corey Kurowski 2024년 7월 30일
Not a problem Brenon, I'm just glad we were able to resolve this relatively quickly and get you moving again (my apologies again for not seeing your initial reply sooner). Again, feel free to start a new post or let me know if any other issues pop up.
Corey Kurowski
Corey Kurowski 2024년 7월 30일
Hey Brenon, one more thing that popped into my mind regarding your choice of Faster RCNN for smaller object identification. I would suggest you give YOLOX a chance if time allows. We recently released it and have seen very good results for smaller objects due to its anchorless design and a feature pyramid network.
Brenon
Brenon 2024년 7월 30일
Corey,
Thank you for providing that link, I will definitely try that out when the time comes, I may even try it on this FLIR ADAS set this weekend just to play around with it. I wont get to try it on the dataset that I mentioned previously as I believe it will take me awhile to create it.
Brenon

댓글을 달려면 로그인하십시오.

제품

릴리스

R2024a

질문:

2024년 7월 20일

댓글:

2024년 7월 30일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by