Change Object Detection to own Objects (fasterRCNN)

조회 수: 2 (최근 30일)
MatLabMcLovinPotato
MatLabMcLovinPotato 2020년 5월 28일
편집: Ali Ozturk 2023년 5월 27일
Afternoon!
I've been working my way through this example as it's the closest I've found for what I'm trying to do: https://www.mathworks.com/help/vision/examples/object-detection-using-faster-r-cnn-deep-learning.html
This is a detector to train, that looks for vehicle, I am trying to use this to detect other things. I need to be detecting more than one object, of more than one Class.
I've changes the above to:
  1. Use my own groundtruth table
  2. I have 8 objects I'd want to be detecting (+ Background)
  3. Main (really only) change is that I modified are references from 'vehilces' to to my groundtruth table columns, where I need all of my new objects.
i.e. I changed:
FROM: bldsTest = boxLabelDatastore(testDataTbl(:,'vehicle'));
To: bldsTest = boxLabelDatastore(testDataTbl(:,2:end)); and tried to list out all the classes in curly[s], rather than calling columns, same errors.
I labelled my own images and have been updating and adjusting as and where I can. I've changed lines in the opiotns, leaning that I need to edit which and what layers?
Although at this point, I have the following road block, at the trainFasterRCNNObjectDetector step, I get the following error:
Error using trainFasterRCNNObjectDetector (line 426)
Invalid network.
Caused by:
Layer 'boxDeltas': The input size must be 1×1×32. This R-CNN box regression layer expects the third input dimension to be 4 times the number of object classes the network should detect (8 classes). See the documentation for more details about creating Fast or Faster R-CNN networks.
Layer 'rcnnClassification': The input size must be 1×1×9. The classification layer expects the third input dimension to be the number of object classes the network should to detect (8 classes) plus 1. The additional class is required for the "background" class. See the documentation for more details about creating Fast or Faster R-CNN networks.
REQUEST
Would it be possible to please get an idea of how I can adjust my working to not have these errors, such as which variable, object, or deity I need to reference. Do I need to run through these steps of this first? For exmaple: https://www.mathworks.com/help/vision/ug/faster-r-cnn-examples.html
Should my images be smaller, more of them, more anti-object images, I've changed the options for training, although not sure if the right direction.... I'm asking what to do about my boxDeltas and rcnnClassification error. I really just ask that, given this isn't the first post talking to this; if you do feel the need to reply, please - don't reword the error message to me. If that's what I was after, I'd have posted this weeks ago...
  댓글 수: 1
Ali Ozturk
Ali Ozturk 2023년 5월 27일
편집: Ali Ozturk 2023년 5월 27일
You need to set the numClasses variable to your number of classes in the faster_rcnn.m file.
For example; if you have 8 classes, the line would be:
numClasses=8

댓글을 달려면 로그인하십시오.

답변 (1개)

Madhav Thakker
Madhav Thakker 2020년 7월 24일
I understand that you want to train a Faster-RCNN for multi-class object detection.
It seems that the Faster-RCNN network is instantiated as expected (8 classes + 1 background). I think the input data is not read properly. You can do width(dataset)-1 to verify the number of classes in your input dataset.
fasterRCNNLayers(inputSize,numClasses,anchorBoxes,featureExtractionNetwork,featureLayer) should be able to create a working Faster-RCNN network with correct number of classes. It is also reflected in the 'boxDeltas' and 'rcnnClassification' layer error.
To answer your other questions -
  • The minimum size of input images should be [224, 224, 3], but if you have a powerful GPU, you could even give the original image as input.
  • More the number of training images, more generalizable and more robust will the learned network be.
  • Ideally, you should have some data with no foreground objects but that depends from case to case basis.
  댓글 수: 1
永涛 贾
永涛 贾 2021년 5월 24일
According to the help document:when training a Faster-RCNN for multi-class object detection, use a datastore, after calling the datastore with the read and readall functions,it returns a cell array or table with two or three columns. The second column must be a cell array that contains M-by-5 matrices of bounding boxes of the form [xcenter, ycenter, width, height,yaw]. The vectors represent the location and size of bounding boxes for the objects in each image.
what does the parameter ‘yaw’ mean?where does it come from?

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by