- In case of deployment target options (ExecutionEnvironment other than 'MATLAB'), Weights and Bias of Convolution/GroupedConvolution layers are replaced with Batchnorm-folded Weights/Bias (and BatchNormalization layer becomes a no-operation layer) to speed up inference in deployment workflows, if there is a subsequent BatchNormalization layer after those. Is that the case with the model in question ?
- Could you talk more about performance degradation ? Is this degradation happening in accuracy or in time taken in inference ? Could you share the steps you are trying and the step which you see degradation in ? Also mention whether you are running on hardware or simulation in MATLAB.
Why do I get different calibration values for weights and biases depending on the specifyed Execution Environment
    조회 수: 3 (최근 30일)
  
       이전 댓글 표시
    
I am using dlquantizer and calibrate functions to quantize a Fully Convolutional Network model. When I set the ExecutionEnvironment to ‘MATLAB’ for a target agnostic quantization, everything is OK and I am able to get good performance with the quantized model. However, when I select ‘FPGA’, performance is severely degraded. When I analyze the results of the calibration processes (MAX_MIN), I can see that in the first case the values of model parameters are not modified (beyond the limits imposed by the 8-bit resolution), but in the second case the values are completely different. For instance, in the first convolutional layer I get the following ranges:
Floating point net:
- Bias -0.006, 0.0014
- Weights -0.1555, 0.1405
Agnostic qnet: 
- Bias -0.006, 0.0014
- Weights -0.1555, 0.1405
FPGA qnet: 
- Bias -3.4817, 4.6812
- Weights -1.7785, 0.3995
That does not make sense to me unless there is a parameter adaptation process to somehow compensate for the deviations introduced by the target hardware (activations?) but I assume this is not the case (this only would make sense in case there was a predefined processor IP, say DPUs for Xilinx devices, but I understand this is a workflow to design a semi-custom processor IP).  I am sure there is a reason for this, but I am not able to infer it from the documentation. Any ideas?
댓글 수: 4
  Dhananjay Kumar
    
 2023년 6월 20일
				
      편집: Dhananjay Kumar
    
 2023년 6월 20일
  
			- Will let the relevant team know about this suggestion of showing merging of the layers.
- If you don't mind could you attach the model or a similar model along with a small code snippet which deomonstrates the accuracy drop ? Or describe more about network architecture, types of layers and layer properties ? You can also create a Technical Support request with MathWorks if there is concern in sharing the model or model details here
  Dhananjay Kumar
    
 2023년 6월 20일
				Hi Koldo,
Reason of accuracy loss here for FPGA could be multiple things here .
One of them :
 A lot of layers would be unquantized and computation would be happening in floating point for 'MATLAB' target.  For 'FPGA', almost all the layers are quantized.
>> qDetails = quantizationDetails(qNet)
qDetails = struct with fields:
            IsQuantized: 1
          TargetLibrary: "none"
    QuantizedLayerNames: [26×1 string]
    QuantizedLearnables: [52×3 table]
Above QuantizedLayerNames will tell you which layers are quantized in the qNet
I also see that Batchnorm-folded Weights/Bias values are a lot higher than the original values so how BatchNormalization layer is being used in the network, could be an issue. Or maybe BatchNormalization is not needed after some of the Convolution layers ?
답변 (2개)
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



