Why do I get different calibration values for weights and biases depending on the specifyed Execution Environment

조회 수: 3 (최근 30일)
I am using dlquantizer and calibrate functions to quantize a Fully Convolutional Network model. When I set the ExecutionEnvironment to ‘MATLAB’ for a target agnostic quantization, everything is OK and I am able to get good performance with the quantized model. However, when I select ‘FPGA’, performance is severely degraded. When I analyze the results of the calibration processes (MAX_MIN), I can see that in the first case the values of model parameters are not modified (beyond the limits imposed by the 8-bit resolution), but in the second case the values are completely different. For instance, in the first convolutional layer I get the following ranges:
Floating point net:
  • Bias -0.006, 0.0014
  • Weights -0.1555, 0.1405
Agnostic qnet:
  • Bias -0.006, 0.0014
  • Weights -0.1555, 0.1405
FPGA qnet:
  • Bias -3.4817, 4.6812
  • Weights -1.7785, 0.3995
That does not make sense to me unless there is a parameter adaptation process to somehow compensate for the deviations introduced by the target hardware (activations?) but I assume this is not the case (this only would make sense in case there was a predefined processor IP, say DPUs for Xilinx devices, but I understand this is a workflow to design a semi-custom processor IP). I am sure there is a reason for this, but I am not able to infer it from the documentation. Any ideas?
  댓글 수: 4
Dhananjay Kumar
Dhananjay Kumar 2023년 6월 20일
편집: Dhananjay Kumar 2023년 6월 20일
  1. Will let the relevant team know about this suggestion of showing merging of the layers.
  2. If you don't mind could you attach the model or a similar model along with a small code snippet which deomonstrates the accuracy drop ? Or describe more about network architecture, types of layers and layer properties ? You can also create a Technical Support request with MathWorks if there is concern in sharing the model or model details here
Dhananjay Kumar
Dhananjay Kumar 2023년 6월 20일
Hi Koldo,
Reason of accuracy loss here for FPGA could be multiple things here .
One of them :
A lot of layers would be unquantized and computation would be happening in floating point for 'MATLAB' target. For 'FPGA', almost all the layers are quantized.
>> qDetails = quantizationDetails(qNet)
qDetails = struct with fields:
IsQuantized: 1
TargetLibrary: "none"
QuantizedLayerNames: [26×1 string]
QuantizedLearnables: [52×3 table]
Above QuantizedLayerNames will tell you which layers are quantized in the qNet
I also see that Batchnorm-folded Weights/Bias values are a lot higher than the original values so how BatchNormalization layer is being used in the network, could be an issue. Or maybe BatchNormalization is not needed after some of the Convolution layers ?

댓글을 달려면 로그인하십시오.

답변 (2개)

Koldo Basterretxea
Koldo Basterretxea 2023년 6월 20일
Hi Dhananjay,
This is a four level encoder-decoder fully convolutional netowork. I think there is nothing wrong with the code, it is just that the merging of convolution and batch-normalization layers produce an increase in the range of activation values in many layers (first figure) so the 8-bit integer representation is not able to produce satisfactory results (second figure). Unfortunatelly the distribution-based exponent scheme ('histogram') does not solve de problem.

Koldo Basterretxea
Koldo Basterretxea 2023년 6월 22일
Hi.
I tried that. I got no significant improvements.
And you were right; the 'agnostic' quantization did not perform 8bit quantization at every layer (the 'FPGA' version did)
Thank jou

카테고리

Help CenterFile Exchange에서 Quantization, Projection, and Pruning에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by