- you've configured the gain's output data type to be uint16.
- all the fixed-point scaling Biases are zero
Method of Real number multiplication operation
조회 수: 4 (최근 30일)
이전 댓글 표시
Environment
In simulink,
┌ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ Value (uint16) │ㅡㅡㅡㅡ [ Gain : 9.8 (fixdt(1,16,0.1,0) ] ㅡㅡㅡㅡㅡ [out]
└ ─ ─ ─ ─ ─ ─ ─ ─ ┘
When I try to generate code in above model, code is generated like below briefly.
(uint16_T) ((((uint32_T)((uint16_T)( (((uint32_T49U)*((uint32_T)Value)) >> 3 )) * ((uint32_T)52429U))) >> 15)
In above equation, I think that we can represent like below.
Value * (49>>3) * (52429>>15)
Additionally, I recognized that the sequantial patterns of binary about 49 and 52429 are respectively same as 98 and 0.1.
So, I think that 49>>3 part represents 98, and 52429>>15 part represents 0.1.
From what i know, even though we represent real number in binary, computer only can understand it like inteager, which is why above equation appear.
However, I couldn't find out any meaning of bitwise operation (>>3 , >>15)
I mean... Why is " >> 3 " generated with 49 instead of 52429?
Please let me know more than what I know...
댓글 수: 0
채택된 답변
Andy Bartlett
2023년 3월 6일
Quick Answer:
Shifts are Multiply or Divide by a Power of Two
Shift left of M bits is equivalent to multiplication by 2^M when no overflow occurs.
Arithmetic shift right of N bit is equivalent to dividing by 2^N followed by rounding toward floor.
Integer shift left and shift right are fast machine instructions that take just one clock cycle on most modern microcontrollers.
Shifts are also much more efficient than division.
When you see shifts in generated C code for fixed-point math, the shifts usually mean multiply by 2^M or divide by 2^N.
So
( ( ( Value * 49 ) >>3 ) * 52429 ) >>15
is mathematically equivalent to
floor( ( floor( ( Value * 49 ) * 2^-3 ) * 52429 ) * 2^-15 )
which if we ignore the rounding is
Value * ( 49 * 2^-3 * 52429 * 2^-15)
or combining constants show the expected math (with input and output both uint16).
Value * 9.8
Long Answer:
Assumptions
Based on the generated code, I assume
Ideal math
The ideal math for a gain in real-world values is
VY = VU * VG
where VG is the gain parameter value.
Replacing each of the variables with the zero-bias fixed-point scaling equation
V = Slope * Q
where Q is the stored integer value, gives
SlopeY * QY = SlopeU * QU * SlopeG * QG
solving for QY gives
QY = QU * QG * (SlopeU * SlopeG / SlopeY)
Using the known values
Based on the given information, we have
SlopeU = 1.0
SlopeY = 1.0
SlopeG = 0.1
QG = 98
Substituting gives
QY = QU * 98 * (1 * 0.1 / 1)
QY = QU * 98 * 0.1
or
QY = QU * 9.8
This is the "ideal math" expressed in terms of the stored integer values.
We expect the generated code to be equivalent to this except for rounding or overflow issues.
Why the math is done in two parts
The math for implementing the gain is effectively done in two parts
QTemp = ( Value * 49 ) >> 3
QY = ( QTemp * 52429 ) >> 15
One reason the math is broken into two parts is because the Gain block is designed to treat the gain parameter as if it could be tuned at some later point to a different stored integer value. So it starts with stored integer value 98, but it might be later tuned to another unsigned 16 bit stored integer values, such as 65001.
So the gain does NOT attempt to optimize the ideal math for
QY = QU * 98 * 0.1
Instead it tries to optimize the generated code for the ideal math
QY = QU * QG * 0.1
This could be implemented as
QY = QU * QG * 52429 * 2^-19
where 52429 * 2^-19 approximates 0.1.
But the pair of integer multiplications Q1 * QG * 52429 requires 48 bits to avoid all posible overflows over the range of tunable values for QG. For the Simulink model's specified production hardware target, the Gain block is trying to avoid going above 32 bit integers.
The operation is split into two parts with each part not exceeding 32 bits needed.
QY = ( QU * QG * 2^-4 ) * 52429 * 2^-15
or conceptually
QTemp = QU * QG * 2^-4
QY = QTemp * 52429 * 2^-15
Code Generation Optimization
Later when code is finally generated, the gain parameter is known to be constant.
Coders can exploit this information provided it doesn't change the defined math behaviors.
Coder see this math to be implemented
QTemp = QU * 98 * 2^-4
QY = QTemp * 52429 * 2^-15
Since 98 in binary has a trailing zero, this
QTemp = QU * 98 * 2^-4
can be replaced with the slightly simpler
QTemp = QU * 49 * 2^-3
Now replacing the multiplications and divisions by powers of two with shifts gives
QTemp = (QU * 49) >> 3
QY = ( QTemp * 52429 ) >> 15
This finishes the explaination of where the pieces of the generated code come from.
Recommendation
Using Slope and Bias scaling can improve accuracy. But to achieve the full accuracy and also get maximally efficient generated code, you want the output type to have scaling that matches the operation. So for gain block multiplication, you want
SlopeY = SlopeU * SlopeG
In your example, you've configured the scaling to not match
1 ~= 1 * 0.1
This causes less efficient code to be generated and removes the accuracy gains in representing 9.8 with Slope 0.1.
Either change the output of the gain to have SlopeY = 0.1.
Or use binary point scaling for the gain parameter. For example, if you use unsigned 16 bit best precision scaling
format long
g = fi( 9.8, 0, 16 )
QG = g.storedInteger
Then for cases that don't overflow the output type (uint16),
the worst case rounding error is just 0.8.
And the generated code is just
QY = ( QU * 40141U ) >> 12;
댓글 수: 2
Andy Bartlett
2023년 3월 7일
편집: Andy Bartlett
2023년 3월 7일
N-bit by M-bit Multiplies in C
When multiplying an N bit integer by an M bit integer, the full precision product will need N+M bits.
But N bit by N bit multiplications provided by the C language do NOT give the full precision 2N bit product. They only give the least significant N bits of the product. So the trick in C to get the full precision product is to upcast that the inputs to a data type that has M+N bits, then do multiplication in that bigger type.
C code normally provides two sizes of integer multiplies int * int and and long long * long long. (Note this assumes long is the same size as int or long long.) Usually, the int multiply has a 2X or more speed and code compactness advantage over long long multiply. Multiplications bigger than long long are not native and require much bigger and slower code to implement.
Your example is 16 bit by 16 bit multiplies
In the example you provided, the multiplications are a 16 bit input times a 16 bit input initially producing a 32 bit product. These can both be produced by a lean and fast int32 (or uint32) multiplication.
The first multiplication is easily observed to be 16 bit by 16 bit based on the input and gain parameter types.
The second multiplication is also 16 bit by 16 bit because an EXPLICIT DOWNCAST to uint16 was inserted after the first multiply and the shift right of 3 bits. Hence the second multiply can also be done with a lean and fast int32 (or uint32) multiplication.
Overflow Management is Part of the Strategy
The math that you've configured uint16 * 9.8 being assigned back to uint16 will overflow for most of the possible inputs.
For example, for 6688 <= u <= 65535, the product u * 9.8 is greater than intmax('uint16') == 65535 so overflow is expected.
The implementation, by design, has structured the first stage to overflow a little less than the overall operation. Stage 1 will overflow 10700 <= u. The expected overflows for 6688 <= u < 10700 will happen in the second stage. More importantly, for the "good cases" 0 <= u <= 6687, neither stage will overflow.
Keeping the multiplications lean and keeping the good cases free of overflows is why the 2^-19 was split up as 2^-4 for stage one and 2^-15 for stage 2.
추가 답변 (0개)
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!