FAQ: Why is a fixed-point type's Fraction Length or Integer Length sometimes negative?

Question

MathWorks Fixed Point Team 2022년 11월 7일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1845588-faq-why-is-a-fixed-point-type-s-fraction-length-or-integer-length-sometimes-negative

편집: MathWorks Fixed Point Team 2022년 11월 7일

Sometimes fixed-point variables have fraction lengths that are negative.

a = 5.6632765314184e+15
a = 5.6633e+15
aFi = fi(a,1,4)
aFi = 
   5.6295e+15

          DataTypeMode: Fixed-point: binary point scaling
            Signedness: Signed
            WordLength: 4
        FractionLength: -50
curFractionLength = aFi.FractionLength
curFractionLength = -50

Sometimes fixed-point variables have negative integer lengths.

b = 0.00037;
bFi = fi(b,0,3)
bFi = 
   3.6621e-04

          DataTypeMode: Fixed-point: binary point scaling
            Signedness: Unsigned
            WordLength: 3
        FractionLength: 14
curIntegerLength = bFi.WordLength - bFi.FractionLength
curIntegerLength = -11

How can that be correct? Is that a bug?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

MathWorks Fixed Point Team 2022년 11월 7일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1845588-faq-why-is-a-fixed-point-type-s-fraction-length-or-integer-length-sometimes-negative#answer_1093498

편집: MathWorks Fixed Point Team 2022년 11월 7일

MATLAB Online에서 열기

Maximizing precision for limited bits

The ability to have a negative fraction lengths or a negative integer lengths is a very good thing. For a given number of bits, it allows a fixed-point type to give maximum precision for very big numbers and for very small numbers respectively.

To see this benefit, look at the amount of error in the examples provided in the question.

a = 5.6632765314184e+15;
aFi = fi(a,1,4);
b = 0.00037;
bFi = fi(b,0,3);
error_a = double(aFi) - a;
relative_abs_error_a = abs(error_a) / abs(a)
relative_abs_error_a = 0.0060
error_b = double(bFi) - b;
relative_abs_error_b = abs(error_b) / abs(b)
relative_abs_error_b = 0.0102

A relative error of under 1% and just over 1% is "pretty darn good" for just 4 bits and 3 bits of word length, respectively.

Fraction Length and Integer Length are Intuitive when "scaling is small".

When thinking of fixed-point numbers using binary-point notation, fraction length and integer length are intuitive

        Type         Real World   Notation: Binary Point
                        Value       
 numerictype(0,3,0)       5     = 101.   
 numerictype(0,3,1)      2.5    =  10.1  
 numerictype(0,3,2)     1.25    =   1.01 
 numerictype(0,3,3)     0.625   =    .101

FALSE Belief that binary-point must be adjacent to the bits.

If you only look at examples of binary-point displays, it is natural to form a FALSE believe that the binary-point must be adjacent to the bits that make up the variables word-length. But that is not true.

Binary-point notation is too limiting. There is an extremely useful and well-known generalization that breaks the adjacent binary-point needless constraint.

Scientific Notation Removes Needless Limits

Decimal-point notation is intuitive, but limiting. As you know, power of 10 scientific notation allows you to represent very big and very small numbers with good accuracy using far fewer digits than decimal-point notation would require.

Likewise, binary-point notation can be generalized with power of 2 scientific-style notation. Same concept, same benefits, more accuracy with few symbols for very big and very small numbers.

A flavor of binary scientific notation that uses integer valued mantissas is shown here. Notice the integer mantissa's are multiplied by two raised to an exponent.

        Type         Real World   Notation: Integer Mantissa in Binary
                         Value           and Pow2 Exponent
 numerictype(0,3,-2)      20     = 101        * 2^2 
 numerictype(0,3,-1)      10     =  101       * 2^1 
  numerictype(0,3,0)       5     =   101      * 2^0 
  numerictype(0,3,1)      2.5    =    101     * 2^-1
  numerictype(0,3,2)     1.25    =     101    * 2^-2
  numerictype(0,3,3)     0.625   =      101   * 2^-3
  numerictype(0,3,4)    0.3125   =       101  * 2^-4
  numerictype(0,3,5)    0.15625  =        101 * 2^-5

The attributes of these eight types are

vecWordLengths =
     3     3     3     3     3     3     3     3
vecFractionLengths =
    -2    -1     0     1     2     3     4     5
vecIntegerLengths =
     5     4     3     2     1     0    -1    -2

Notice that the first two fraction lengths are negative and the last to integer lengths are negative.

But none of that is a problem. What really lives in memory of the microcontroller or FPGA or ASIC are the bits that make up the word length. The word length is the only thing that needs to be positive. Let's have some fun and call that the "Law of Bit Conservation".

If you embrace the binary scientific notation way of thinking about fixed-point types, then thinking about the types fixed exponent will become a more natural description.

vecFixedExponents =
     2     1     0    -1    -2    -3    -4    -5

Conceptual padding bits

If you were really keen to see the binary-point display even if the integer length or fraction length was negative, then what could you do? Just like starting with decimal scientific notation number and converting it to a traditional decimal-point display format, you could as needed jamb in some padding slots on the left end or the right end. For example, for 3e16, you could jamb in 16 zeros after the 3 to get a decimal-point display. Same concept applys in taking binary scientific notation to binary-point display.

For example, if fraction length was negative, then you need to jamb in -1 * FractionLength bits on the least significant end. These pad bits would always be zero.

Likewise, if integer length was negative, then you need to jamb in -1 * IntegerLength bits on the most significant end. If the value is unsigned, then these pad bits are always zero. If the value is two's complement and non-negative, then these pad bits are all zero. If the value is two's complement and negative, then these pad bits are all one. A simplified way to describe all these cases is to say that the pad bits are just a sign extension. For unsigned types, the conceptual sign bit is always implicilty zero.

These slides have shown how to reconcile the interconnected concept for word length, fraction length, and integer length when one of the latter two is negative. The math is all consistent. The "Law of Bit Conservation" is satisfied in all the cases.

In summary,

Negative fraction lengths are great for maximized accuracy of very large numbers.

Negative integer lengths are great for maximized accuracy of very small numbers.

Binary-point thinking is more intuitive, but for a limited range of scalings.

Scientific notation is more general and gives the power of maximum accuracy for a limited number of bits.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

FAQ: Why is a fixed-point type's Fraction Length or Integer Length sometimes negative?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

FAQ: Why is a fixed-point type's Fraction Length or Integer Length sometimes negative?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기