precision of double variables

Question

Arda Yigit 2022년 6월 17일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1742685-precision-of-double-variables

댓글: Walter Roberson 2022년 6월 24일

채택된 답변: Walter Roberson

The code below outputs 1 and 1.1 (and not 0.9):

for ii = -1.1:0.1:1.1

if ii >= 0.9

disp(ii)

end

If the first line is replaced as the following:

for ii = -1.2:0.1:1.2

the output becomes 0.9, 1, 1.1 and 1.2.

The difference betwwen the actual value of the variable ii and the expected value is epsilon=2.2204e-16.

Does anyone have any idea why replacing 1.1 by 1.2 causes the behavior?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Walter Roberson 2022년 6월 18일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1742685-precision-of-double-variables#answer_987965

MATLAB Online에서 열기

Suppose you were using decimal to 4 decimal places.

1/3 == 0.3333
1/3 + 1/3 = 0.3333 + 0.3333 = 0.6666
2/3 = 0.6667 because that is the closest representable number to 4 decimal places

Notice that taking the closest representation to 1/3 and adding it to itself does not get you the closest representation to 2/3

You could clearly expand this to any finite number of decimal places. 1/3 is 3 repeated any finite number of times, add it to itself to get 6 repeated that number of times, but 2/3 is 6 repeating ending in 7.

This is thus an inherent problem in using any fixed numeric base to a finite number of places: there will exist some number that rounds up when calculated as a fraction but rounds down through repeated addition.

It happens that MATLAB uses industry standard IEEE 754 binary floating point, which is what is designed into your computer CPU (if you have a GPU then the GPU might be nearly the same but differ on handling very small magnitude numbers near 1e-308)

IEEE 754 binary floating point (and every other binary floating point) has the property that 1/10 cannot be exactly represented. This is for the same mathematical reasons that decimal cannot exactly represent 1/3 or 1/7 or 1/11: every system that represents numbers with finite precision in a fixed numeric base has the same deficiency for some numbers.

So 0.1+0.1 in binary floating point is not guaranteed to be exactly the same as starting with 0.2 directly.

The next thing you need to know is that "for" loops operate by cumulative addition. First value plus increment. Add the increment again. Add the increment again. And so on.

"for" loops do not work by taking the starting point and multiply the increment by the number of iterations and add to the base: that has slightly different effects on accumulated loss of precision.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 2

David Goodmanson 2022년 6월 17일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1742685-precision-of-double-variables#answer_987855

편집: David Goodmanson 2022년 6월 17일

MATLAB Online에서 열기

Hi Arda,

you pretty much answered the question by mentioning precision. Most floating point numbers are not represented exactly in memory. That includes most rational numbers such as .9. So you can't always expect the floating point values in the for loop to exactly agree with a value established in some other way. In the following example, you would at least expect the two would-be values of .9 to agree with each other:

% save the values of ii using concatenation (not recommended but the list is small here)
iivals = [];
for ii = -1.1:0.1:1.1
    iivals = [iivals ii];
end
iivals(21) - .9             % 21st value should be .9
   ans =  -1.1102e-16
a = (-1.1:0.1:1.1);
a(21) - .9
   ans =  1.1102e-16

but they don't, and neither equals Matlab's best approximation to .9. The for loop value of ii is too small, so .9 doesn't get displayed. (Doing the 1.2 case will show that the value of ii is just over the line so .9 does get displayed).

To see how these values are stored in memory (Matlab uses the IEEE754 standard) you can use format hex:

   format hex
   iivals(21)
   a(21)
   .9
   
   ans = 3feccccccccccccc       % too small
   ans = 3fecccccccccccce       % too large
   ans = 3feccccccccccccd 

The fix is pretty simple: don't use floating point values in an indexing situation. For example, something like

   for ii = -11:11
       x = ii/10      % use this value to calculate stuff
       if ii >= 9     % use ii for equality-type checks
   end

avoids a lot of problems.

댓글 수: 20
이전 댓글 18개 표시이전 댓글 18개 숨기기

Stephen23 2022년 6월 21일

"only explains the behavior if we a priori assume that the parser treats the square brackets as superfluous"

Not at all: observing the behavior is sufficient to demonstrate that different algorithms are at play. Anything supplementary to that is merely amusing speculation (which will have zero effect on my code).

"...especially when the use of square brackets here could actually reflect a specific intent, i.e., evaluate values exactly as if creating the same for assignment, for which there could be reasons to do so."

Well, if this is turning into a discussion of what is documented or not, then your "specific intent" is definitely not documented anywhere, you just made that up. The actual purposes of square brackets is documented here:

https://www.mathworks.com/help/matlab/matlab_prog/matlab-operators-and-special-characters.html

Nope, there is absolutely nothing there about "evaluate values exactly as if creating the same for assignment". In your example nothing is being concatenated together (i.e. the documented purpose of square brackets), therefore the syntax parser and JIT engine are completely within their intended purposes to optimize such superfluous things away. I would certainly hope that they do.

So far no big surprises.

Walter Roberson 2022년 6월 22일

MATLAB Online에서 열기

Experimentally, for with =(colon) or =[colon] or =colon all compare the same, but storing into an array beforehand is different.

iivals0 = [];
for ii = (-1.1:0.1:1.1)
    iivals0 = [iivals0 ii];
end
iivals1 = [];
for ii = [-1.1:0.1:1.1]  % added brackets
    iivals1 = [iivals1 ii];
end
iivals2 = [];
for ii = -1.1:0.1:1.1  % added brackets
    iivals2 = [iivals2 ii];
end
avalues = -1.1:0.1:1.1;   %precompute 1
iivals3 = [];
for ii = avalues
    iivals3 = [iivals3 ii];
end
avalues = [-1.1:0.1:1.1];   %precompute 2
iivals4 = [];
for ii = avalues
    iivals4 = [iivals4 ii];
end
a = [-1.1:0.1:1.1];
parts = {a, iivals0, iivals1, iivals2, iivals3, iivals4};
for J = 1 : length(parts)
    for K = 1 : length(parts)
        compare(J, K) = isequal(parts{J}, parts{K});
    end
end
N = {'[:] bulk', '(:)', '[:]', ': loop', 'array', '[array]'};
compares = array2table(compare, 'VariableNames', N, 'RowNames', N)
compares = 6×6 table
                [:] bulk     (:)      [:]     : loop    array    [array]
                ________    _____    _____    ______    _____    _______

    [:] bulk     true       false    false    false     true      true  
    (:)          false      true     true     true      false     false 
    [:]          false      true     true     true      false     false 
    : loop       false      true     true     true      false     false 
    array        true       false    false    false     true      true  
    [array]      true       false    false    false     true      true  

Stephen23 2022년 6월 22일

편집: Stephen23 2022년 6월 22일

"In accordance with the doc pages, I coded the for statement that way with intent, namely the first use of square brackets on the linked doc page, which is "array construction,""

No, the colon operator already constructs an array. Your syntax cannot construct an array which already exists.

Whether that array values explicitly exist in memory or not is of academic interest only: the array has been constructed by the colon operator insomuch as MATLAB is able to say "here is an array which I will iterate over". Whether it stores all of the values at once or generates them on the fly is of absolutely no relevance to me, the user. The MATLAB documentation does not (never has, never will) tell us such details, in exactly the same way that they will never tell us in how a million other features are internally implemented and can be optimized and changed over different versions.

"Yet no array was constructed. So the parser ignored the brackets ..."

No. Your and Walter Roberson's incorrect understanding of "array construction" is not supported by the documentation. Nowhere in the MATLAB documentation does it state that using square brackets et al will explicitly create all elements in memory. "Array construction" is just concatenation by another name, which, just to emphasize again, your various example does not do and hence is optimized away. As they should be.

"Experimentally, for with =(colon) or =[colon] or =colon all compare the same, but storing into an array beforehand is different."

Thank you for taking the time to confirm what I expect and wrote.

So far absolutely no surprises.

"I agree that I would expect for=[colon] to work differently than for=colon."

Yet that is entirely inconsistent with the meaning of square brackets (concatenation), the optimization of FOR loops, the reality of all of your own tests, and the MATLAB documentation.

Stephen23 2022년 6월 22일

편집: Stephen23 2022년 6월 22일

MATLAB Online에서 열기

"Apparently the colon operator is not constructing an array when used with for."

It seems that we disagree on the definition of "construct" an array, which for me is something like "defines an array in such as way that its class, size, and element values can be accessed (e.g. by indexing)." Note that this definition does not require the elements to explicitly exist in the computer's memory (it does even refer to memory), but the array certainly has been constructed because its elements can be referred to by the user. Whether MATLAB stores them all in a long list in the computer's memory or magically makes them appear when I want them is to me, the user, a completely irrelevant implementation detail. The array has been constructed: it has a class, a size, and elements that can be retrieved.

This definition is also consistent with the fact that no computer has ever stored an array, just representations of them. Arrays are (much like triangles and numbers) abstract concepts that can only be written/stored using a representation of them. But there is nothing inherently special about one representation or another. Consider a sparse array: once the user has defined it, would you state that it has been "constructed" ? Most of the sparse array's elements are not in memory, yet we can get its class, size, and value of any element (including the zeros, which are not stored explicitly). So... is it constructed? Extend that argument to its logical conclusion: is there a canonical form for storing arrays in computer memory? No, there are infinite possible formats that we could store arrays in memory. Do these formats require storing elements explicitly? No, an infinite subset of those infinite formats could implicitly store any number of their values in an infinite number of ways (sparse are one example of this). Is such an array "constructed"? Yes, when we can access its class, size, and element values.

In contrast if we follow your argument to its logical conclusion we end up with the situation that only some of those infinite array formats meet your (so far unwritten) definition of "constructed" whereas other do not, even though the user can equally access the class, size, and element values of all of them (and sees no practical difference between them). As a simple MATLAB user, I see little added value in such a definition.

"Why does the Special Characters documentation show "Array construction" and "Array concatenation" as two different uses of Square Bracket if they are really the same thing?"

In one word: users. Users are not going to search for or understand "concatenate lots of scalar arrays" when they want to create a vector with some numbers. Would you? I certainly wouldn't have, when I was first learning MATLAB. It is reasonable that users will search for / understand "array construction", but in practice it is simply the concatenation of scalar arrays (all numbers in MATLAB are scalar arrays, there is no special scalar class). From a functional perspective there is absolutely no difference between "array construction" and concatenation of scalar arrays.

Note that if you claim these terms are fundamentally different, then in what precise way are these four operations actually different?:

[1,ones(1,3)]
[1,ones(1,2)]
[1,ones(1,1)]
[1,1]

Remember, scalars are not a special class, they are perfectly ordinary 1x1(x1x1...) arrays.

So, still absolutely no surprises here.

Walter Roberson 2022년 6월 22일

MATLAB Online에서 열기

@Stephen23

There are three documented syntaxes for "for"

initVal:endVal — Increment the index variable from initVal to endVal by 1, and repeat execution of statements until index is greater than endVal.
initVal:step:endVal — Increment index by the value step on each iteration, or decrements index when step is negative.
valArray — Create a column vector, index, from subsequent columns of array valArray on each iteration. For example, on the first iteration, index = valArray(:,1). The loop executes a maximum of n times, where n is the number of columns of valArray, given by numel(valArray(1,:)). The input valArray can be of any MATLAB® data type, including a character vector, cell array, or struct.

The forms with the colon operator directly after the = are defined in terms of incrementing a variable, rather than in terms of evaluating the colon operator and using the resulting array as the value array.

When you use

for ii = [-1.1:0.1:1.1]

then that does not match either of the first two syntaxes, and so must be a match for the third syntax -- which involves first evaluating the right hand side and then indexing into the result.

Therefore, according to the documentation

for ii = [-1.1:0.1:1.1]

should be the same as

iivals = [-1.1:0.1:1.1];
for ii = iivals

But, experimentally, it is not the same. Experimentally for ii = [-1.1:0.1:1.1] is the same as for ii = -1.1:0.1:1.1 which uses the increment strategy.

There is no documented special case for ii = [-1.1:0.1:1.1] as being the same as for ii = -1.1:0.1:1.1

@Paul is completely right to expect that the for ii = [-1.1:0.1:1.1] case will evaluate [-1.1:0.1:1.1] into a temporary internal location and then iterate over its columns, as that is what is documented as happening except for the initVal:endVal and initVal:step:endVal cases.

Stephen23 2022년 6월 23일

편집: Stephen23 2022년 6월 23일

"then that does not match either of the first two syntaxes, and so must be a match for the third syntax... "

Irrelevant. Either:

the MATLAB engine removes your superfluous fluff (unlike what you state, square brackets et al are not documented to actually create an explicit array in memory), in which case the 1st and 2nd syntaxes are perfectly relevant.
the definition I already gave (which you accepted by not disagreeing with it or providing your own definition) does not require the explicit listing of all values in memory. So the array is constructed, just as I defined, and MATLAB iterates over it, following the 3rd syntax.

So every syntax is consistent with my explanation already given.

"@Paul is completely right to expect that the for ii = [-1.1:0.1:1.1] case will evaluate [-1.1:0.1:1.1] into a temporary internal location"

Where is that supported by the documentation? Answer: nowhere. Nowere in the documentation does it talk about "temporary internal locations" or anything about the implementation. Never has, never will.

Question: why are you trying to convince me of something that disagrees with your own test results ? To be honest I am struggling to follow your argument:

provide test evidence of A
claim ~A

It is certainly an innovative strategy.

Paul 2022년 6월 23일

Hi @Stephen23

I've read this comment a few times and I feel like it puts too much emphasis on two issues.

First, I think we all understand how for operates in practice for the case under discussion (for ii = [colon]). We all understand @Walter Roberson's test results, and nobody is trying to use those results to demonstrate that for works any differently than it actually does.

Second, speaking for myself, my mental model of how I think for should work in this case is that the parser sees I'm asking for the third documented syntax, evaluates the valarray on RHS of the = sign, and then loops index over the columns of the valarray. It is appealing, at least to me, to think of this process as actually constructing an array in a termporary internal location to understand the algorithmic flow. But I certainly realize that such construction doesn't have to happen and I don't care if it does, as long as the end result is the same as if such an array was actually constructed.

What I do care about is if Matlab is self-consistent in its syntax and with the documentation. From a user perspective, I don't think it is (surely, for could be improved to mention this issue). Others may disagree.

Are you aware of any cases besides for where [colon] (or (colon) for that matter) in an expression is not functionally equivalent to how [colon] is evaluated on the RHS of an assignment statement?

Stephen23 2022년 6월 23일

편집: Stephen23 2022년 6월 23일

Hello Paul,

Good points, thank you for your comment.

I think what you term the "mental model" is highly relevant to this topic. Really all code (and by extension, all explanations/mental models of code) are just abstractions of what actually happens deep inside the computer. Thus we must consider this paradigm:

https://en.wikipedia.org/wiki/All_models_are_wrong

Levels of abstraction suit different purposes (and also incite endless flame wars about whose language rulz)... but none the less, every model must necessarily simplify, which means they are wrong.. But some models are useful, because they make it easier for us understand or do something (example: any map).

MATLAB suits a particular type of work, it is not a low-level language. Not only that, TMW have made it quite clear that they will not give details of the low-level implementation or optimization, with (IMHO) good reason (for the kind of language it is):

https://www.mathworks.com/matlabcentral/answers/430098-documentation-on-jit-compiler#comment_638005

All kinds of things can and regularly do change in the optimization. As you wrote, "...I don't care if it does, as long as the end result is the same..."

So... what is the meaning of "same" ?

Over the years I have seen minor tweaks to various functions which changed the output bits just slightly, i.e. the usual binary floating point noise. And then some users would complain, because algorithm B delivered different bits to algorithm A, which they previously used and loved. Did TMW do something wrong? Are the outputs the "same" ? Inasmuch as both algorithms are numerically equally good (or equally bad), returning the same values but slightly different numerically meaningless floating point noise, ... are the values the "same"? From a scientific point of view, they are indistinguishable (error propogation is not just related to numeric computing, but is a well-defined topic that is relevant to all measurements and calculations). Or, in George Box's terms, the difference is not "importantly wrong".

That is my mental model of MATLAB: as a user, my code should not rely on a specific algorithm that delivers specific bits (because these are numerically meaningless noise quadrillions of times smaller than the precision of my data) that TMW have already explicitly stated that they will not document. I find that model useful.

"Are you aware of any cases besides for where [colon] (or (colon) for that matter) in an expression is not functionally equivalent to how [colon] is evaluated on the RHS of an assignment statement?"

They are already functionally equivalent.

I guess you are actually asking about the implentation: I can't think of any other example... but then according to my mental model there is also no reason to assume that RHS assignment necessarily requires an explicit array in memory either: perhaps this year's release does, but that is no reason to assume that next year's relase will not do something else. As long as they are functionally the same, I don't care.

Walter Roberson 2022년 6월 24일

@Stephen23

The third form of for is defined as doing indexing. It would be against the documentation to use any kind of incremental computation for the third form.

댓글을 달려면 로그인하십시오.

Answer 3

Jan 2022년 6월 17일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1742685-precision-of-double-variables#answer_987835

편집: Jan 2022년 6월 17일

MATLAB Online에서 열기

Welcome to the world of numerics with limited precision.

These are the expected effects. You observe the value mentioned in the frequently asked question:

FAQ: Why is 0.3 - 0.2 - 0.1 not equal to zero?

If you start at -1.1, you see the deviation at 1.0:

for ii = -1.1:0.1:1.1
    if ii >= 1.0
        disp(ii)
    end
end
     1

    1.1000

If you start at -1.2 you see it at 0.9, so this is simply shifted by 0.1 .

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

precision of double variables

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (2개)

댓글 수: 20
이전 댓글 18개 표시이전 댓글 18개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

precision of double variables

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (2개)

댓글 수: 20 이전 댓글 18개 표시이전 댓글 18개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 20
이전 댓글 18개 표시이전 댓글 18개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기