Different results on different computers, Matlab 2021b - 64 bit, windows 64 bit, both Intel chips

Question

Roman Foell 2021년 12월 1일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1600765-different-results-on-different-computers-matlab-2021b-64-bit-windows-64-bit-both-intel-chips

댓글: Walter Roberson 2023년 11월 2일

Hello together,

I was testing some code on two different machines, both 64 bit windows, both Matlab 2021b with 64 bit.

I was suprised, that a simple operation with the same variables, the same precision, reproduces slightly different results.

It is not a huge operation, actually just a matrix vector multiplication of a vector A with size 1x16 and a matrix B with 16x3, both in single format, resulting in a vector C of 1x3.

I tested the bit representation of both, the vector A and the matrix B entries, and they are exactly the same.

But when I perform the matrix vector multiplication C = A*B; , the first entry is different on the two machines.

The funny thing is, that, when I perform C(1) = A*B(:,1); I get the same value on both machines, and I get also the same value (but the other different result) when I perform C(1) = sum(A.*B(:,1)');

So summarized:

when I perform C = A*B, the first entries are different on the two machines ('10111101111101110110011011110110' and '10111101111101110110011011111000')
when I perform C(1) = A*B(:,1), the values are the same on the two machines ('10111101111101110110011011111000')
when I perform C(1) = sum(A.*B(:,1)'), the values are the same on the two machines ('10111101111101110110011011110110')

How does this come, and which value to trust?

Thanks!

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

Roman Foell 2021년 12월 1일

편집: Roman Foell 2021년 12월 1일

Edit: Just as information, I tested several such matrix vector multiplications with different vectors and but the same matrix (40000 overall) and nearly all of them gave different results.

James Tursa 2021년 12월 1일

편집: James Tursa 2021년 12월 1일

Can you post some actual small examples that demonstrate this? Either post the hex versions of the numbers, or maybe a mat file. I.e., post something so that we can use the exact same numbers to start with.

Do the two machines use different BLAS libraries? Or maybe the floating point rounding mode is set differently on the two machines for some reason.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Christine Tobler 2021년 12월 10일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1600765-different-results-on-different-computers-matlab-2021b-64-bit-windows-64-bit-both-intel-chips#answer_852250

편집: Christine Tobler 2021년 12월 10일

MATLAB Online에서 열기

First, about "which value to trust?"

Both values are equally trustworthy, the differences in results come down to applying the multiplications and additions of the matrix multiplication in different order. There is no right and wrong choice there, and so it's made based on performance considerations for each hardware architecture.

So for an individual matrix, you could use the symbolic toolbox and compare which machine got closer to the exact result, but this will come down to random luck for any specific input matrix.

The harder question is "how does this come?"

We make sure that MATLAB commands are reproducible, by which we mean: If you're on the same MATLAB version, same machine, same OS, same number of threads allowed for MATLAB to use, no change in any deep-down BIOS settings, the outputs of the same command with the exact same inputs are always the same.

To see what might be going on, can you run

version -blas

on your machine? This will tell us about which version of the library that we call for matrix multiplication has been chosen. I suspect they might be using different instruction set levels (e.g., AVX2 vs AVX512).

댓글 수: 7
이전 댓글 5개 표시이전 댓글 5개 숨기기

Christine Tobler 2021년 12월 10일

Yes, I would say the differences are coming down to which branch is used. The newer machine has AVX512 registers, which allow faster speed, and the MKL library detects this and uses code which uses these new registers.

I'm not aware of a way to switch the branch chosen here by hand, although I assume there might be some deep-down way to do this.

However, even if possible, I wouldn't recommend doing this, because this would affect all computations in MATLAB, meaning that you wouldn't get some of the performance benefits of your new machine. Also, this would be a configuration of MATLAB that hasn't been tested (running AVX2 instructions on a machine that has AVX512 hardware), so you'd be losing some safety there, too.

The change in round-off you're experiencing here is likely to also happen with a MATLAB update, or at your next machine update, so changing to using AVX2 would only be delaying these types of portability concerns.

Andrew Roscoe 2023년 10월 23일

편집: Andrew Roscoe 2023년 10월 25일

MATLAB Online에서 열기

After a lot of digging I finally get to this webpage and document set:

https://www.intel.com/content/www/us/en/docs/onemkl/developer-guide-windows/2023-2/overview.html

and the pdf version:

https://cdrdv2.intel.com/v1/dl/getContent/781016?fileName=onemkl_developer-guide-windows_2023.2-766692-781016.pdf

It is worth a read.

In particular, you CAN force MATLAB to use AVX2 across all machines if they are all AVX2 capable, so for example the Xeon and other new i7 CPUs will be "restrained" from AVX512 to AVX2. The performance loss does not seem to be significant on the Xeon-processor machine I tried, and it allows me to get bitwise-perfect matching MATLAB (and more significantly, Simulink) similations between multiple workstations running different i5/i7/Xeon processors, that otherwise produce DIFFERENT results. I don't see any evidence (at least so far) that the AVX2 usage on the AVX512-capable machines is producing incorrect results,

To get it to work, start MATLAB via a batch file that sets the environment variable MKL_ENABLE_INSTRUCTIONS, or, set that environment variable directly via Windows settings, BEFORE starting MATLAB.

In batch file:

set MKL_ENABLE_INSTRUCTIONS=AVX2
"C:\Program Files\MATLAB\R2021b\bin\matlab.exe -singleCompThread"

(or similar).

Then try version('-blas') at the MATLAB command prompt and check that all Xeon-type processors now say "AVX2" not "AVX512".

I am finding that I also need to constrain all MATLAB sessions to the same number of threads across workstations, as well as the same "AVX2" setting, to guarantee numerical repeatability. Practically this is easiest by just using a single thread for MATLAB/Simulink. This can be achieved either by using the -singleCompThread option when starting MATLAB, or by executing

maxNumCompThreads(1);

early in the MATLAB script that configures a simulation.

I did experiment with the other environment variable setting:

set MKL_CBWR=AVX2,STRICT

This did not seem to have any effect on MATLAB when I tried version('-blas'); I don't know why that doesn't work as per the documentation.

Also I did experiment with the other environment variable setting:

set MKL_DEBUG_CPU_TYPE=5

This DID work, and seemed to be equivalent to

set MKL_ENABLE_INSTRUCTIONS=AVX2

but it isn't as well documented as

set MKL_ENABLE_INSTRUCTIONS=AVX2

so I choose the latter solution.

Andrew Roscoe 2023년 11월 2일

편집: Andrew Roscoe 2023년 11월 2일

A further finding is that to get the same Simulink results I need to bdclose() the relevant model and re-open it just prior to the simulation, OR, start a fresh MATLAB session. Otherwise, "something" in the cached memory of Simulink can cause a set of simulations to produce different numerical results if you re-run the set of simulations in a different order to the first time.

Deleting the slxc files, and/or the contents of the slprj directory, seems to have no effect. It seems to be something in the memory of the Simulink session, related to the open model, that is relevant.

This is rather annoying, because, while the time penalty for constraining to AVX2 and a single thread is not too bad (10-20% simulation slowdown), the time penalty for not being able to benefit from reduced JIT acceleration times, for simulations run in a sequence, is very large. Often the JIT acceleration time is nearly 50% of the total time required. For a sequence of simulations, if the simulations use the same model (or nearly the same model), the JIT acceleration time can be dropped to almost-zero or even zero, if Simulink realises that the model is similar to the one it just simulated.

BUT, to get numerical reproducability I seem to have to bdclose() the model between every simulation, which means that the JIT acceleration takes the full time, every time, even if I leave the slprj directory intact.

Walter Roberson 2023년 11월 2일

@Andrew Roscoe

To check:

If you set the random number seed to a constant before each run, does the same problem happen? (I assume here that even if you do not knowingly use random numbers, that something in your model might just be using random numbers.)

Something else that can cause subtle differences if if somehow the rounding mode got set. Rounding mode at the MATLAB level is not documented; it is set via system_dependent() or feature(); see https://undocumentedmatlab.com/articles/undocumented-feature-function

댓글을 달려면 로그인하십시오.

Answer 2

Roman Foell 2021년 12월 1일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1600765-different-results-on-different-computers-matlab-2021b-64-bit-windows-64-bit-both-intel-chips#answer_845330

편집: Roman Foell 2021년 12월 2일

I attached the example variables A,B in my first post.

@James Tursa: How to check the BLAS setting? How to check the rounding setting for floating point?

Edit: Following https://de.mathworks.com/matlabcentral/answers/223952-configuration-of-lapack-and-blas-in-matlab the BLAS setting is dependent of the Matlab version, I used both Matlab 2021b.

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

Andres 2021년 12월 7일

MATLAB Online에서 열기

I can also confirm different results on different computers.

Matlab Online gave the same results as in my previous comment, but

R2020a and R2021b on Intel(R) Core(TM) i5-7300U:

C1 = dec2bin(typecast( (A*B) * [1;0;0], 'uint32'))
C2 = dec2bin(typecast( A*B(:,1), 'uint32'))
C3 = dec2bin(typecast( sum(A.*B(:,1)'), 'uint32'))
N1 = (A*B) * [1;0;0] - A*B(:,1)
C1 =
    '10111101111101110110011011111000'
C2 =
    '10111101111101110110011011111000'
C3 =
    '10111101111101110110011011110110'
N1 =
  single
     0

Roman Foell 2021년 12월 7일

@Andres: Thanks, so actually the same as for me. Do you could figure out, from which this difference comes?

댓글을 달려면 로그인하십시오.

Different results on different computers, Matlab 2021b - 64 bit, windows 64 bit, both Intel chips

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

채택된 답변

댓글 수: 7
이전 댓글 5개 표시이전 댓글 5개 숨기기

추가 답변 (1개)

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Different results on different computers, Matlab 2021b - 64 bit, windows 64 bit, both Intel chips

댓글 수: 4 이전 댓글 2개 표시이전 댓글 2개 숨기기

채택된 답변

댓글 수: 7 이전 댓글 5개 표시이전 댓글 5개 숨기기

추가 답변 (1개)

댓글 수: 6 이전 댓글 4개 표시이전 댓글 4개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

댓글 수: 7
이전 댓글 5개 표시이전 댓글 5개 숨기기

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기