Half precision using GPU

조회 수: 6 (최근 30일)
Fernando
Fernando 2023년 4월 10일
댓글: Fernando 2023년 4월 11일
Hello, I was trying to see if I can run some code using half precision rather than single.
before converting my code, I was trying a very simple example.
A=gpuArray(magic(3));
A=half(A);
This gives me the error: No constructor 'half' with matching signature found.
Using the the half with the CPU works fawlessly.
Any idea if this is supported by all? Looking here, https://www.mathworks.com/help/gpucoder/ug/what-is-half-precision.html, it seems some GPU should support it?
I am using a 16 GB RTX3080 Mobile. R2022b.
  댓글 수: 2
Walter Roberson
Walter Roberson 2023년 4월 11일
Perhaps
A=gpuArray(half(magic(3)))
??
I do not have a GPU available to test with
Fernando
Fernando 2023년 4월 11일
Unforunately, this won't work either, it gives: GPU arrays support only fundamental numeric or logical data types.

댓글을 달려면 로그인하십시오.

채택된 답변

Joss Knight
Joss Knight 2023년 4월 11일
As pointed out, gpuArray does not support half. The main reason is that half is an emulated type only meaningful for deployment to special hardware, it is not native to most processors. Feel free to investigate use of half for code generation.
Do you just want to store data in half to save space on the GPU? You can use the following code to get something like the behaviour you're after:
function u = toHalf(x)
realmaxHalf = single(65504);
x = min(max(x,-realmaxHalf),realmaxHalf);
[f,e] = frexp(abs(x));
sgn = uint16(x>=0);
sgnbit = bitshift(sgn,15);
expbits = bitshift(uint16(e+15),10);
fbits = uint16(f.*2.^10 - 1);
u = bitor(bitor(sgnbit, expbits), fbits);
end
function x = fromHalf(u)
if u == 0
x = single(0);
return
end
u = uint16(u);
sgn = single(bitshift(u,-15));
fbits = bitand(u,uint16(1023));
f = single(fbits+1)./(2.^10);
expbits = bitand(u,uint16(31744));
e = single(bitshift(expbits,-10))-15;
x = (sgn.*2-1).*f.*2.^e;
end
Note, this is a very crude implementation of fp16 that takes no account of nans, infs, correct overflow behaviour or denormals. The half version is just a uint16 with the data in it, you can't actually use it to compute anything in fp16.
  댓글 수: 4
Joss Knight
Joss Knight 2023년 4월 11일
'fraid not. No chance of that! Your only hope is to actually convert to int16 (by rescaling to some range), but you will find many blockers in the way such as integer overflow and unsupported mathematical operations. The code I gave you merely stores the number you have as a float into 16 bits; you can't actually do any computation with it.
Fernando
Fernando 2023년 4월 11일
I see. The issue is that I gain more from having larger matrices as oppossed to have smaller ones with higher precision or digits in them.
I guess I could try to work with your solution while I figure out another way or buy a better gpu.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Matt J
Matt J 2023년 4월 11일
편집: Matt J 2023년 4월 11일
GPU Code Generation does support it, but not the Parallel Computing Toolbox, which is where gpuArray is defined.
  댓글 수: 3
Matt J
Matt J 2023년 4월 11일
편집: Matt J 2023년 4월 11일
You should probaly break the data sets into smaller chunks and process them sequentially. The GeForce RTX 3080 can only process about 70000 threads at a time anyway.
Fernando
Fernando 2023년 4월 11일
Ok, I will try to look into this.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Kernel Creation from MATLAB Code에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by