Problem evaluating very small numbers

Question

Marc Laub 2023년 12월 29일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2064956-problem-evaluating-very-small-numbers

댓글: Steven Lord 2024년 1월 1일

Hey, following problem

I trie to calculate probabilities from a bernoulli process.

So lets say I have a pool with 2 kind of balls in it, A and B. P_A=A/(A+B) is the probability to pick A and P_B=B/(A+B) is the probability to pick B.

My question know is to calculate the probability to pick a certain composition.

So when I take a bucket and draw some balls, my expected percentage of A in the bucket is P_A.

The spread, standart deviation is given by :sigma=sqrt(var)=sqrt(P_A*P_B/N), with N beeing the sample size in the bucket.

Now I just sweep over the composition space and evaluate my propabilities with given parameters and put them into a norma distribution.

My problem now is, that sample size N is dependent on the ratio in the bucket itsself and there are ceratin ratios that are (since it is a thermodynamic problem) impossible.

This leads to the fact that compositions where the probability would be highest is impossible (sigma=0) and the only compositions that are theoretically possible have such a low probability, that their probabiity is evaluated as zero, but in fact must be larger than zero.

Then pikcing rand composition ccording to their probability leads to an error since there is no probability value larger 0...

Is there some trick to avaluate those non zero probabilities and overcome the limit of eps(0)=4.9407e-324

Just some example numbers:

Cmp= 0.96; %expected value
x=linspace(0,1,100);
N=[-1040.60913764593	-1274.45152017157	-1517.53917725789	-1795.99182909355	-2122.34693486558	-2510.06682714684	-2975.64759019587	-3540.23709594117	-4231.60030785356	-5086.84402905997	-6156.38134667049	-7509.86596572323	-9245.29486431790	-11503.3241150237	-14490.4056493614	-18517.3387573243	-24065.7714122282	-31907.5747396386	-43329.2815472576	-60577.7164494251	-87804.5311543582	-133234.395760636	-214664.544494865	-375335.735182308	-738466.463858388	-1748414.20821032	-5774747.55005333	-41795919.1425974	-90501964828.2929	72025044.7727275	8009861.69662709	2313015.83600968	972090.250808377	500079.010918636	291994.059652964	185962.451073684	126170.032691603	89826.2196141791	66420.8965345887	50639.6682593742	39593.8852638151	31619.0994776238	25708.9843417277	21230.0434065884	17769.5292145238	15050.4283306574	12881.9596492194	11129.7129415162	9697.06783111667	8513.31060620962	7525.84673350056	6694.98454205075	5990.37155119071	5388.51592935915	4871.03437008720	4423.39490718085	4034.00244586203	3693.52515537666	3394.39247493320	3130.41695426230	2896.50651023482	2688.44342683401	2502.71313027739	2336.37044190681	2186.93430415773	2052.30432191963	1930.69415199415	1820.57800282139	1720.64740916629	1629.77611481347	1546.99139540023	1471.45052919991	1402.42140865373	1339.26650328765	1281.42955245236	1228.42449663107	1179.82625816233	1135.26306309632	1094.41006084980	1056.98405150267	1022.73917534666	991.463458577286	962.976145552797	937.125784786936	913.789076401132	892.870538113866	874.303112577631	858.049933356184	844.107611471935	832.511638100216	823.344895877077	816.750981076786	812.955382999989	812.300289399375	815.304778866772	822.776840476043	836.045264946388	857.525633118955	892.577334329355	966.725416708172];
var=Cmp*(1-Cmp)/N
simga=real(sqrt(var));
P=1./(sigma*sqrt(2*pi)).*exp(-0.5*((x-Cmp)./sigma).^2);

P(30:end) are evaluated as zero, where in fact must be >0 and those values are important for me, because with those values being zero, there is no solution, wher ein fact there is a solution, just with a very small probability.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Steven Lord 2024년 1월 1일

So lets say I have a pool with 2 kind of balls in it, A and B. P_A=A/(A+B) is the probability to pick A and P_B=B/(A+B) is the probability to pick B.

What are typical values for A and B for your problem? It's not clear to me how the "example numbers" in your code segment are related to the problem you described. You have defined a value N, but it's also not clear to me how you can have a sample size of -1040.60913764593.

Are you drawing with or without replacement?

This leads to the fact that compositions where the probability would be highest is impossible (sigma=0) and the only compositions that are theoretically possible have such a low probability, that their probabiity is evaluated as zero, but in fact must be larger than zero.

By the infinite monkey theorem there is a non-zero probability of a monkey sitting at a typewriter and producing the complete works of William Shakespeare. But one of the examples on the orders of magnitude (for numbers) Wikipedia page is for the probability of a monkey reproducing just one of those works, Hamlet. That probability (which is almost certainly never going to happen in this universe) is much, MUCH greater than one of the probabilities Walter Roberson computed in a comments (of 1e-864000000)

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Ayush 2023년 12월 29일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2064956-problem-evaluating-very-small-numbers#answer_1380066

Hi @Marc Laub

I understand that you are finding your probabilities being rounded down to zero due to numerical precision issues, you might consider working with log-probabilities instead. When you need to compare probabilities or normalize them, you can do so in the log space, which can help manage issues with numerical underflow.

To calculate the log of the value you may read: https://www.mathworks.com/help/matlab/ref/log.html

Thanks,

Ayush

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Marc Laub 2023년 12월 31일

I tried and saved my probability matrices as log-probabilities.

I then tried to spread the log-probabilities onto the whole space from -10^-308 till +10^308, instead of from -10^-308 to 1 as it would be for a normalized function. problem now is, that the function randsample intrinsic normalizes the probabilities which destroys all efford.... since it needs to take the some over all probabilities for the normaization and the sum of numbers that are spread up to +10^308 is unfortunately Inf... so all probabilities are Inf after normalization within the randperm function.....

댓글을 달려면 로그인하십시오.

Answer 2

Hassaan 2023년 12월 29일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2064956-problem-evaluating-very-small-numbers#answer_1380096

MATLAB Online에서 열기

% Assuming you have some values for p_A, p_B, and N
p_A = 0.96; % Just an example value
p_B = 1 - p_A;
N = 100; % Example sample size
% Calculating variance using the provided formula
variance = p_A * p_B / N;
% If the variance is too small, consider using vpa for higher precision
variance = vpa(variance, 50); % 50 digits of precision
% Calculating standard deviation
std_dev = sqrt(variance);
% If you need to evaluate the probability density function (pdf)
% at a particular point x, you could use the normpdf function.
% However, if the std_dev is too small, this could be problematic.
% Instead, you can compute the log-pdf.
% Example x value
x = 0.95;
% Log-pdf calculation
log_pdf = -0.5 * ((x - p_A) / std_dev)^2 - log(std_dev) - 0.5 * log(2*pi);
% If you need the actual pdf value, convert back from the log scale
pdf = exp(log_pdf);
disp(pdf)
17.872927470476997409064296230165

------------------------------------------------------------------------------------------------------------------------------------------------

If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Walter Roberson 2023년 12월 31일

MATLAB Online에서 열기

variance = vpa(variance, 50);

You cannot just introduce vpa at that point and expect it to improve precision.

Walter Roberson 2023년 12월 31일

MATLAB Online에서 열기

Q = @(v) sym(v);

% Assuming you have some values for p_A, p_B, and N

p_A = Q(0.96); % Just an example value

p_B = 1 - p_A;

N = Q(100); % Example sample size

% Calculating variance using the provided formula

variance = p_A * p_B / N

variance =

% Calculating standard deviation

std_dev = sqrt(variance)

std_dev =

% If you need to evaluate the probability density function (pdf)

% at a particular point x, you could use the normpdf function.

% However, if the std_dev is too small, this could be problematic.

% Instead, you can compute the log-pdf.

% Example x value

x = Q(0.95);

% Log-pdf calculation

log_pdf = -Q(0.5) * ((x - p_A) / std_dev)^2 - log(std_dev) - Q(0.5) * log(2*Q(pi));

% If you need the actual pdf value, convert back from the log scale

pdf = exp(log_pdf);

disp(pdf)

댓글을 달려면 로그인하십시오.

Answer 3

Walter Roberson 2023년 12월 31일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2064956-problem-evaluating-very-small-numbers#answer_1380936

Is there some trick to avaluate those non zero probabilities and overcome the limit of eps(0)=4.9407e-324

Not without using the Symbolic Toolbox, which you have already indicated you are not willing to do.

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

Walter Roberson 2024년 1월 1일

편집: Walter Roberson 2024년 1월 1일

MATLAB Online에서 열기

You have problems because you take real(sqrt(var)) when var contains a bunch of negatives. real(sqrt()) comes out as 0. You end up multiplying and dividing by 0.

Q = @(v) sym(v);

Cmp = Q(0.96); %expected value

x = Q(linspace(0,1,101));

N = Q([-1040.60913764593 -1274.45152017157 -1517.53917725789 -1795.99182909355 -2122.34693486558 -2510.06682714684 -2975.64759019587 -3540.23709594117 -4231.60030785356 -5086.84402905997 -6156.38134667049 -7509.86596572323 -9245.29486431790 -11503.3241150237 -14490.4056493614 -18517.3387573243 -24065.7714122282 -31907.5747396386 -43329.2815472576 -60577.7164494251 -87804.5311543582 -133234.395760636 -214664.544494865 -375335.735182308 -738466.463858388 -1748414.20821032 -5774747.55005333 -41795919.1425974 -90501964828.2929 72025044.7727275 8009861.69662709 2313015.83600968 972090.250808377 500079.010918636 291994.059652964 185962.451073684 126170.032691603 89826.2196141791 66420.8965345887 50639.6682593742 39593.8852638151 31619.0994776238 25708.9843417277 21230.0434065884 17769.5292145238 15050.4283306574 12881.9596492194 11129.7129415162 9697.06783111667 8513.31060620962 7525.84673350056 6694.98454205075 5990.37155119071 5388.51592935915 4871.03437008720 4423.39490718085 4034.00244586203 3693.52515537666 3394.39247493320 3130.41695426230 2896.50651023482 2688.44342683401 2502.71313027739 2336.37044190681 2186.93430415773 2052.30432191963 1930.69415199415 1820.57800282139 1720.64740916629 1629.77611481347 1546.99139540023 1471.45052919991 1402.42140865373 1339.26650328765 1281.42955245236 1228.42449663107 1179.82625816233 1135.26306309632 1094.41006084980 1056.98405150267 1022.73917534666 991.463458577286 962.976145552797 937.125784786936 913.789076401132 892.870538113866 874.303112577631 858.049933356184 844.107611471935 832.511638100216 823.344895877077 816.750981076786 812.955382999989 812.300289399375 815.304778866772 822.776840476043 836.045264946388 857.525633118955 892.577334329355 966.725416708172]);

var = Cmp*(1-Cmp)./N

var =

sigma = real(sqrt(var))

sigma =

part1 = 1./(sigma*sqrt(2*Q(pi)));

part2 = exp(-0.5*((x(:)-Cmp)./sigma).^2);

P = part1 .* part2;

surf(double(P), 'edgecolor', 'none')

Walter Roberson 2024년 1월 1일

MATLAB Online에서 열기

Q = @(v) sym(v);
Cmp = Q(0.96); %expected value
x = Q(linspace(0,1,101));
N = Q([-1040.60913764593	-1274.45152017157	-1517.53917725789	-1795.99182909355	-2122.34693486558	-2510.06682714684	-2975.64759019587	-3540.23709594117	-4231.60030785356	-5086.84402905997	-6156.38134667049	-7509.86596572323	-9245.29486431790	-11503.3241150237	-14490.4056493614	-18517.3387573243	-24065.7714122282	-31907.5747396386	-43329.2815472576	-60577.7164494251	-87804.5311543582	-133234.395760636	-214664.544494865	-375335.735182308	-738466.463858388	-1748414.20821032	-5774747.55005333	-41795919.1425974	-90501964828.2929	72025044.7727275	8009861.69662709	2313015.83600968	972090.250808377	500079.010918636	291994.059652964	185962.451073684	126170.032691603	89826.2196141791	66420.8965345887	50639.6682593742	39593.8852638151	31619.0994776238	25708.9843417277	21230.0434065884	17769.5292145238	15050.4283306574	12881.9596492194	11129.7129415162	9697.06783111667	8513.31060620962	7525.84673350056	6694.98454205075	5990.37155119071	5388.51592935915	4871.03437008720	4423.39490718085	4034.00244586203	3693.52515537666	3394.39247493320	3130.41695426230	2896.50651023482	2688.44342683401	2502.71313027739	2336.37044190681	2186.93430415773	2052.30432191963	1930.69415199415	1820.57800282139	1720.64740916629	1629.77611481347	1546.99139540023	1471.45052919991	1402.42140865373	1339.26650328765	1281.42955245236	1228.42449663107	1179.82625816233	1135.26306309632	1094.41006084980	1056.98405150267	1022.73917534666	991.463458577286	962.976145552797	937.125784786936	913.789076401132	892.870538113866	874.303112577631	858.049933356184	844.107611471935	832.511638100216	823.344895877077	816.750981076786	812.955382999989	812.300289399375	815.304778866772	822.776840476043	836.045264946388	857.525633118955	892.577334329355	966.725416708172]);
var = Cmp*(1-Cmp)./N;
sigma = real(sqrt(var));
part1 = 1./(sigma*sqrt(2*Q(pi)));
part2 = exp(-0.5*((x(:)-Cmp)./sigma).^2);
P = part1 .* part2;
nnz(isnan(double(P))) / numel(P) * 100
ans = 29.0000

29% of your solution comes out NaN

Walter Roberson 2024년 1월 1일

MATLAB Online에서 열기

format long g

Q = @(v) sym(v);

Cmp = Q(0.96); %expected value

x = Q(linspace(0,1,101));

N = Q([-1040.60913764593 -1274.45152017157 -1517.53917725789 -1795.99182909355 -2122.34693486558 -2510.06682714684 -2975.64759019587 -3540.23709594117 -4231.60030785356 -5086.84402905997 -6156.38134667049 -7509.86596572323 -9245.29486431790 -11503.3241150237 -14490.4056493614 -18517.3387573243 -24065.7714122282 -31907.5747396386 -43329.2815472576 -60577.7164494251 -87804.5311543582 -133234.395760636 -214664.544494865 -375335.735182308 -738466.463858388 -1748414.20821032 -5774747.55005333 -41795919.1425974 -90501964828.2929 72025044.7727275 8009861.69662709 2313015.83600968 972090.250808377 500079.010918636 291994.059652964 185962.451073684 126170.032691603 89826.2196141791 66420.8965345887 50639.6682593742 39593.8852638151 31619.0994776238 25708.9843417277 21230.0434065884 17769.5292145238 15050.4283306574 12881.9596492194 11129.7129415162 9697.06783111667 8513.31060620962 7525.84673350056 6694.98454205075 5990.37155119071 5388.51592935915 4871.03437008720 4423.39490718085 4034.00244586203 3693.52515537666 3394.39247493320 3130.41695426230 2896.50651023482 2688.44342683401 2502.71313027739 2336.37044190681 2186.93430415773 2052.30432191963 1930.69415199415 1820.57800282139 1720.64740916629 1629.77611481347 1546.99139540023 1471.45052919991 1402.42140865373 1339.26650328765 1281.42955245236 1228.42449663107 1179.82625816233 1135.26306309632 1094.41006084980 1056.98405150267 1022.73917534666 991.463458577286 962.976145552797 937.125784786936 913.789076401132 892.870538113866 874.303112577631 858.049933356184 844.107611471935 832.511638100216 823.344895877077 816.750981076786 812.955382999989 812.300289399375 815.304778866772 822.776840476043 836.045264946388 857.525633118955 892.577334329355 966.725416708172]);

var = Cmp*(1-Cmp)./N;

sigma = real(sqrt(var));

part1 = 1./(sigma*sqrt(2*Q(pi)));

part2 = exp(-0.5*((x(:)-Cmp)./sigma).^2);

P = part1 .* part2;

minP = min(P,[],'all')

minP =

double(minP)

ans =

0

maxP = max(P,[],'all')

maxP =

double(maxP)

ans =

17277.7116645823

simplify(log(minP))

ans =

double(ans)

ans =

-864300527.515557

댓글을 달려면 로그인하십시오.

Problem evaluating very small numbers

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

답변 (3개)

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Problem evaluating very small numbers

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

답변 (3개)

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 4 이전 댓글 2개 표시이전 댓글 2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기