How to efficiently generate a random integer within a range from an arbitrary probability distribution

2017 1월 7

2 답변

답변 채택됨

업데이트 시간: 2017 1월 8

조회 수: 6 (30일)

이 질문에 답변하려면 로그인하십시오.

Follow Question

이 질문에 답변하려면 로그인하십시오.

Follow Question

이전 댓글 표시

MATLAB Online에서 열기

0 개 추천

I need to generate a random integer within a range from an arbitrary probability distribution, within a loop of 100000 iterations. My implementation works, but I am not sure it is mathematically clean, and it takes forever:

pdf = [ 0.9 0.3 0.003 0.1 0.07 0.0005 0.003 0.15 0.009 0.08 ]; % discrete prob distrib function
cdf = cumsum(pdf); % cumulative distribution function
cdf = cdf / max(cdf); cdf(1) = 0; % normalization
index = ceil(interp1(cdf, [1:numel(pdf)], rand(1)))

Notice that the pdf above is just an example: my actual case is a vector of about 500 numbers.

Here is a different solution, which seems mathematically cleaner, but does not work for my overall problem, and is just as slow as above:

pdf = [ 0.9 0.3 0.003 0.1 0.07 0.0005 0.003 0.15 0.009 0.08 ]; % discrete prob distrib function
cdf = cumsum(pdf); % cumulative distribution function
cdf = cdf - min(cdf); cdf = cdf / max(cdf); % normalization
index = round(interp1(cdf, [1:numel(pdf)], rand(1)))

Is there a more efficient/correct way to do this?

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

채택된 답변

the cyclist 2017년 1월 8일

편집: the cyclist 2017년 1월 8일

MATLAB Online에서 열기

1 개 추천

I think that

index = sum(rand()>cdf)+1;

will be much faster than using interp1 as you do, and will give the same result.

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

the cyclist 2017년 1월 8일

I double-checked by generating 100,000 samples, and the results seem right.

the cyclist 2017년 1월 8일

MATLAB Online에서 열기

Regarding speed ...

I get roughly equivalent results between my solution and randsample. Note that both of these solutions can be vectorized:

index = sum(rand(N,1)>cdf,2)';

and

index = randsample(1:numel(pdf),N,true,pdf);

These will both generate N = 100,000 samples in a few milliseconds.

I did not try to vectorize your solution, but as it currently stands, it took 5 seconds to generate that many.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

the cyclist 2017년 1월 8일

1 개 추천

Do you have the Statistics and Machine Learning Toolbox? If so, you can do this with the randsample command.

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Paolo Binetti 2017년 1월 8일

Thank you. No, I do not have this toolbox but I'll see if I can get it.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

카테고리

도움말 센터 및 File Exchange에서 Random Number Generation에 대해 자세히 알아보기

태그

2017년 1월 7일

the cyclist

2017년 1월 8일

the cyclist

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Translated by