paretotails function and frequency vector

조회 수: 1 (최근 30일)
Rémy Bretin
Rémy Bretin 2020년 11월 5일
답변: Aditya Patil 2020년 11월 20일
Hi matlab community,
I have a series of measurements, following a normal distribution, which I know that they must be greater than Xmin and lower than Xmax.
My problem is that I have 10^9 measurements, and I can’t save all of them in on array due to memory issue.
My solution is to discretize the range of value, knowing that my measurements are nd decimals accurate:
Step=10^-nd;
X=floor(Xmin*10^nd)/10^nd : step : ceil(Xmax*10^nd)/10^nd;
Therefore, for each measurement, I find the index of the closest values from X, and add +1 to the frequency vector CNT :
CNT=zeros(size(X));
For each measurement m do:
[~,i]=min(abs(X-m)); CNT(i)=CNT(i)+1;
The obtained distribution can be display by:
Edges=[X-step/2 Xmax+step/2];
figure; histogram('BinEdges',Edges,'BinCounts',CNT);
My goal is to be able to estimate the probability to measure a value greater than a threshold value Xth after 10^14 or more measurements.
For that, I would like to apply the “paretotails” function to my problem.
Unfortunately, the function doesn’t propose a way to use a frequency vector.
So I’m asking for your help, if anyone has a solution to my issue.
Thank you all in advance,
Rémy

채택된 답변

Aditya Patil
Aditya Patil 2020년 11월 20일
I have brought this issue to the notice of the concerned people.
If the data is time based, then you can work around the problem by downsampling it. If not, you can recreate the data by using the histogram, but by reducing the number of samples in each bin by same factor. Then you should be able to use paretotails.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Histograms에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by