Trying to remove Nans when plotting histogram, pdf and cdf

clear;
load InsulinReadings.mat
xX2 = InsulinReadings;
xX2(xX2==0)=missing;
A2 = mean(xX2,'all',"omitnan")
B2 = median(xX2,'all',"omitnan")
C2 = max(xX2,[],'all',"omitnan")
D2 = min(xX2,[],'all', "omitnan")
figure
histogram(InsulinReadings(~isnan(InsulinReadings),128,'Normalization')
Invalid expression. When calling a function or indexing a variable, use parentheses. Otherwise, check for mismatched delimiters.
xlabel('Insulin ng/dL')
%Now get pdf
[D PD] = allfitdist(xX2,'PDF');
xlabel('Insulin ng/dL');
%Now get the CDF
[D PD] = allfitdist(xGlucoseReadings,'CDF');
xlabel('Insulin ng/dL')

 채택된 답변

Voss
Voss 2022년 2월 25일
I think you basically have it right. I just "fixed" a syntax error on the line where you call histogram(). ("fixed" is in quotes because I can't be sure what you're going for there.)
(Also, looks like allfitdist.m has been removed from the File Exchange, so I can't run it, but maybe your copy does the right thing here - I don't know.)
clear;
load InsulinReadings.mat
xX2 = InsulinReadings;
xX2(xX2==0)=missing;
A2 = mean(xX2,'all',"omitnan")
A2 = 296.9102
B2 = median(xX2,'all',"omitnan")
B2 = 86.4899
C2 = max(xX2,[],'all',"omitnan")
C2 = 1.0226e+04
D2 = min(xX2,[],'all', "omitnan")
D2 = 3.6900e-07
figure
histogram(InsulinReadings(~isnan(InsulinReadings)),128)%,'Normalization')
xlabel('Insulin ng/dL')
%Now get pdf
[D PD] = allfitdist(xX2,'PDF');
Unrecognized function or variable 'allfitdist'.
xlabel('Insulin ng/dL');
%Now get the CDF
[D PD] = allfitdist(xGlucoseReadings,'CDF');
xlabel('Insulin ng/dL')

댓글 수: 6

If you look at the histogram the zeroes are the tallest column
I am trying to avoid this would you be able to help me remove them
I see.
Well, they're not necessarily zeros, they're just values that fall within the first bin of the histogram (i.e., closer to zero than to the right edge of the second bin). I guess it makes sense to plot a histogram of xX2 rather than InsulinReadings since xX2 has the zeros replaced with NaNs (but you still get a lot of small values).
If you want to remove any value that would be in the first bin of the histogram, you have to know what your histogram edges are before creating the histogram (which is ok), or you could replace any value less than some threshold with a NaN (or a 0, then replace all the 0s with NaNs like you're already doing), but then you have to choose some threshold, so your histogram will depend on that choice.
clear;
load InsulinReadings.mat
xX2 = InsulinReadings;
nnz(xX2 == 0) % 41209 zeros
ans = 41209
nnz(isnan(xX2)) % 0 NaNs
ans = 0
xX2(xX2==0)=missing;
nnz(isnan(xX2)) % now 41209 NaNs
ans = 41209
figure
h = histogram(xX2(~isnan(xX2)),128);%,'Normalization')
xlabel('Insulin ng/dL')
% now make a new histogram with values in the first bin replaced with NaNs
edges = get(h,'BinEdges');
xX2(xX2 < edges(2)) = NaN;
figure();
h = histogram(xX2(~isnan(xX2)),128);%,'Normalization')
xlabel('Insulin ng/dL')
Okay thank you
Do you know any way I can repeat this shape of the histogram for a pdf and cdf?
There probably are built-in functions, but I don't know what they are off the top of my head (maybe search the documentation).
It's relatively straighforward to calculate a PDF and CDF from the properties of the histogram:
clear;
load InsulinReadings.mat
xX2 = InsulinReadings;
xX2(xX2==0)=missing;
figure();
h = histogram(xX2(~isnan(xX2)),128);%,'Normalization')
% now make a new histogram with values in the first bin replaced with NaNs
edges = get(h,'BinEdges');
xX2(xX2 < edges(2)) = NaN;
figure();
h = histogram(xX2(~isnan(xX2)),128);%,'Normalization')
xlabel('Insulin ng/dL')
% pdf and cdf
figure();
edges = get(h,'BinEdges');
counts = get(h,'BinCounts');
bin_centers = (edges(1:end-1)+edges(2:end))/2;
total_counts = sum(counts);
pdf = counts/total_counts;
cdf = cumsum(counts)/total_counts;
plot(bin_centers,pdf,'LineWidth',2);
hold on
plot(bin_centers,cdf,'LineWidth',2);
xlim(bin_centers([1 end]));
legend('PDF','CDF');

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Data Distribution Plots에 대해 자세히 알아보기

제품

릴리스

R2021b

질문:

2022년 2월 25일

댓글:

2022년 2월 26일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by