Unexpected interquartile range (IQR) result

조회 수: 14 (최근 30일)
Sim
Sim 2023년 12월 9일
댓글: Sim 2023년 12월 11일
For a number of distributions I would like to compare and show the interquartile range (IQR) and the standard deviation (STD).
For the normal distribution I got more or less what expected, i.e. the percentage of data within 1 STD, is around 68% of the distribution, and the IQR is around 50% of the distribution (i.e. the central half of the distribution). Here following my test:
clear all; clc;
samplesize = 100000;
% generate distribution
mu = 0;
sigma = 1;
data = normrnd(mu,repmat(sigma,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 68.1040
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50.1370
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off
However, if I try the same with another distribution, like a gamma one, the IQR is not 50% anymore of the distribution. What did I do wrong?
clear all; clc;
samplesize = 100000;
% generate distribution
a = 1;
b = 5;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.5350
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 100
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off

답변 (1개)

Sim
Sim 2023년 12월 9일
편집: Sim 2023년 12월 9일
my bad.. this is the solution:
dataIQR = data( data > q(1) & data < q(3) );
and the vertical lines related to the quartiles need to be replaced by this command:
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
This is a correct example:
% generate distribution
samplesize = 100000;
a = 1;
b = 8;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.3970
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data( data > q(1) & data < q(3) );
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50
% plot
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
  댓글 수: 2
Steven Lord
Steven Lord 2023년 12월 9일
You could check your results using the iqr function and/or the prctile function, each moved from Statistics and Machine Learning Toolbox to MATLAB in release R2022a.
Sim
Sim 2023년 12월 11일
Thanks a lot @Steven Lord for your nice comment and suggestion! :-) :-)

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Descriptive Statistics에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by