This section explains how the Statistics and Machine Learning Toolbox™ functions quantile
and prctile
compute
quantiles and percentiles.
The prctile
function calculates the percentiles
in a similar way as quantile
calculates quantiles.
The following steps in the computation of quantiles are also true
for percentiles, given the fact that, for the same data sample, the
quantile at the value Q is the same as the percentile at the value
P = 100*Q.
quantile
initially assigns the
sorted values in X
to the (0.5/n),
(1.5/n), ..., ([n – 0.5]/n)
quantiles. For example:
For a data vector of six elements such as {6, 3, 2, 10, 8, 1}, the sorted elements {1, 2, 3, 6, 8, 10} respectively correspond to the (0.5/6), (1.5/6), (2.5/6), (3.5/6), (4.5/6), and (5.5/6) quantiles.
For a data vector of five elements such as {2, 10, 5, 9, 13}, the sorted elements {2, 5, 9, 10, 13} respectively correspond to the 0.1, 0.3, 0.5, 0.7, and 0.9 quantiles.
The following figure illustrates this approach for
data vector X = {2, 10, 5, 9, 13}. The first observation
corresponds to the cumulative probability 1/5 = 0.2, the second observation
corresponds to the cumulative probability 2/5 = 0.4, and so on. The
step function in this figure shows these cumulative probabilities. quantile
instead
places the observations in midpoints, such that the first corresponds
to 0.5/5 = 0.1, the second corresponds to 1.5/5 = 0.3, and so on,
and then connects these midpoints. The red lines in the following
figure connect the midpoints.
Assigning Observations to Quantiles
p
quantiles. Quantiles of X
quantile
finds any quantiles between
the data values using linear interpolation.
Linear interpolation uses linear polynomials to approximate a function f(x) and construct new data points within the range of a known set of data points. Algebraically, given the data points (x1, y1) and (x2, y2), where y1 = f(x1) and y2 = f(x2), linear interpolation finds y = f(x) for a given x between x1 and x2 as follows:
Similarly, if the 1.5/n quantile is y1.5/n and the 2.5/n quantile is y2.5/n, then linear interpolation finds the 2.3/n quantile y2.3/n as
quantile
assigns the first and
last values of X to the quantiles for probabilities
less than (0.5/n) and greater than ([n–0.5]/n),
respectively.
[1] Langford, E. “Quartiles in Elementary Statistics”, Journal of Statistics Education. Vol. 14, No. 3, 2006.