quantile
Quantiles of data set
Syntax
Description
returns quantiles of elements in input data Q
= quantile(A
,p
)A
for the cumulative
probability or probabilities p
in the interval [0,1].
If
A
is a vector, thenQ
is a scalar or a vector with the same length asp
.Q(i)
contains thep(i)
quantile.If
A
is a matrix, thenQ
is a row vector or a matrix, where the number of rows ofQ
is equal tolength(p)
. Thei
th row ofQ
contains thep(i)
quantiles of each column ofA
.If
A
is a multidimensional array, thenQ
contains the quantiles computed along the first array dimension whose size does not equal 1.
returns quantiles for Q
= quantile(A
,n
)n
evenly spaced cumulative probabilities
(1/(n
+ 1), 2/(n
+ 1), ...,
n
/(n
+ 1)) for integer n
> 1.
If
A
is a vector, thenQ
is a scalar or a vector with lengthn
.If
A
is a matrix, thenQ
is a matrix withn
rows.If
A
is a multidimensional array, thenQ
contains the quantiles computed along the first array dimension whose size does not equal 1.
returns
quantiles of all the elements of Q
= quantile(___,"all")A
for either of the first two
syntaxes.
operates along the dimension Q
= quantile(___,dim
)dim
for either of the first two syntaxes.
For example, if A
is a matrix, then quantile(A,p,2)
operates on the elements in each row.
operates along the dimensions specified in the vector Q
= quantile(___,vecdim
)vecdim
for either
of the first two syntaxes. For example, if A
is a matrix, then
quantile(A,n,[1 2])
operates on all the elements of
A
because every element of a matrix is contained in the array slice
defined by dimensions 1 and 2.
Examples
Quantiles for Given Probabilities
Calculate the quantiles of a data set for specified probabilities.
Generate a data set of size 7.
rng default % for reproducibility A = randn(1,7)
A = 1×7
0.5377 1.8339 -2.2588 0.8622 0.3188 -1.3077 -0.4336
Calculate the 0.3 quantile of the elements of A
.
Q = quantile(A,0.3)
Q = -0.7832
Calculate the quantiles of the elements of A
for the cumulative probabilities 0.025, 0.25, 0.5, 0.75, and 0.975.
Q = quantile(A,[0.025 0.25 0.5 0.75 0.975])
Q = 1×5
-2.2588 -1.0892 0.3188 0.7810 1.8339
Quantiles for n
Evenly Spaced Cumulative Probabilities
Calculate the quantiles of a data set for a given number of probabilities.
Generate a data set of size 7.
rng default % for reproducibility A = randn(1,7)
A = 1×7
0.5377 1.8339 -2.2588 0.8622 0.3188 -1.3077 -0.4336
Calculate four evenly spaced quantiles of the elements of A
.
Q = quantile(A,4)
Q = 1×4
-1.4028 -0.2079 0.4720 0.9593
Using Q = quantile(A,[0.2,0.4,0.6,0.8])
is another way to return the four evenly spaced quantiles.
Quantiles of Matrix for Given Probabilities
Calculate the quantiles along the columns and rows of a data matrix for specified probabilities.
Generate a 4-by-6 data matrix.
rng default % for reproducibility A = randn(4,6)
A = 4×6
0.5377 0.3188 3.5784 0.7254 -0.1241 0.6715
1.8339 -1.3077 2.7694 -0.0631 1.4897 -1.2075
-2.2588 -0.4336 -1.3499 0.7147 1.4090 0.7172
0.8622 0.3426 3.0349 -0.2050 1.4172 1.6302
Calculate the 0.3 quantile for each column of A
.
Q = quantile(A,0.3,1)
Q = 1×6
-0.3013 -0.6958 1.5336 -0.1056 0.9491 0.1078
quantile
returns a row vector Q
when calculating one quantile for each column in A
. -0.3013
is the 0.3 quantile of the first column of A
with elements 0.5377, 1.8339, -2.2588, and 0.8622. Because the default value of dim
is 1, Q = quantile(A,0.3)
returns the same result.
Calculate the 0.3 quantile for each row of A
.
Q = quantile(A,0.3,2)
Q = 4×1
0.3844
-0.8642
-1.0750
0.4985
quantile
returns a column vector Q
when calculating one quantile for each row in A
. 0.3844
is the 0.3 quantile of the first row of A
with elements 0.5377, 0.3188, 3.5784, 0.7254, -0.1241, and 0.6715.
Quantiles of Matrix for n
Evenly Spaced Probabilities
Calculate evenly spaced quantiles along the columns and rows of a data matrix.
Generate a 6-by-7 data matrix.
rng default % for reproducibility A = randi(10,6,7)
A = 6×7
9 3 10 8 7 8 7
10 6 5 10 8 1 4
2 10 9 7 8 3 10
10 10 2 1 4 1 1
7 2 5 9 7 1 5
1 10 10 10 2 9 4
Calculate the quantiles for each column of A
for three evenly spaced cumulative probabilities.
Q = quantile(A,3,1)
Q = 3×7
2.0000 3.0000 5.0000 7.0000 4.0000 1.0000 4.0000
8.0000 8.0000 7.0000 8.5000 7.0000 2.0000 4.5000
10.0000 10.0000 10.0000 10.0000 8.0000 8.0000 7.0000
Each column of matrix Q
contains the quantiles for the corresponding column in A
. 2
, 8
, and 10 are the quantiles of the first column of A
with elements 9, 10, 2, 10, 7, and 1. Q = quantile(A,3)
returns the same result because the default value of dim
is 1.
Calculate the quantiles for each row of A
for three evenly spaced cumulative probabilities.
Q = quantile(A,3,2)
Q = 6×3
7.0000 8.0000 8.7500
4.2500 6.0000 9.5000
4.0000 8.0000 9.7500
1.0000 2.0000 8.5000
2.7500 5.0000 7.0000
2.5000 9.0000 10.0000
Each row of matrix Q
contains the three evenly spaced quantiles for the corresponding row in A
. 7
, 8
, and 8.75
are the quantiles of the first row of A
with elements 9, 3, 10, 8, 7, 8, and 7.
Quantiles of Multidimensional Array for Given Probabilities
Calculate the quantiles of a multidimensional array for specified probabilities by using "all"
and the vecdim
inputs.
Create a 3-by-5-by-2 array. Specify a vector of probabilities.
A = reshape(1:30,[3 5 2])
A = A(:,:,1) = 1 4 7 10 13 2 5 8 11 14 3 6 9 12 15 A(:,:,2) = 16 19 22 25 28 17 20 23 26 29 18 21 24 27 30
p = [0.25 0.75];
Calculate the 0.25 and 0.75 quantiles of all the elements of A
.
Qall = quantile(A,p,"all")
Qall = 2×1
8
23
Qall(1)
is the 0.25 quantile of A
, and Qall(2)
is the 0.75 quantile of A
.
Calculate the 0.25 and 0.75 quantiles for each page of A
by specifying dimensions 1 and 2 as the operating dimensions.
Qpage = quantile(A,p,[1 2])
Qpage = Qpage(:,:,1) = 4.2500 11.7500 Qpage(:,:,2) = 19.2500 26.7500
Qpage(1,1,1)
is the 0.25 quantile of the first page of A
, and Qpage(2,1,1)
is the 0.75 quantile of the first page of A
.
Calculate the 0.25 and 0.75 quantiles of the elements in each A(i,:,:)
slice by specifying dimensions 2 and 3 as the operating dimensions.
Qrow = quantile(A,p,[2 3])
Qrow = 3×2
7 22
8 23
9 24
Qrow(3,1)
is the 0.25 quantile of the elements in A(3,:,:)
, and Qrow(3,2)
is the 0.75 quantile of the elements in A(3,:,:)
.
Median and Quartiles for Even Number of Data Elements
Find the median and quartiles of a vector with an even number of elements.
Create a data vector.
A = [2 5 6 10 11 13]
A = 1×6
2 5 6 10 11 13
Calculate the median of the elements of A
.
Q = quantile(A,0.5)
Q = 8
Calculate the quartiles of the elements of A
.
Q = quantile(A,[0.25, 0.5, 0.75])
Q = 1×3
5 8 11
Using Q = quantile(A,3)
is another way to compute the quartiles of the elements of A
.
These results might be different from the textbook definitions because quantile
uses Linear Interpolation to find the median and quartiles.
Median and Quartiles for Odd Number of Data Elements
Find the median and quartiles of a vector with an odd number of elements.
Create a data vector.
A = [2 4 6 8 10 12 14]
A = 1×7
2 4 6 8 10 12 14
Calculate the median of the elements of A
.
Q = quantile(A,0.50)
Q = 8
Calculate the quartiles of the elements of A
.
Q = quantile(A,[0.25, 0.5, 0.75])
Q = 1×3
4.5000 8.0000 11.5000
Using Q = quantile(A,3)
is another way to compute the quartiles of A
.
These results might be different from the textbook definitions because quantile
uses Linear Interpolation to find the median and quartiles.
Quantiles of Tall Vector for Given Probability
Calculate exact and approximate quantiles of a tall column vector for a given probability.
When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer
function.
mapreducer(0)
Create a datastore for the airlinesmall
data set. Treat "NA"
values as missing data so that datastore
replaces them with NaN
values. Specify to work with the ArrTime
variable.
ds = datastore("airlinesmall.csv","TreatAsMissing","NA", ... "SelectedVariableNames","ArrTime");
Create a tall table tt
on top of the datastore, and extract the data from the tall table into a tall vector A
.
tt = tall(ds)
tt = Mx1 tall table ArrTime _______ 735 1124 2218 1431 746 1547 1052 1134 : :
A = tt{:,:}
A = Mx1 tall double column vector 735 1124 2218 1431 746 1547 1052 1134 : :
Calculate the exact quantile of A
for cumulative probability p = 0.5
. Because A
is a tall column vector and p
is a scalar, quantile
returns the exact quantile value by default.
p = 0.5; Qexact = quantile(A,p)
Qexact = tall double ?
Calculate the approximate quantile of A
for p = 0.5
. Specify method
as "approximate"
to use an approximation algorithm based on T-Digest for computing the quantiles.
Qapprox = quantile(A,p,"Method","approximate")
Qapprox = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : :
Evaluate the tall arrays and bring the results into memory by using gather
.
[Qexact,Qapprox] = gather(Qexact,Qapprox)
Evaluating tall expression using the Local MATLAB Session: - Pass 1 of 4: Completed in 0.77 sec - Pass 2 of 4: Completed in 0.24 sec - Pass 3 of 4: Completed in 0.39 sec - Pass 4 of 4: Completed in 0.29 sec Evaluation completed in 2.2 sec
Qexact = 1522
Qapprox = 1.5220e+03
The values of the exact quantile and the approximate quantile are the same to the four digits shown.
Quantiles of Tall Matrix Along Different Dimensions
Calculate exact and approximate quantiles of a tall matrix for specified cumulative probabilities along different dimensions.
When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer
function.
mapreducer(0)
Create a tall matrix A
containing a subset of variables stored in varnames
from the airlinesmall
data set. See Quantiles of Tall Vector for Given Probability for details about the steps to extract data from a tall array.
varnames = ["ArrDelay","ArrTime","DepTime","ActualElapsedTime"]; ds = datastore("airlinesmall.csv","TreatAsMissing","NA", ... "SelectedVariableNames",varnames); tt = tall(ds); A = tt{:,varnames}
A = Mx4 tall double matrix 8 735 642 53 8 1124 1021 63 21 2218 2055 83 13 1431 1332 59 4 746 629 77 59 1547 1446 61 3 1052 928 84 11 1134 859 155 : : : : : : : :
When operating along a dimension that is not 1, the quantile
function calculates the exact quantiles only so that it can perform the computation efficiently using a sorting-based algorithm (see Algorithms) instead of an approximation algorithm based on T-Digest.
Calculate the exact quantiles of A
along the second dimension for the vector p
of cumulative probabilities 0.25, 0.5, and 0.75.
p = [0.25 0.5 0.75]; Qexact = quantile(A,p,2)
Qexact = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : :
When the function operates along the first dimension and p
is a vector of cumulative probabilities, you must use the approximation algorithm based on t-digest to compute the quantiles. Using the sorting-based algorithm to find quantiles along the first dimension of a tall array is computationally intensive.
Calculate the approximate quantiles of A
along the first dimension for the cumulative probabilities 0.25, 0.5, and 0.75. Because the default dimension is 1, you do not need to specify a value for dim
.
Qapprox = quantile(A,p,"Method","approximate")
Qapprox = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : :
Evaluate the tall arrays and bring the results into memory by using gather
.
[Qexact,Qapprox] = gather(Qexact,Qapprox);
Evaluating tall expression using the Local MATLAB Session: - Pass 1 of 1: Completed in 1.9 sec Evaluation completed in 2.4 sec
Show the first five rows of the exact quantiles of A
(along the second dimension) for the cumulative probabilities 0.25, 0.5, and 0.75.
Qexact(1:5,:)
ans = 5×3
103 ×
0.0305 0.3475 0.6885
0.0355 0.5420 1.0725
0.0520 1.0690 2.1365
0.0360 0.6955 1.3815
0.0405 0.3530 0.6875
Each row of the matrix Qexact
contains the three quantiles of the corresponding row in A
. For example, 30.5
, 347.5
, and 688.5
are the 0.25, 0.5, and 0.75 quantiles, respectively, of the first row in A
.
Show the approximate quantiles of A
(along the first dimension) for the cumulative probabilities 0.25, 0.5, and 0.75.
Qapprox
Qapprox = 3×4
103 ×
-0.0070 1.1149 0.9322 0.0700
0 1.5220 1.3350 0.1020
0.0110 1.9180 1.7400 0.1510
Each column of the matrix Qapprox
contains to the three quantiles of the corresponding column in A
. For example, the first column of Qapprox
with elements –7, 0, and 11 contains the quantiles for the first column of A
.
Quantiles of Tall Matrix for n
Evenly Spaced Probabilities
Calculate exact and approximate quantiles along different dimensions of a tall matrix for a given number of evenly spaced cumulative probabilities.
When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer
function.
mapreducer(0)
Create a tall matrix A
containing a subset of variables stored in varnames
from the airlinesmall
data set. See Quantiles of Tall Vector for Given Probability for details about the steps to extract data from a tall array.
varnames = ["ArrDelay","ArrTime","DepTime","ActualElapsedTime"]; ds = datastore("airlinesmall.csv","TreatAsMissing","NA", ... "SelectedVariableNames",varnames); tt = tall(ds); A = tt{:,varnames}
A = Mx4 tall double matrix 8 735 642 53 8 1124 1021 63 21 2218 2055 83 13 1431 1332 59 4 746 629 77 59 1547 1446 61 3 1052 928 84 11 1134 859 155 : : : : : : : :
To calculate quantiles for evenly spaced cumulative probabilities along the first dimension, you must use the approximation algorithm based on T-Digest. Using the sorting-based algorithm (see Algorithms) to find quantiles along the first dimension of a tall array is computationally intensive.
Calculate the quantiles for three evenly spaced cumulative probabilities along the first dimension of A
. Because the default dimension is 1, you do not need to specify a value for dim
. Specify the method
as "approximate"
to use the approximation algorithm.
Qapprox = quantile(A,3,"Method","approximate")
Qapprox = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : :
To calculate quantiles for evenly spaced cumulative probabilities along any other dimension (dim
is not 1
), quantile
calculates the exact quantiles only, so that it can perform the computation efficiently by using the sorting-based algorithm.
Calculate the quantiles for three evenly spaced cumulative probabilities along the second dimension of A
. Because dim
is not 1, quantile
returns the exact quantiles by default.
Qexact = quantile(A,3,2)
Qexact = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : :
Evaluate the tall arrays and bring the results into memory by using gather
.
[Qapprox,Qexact] = gather(Qapprox,Qexact);
Evaluating tall expression using the Local MATLAB Session: - Pass 1 of 1: Completed in 1.2 sec Evaluation completed in 1.4 sec
Show the approximate quantiles of A
(along the first dimension) for the three evenly spaced cumulative probabilities.
Qapprox
Qapprox = 3×4
103 ×
-0.0070 1.1148 0.9321 0.0700
0 1.5220 1.3350 0.1020
0.0110 1.9180 1.7400 0.1510
Each column of the matrix Qapprox
contains the quantiles of the corresponding column in A
. For example, the first column of Qapprox
with elements –7, 0, and 11 contains the quantiles for the first column of A
.
Show the first five rows of the exact quantiles of A
(along the second dimension) for the three evenly spaced cumulative probabilities.
Qexact(1:5,:)
ans = 5×3
103 ×
0.0305 0.3475 0.6885
0.0355 0.5420 1.0725
0.0520 1.0690 2.1365
0.0360 0.6955 1.3815
0.0405 0.3530 0.6875
Each row of the matrix Qexact
contains the three evenly spaced quantiles of the corresponding row in A
. For example, 30.5
, 347.5
, and 688.5
are the 0.25, 0.5, and 0.75 quantiles, respectively, of the first row in A
.
Input Arguments
A
— Input array
vector | matrix | multidimensional array
Input array, specified as a vector, matrix, or multidimensional array.
Data Types: double
| single
| duration
p
— Cumulative probabilities for which to compute quantiles
scalar | vector
Cumulative probabilities for which to compute quantiles, specified as a scalar or vector of scalars from 0 to 1.
Example: 0.3
Example: [0.25, 0.5, 0.75]
Example: (0:0.25:1)
Data Types: double
| single
n
— Number of probabilities for which to compute quantiles
positive integer scalar
Number of probabilities for which to compute quantiles, specified as a positive
integer scalar. quantile
returns n
quantiles that
divide the data set into evenly distributed n
+1 segments.
Data Types: double
| single
dim
— Dimension to operate along
positive integer scalar
Dimension to operate along, specified as a positive integer scalar. If you do not specify the dimension, then the default is the first array dimension whose size does not equal 1.
Consider an input matrix A
and a vector of cumulative
probabilities p
:
Q = quantile(A,p,1)
computes quantiles of the columns inA
for the cumulative probabilities inp
. Because 1 is the specified operating dimension,Q
haslength(p)
rows.Q = quantile(A,p,2)
computes quantiles of the rows inA
for the cumulative probabilities inp
. Because 2 is the specified operating dimension,Q
haslength(p)
columns.
Consider an input matrix A
and a vector of evenly spaced
probabilities n
:
Q = quantile(A,n,1)
computes quantiles of the columns inA
for then
evenly spaced cumulative probabilities. Because 1 is the specified operating dimension,Q
hasn
rows.Q = quantile(A,n,2)
computes quantiles of the rows inA
for then
evenly spaced cumulative probabilities. Because 2 is the specified operating dimension,Q
hasn
columns.
Dimension dim
indicates the dimension of Q
whose length is equal to length(p)
or n
.
Data Types: double
| single
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
vecdim
— Vector of dimensions to operate along
vector of positive integers
Vector of dimensions to operate along, specified as a vector of positive integers. Each element represents a dimension of the input data.
The size of the output Q
in the smallest specified operating
dimension is equal to length(p)
or n
. The size of
Q
in the other operating dimensions specified in
vecdim
is 1. The size of Q
in all dimensions not
specified in vecdim
remains the same as the input data.
Consider a 2-by-3-by-3 input array A
and the cumulative
probabilities p
. quantile(A,p,[1 2])
returns a
length(p)
-by-1-by-3 array because 1 and 2 are the operating
dimensions and min([1 2]) = 1
. Each page of the returned array
contains the quantiles of the elements on the corresponding page of
A
.
Data Types: double
| single
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
method
— Method for calculating quantiles
"exact"
(default) | "approximate"
More About
Linear Interpolation
Linear interpolation uses linear polynomials to find yi = f(xi), the values of the underlying function Y = f(X) at the points in the vector or array x. Given the data points (x1, y1) and (x2, y2), where y1 = f(x1) and y2 = f(x2), linear interpolation finds y = f(x) for a given x between x1 and x2 as
Similarly, if the 1.5/n quantile is y1.5/n and the 2.5/n quantile is y2.5/n, then linear interpolation finds the 2.3/n quantile y2.3/n as
T-Digest
T-digest [2] is a probabilistic data structure that is a sparse representation of the empirical cumulative distribution function (CDF) of a data set. T-digest is useful for computing approximations of rank-based statistics (such as percentiles and quantiles) from online or distributed data in a way that allows for controllable accuracy, particularly near the tails of the data distribution.
For data that is distributed in different partitions, t-digest computes quantile estimates (and percentile estimates) for each data partition separately, and then combines the estimates while maintaining a constant-memory bound and constant relative accuracy of computation ( for the qth quantile). For these reasons, t-digest is practical for working with tall arrays.
To estimate quantiles of an array that is distributed in different partitions, first
build a t-digest in each partition of the data. A t-digest clusters the data in the
partition and summarizes each cluster by a centroid value and an accumulated weight that
represents the number of samples contributing to the cluster. T-digest uses large clusters
(widely spaced centroids) to represent areas of the CDF that are near
q = 0.5
and uses small clusters (tightly spaced
centroids) to represent areas of the CDF that are near q =
0
and q = 1
.
T-digest controls the cluster size by using a scaling function that maps a quantile q to an index k with a compression parameter δ. That is,
where the mapping k is monotonic with minimum value k(0,δ) = 0 and maximum value k(1,δ) = δ. This figure shows the scaling function for δ = 10.
The scaling function translates the quantile q to the scaling factor
k in order to give variable-size steps in q. As a
result, cluster sizes are unequal (larger around the center quantiles and smaller near
q = 0
and q =
1
). The smaller clusters allow for better accuracy near the edges of the
data.
To update a t-digest with a new observation that has a weight and location, find the cluster closest to the new observation. Then, add the weight and update the centroid of the cluster based on the weighted average, provided that the updated weight of the cluster does not exceed the size limitation.
You can combine independent t-digests from each partition of the data by taking a union of the t-digests and merging their centroids. To combine t-digests, first sort the clusters from all the independent t-digests in decreasing order of cluster weights. Then, merge neighboring clusters, when they meet the size limitation, to form a new t-digest.
Once you form a t-digest that represents the complete data set, you can estimate the endpoints (or boundaries) of each cluster in the t-digest and then use interpolation between the endpoints of each cluster to find accurate quantile estimates.
Algorithms
For an n-element vector A
, quantile
computes quantiles by using a sorting-based algorithm:
The sorted elements in
A
are taken as the (0.5/n), (1.5/n), ..., ([n – 0.5]/n) quantiles. For example:For a data vector of five elements such as {6, 3, 2, 10, 1}, the sorted elements {1, 2, 3, 6, 10} respectively correspond to the 0.1, 0.3, 0.5, 0.7, and 0.9 quantiles.
For a data vector of six elements such as {6, 3, 2, 10, 8, 1}, the sorted elements {1, 2, 3, 6, 8, 10} respectively correspond to the (0.5/6), (1.5/6), (2.5/6), (3.5/6), (4.5/6), and (5.5/6) quantiles.
quantile
uses Linear Interpolation to compute quantiles for probabilities between (0.5/n) and ([n – 0.5]/n).For the quantiles corresponding to the probabilities outside that range,
quantile
assigns the minimum or maximum values of the elements inA
.
quantile
treats NaN
s
as missing values and removes them.
References
[1] Langford, E. “Quartiles in Elementary Statistics”, Journal of Statistics Education. Vol. 14, No. 3, 2006.
[2] Dunning, T., and O. Ertl. “Computing Extremely Accurate Quantiles Using T-Digests.” August 2017.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
The
quantile
function supports tall arrays with the following usage
notes and limitations:
Q = quantile(A,p)
andQ = quantile(A,n)
return the exact quantiles (using a sorting-based algorithm) only ifA
is a tall numeric column vector.Q = quantile(__,dim)
returns the exact quantiles only when one of these conditions exists:A
is a tall numeric column vector.A
is a tall numeric array anddim
is not1
. For example,quantile(A,p,2)
returns the exact quantiles along the rows of the tall arrayA
.
If
A
is a tall array anddim
is1
, then you must specifymethod
as"approximate"
to use an approximation algorithm based on T-Digest for computing the quantiles. For example,quantile(A,p,1,"Method","approximate")
returns the approximate quantiles along the columns of the tall arrayA
.Q = quantile(__,vecdim)
returns the exact quantiles only when one of these conditions exists:A
is a tall numeric column vector.A
is a tall numeric array andvecdim
does not include1
. For example, ifA
is a 3-by-5-by-2 array, thenquantile(A,p,[2,3])
returns the exact quantiles of the elements in eachA(i,:,:)
slice.A
is a tall numeric array andvecdim
includes1
and all the dimensions ofA
whose size does not equal 1. For example, ifA
is a 10-by-1-by-4 array, thenquantile(A,p,[1 3])
returns the exact quantiles of the elements inA(:,1,:)
.
If
A
is a tall numeric array andvecdim
includes1
but does not include all the dimensions ofA
whose size does not equal 1, then you must specifymethod
as"approximate"
to use the approximation algorithm. For example, ifA
is a 10-by-1-by-4 array, you can usequantile(A,p,[1 2],"Method","approximate")
to find the approximate quantiles of each page ofA
.
For more information, see Tall Arrays.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
The
"all"
andvecdim
inputs are not supported.The
Method
name-value argument is not supported.The
dim
input argument must be a compile-time constant.If you do not specify the
dim
input argument, the working (or operating) dimension can be different in the generated code. As a result, run-time errors can occur. For more details, see Automatic dimension restriction (MATLAB Coder).If the output
Q
is a vector, the orientation ofQ
differs from MATLAB® when all of these conditions are true:You do not supply
dim
.A
is a variable-size array, and not a variable-size vector, at compile time, butA
is a vector at run time.The orientation of the vector
A
does not match the orientation of the vectorp
.
In this case, the output
Q
matches the orientation ofA
, not the orientation ofp
.
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
The quantile
function
supports GPU array input with these usage notes and limitations:
The
"all"
andvecdim
inputs are not supported.The
Method
name-value argument is not supported.
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced before R2006aR2022b: Improved performance with small input data
The quantile
function shows improved performance due to faster
input parsing. The performance improvement is most significant when input parsing is a
greater portion of the computation time. This situation occurs when:
The size of the input data is small.
The number of cumulative probabilities for which to compute quantiles is small.
Computation is along the default operating dimension.
For example, this code calculates four quantiles for a 3000-element matrix. The code is about 4.95x faster than in the previous release.
function timingQuantile A = rand(300,10); for k = 1:3e3 Q = quantile(A,[20 40 60 80]); end end
The approximate execution times are:
R2022a: 0.94 s
R2022b: 0.19 s
The code was timed on a Windows® 10, Intel®
Xeon® CPU E5-1650 v4 @ 3.60 GHz test system using the timeit
function:
timeit(@timingQuantile)
R2022a: Moved to MATLAB from Statistics and Machine Learning Toolbox
Previously, quantile
required Statistics and Machine Learning Toolbox™.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)