Package: clustering.evaluation
Superclasses: ClusterCriterion
Gap criterion clustering evaluation object
GapEvaluation
is an object consisting of sample data, clustering
data, and gap criterion values used to evaluate the optimal number of clusters. Create a
gap criterion clustering evaluation object using evalclusters
.
creates a gap criterion clustering evaluation object.eva
= evalclusters(x
,clust
,'Gap')
creates a gap criterion clustering evaluation object using additional options specified
by one or more namevalue pair arguments.eva
= evalclusters(x
,clust
,'Gap',Name,Value
)

Number of data sets generated from the reference distribution, stored as a positive integer value. 

Clustering algorithm used to cluster the input data, stored
as a valid clustering algorithm name or function handle. If the clustering
solutions are provided in the input, 

Name of the criterion used for clustering evaluation, stored as a valid criterion name. 

Criterion values corresponding to each proposed number of clusters
in 

Distance metric used for clustering data, stored as a valid distance metric name. 

Expectation of the natural logarithm of W based on the
generated reference data, stored as a vector of scalar values.
W is the withincluster dispersion computed using the
distance metric 

List of the number of proposed clusters for which to compute criterion values, stored as a vector of positive integer values. 

Natural logarithm of W based on the input data, stored
as a vector of scalar values. W is the withincluster
dispersion computed using the distance metric


Logical flag for excluded data, stored as a column vector of
logical values. If 

Number of observations in the data matrix 

Optimal number of clusters, stored as a positive integer value. 

Optimal clustering solution corresponding to 

Reference data generation method, stored as a valid reference distribution name. 

Standard error of the natural logarithm of W with
respect to the reference data for each number of clusters in


Method for determining the optimal number of clusters, stored as a valid search method name. 

Standard deviation of the natural logarithm of W with
respect to the reference data for each number of clusters in


Data used for clustering, stored as a matrix of numerical values. 
increaseB  Increase reference data sets 
[1] Tibshirani, R., G. Walther, and T. Hastie. “Estimating the number of clusters in a data set via the gap statistic.” Journal of the Royal Statistical Society: Series B. Vol. 63, Part 2, 2001, pp. 411–423.