Main Content

clusterDBSCAN.estimateEpsilon

Estimate neighborhood clustering threshold

Since R2021a

Description

epsilon = clusterDBSCAN.estimateEpsilon(X,MinNumPoints,MaxNumPoints) returns an estimate of the neighborhood clustering threshold, epsilon, used in the density-based spatial clustering of applications with noise (DBSCAN) algorithm. epsilon is computed from input data X using a k-nearest neighbor (k-NN) search. MinNumPoints and MaxNumPoints set a range of k-values for which epsilon is calculated. The range extends from MinNumPoints – 1 through MaxNumPoints – 1. k is the number of neighbors of a point, which is one less than the number of points in a neighborhood.

example

clusterDBSCAN.estimateEpsilon(X,MinNumPoints,MaxNumPoints) displays a figure showing the k-NN search curves and the estimated epsilon. The neighborhood clustering threshold, epsilon, is used in the density-based spatial clustering of applications with noise (DBSCAN) algorithm. epsilon is computed from input data X using a k-nearest neighbor (k-NN) search. MinNumPoints and MaxNumPoints set a range of k-values for which epsilon is calculated. The range extends from MinNumPoints – 1 through MaxNumPoints – 1. k is the number of neighbors of a point, which is one less than the number of points in a neighborhood.

example

Examples

collapse all

Create simulated target data and use the clusterDBSCAN.estimateEpsilon function to calculate an appropriate epsilon threshold.

Create the target data as xy Cartesian coordinates.

X = [randn(20,2) + [11.5,11.5]; randn(20,2) + [25,15]; ...
    randn(20,2) + [8,20]; 10*rand(10,2) + [20,20]];

Set the range of values for the k-NN search.

minNumPoints = 15;
maxNumPoints = 20;

Estimate the clustering threshold epsilon and display its value on a plot.

clusterDBSCAN.estimateEpsilon(X,minNumPoints,maxNumPoints)

Figure Estimated Epsilon contains an axes object. The axes object with title Estimated Epsilon, xlabel Index, ylabel Epsilon contains 20 objects of type line, text. These objects represent Estimated Epsilon, Time-Averaged Epsilon.

Use the estimated Epsilon value, 3.62, in the clusterDBSCAN clusterer. Then, plot the clusters.

clusterer = clusterDBSCAN('MinNumPoints',6,'Epsilon',3.62, ...
    'EnableDisambiguation',false);
[idx,cidx] = clusterer(X);
plot(clusterer,X,idx)

Figure Clusters contains an axes object. The axes object with title Clusters, xlabel Dimension 1, ylabel Dimension 2 contains 5 objects of type line, scatter, text. One or more of the lines displays its values using only markers

Input Arguments

collapse all

Input feature data, specified as a real-valued N-by-P matrix. The N rows correspond to feature points in a P-dimensional feature space. The P columns contain the values of the features over which clustering takes place. The DBSCAN algorithm can cluster any type of data with appropriate MinNumPoints and Epsilon settings. For example, a two-column input can contain the xy Cartesian coordinates, or range and Doppler.

Data Types: double

The starting value of the k-NN search range, specified as a positive integer. MinNumPoints is used to specify the starting value of k in the k-NN search range. The starting value of k is one less than MinNumPoints.

Example: 10

Data Types: double

The end value of k-NN search range, specified as a positive integer. MaxNumPoints is used to specify the ending value of k in the k-NN search range. The ending value of k is one less than MaxNumPoints.

Output Arguments

collapse all

Estimated epsilon, returned as a positive scalar.

Algorithms

collapse all

Extended Capabilities

expand all

Version History

Introduced in R2021a