fcm
Fuzzy c-means clustering
Syntax
Description
[
returns the clustering results for all numbers of clusters
used along with the validity index used for determining the optimal number of clusters.
When the distance metric specified in centers
,U
,objFcn
,info
]
= fcm(___)options
is either
"mahalanobis"
or "fmle"
, info
also
contains the covariance matrices generated for each number of clusters.
Examples
Input Arguments
Output Arguments
Tips
To generate a fuzzy inference system using FCM clustering, use the
genfis
function. For example, suppose that you cluster your data using the following syntax.[centers,U] = fcm(data,fcmOpt);
The first
M
columns ofdata
correspond to input variables and the remaining columns correspond to output variables.You can generate a fuzzy system using the same training data and FCM clustering configuration. To do so:
Configure the clustering options.
opt = genfisOptions("FCMClustering"); opt.NumClusters = fcmOpt.NumClusters; opt.Exponent = fcmOpt.Exponent; opt.MaxNumIteration = fcmOpt.MaxNumIteration; opt.MinImprovement = fcmOpt.MinImprovement; opt.DistanceMetric = fcmOpt.DistanceMetric; opt.Verbose = fcmOpt.Verbose;
Extract the input and output variable data.
inputData = data(:,1:M); outputData = data(:,M+1:end);
Generate the FIS structure.
fis = genfis(inputData,outputData,opt);
The fuzzy system
fis
contains one fuzzy rule for each cluster, and each input and output variable has one membership function per cluster. For more information, seegenfis
andgenfisOptions
.
Algorithms
FCM is a clustering method that allows each data point to belong to multiple clusters with
varying degrees of membership. To configure clustering options, create an
fcmOptions
object.
The FCM algorithm computes cluster centers and membership values to minimize the following objective function.
Here:
N is the number of data points.
C is the number of clusters. To specify this value, use the
NumClusters
option.m is fuzzy partition matrix exponent for controlling the degree of fuzzy overlap, with m > 1. Fuzzy overlap refers to how fuzzy the boundaries between clusters are, that is, the number of data points that have significant membership in more than one cluster. To specify the fuzzy partition matrix exponent, use the
Exponent
option.Dij is the distance from the jth data point to the ith cluster.
μij is the degree of membership of the jth data point in the ith cluster. For a given data point, the sum of the membership values for all clusters is one.
The fcm
function supports three types of FCM clustering:
These methods differ in the distance metric used for computing Dij. For more information, see Fuzzy Clustering.
References
[1] Bezdek, James C. Pattern Recognition with Fuzzy Objective Function Algorithms. Boston, MA: Springer US, 1981. https://doi.org/10.1007/978-1-4757-0450-1.
[2] Gustafson, Donald, and William Kessel. “Fuzzy Clustering with a Fuzzy Covariance Matrix.” In 1978 IEEE Conference on Decision and Control Including the 17th Symposium on Adaptive Processes, 761–66. San Diego, CA, USA: IEEE, 1978. https://doi.org/10.1109/CDC.1978.268028.
[3] Gath, I., and A.B. Geva. “Unsupervised Optimal Fuzzy Clustering.” IEEE Transactions on Pattern Analysis and Machine Intelligence 11, no. 7 (July 1989): 773–80. https://doi.org/10.1109/34.192473.
[4] Xie, X.L., and G. Beni. “A Validity Measure for Fuzzy Clustering.” IEEE Transactions on Pattern Analysis and Machine Intelligence 13, no. 8 (August 1991): 841–47. https://doi.org/10.1109/CDC.1978.268028.