Core points of clusters
    조회 수: 5 (최근 30일)
  
       이전 댓글 표시
    
I need to find the center points of a clusters. I used dbscan for clustering. Now I need to find the core points of these clusters. I used the corepts,but it gives the logical array. How can I find the core points of those clusters or atleast a point contained in those clusters. Anybody please help me.
[idx, corepts] = dbscan(asc,epsilon,minpts);
댓글 수: 7
채택된 답변
  Ameer Hamza
      
      
 2020년 3월 7일
        As discussed here, https://stackoverflow.com/questions/52364959/how-to-find-center-points-of-dbscan-clusrering-in-sklearn and here https://www.quora.com/Is-there-anything-equivalent-to-a-centroid-in-DBSCAN, dbscan does not have a center of the cluster. However, it does generate core points. You can get the core points by modifying the line in your code
core = data(corepts, :);
It will give you all rows conntaining core points. Similarly you can get the cluster number of these core points
corr_idx = idx(corepts, :);
As an example, try this
data=xlsread('glass.xlsx');
minpts=6;
epsilon=4;
[idx, corepts] = dbscan(data,epsilon,minpts);
fig1 = figure();
gscatter(data(:,1),data(:,2),idx);
fig2 = figure();
core=data(corepts, :);
corr_idx = idx(corepts, :);
gscatter(core(:,1),core(:,2),corr_idx);
댓글 수: 4
  Ameer Hamza
      
      
 2020년 3월 8일
				I think you misunderstood the meaning of core points. All the points shown in the image in my last comment are the core points of that cluster. The core point in dbscan does not imply the center of the cluster. If you want to find the five closest point from the center of the cluster (center as I calculated in the last comment by taking an average of the cluster), then you can try the following code
clc;
clear;
data=xlsread('glass.xlsx');
minpts=6;
epsilon=4;
[idx, corepts] = dbscan(data,epsilon,minpts);
fig1 = figure();
gscatter(data(:,1),data(:,2),idx);
fig2 = figure();
ax = axes();
hold on;
core=data(corepts, :);
core_idx = idx(corepts, :);
gscatter(core(:,1),core(:,2),core_idx);
centers = splitapply(@(x) mean(x, 1), core, core_idx);
gscatter(centers(:,1), centers(:,2), (1:6)');
for i=1:6
    ax.Children(i).Marker = 'x';
    ax.Children(i).MarkerSize = 30;
    ax.Children(i).LineWidth = 10;
end
clusters = splitapply(@(x) {x}, core, core_idx);
closest_points = cell(1,5);
closest_idx = cell(1,5);
for i = 1:length(clusters)
    [~, index] = mink(sum((clusters{i}-centers(i,:)).^2,2), 5, 1);
    closest_points{i} = clusters{i}(index,:);
    closest_idx{i} = i*ones(size(closest_points{i},1),1);
end
closest_points = cell2mat(closest_points');
closest_idx = cell2mat(closest_idx');
g = gscatter(closest_points(:,1), closest_points(:,2), closest_idx);
[g.MarkerSize] = deal(30);
[g.Color] = deal([0 0 0]);
The result is, the closet points are shown in black. Note that the distance is calculated in all 11 dimensions, so points may not appear close in 2 dimensions, but they are overall closer to center on considering 11 dimensions.

추가 답변 (0개)
참고 항목
카테고리
				Help Center 및 File Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



