clustering of 1d data

Question

joy 2015년 3월 2일

1
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/181070-clustering-of-1d-data

편집: Sim 2020년 10월 12일

MATLAB Online에서 열기

So let's say I have an array like this:

[1,1,2,3,10,11,13,67,71]

Is there a convenient way to partition the array into something like this?

[[1,1,2,3],[10,11,13],[67,71]]

I searched with this topic...it seems that kmeans is not a suitable solution for 1d data.. Jenks Natural Breaks Optimization or Kernel Density Estimation could be an option..but which method will be suitable for matlab implementation? Is there any other way in matlab?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

MS 2019년 9월 11일

2
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/181070-clustering-of-1d-data#answer_391461

MATLAB Online에서 열기

Yes, you can apply the Jenks Natural Breaks iteratively to split the array into several classes based on the similarity of the elements. I wrote a function that applies this method to a one-dimensional array to split it into two classes. You can use it several times while updating the data array.

Function: https://www.mathworks.com/matlabcentral/fileexchange/72677-clustering-via-jenks-natural-breaks

Example:

data = [1,1,2,3,10,11,13,67,71];
total = length (data);
% Split the initial array into two classes based on Jenks Natural Breaks 
[SDCM_All, GF] = get_jenks_interface(data);
% get the first interface: index of maximum Goodness of Variance Fit 
[M, I1] = max(GF);
% extract sub_array 3
sub_array_3 = data(I1+1:total);
    
% get the reamining elements 
remaining_elements = data (1:I1);
total = length(remaining_elements);
% Split the remaining elements into two classes based on Jenks natural breaks 
[SDCM_All, GF] = get_jenks_interface(remaining_elements);
% get the second interface: index that has the maximum Goodness of Variance Fit 
[M, I2] = max(GF);
% extract sub_array_2
sub_array_2 = data(I2+1:total);
% extract sub_array_1
sub_array_1 = data(1:I2);
disp(sub_array_1);
disp(sub_array_2);
disp(sub_array_3);

>>>>>>>> Output:

>> main

1 1 2 3

10 11 13

67 71

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Sim 2020년 10월 12일

편집: Sim 2020년 10월 12일

MATLAB Online에서 열기

I have just re-written your code in a more compact way, allowing you to select as many classes as you want (in this example, classes = 4):

clc; clear output sub_array;
input = [1,1,2,3,10,11,13,67,71];
classes = 4;
for i = 1 : classes-1
    if i == 1
        data = input;
    elseif i > 1
        data = remaining_elements;
    end
    total = length (data);
    [SDCM_All, GF] = get_jenks_interface(data);
    [M, I1] = max(GF);
    sub_array{i} = data(I1+1:total);
    remaining_elements = data (1:I1);
end
output = vertcat({data(1:I1)}, flipud(sub_array'));
output{:}

The result with

classes = 4;

is the following:

ans =
     1     1
ans =
     2     3
ans =
    10    11    13
ans =
    67    71

댓글을 달려면 로그인하십시오.

Answer 2

Adam 2015년 3월 2일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/181070-clustering-of-1d-data#answer_169826

MATLAB Online에서 열기

data = [1, 1, 2, 3, 10, 11, 13, 67, 71]'
idx = kmeans( data, 3 );

seems to give the correct clustering if you then apply that indexing to your data.

Unfortunately one thing I find with kmeans is that you get the indexing of your clusters in an arbitrary order, but you can define cluster centres I think to stop that.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

clustering of 1d data

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (2개)

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

clustering of 1d data

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (2개)

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기