Sort Group Data Trends

Question

T 2013년 4월 3일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/69728-sort-group-data-trends

Suppose you had a vector

Is it possible to sort this data into groups automatically without knowing if your vector has groups in 1.X, 2.X or 4.X ? Is it possible to detect a trend?

Group#1:
2 
3
4
5
Group#2:
4
5
6
7
8
9
Group#3:
1
2
3
4
6
7

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

T 2013년 4월 3일

편집: T 2013년 4월 3일

MATLAB Online에서 열기

Right but suppose you don't know what the other groups could be but you have a bunch of data, say

93420752164
78836160714
53852291667
56820951705
26671073718
92029910714
44986160714
83251
44219642857
91854589844
7494221591
72721875
7595013587
1245013587
4331732955
3921853147
4290520833
0803513889
4949040616
1499960937
1340036932
80777125
8052548077
8122678571
8893194444
6083854167
6444630682
30253125
4106951531
6680576002

There are three groups here, 7k 11k and 15k. Now in the simple example you shown, you assumed that we already know what the groups would look like.

Sven 2013년 4월 3일

So how to you know that there are three groups?

If your criteria is "my human brain looked at the numbers and using all my years experience with numbers and the knowledge of my current problem I chose 3", then things will be difficult.

If instead you can describe a thought process such as:

I took the smallest number (7885) and saw it had 4 digits
I divided all numbers by 10E(4-1)
I rounded all the numbers, and grouped by the result.

... then we might be able to do something.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Sven 2013년 4월 3일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/69728-sort-group-data-trends#answer_80919

편집: Sven 2013년 4월 3일

MATLAB Online에서 열기

Anthony, here's a solution relevant to my comment above. Note that there are in fact 4 groups: 7k, 8k, 11k, and 15k.

nums = [7885.93420752164
78836160714
53852291667
56820951705
26671073718
92029910714
44986160714
83251
44219642857
91854589844
7494221591
72721875
7595013587
1245013587
4331732955
3921853147
4290520833
0803513889
4949040616
1499960937
1340036932
80777125
8052548077
8122678571
8893194444
6083854167
6444630682
30253125
4106951531
6680576002]
minDigits = length(num2str(round(min(nums))));
rescaledNums = nums/ 10^(minDigits-1);
[grpPrefixes ,~,groups] = unique(floor(rescaledNums))
grpPrefixes =
     7
     8
    11
    15

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Sven 2013년 4월 4일

MATLAB Online에서 열기

Anthony, in your first example you separated numbers that were less than 1 apart. Here you're merging numbers "because their difference is 26". Can you see the inconsistency? I agree that a human mind can see patterns, but can you see that if you want a computer to find the same pattern as your mind you need to describe how you are choosing your separation?

If you can describe clearly why you chose 3 groups for your first example and then (using the exact same reasoning!) have it also choose 3 groups for your second example, then we can help you cluster your data. For example, the logic I gave in my first comment clusters your first example into 3 groups and your second example into 4 groups, but at least it is 100% non-ambiguous so can therefore be coded.

If you know in advance how many clusters, then you can use kmeans() which will (possibly) conform to how you would expect the clustering to be done:

groups = kmeans(nums,3)

Does that help you out?

T 2013년 4월 4일

I'll look into it. Thanks!

댓글을 달려면 로그인하십시오.

Sort Group Data Trends

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

Sort Group Data Trends

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기