Clustering/sorting of close data points
이전 댓글 표시
Hello all,
I would like to bundle up some close data points. This is how my data looks like: 214981 366893 455877 455877 455877 455878 889359 889359 1443570 1443570........
Can anybody suggest an easy way to do this?
Thanks.
Raju
댓글 수: 6
Amish Saxena
2022년 7월 6일
Can you please explain your question a bit in detail ?
Due to want do something like K-Means clustering or want to seperate out values (that have a difference less than some specific number between them) into some sets or arrays?
Walter Roberson
2022년 7월 6일
Raju Kumar
2022년 7월 6일
Walter Roberson
2022년 7월 6일
uniquetol() will do it, but you have to decide how you want to classify points as "close". In particular is close based on absolute difference, or is it proportional (such as 5%)
Raju Kumar
2022년 7월 11일
편집: Raju Kumar
2022년 7월 11일
Walter Roberson
2022년 7월 11일
Consider for example your value 455877 : what is the minimum and maximum value that you wish to be considered to be the same group as 455877, if those values were encountered as data?
Consider also 214981: what is the minimum and maximum value that you wish to be considered to be the same group as 214981, if those values were encountered as data?
I ask about two different values because the boundary for higher values might not have the same range as for lower value. For example in [1 8 25] the 8 might be considered to be relatively far from the "1" (since it is 8 times the value), whereas by the time you got to 200000, the value 200010 might be considered "close" to 200000 since the difference is pretty small relative to the value.
답변 (1개)
This is really too broad a question to answer yet. You haven't even plotted your data or told us what "close" is. I suggest you start by reading this page:
Maybe you can simply take the histogram.
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
This is how I'd classify your data. Basically you can do it manually for such few data points. For far more data points, you can try the Classification Learner App on the apps tab of the tool ribbon. Even for these few, it looks like SVM might be good. But please attach far more data so we can find the best classifier.
data = [214981 366893 455877 455877 455877 455878 889359 889359 1443570 1443570]';
classes = [1,1,1,1,1,1,2,2,3,3]';
plot(data, 'b.', 'MarkerSize', 30);
grid on;
댓글 수: 2
Raju Kumar
2022년 7월 12일
편집: Raju Kumar
2022년 7월 12일
Exactly how are you seeing clusters in that data?
s = load('raju.mat')
toa = s.ToA;
classes = [1,1,1,1,1,1,2,2,3,3]';
plot(toa, 'b.', 'MarkerSize', 10);
grid on;
xlabel('Index of Vector')
ylabel('Value of toa')
카테고리
도움말 센터 및 File Exchange에서 Power and Energy Systems에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

