How do I assign cluster IDs of test data based on DBSCAN?
조회 수: 7 (최근 30일)
이전 댓글 표시
I'm interested in testing the clustering accuracy of a DBSCAN cluster model. The idea I have is that I split my dataset into training and testing via k-fold CV, the training partition gets fed into the DBSCAN function, which then outputs cluster assignments of each point in the training dataset. Is it possible to then use this result to determine which clusters the new points of data should belong to?
In K-Means clustering for instance, I know I can assign the cluster ID of test data point by finding the point's nearest distance to a cluster centroid via pdist2, but since centroids don't exist in DBSCAN, I'm not sure what the "equivalent" procedure is.
edit: Thinking it over, my initial thought would be that, for each test data point, I find the nearest datapoint in the "trained" model i.e, the point with the smallest euclidean distance, and then simply assign the test data point the same cluster (or no cluster if the test data point is closest to an outlier or exceeds a threshold distance), would that be a good way to approach?
댓글 수: 0
답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!