Evaluate quality of a classification feature based on distance matrix

조회 수: 3 (최근 30일)
LMarcel
LMarcel 2019년 11월 27일
댓글: LMarcel 2019년 11월 28일
Hello,
I am currently trying to do feature extraction on measurement data which are preprocessed in the form of normalized 1D-histograms. As I don't know in advance where relevant features (they are expected to be somewhere within certain areas along the histogram indices) might be, I am "scanning" my data using custom distance metrics on data sets with known labels. Having applied the metric I get a square distance matrix with pairwise distances as shown below. The upper left and lower right quadrants always contain the pairwise distances within the two respective clusters, the others the distances between them. The Indices belonging to each cluster are always known.
In order to get a quick evaluation on relevant features (as there is a lot of data to scan), I thought of a measure that somehow reflects the discriminative power of the feature observed:
  • score=
Although this score value works quite well in most cases and is easy and fast to compute, I wanted to check whether there might be a better and more expressive way that is computationally effective, results in a single value and is less sensitive regarding outliers in the data.
Unfortunately the approaches I found use raw data (instead of distances, which are definitely the input here) and/ or require an interpretation of the result.
I hope y'all get what I am looking for and hope that my approach is not fundamentally stupid. If so, let me know:)
Thanks in advance!
Distance_Matrices.png
  댓글 수: 5
LMarcel
LMarcel 2019년 11월 27일
편집: LMarcel 2019년 11월 27일
I accidently uploaded the wrong distance matrix in the first place. Sorry for that.
Find attached the one matching the example data set and the following distance computation for the first and second "runs" for indices 66:75 and euclidean distance:
>> sample1=sampledata.DATA{1,1}(66:75)./sum(sampledata.DATA{1,1}(66:75))
>> sample2=sampledata.DATA{2,1}(66:75)./sum(sampledata.DATA{2,1}(66:75))
>> D_12=pdist2(sample1',sample2','euclidean')
distmatrix_euclidean_66_75.png
LMarcel
LMarcel 2019년 11월 28일
@ Image Analyst: Was that the input you were asking for or did I get it wrong? And thanks for the hint on PNGs, good call!

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

제품


릴리스

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by