Dataset condensation and distance function optimization in KNN classifier
조회 수: 2 (최근 30일)
이전 댓글 표시
Hi,
I am working on a decision system for stocks. I have a lot of data (time series 6000 stocks, 10 years of daily data on various metrics related to valuation, price momentum, street estimates, intrinsic business quality, etc.)
From my reading, it sounds like a KNN classifier is the easiest and best type of framework for me to focus on (after considering NN's, decision tree's, etc.). However, the MATLAB provided toolboxes seem to lack some important components that I would need. Namely: data condensation and some way of optimizing the distance function.
I Googled a few things and found that the "Hart" algorithm is often used for condensation ("CNN"), and found this link which seems to be the kind of thing I need (<http://mirlab.org/jang/matlab/toolbox/machineLearning/help/dsCondense_help.html#2)>. Unfortunately it doesn't seem this code is freely available.
For optimizing distance functions there seem to be more freely available code online, such as http://www.cs.cmu.edu/~liuy/distlearn.htm and http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html.
Does anybody know where I can find good code to accomplish condensing and distance function optimization? Any other comments on the general approach would be greatly appreciated.
THANK YOU!
Regards, Mike
댓글 수: 0
답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!