Plot a curve that splits data into two sets
조회 수: 3 (최근 30일)
이전 댓글 표시
Hello,
I have data points which represent 2 classes (collisions avoided and probable collisions). My goal is to plot a curve (polynomial equation), that would split the data points say in a chosen ratio (Say 90% collisions avoided to 10% probable collisions). Note that data points corresponding to two classes are very close.
I have tried using 'fit' funciton in matlab, and for a polynomial of degree 8, here is what I get (refer image). But it doesn't split the data as required.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/166534/image.jpeg)
I am looking at Support Vector Machines for Binary Classification (I am not an expert in this domain), I am not sure if it would help. How can I get the data seggregation I want?
Best,
Raj
댓글 수: 0
답변 (3개)
Greg Heath
2017년 8월 4일
Your data is extremely discontinuous. The best you can hope for is a decision tree.
Hope this helps
Thank you for formally acceptingmy answer
Greg
댓글 수: 2
Image Analyst
2017년 8월 14일
Too bad because I think that's your best shot at a possible solution. Since your data is so overlapping, I think that those two parameters are not enough to do the discrimination. You'd best try to look for a third or fourth parameter, like acceleration, velocity vector angles, or something. If you can't, then I think a treebagger/random forest/decision tree type of approach is the best you can hope for, like Greg said. See the scatterplot example on https://www.mathworks.com/help/stats/ensemble-methods.html#bsx62vu Actually your ad hoc convex hull example is somewhat related to a treebagger type of solution. It also sounds a bit like dbscan https://en.wikipedia.org/wiki/DBSCAN
John D'Errico
2017년 8월 4일
But why would a polynomial regression fit have any chance of satisfying this goal? It would be pure random chance if it came even close. It is especially wrong to hope that such a fit, based on purely distance as the independent variable would have a chance.
It seems you are looking for a nonlinear discriminant curve, based on both velocity and distance. I'd suggest neural nets, but just because you want to see a 90% success rate does not mean any such function exists. You could have as easily have insisted on a 99.99% target success rate. If wishes were horses, beggars would ride.
What you need to be modeling is a boolean result, thus collision or not, as a function of TWO independent variables, vehicle velocity and inter-vehicle distance. Again, use a tool of your choice. But a polynomial regression is still NOT the tool I would ever advise here.
참고 항목
카테고리
Help Center 및 File Exchange에서 Discriminant Analysis에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!